Preventing recovery failures

Protecting from recovery faults The advantage of designing the recovery interaction in form of wizard is that the operator has only a limited set of options, and it is easy to make sure that the system supports all these options. The only support that we need to add is to enable the operators to undo some of the recovery decisions. Wizard forms allow the user to specify parameters gradually, and to go back to previous forms before submitting the recovery command. Other means to protect from recovery faults are typically application specific.
Recovery interaction style Typically, users do not follow recovery procedures very often. Therefore, we should assume that the recovery screens are new to the users. Also, in emergency, the operator’s mental effort is totally allocated to solving the problem, and nothing is left for learning new screens. This implies that recovery screens should be designed as wizards, providing one option at a time, and a safe default option. If the screen is overloaded with irrelevant data, the user will fail, as he failed in the NYC blackout accident.
Preventing decision errors

Suppose that the user chose an option that we consider as unsafe. Typically, we require the user’s confirmation, by prompting “Are you sure? Y/N”. But then, how should the user respond to this question? If the action was unintentional, the user may choose the N option, but what if the user intended to chose the risky option? The users may rarely understand the risks of confirming the risky option, unless we inform them all the details about this option: Why is my confirmation required? What does the option Y mean? What does the option N mean? How will the system respond in case I confirm my command? Is it safe? Can I regret later? Can I give up this option and still accomplish my task? Can I try before deciding, and see the possible outcome?

We need to provide all these details in order that the user feels safe confirming the command. Also, in case automatic recovery, the system should provide clear indication of why and how the system recovers and the target, stable state.