On Sat, 19 Jan 2019, Scott Bauer wrote:

On Thu, Jan 17, 2019 at 09:31:55PM +0000, David Kozub wrote:

-       for (state = 0; !error && state < n_steps; state++) {
-               step = &steps[state];
-
-               error = step->fn(dev, step->data);
-               if (error) {
-                       pr_debug("Step %zu (%pS) failed with error %d: %s\n",
-                                state, step->fn, error,
-                                opal_error_to_human(error));
-
-                       /* For each OPAL command we do a discovery0 then we
-                        * start some sort of session.
-                        * If we haven't passed state 1 then there was an error
-                        * on discovery0 or during the attempt to start a
-                        * session. Therefore we shouldn't attempt to terminate
-                        * a session, as one has not yet been created.
-                        */
-                       if (state > 1) {
-                               end_opal_session_error(dev);
-                               return error;
-                       }




+       /* first do a discovery0 */
+       error = opal_discovery0_step(dev);

-               }
-       }
+       for (state = 0; !error && state < n_steps; state++)
+               error = execute_step(dev, &steps[state], state);
+
+       /* For each OPAL command the first step in steps starts some sort
+        * of session. If an error occurred in the initial discovery0 or if
+        * an error stopped the loop in state 0 then there was an error
+        * before or during the attempt to start a session. Therefore we
+        * shouldn't attempt to terminate a session, as one has not yet
+        * been created.
+        */
+       if (error && state > 0)
+               end_opal_session_error(dev);


This is subtly wrong. There was some state that was encoded by having the
loop the way we did.

the tl;dr is the check needs to be if (error && state > 1) still.

The why is that in the previous implementation we wanted to 
end_opal_session_error
only if a successful discovery0 AND a successful start_*_session. In the 
previous loop,
discovery0 would complete, we'd do state++, then we'd go and attempt to start 
our
session. If we failed on that session starting we'd still be in state 1, and 
wouldn't
attempt to close the session.

In the current form, discovery0 is gone, so start session is on state 0. If we 
fail
on the start session, we set error = true, state gets ++'d, then we look at 
!error
and we don't loop again.

We go down to the check and attempt to end a session that was never started.

Ouch! You're right. I'll fix this for v3 by comparing against 1.

There is one more issue that was bugging me. If next() fails at the discovery0 step, or at steps[0], in both cases the error message will say step 0 failed. But as it's just a pr_debug message, the function address is included and I don't see a short and nice solution (should I report the steps as starting from 1? but that might be confusing; or a different string? sounds like not worth it), I kept it that way.

But if someone thinks this is worth improving, please let me know.

Reply via email to