George, thanks for taking a look. Your patch looks good and I can confirm it
fixes the hang I am seeing on our XE6.
-Nathan
On Thu, 3 May 2012, George Bosilca wrote:
Nathan,
You're right, when we loop trying to restart a failed request we must reset the
convertor. However:
1. the position i
Nathan,
You're right, when we loop trying to restart a failed request we must reset the
convertor. However:
1. the position in this case is always zero, so we don't have to save the
previous position in order to restore it.
2. all cases must be protected, not only the
mca_pml_ob1_send_request_s
Ran across a problem in a failure path of start_prepare in ob1. If prepare_src
succeed but send fails the send request convertor needs to be rolled back to
the correct position. Can someone with more knowledge of ob1 check if this is
indeed an error. Patch is below.
-Nathan
diff --git a/ompi/m