Re: [OMPI devel] Potential ob1 bug

2012-05-07 Thread Nathan Hjelm
George, thanks for taking a look. Your patch looks good and I can confirm it fixes the hang I am seeing on our XE6. -Nathan On Thu, 3 May 2012, George Bosilca wrote: Nathan, You're right, when we loop trying to restart a failed request we must reset the convertor. However: 1. the position i

Re: [OMPI devel] Potential ob1 bug

2012-05-03 Thread George Bosilca
Nathan, You're right, when we loop trying to restart a failed request we must reset the convertor. However: 1. the position in this case is always zero, so we don't have to save the previous position in order to restore it. 2. all cases must be protected, not only the mca_pml_ob1_send_request_s

[OMPI devel] Potential ob1 bug

2012-05-01 Thread Hjelm, Nathan T
Ran across a problem in a failure path of start_prepare in ob1. If prepare_src succeed but send fails the send request convertor needs to be rolled back to the correct position. Can someone with more knowledge of ob1 check if this is indeed an error. Patch is below. -Nathan diff --git a/ompi/m