On Apr 8, 2008, at 2:21 PM, Kyle Schochenmaier wrote:
I think this is a fairly critical fix and will affect the majority of
the ib users I know of, who are all running cvs head for various
reasons.
The patch only moves two #defines around so I'm not sure why we should
hold off on it, maybe I'm missing something?

Nope, I just need to know whether to merge it to the release branch or not.
-sam



Kyle

On Tue, Apr 8, 2008 at 1:47 PM, Sam Lang <[EMAIL PROTECTED]> wrote:

Does this need to go in the next release?
-sam



On Apr 8, 2008, at 1:43 PM, Pete Wyckoff wrote:




[EMAIL PROTECTED] wrote on Tue, 08 Apr 2008 13:17 -0500:

Troy and I stumbled across this bug, that at least for our
configurations, causes a double-free on the server when cleaning up
'stale' connections.

It seems that some logic is duplicated here when cleaning up
connections that disappear to the server:

#if !MEMCACHE_BOUNCEBUF
         if (rq->state.recv == RQ_RTS_WAITING_RTS_DONE)  /*
--------------- here-------------- */
memcache_deregister(ib_device->memcache, &rq- >buflist);
#  if MEMCACHE_EARLY_REG
         /* pin on post, dereg all these */
if (rq->state.recv == RQ_RTS_WAITING_CTS_SEND_COMPLETION ||
             rq->state.recv == RQ_RTS_WAITING_RTS_DONE)    /*
----------------- and here ------------ */
memcache_deregister(ib_device->memcache, &rq- >buflist);
         if (rq->state.recv == RQ_WAITING_INCOMING
           && rq->buflist.tot_len > ib_device->eager_buf_payload)
memcache_deregister(ib_device->memcache, &rq- >buflist);
#  endif

The patch below worked cleans it up, and compiles and runs *here*, but we have some weird hanging issues after closing connections, basically
no new unexpected requests are processed until after the BMItimeout
and/or FlowTimeout (pvfs2.conf) have expired. We're not sure if that
is a seperate issue, it hangs and dies w/o this patch, now we just
hang.  This fixes the segfault which is a more major issue.
Patch is against CVS head from 20 minutes ago.


Yep, good stuff.  I was too eager in adding extra cancel states
back when Troy pointed out there were problems.  This fixes the
bugs that went in with that patch on 15 feb 08.  I think it should
do the saame thing as what you were aiming at.  Let me know if
you like it.

              -- Pete

P.S.  Put a line "diff -uNp" in your ~/.cvsrc.  Unified diffs are
so much easier to read.

commit 0872d4cace8e9cc98ac1a0831d56fb784f2fc0ff
Author: Pete Wyckoff <[EMAIL PROTECTED]>
Date:   Tue Apr 8 14:40:37 2008 -0400

 bmi ib: fix double free in cancel

 Earlier checkin to clean up more thoroughly during a cancel
 errantly introduced a double free in certain states.  Remove
 those.

diff --git a/src/io/bmi/bmi_ib/ib.c b/src/io/bmi/bmi_ib/ib.c
index f8d6b41..76deb7d 100644
--- a/src/io/bmi/bmi_ib/ib.c
+++ b/src/io/bmi/bmi_ib/ib.c
@@ -1496,8 +1496,7 @@ BMI_ib_cancel(bmi_op_id_t id, bmi_context_id
context_id __unused)
/* pin when sending rts, so also must dereg in this state */
          if (sq->state.send == SQ_WAITING_RTS_SEND_COMPLETION ||
sq->state.send == SQ_WAITING_RTS_SEND_COMPLETION_GOT_CTS ||
-               sq->state.send == SQ_WAITING_CTS ||
-               sq->state.send == SQ_WAITING_DATA_SEND_COMPLETION)
+               sq->state.send == SQ_WAITING_CTS)
memcache_deregister(ib_device->memcache, &sq- >buflist);
#  endif
#endif
@@ -1512,8 +1511,7 @@ BMI_ib_cancel(bmi_op_id_t id, bmi_context_id
context_id __unused)
memcache_deregister(ib_device->memcache, &rq- >buflist);
#  if MEMCACHE_EARLY_REG
          /* pin on post, dereg all these */
- if (rq->state.recv == RQ_RTS_WAITING_CTS_SEND_COMPLETION ||
-               rq->state.recv == RQ_RTS_WAITING_RTS_DONE)
+ if (rq->state.recv == RQ_RTS_WAITING_CTS_SEND_COMPLETION) memcache_deregister(ib_device->memcache, &rq- >buflist);
          if (rq->state.recv == RQ_WAITING_INCOMING
            && rq->buflist.tot_len > ib_device->eager_buf_payload)
_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers







--
Kyle Schochenmaier


Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to