> On Nov. 15, 2012, 9:17 p.m., Cliff Jansen wrote: > > I am going to disagree (with the proposed patch, and agree with Andrew). I > > managed to reproduce (took closer to 15 minutes on my laptop) and get a > > similar stack trace. > > > > I see a shared_ptr with non-null value and use_count of 3. impl->get() > > would also return true and trigger the assert. > > > > On a separate thread, I see the same PollableQueue instance in > > dispatch()/process(), waiting to reacquire the lock after line 152. > > > > I think the explanation is more likely that differences in the poller > > implementations expose a different scheduling opportunity for the bug on > > Windows compared to Linux. > > Cliff Jansen wrote: > Speculating further: on Linux, the PollableCondition has a pipe fd > plugged into the poller, and the poller can't see a second event to dispatch > until the pipe is re-enabled after the first callback completes. > > On Windows, perhaps there are two async poke's, allowing two dispatches > to occur and enabling the window of opportunity. That's just a wild guess, > but I think the main clue is that there are two dispatches running on the > same PollableQueue in separate threads.
Thank you for working on this, Cliff! Your analysis made me look at the PollableQueue code (awake this time :-) and it's easy to see how multiple threads can get into dispatch. The code assumes that once dispatch() is entered, no other thread will enter it and call process() until the first thread is done. I'll look further at how to resolve this in PollableQueue. - Steve ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/8072/#review13485 ----------------------------------------------------------- On Nov. 15, 2012, 3:05 a.m., Steve Huston wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/8072/ > ----------------------------------------------------------- > > (Updated Nov. 15, 2012, 3:05 a.m.) > > > Review request for qpid, Andrew Stitcher, Chug Rolke, and Cliff Jansen. > > > Description > ------- > > The assert in QPID-4424 was a check for a Thread object not set. This change > resolves that problem, but could it really be that easy? Why doesn't the > Linux code fail the same way? > > > This addresses bug QPID-4424. > https://issues.apache.org/jira/browse/QPID-4424 > > > Diffs > ----- > > > http://svn.apache.org/repos/asf/qpid/trunk/qpid/cpp/src/qpid/sys/windows/Thread.cpp > 1409628 > > Diff: https://reviews.apache.org/r/8072/diff/ > > > Testing > ------- > > Original reproducing test case in QPID-4424 (run broker quiet for 15 > seconds). I set a breakpoint at the assert and stepped across it without > error. > > > Thanks, > > Steve Huston > >
