[
https://issues.apache.org/jira/browse/THRIFT-488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rush Manbert updated THRIFT-488:
--------------------------------
Attachment: ThreadManagerAssertPatch.txt
For testing, a patch that just adds some asserts to ThreadManager.cpp. See my
last comment for more detail.
> ThreadManager crashing bugs
> ---------------------------
>
> Key: THRIFT-488
> URL: https://issues.apache.org/jira/browse/THRIFT-488
> Project: Thrift
> Issue Type: Bug
> Components: Library (C++)
> Affects Versions: 0.1
> Environment: Mac OS X 10.5.6, Xcode
> Reporter: Rush Manbert
> Attachments: ThreadManager.cpp.patch, ThreadManagerAssertPatch.txt
>
>
> The ThreadManager::Impl and the ThreadManager::Worker classes work together
> to execute client threads. There are race conditions between the two classes
> that can cause Bus Error exceptions in certain cases. In general, these occur
> when a ThreadManager instance is destructed, so might not be seen in long
> running programs. They happen frequently enough, though, that looped
> repetitions of ThreadManagerTests::blockTest() (part of the concurrency_test
> program) fail quite often.
> These errors are generally not seen with the current version of
> ThreadManagerTests::blockTest() due to errors in the test itself that cause
> failures at a far higher frequency. In order to see them, you need to apply
> the patches that are attached to THRIFT-487
> (https://issues.apache.org/jira/browse/THRIFT-487).
> Test procedure:
> 1) Apply the patch from THRIFT-487 for the Tests.cpp file.
> 2) Run make in lib/cpp in order to rebuild concurrency_test
> 3) Run concurrency_test with the command line argument "thread-manager" and
> observe that the test hangs in no time.
> 4) Apply the patch from THRIFT-487 for the ThreadManagerTests.h file.
> 5) Run make in lib/cpp
> 6) Run concurrency_test as before. Observe that now it runs for longer
> (generally) and usually fails with an assert in Monitor.cpp. This failure is
> because of one of the bugs in ThreadManager.
> 7) Apply the attached patch file for ThreadManager.cpp
> 8) Run make in lib/cpp
> 9) Run concurrency_test as before. It should just run, and run, and run.
> Note that there is a possible path through the original
> ThreadManager::Worker::run() method where active never becomes true. In
> practice, exercising this code path is difficuly. The way that I exercised it
> was to edit line 322 in the patched version of ThreadManager.cpp. I changed
> the for statement to read:
> for (size_t ix = 0; ix < value + 1; ix++)
> so that the ThreadManager always created more workers than were needed. That
> extra worker caused quite a bit of trouble until I moved his handling up to
> the top of the run() method. I don't understand how this situation could
> occur in real life, but this code appears to handle it correctly.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.