Re: Possible bug in LockingTransaction

Brandon Ibach Thu, 12 Sep 2013 02:11:57 -0700

Thanks for the confirmation, Alex.  Ticket 
filed<http://dev.clojure.org/jira/browse/CLJ-1260>. 
:)


Aaron, as shown in the test case attached to the ticket, I'm calling the 
Clojure library from Java code, so I use the Agent.dispatch() method.  The 
project where this issue was found is attempting to use Clojure's STM and 
Agent support to fix concurrency issues in an existing, rather large, Java 
code base, which we're not quite ready to port to another language, just 
yet.

Meikel, thanks for the suggestion.  I wasn't very clear about it in my 
description, but a workaround much like that is what I'd already employed 
in order to move forward with my work.  The version of it that is shown in 
the test case on the ticket just references a static field from the RT 
class, which is enough to trigger the runtime initialization.

-Brandon :)


On Wednesday, September 11, 2013 8:25:31 AM UTC-4, Alex Miller wrote:
>
> I have not gone to look at the code but the description certainly sounds 
> like a recipe for a bug. 
>
> If you can a) create a reproducible case and b) check that it happens on 
> 1.5 as well we would greatly appreciate a ticket:
>
> Create a jira account - 
> http://dev.clojure.org/jira/secure/Signup!default.jspa
> Then create a jira ticket - 
> http://dev.clojure.org/display/community/Creating+Tickets
>
> Thanks,
> Alex
>
>
> On Wednesday, September 11, 2013 1:47:56 AM UTC-5, Brandon Ibach wrote:
>>
>> I have found what appears to be a bug in LockingTransaction, albeit one 
>> that probably wouldn't occur often.  But, I suppose that's a given for a 
>> previously undiscovered problem in oft-used code that hasn't changed for 
>> some while. :)
>>
>> I'm using the Clojure 1.4 library strictly from Java code and I have a 
>> fairly simple transaction which dispatches an action to an agent (send, not 
>> send-off).  When called from a JUnit test, such that we jump right in to 
>> things, skipping some of the initialization we normally do in our app, I 
>> get a ConcurrentModificationException from inside Locktransaction.run() 
>> while it is iterating through the "actions" list, dispatching the actions 
>> after committing the transaction.
>>
>> Based on some debugging, here's what seems to be happening:
>>
>> 1. My transaction is run, dispatching an action to an agent.
>> 2. The transaction completes and is successfully committed.
>> 3. LockingTransaction does post-commit cleanup, freeing locks and putting 
>> a stop() to the transaction, which nulls the transaction's Info object 
>> reference.
>> 4. Notifications are sent and we start iterating the list of actions to 
>> be dispatched.
>> 5. The run() method calls Agent.dispatchAction().  Because the thread's 
>> transaction is no longer "running" (due to the Info object being null) and 
>> no action is being processed on the thread (so its "nested" vector is 
>> null), the action is enqueue()d with the agent.
>> 6. As part of the enqueue process, the action is consed onto the agent's 
>> ActionQueue.  Here's where the unique circumstances come into play.
>>    a. At this point, we haven't really interacted with the Clojure 
>> runtime, specifically the RT class, so its initiation machinery kicks in.
>>    b. Down in the depths, it executes a transaction to add a library to 
>> its list of loaded libraries.
>>    c. The still-existing-but-not-running thread-local transaction, with 
>> its existing action list intact, fires up, runs and commits.
>>    d. The post-commit stuff runs, including a nested attempt to dispatch 
>> that same action, again, which apparently succeeds.
>>    e. The action list is cleared before exiting the run() method.
>> 7. Upon returning way back up the stack to our 
>> not-quite-finished-post-processing transaction, we continue iterating the 
>> now-cleared action list, which promptly throws the 
>> ConcurrentModificationException.
>>
>> A quick perusal of the LockingTransaction code shows that the only 
>> interaction with the action list is adding an item to it in the enqueue() 
>> method, iterating it in the post-processing section of run() and clearing 
>> it in the "finally" clause of that section, so it's easy to see how a 
>> transaction started by any of the action-dispatching machinery would cause 
>> problems.  Any such activity in the actions themselves would not be an 
>> issue, since they'd occur on other threads, but the dispatch stuff all runs 
>> on the same thread.  The few moving parts that occur in this code seem 
>> fairly safe, as long as the runtime is already initialized, but if that 
>> occurs during this phase, I think all bets are off.
>>
>> Does this analysis seem sound?  If so, is there a more formal process for 
>> filing this as a bug?  I can probably create a nice, compact example to 
>> reproduce it.
>>
>> Thanks!
>>
>> -Brandon :)
>>
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Re: Possible bug in LockingTransaction

Reply via email to