[ 
https://issues.apache.org/jira/browse/ACCUMULO-4391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478002#comment-15478002
 ] 

Ivan Bella commented on ACCUMULO-4391:
--------------------------------------

For case #1, the issue is more that the same decompressor was being added back 
into the pool multiple times.  That was most likely done on the same thread.  I 
will have to do some research to determine why close was being called multiple 
times.  My fix still allows that to happen, but the decompressor would only be 
returned once.
For case #2, is was the root RFile.Reader being closed on the main thread or 
perhaps a cleanup thread because the root scan was complete or the client was 
gone, but one of my threads was still reading from it.  This is normally not a 
problem because we no longer care about the results, however when it results in 
array creation issues and subsequent memory issues then we have problems.


> Source deepcopies cannot be used safely in separate threads in tserver
> ----------------------------------------------------------------------
>
>                 Key: ACCUMULO-4391
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4391
>             Project: Accumulo
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.6.5
>            Reporter: Ivan Bella
>            Assignee: Ivan Bella
>             Fix For: 1.6.6, 1.7.3, 1.8.1, 2.0.0
>
>   Original Estimate: 24h
>          Time Spent: 12h 50m
>  Remaining Estimate: 11h 10m
>
> We have iterators that create deep copies of the source and use them in 
> separate threads.  As it turns out this is not safe and we end up with many 
> exceptions, mostly down in the ZlibDecompressor library.  Curiously if you 
> turn on the data cache for the table being scanned then the errors disappear.
> After much hunting it turns out that the real bug is in the 
> BoundedRangeFileInputStream.  The read() method therein appropriately 
> synchronizes on the underlying FSDataInputStream, however the available() 
> method does not.  Adding similar synchronization on that stream fixes the 
> issues.  On a side note, the available() call is only invoked within the 
> hadoop CompressionInputStream for use in the getPos() call.  That call does 
> not appear to actually be used at least in this context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to