[ 
https://issues.apache.org/jira/browse/ACCUMULO-4391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15476871#comment-15476871
 ] 

marco polo commented on ACCUMULO-4391:
--------------------------------------

Why is that decompressor being shared? Why isn't the thread being given access 
to its own decompressor on its own block read?

> Source deepcopies cannot be used safely in separate threads in tserver
> ----------------------------------------------------------------------
>
>                 Key: ACCUMULO-4391
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4391
>             Project: Accumulo
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.6.5
>            Reporter: Ivan Bella
>            Assignee: Ivan Bella
>             Fix For: 1.6.6, 1.7.3, 1.8.1, 2.0.0
>
>   Original Estimate: 24h
>          Time Spent: 12.5h
>  Remaining Estimate: 11.5h
>
> We have iterators that create deep copies of the source and use them in 
> separate threads.  As it turns out this is not safe and we end up with many 
> exceptions, mostly down in the ZlibDecompressor library.  Curiously if you 
> turn on the data cache for the table being scanned then the errors disappear.
> After much hunting it turns out that the real bug is in the 
> BoundedRangeFileInputStream.  The read() method therein appropriately 
> synchronizes on the underlying FSDataInputStream, however the available() 
> method does not.  Adding similar synchronization on that stream fixes the 
> issues.  On a side note, the available() call is only invoked within the 
> hadoop CompressionInputStream for use in the getPos() call.  That call does 
> not appear to actually be used at least in this context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to