Ivan Bella created ACCUMULO-4391:
------------------------------------

             Summary: Source deepcopies cannot be used safely in separate 
threads in tserver
                 Key: ACCUMULO-4391
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4391
             Project: Accumulo
          Issue Type: Bug
          Components: core
    Affects Versions: 1.6.5
            Reporter: Ivan Bella
            Assignee: Ivan Bella
             Fix For: 1.6.6


We have iterators that create deep copies of the source and use them in 
separate threads.  As it turns out this is not safe and we end up with many 
exceptions, mostly down in the ZlibDecompressor library.  Curiously if you turn 
on the data cache for the table being scanned then the errors disappear.

After much hunting it turns out that the real bug is in the 
BoundedRangeFileInputStream.  The read() method therein appropriately 
synchronizes on the underlying FSDataInputStream, however the available() 
method does not.  Adding similar synchronization on that stream fixes the 
issues.  On a side note, the available() call is only invoked within the hadoop 
CompressionInputStream for use in the getPos() call.  That call does not appear 
to actually be used at least in this context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to