[ 
https://issues.apache.org/jira/browse/AVRO-541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Carey updated AVRO-541:
-----------------------------

    Attachment: AVRO-541.patch

This patch addresses the issue here.

Furthermore, it cleans up and refactors DataFileStream and DataFileWriter a 
little bit, encapsulating block write, decode, and encode work in 
DataFileStream.DataBlock for consistency.

The bug here was caused by a quirk in how Inflater.java works.   This quirk 
ONLY affects deflate with 'nowrap' mode.  Simply changing nowrap to false stops 
this bug, but is not up to spec.

The simplest work-around was to use InflaterOutputStream instead of 
InflaterInputStream.  This also allows for sharing more code between compress() 
and decompress().

The OutputStream variations avoid the complexity of having to deal with 
detecting the end of the stream that happens with the read() methods of the 
OutputStream interface, making it all much simpler, both in our code and in the 
internals of InflaterOutputStream and DeflaterOutputStream compared to the 
InputStream variants.   Its just easier to 'push' to the Inflate and Deflate 
API than to pull.

For some information on the sorts of things that were happening, see this Java 
bug: 
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4795299

The work-arounds there do not work well for a case where the end of the array 
is not guaranteed to be the end of the stream, which it is not when abstracted 
through a ByteBuffer for input in decompress().


> Java: TestDataFileConcat sometimes fails
> ----------------------------------------
>
>                 Key: AVRO-541
>                 URL: https://issues.apache.org/jira/browse/AVRO-541
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>            Reporter: Doug Cutting
>            Assignee: Scott Carey
>            Priority: Critical
>             Fix For: 1.4.0
>
>         Attachments: AVRO-541.patch, AVRO-541.patch
>
>
> TestDataFileConcat intermittently fails with:
> {code}
> Testcase: testConcateateFiles[5] took 0.032 sec
>         Caused an ERROR
> java.io.IOException: Block read partially, the data may be corrupt
> org.apache.avro.AvroRuntimeException: java.io.IOException: Block read 
> partially, the data may be corrupt
>         at 
> org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:173)
>         at org.apache.avro.file.DataFileStream.next(DataFileStream.java:193)
>         at 
> org.apache.avro.TestDataFileConcat.testConcateateFiles(TestDataFileConcat.java:141)
> Caused by: java.io.IOException: Block read partially, the data may be corrupt
>         at 
> org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:157)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to