[jira] [Created] (COMPRESS-174) BZip2CompressorInputStream doesn't handle being given a wrong-format compressed file

2012-01-31 Thread Andrew Pavlin (Created) (JIRA)
BZip2CompressorInputStream doesn't handle being given a wrong-format compressed 
file


 Key: COMPRESS-174
 URL: https://issues.apache.org/jira/browse/COMPRESS-174
 Project: Commons Compress
  Issue Type: Bug
  Components: Compressors
Affects Versions: 1.3
 Environment: Linux and Windows
Reporter: Andrew Pavlin
Priority: Minor


When reading a file through BZip2CompressorInputStream, and the user selects a 
file of the wrong type (such as ZIP or GZIP), the read blows up with a strange 
ArrayIndexOutOfBoundException, instead of reporting immediately that the input 
data is of the wrong format.

The Bzip2Compressor should be able to identify whether a stream is of BZip2 
format or not, and immediately reject it with a meaningful exception (example: 
ProtocolException: not a BZip2 compressed file).

Alternatively, are there functions in commons-compress that can identify the 
compression type of an InputStream by inspection?

Example stack trace when using a ZIP input file:

Exception in thread OSM Decompressor 
java.lang.ArrayIndexOutOfBoundsException: 90 
at 
org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.getAndMoveToFrontDecode(BZip2CompressorInputStream.java:688)
 
at 
org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.initBlock(BZip2CompressorInputStream.java:322)
 
at 
org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.setupNoRandPartA(BZip2CompressorInputStream.java:880)
 
at 
org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.setupNoRandPartB(BZip2CompressorInputStream.java:936)
 
at 
org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.read0(BZip2CompressorInputStream.java:228)
 
at 
org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.read(BZip2CompressorInputStream.java:180)
 
at java.io.InputStream.read(InputStream.java:82) 
at org.ka2ddo.yaac.osm.OsmXmlSegmenter$1.run(OsmXmlSegmenter.java:129) 
at java.lang.Thread.run(Thread.java:680) 
 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (COMPRESS-162) BZip2CompressorInputStream still stops after 900,000 decompressed bytes of large compressed file

2011-11-07 Thread Andrew Pavlin (Created) (JIRA)
BZip2CompressorInputStream still stops after 900,000 decompressed bytes of 
large compressed file


 Key: COMPRESS-162
 URL: https://issues.apache.org/jira/browse/COMPRESS-162
 Project: Commons Compress
  Issue Type: Bug
  Components: Compressors
Affects Versions: 1.3
 Environment: Linux (Fedora Cores 13 [2.6.34.9-69.fc13.i686.PAE] and 
15, at latest 'yum upgrade' as of 7 Nov 2011), Sun Java 1.6.0_22
Reporter: Andrew Pavlin


Attempting to unzip the planet-110921.osm.bz2 file downloaded directly from 
planet.OpenStreetMaps.org aborts after exactly 90 bytes are uncompressed. 
The uncompressed content looks like valid XML, and causes my application's 
parser to blow up with XML syntax errors due to missing closing tags. Tried 
using the example code to just uncompress, and got the same exact behavior.

Uncompressing the same file planet-110921.osm.bz2 (19357793489 bytes long 
compressed) with the Linux bzip2 command-line utility 
(bzip2-1.0.6-1.fc13.i686.rpm) succeeds and produces a valid (and enormous) XML 
file that can be successfully parsed.

Tried getting a subversion snapshot of the commons-compress trunk on 7 Nov 2011 
and replacing the org.apache.commons.compress.compressors.bzip2 package in the 
commons-compress-1.3.jar with compiled code from the trunk (Subversion log 
reported that the fix for COMPRESS-146 (?) was in). Still the same failure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira