[ 
https://issues.apache.org/jira/browse/COMPRESS-222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Katsubo updated COMPRESS-222:
------------------------------------

    Attachment: md5.correct.txt
    
> ZipArchiveInputStream may read incorrect bytes from stream when processing 
> nested ZIP
> -------------------------------------------------------------------------------------
>
>                 Key: COMPRESS-222
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-222
>             Project: Commons Compress
>          Issue Type: Bug
>          Components: Archivers
>    Affects Versions: 1.5
>            Reporter: Dmitry Katsubo
>         Attachments: ArchiveTest.java, log_read_whole_entry.txt, log.txt, 
> md5.correct.txt
>
>
> The problem is relevant to COMPRESS-189, in particular it relates to 
> processing of inner ZIP files.
> Problem description:
> If the archive entry is not fully read, then partial reading returns 
> incorrect contents.
> In particular the given example loops trough all entries of "09815141_4.zip" 
> ZIP archive, probing each entry to be a TIFF file. The probe assumes that 
> given file if TIFF, if it starts with bytes [0x49 0x49 0x2A 0x0 0x8 0x0 0x0 
> 0x0 0x14 0x0].
> Most entries are correctly reported as TIFF, except:
> {code}
> [ArchiveTest] 000017.tif is something else
> [ArchiveTest] Header contents: 0x49 0x49 0x2A 0x0 0x8 0x0 0x0 0x0 0x0 0x0 
> [ArchiveTest] 000033.tif is something else
> [ArchiveTest] Header contents: 0x49 0x49 0x2A 0x0 0x0 0x0 0x0 0x0 0x0 0x0 
> [ArchiveTest] 000056.tif is something else
> [ArchiveTest] Header contents: 0x49 0x49 0x2A 0x0 0x8 0x0 0x0 0x0 0x0 0x0 
> [ArchiveTest] 000069.tif is something else
> [ArchiveTest] Header contents: 0x49 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 
> {code}
> As I can see, the problem can be introduced at any random byte.
> If the program is set {{READ_WHOLE_ENTRY=true}} then all entries are fully 
> read and MD5 sum is calculated. MD5 sum matches and all entries are correctly 
> reported as TIFF. Thus the problem is only present when entry is not fully 
> read and {{ArchiveInputStream.getNextEntry()}} is called.
> Test ZIP can be downloaded from: 
> https://www.dropbox.com/s/h20wo6t0mwbgsqc/09815141_4.zip
> It was originally taken from WIPO FTP, i.e. it is in public domain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to