[ https://issues.apache.org/jira/browse/COMPRESS-500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17005262#comment-17005262 ]
Anvesh Mora commented on COMPRESS-500: -------------------------------------- [~bodewig], Yes your understanding right. We are uncompressing the gzip file in Zip file. I'm determining the size of file and comparing it after the gzip file is decompressed ( after it has been written to disk) by common-compress library and also unzip & gunzip on Centos. And Entry size is not same when looked in ZipEntry. Basically it's giving -1: {code:java} Entry Name: cloud_3672_20191209220000.log.gz Entry size: -1 {code} I did small code snippet to test with JDK ZipFile and GZip I got similar output file size as centos without any EOF excpetion: {code:java} Enumeration<? extends ZipEntry> zipEntries = zipFile.entries(); while(zipEntries.hasMoreElements()){ ZipEntry entry = zipEntries.nextElement(); InputStream inputStream = zipFile.getInputStream(entry); GZIPInputStream gzipCompressorInputStream = new GZIPInputStream(inputStream); OutputStream os = new BufferedOutputStream(new FileOutputStream("/home/amora/Work/"+entry.getName())); IOUtils.copy(gzipCompressorInputStream,os); } {code} File decompressed size is: 2032922454 (Dec 30 11:16) > Discrepancy in file size extracted using ZipArchieveInputStream and Gzip > decompress component > ---------------------------------------------------------------------------------------------- > > Key: COMPRESS-500 > URL: https://issues.apache.org/jira/browse/COMPRESS-500 > Project: Commons Compress > Issue Type: Bug > Components: Compressors > Affects Versions: 1.8, 1.18 > Reporter: Anvesh Mora > Priority: Major > > Recent time I raised a bug facing a issue of "invalid Entry Size" > COMPRESS-494 ( Not resolved yet). > > And we are seeing a new issue, before explaining we have a file structure as > below and it is received as a stream of data over HTTPS. > > *File Structure*: > In Zip file > We have zero or more gz files which need to decompressed > And meta data at the end of the zip entries (end of stream), used for > downloading next file zip file. As plain text. > > And Now in production we are seeing new issue where we the entire gz file is > not decompressing. We found out that the utility on Cent OS7 is able to > extract and decompress the entire where as our library is failing. Below are > the differences in Sizes: > Using API: *765460480* bytes > And using Cent OS7 Linux utilities: *2032925215* bytes. > > We are getting EOF File exception at GzipCompressorInputStream.java:278, I'm > not sure of why. > > Need you help on this as we are blocked in the production. This could be a > potential fix for our library to make it more robust. > > Let me know HOW CAN WE INCREASE THE PRIORITY IF NEEDED! > -- This message was sent by Atlassian Jira (v8.3.4#803005)