[ https://issues.apache.org/jira/browse/HADOOP-7076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12978206#action_12978206 ]
Hadoop QA commented on HADOOP-7076: ----------------------------------- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12467585/HADOOP-7076.patch against trunk revision 1055206. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 1049 javac compiler warnings (more than the trunk's current 1048 warnings). +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. -1 release audit. The applied patch generated 2 release audit warnings (more than the trunk's current 1 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/159//testReport/ Release audit warnings: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/159//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/159//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/159//console This message is automatically generated. > Splittable Gzip > --------------- > > Key: HADOOP-7076 > URL: https://issues.apache.org/jira/browse/HADOOP-7076 > Project: Hadoop Common > Issue Type: New Feature > Components: io > Reporter: Niels Basjes > Attachments: HADOOP-7076.patch > > > Files compressed with the gzip codec are not splittable due to the nature of > the codec. > This limits the options you have scaling out when reading large gzipped input > files. > Given the fact that gunzipping a 1GiB file usually takes only 2 minutes I > figured that for some use cases wasting some resources may result in a > shorter job time under certain conditions. > So reading the entire input file from the start for each split (wasting > resources!!) may lead to additional scalability. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.