[jira] Commented: (MAPREDUCE-830) Providing BZip2 splitting support for Text data

2009-09-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755306#action_12755306
 ] 

Hudson commented on MAPREDUCE-830:
--

Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #6 (See 
[http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/6/])
. Add support for splittable compression to TextInputFormats. Contributed 
by Abdul Qadeer


 Providing BZip2 splitting support for Text data
 ---

 Key: MAPREDUCE-830
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-830
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: Abdul Qadeer
Assignee: Abdul Qadeer
 Fix For: 0.21.0

 Attachments: M830-2.patch, M830-3.patch, M830-4.patch, M830-4.patch, 
 MapReduce-830-version1.patch


 HADOOP-4012 (https://issues.apache.org/jira/browse/HADOOP-4012) is providing 
 support to handle BZip2 compressed data such that the input compressed file 
 is split at arbitrary points.  This JIRA uses that functionality in 
 LineRecordReader.  The benefit of this work is that, if user provides 
 compressed BZip2 Text data, it will be split by Hadoop and hence will be 
 processed by multiple mappers.  So BZip2 compressed data will be able to 
 fully utilize the cluster power.  Currently BZip2 compressed Text file goes 
 to one mapper and is not split.  So the enhancement in this JIRA provides 
 splitting support  and a considerable performance gains.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-830) Providing BZip2 splitting support for Text data

2009-09-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12753832#action_12753832
 ] 

Hadoop QA commented on MAPREDUCE-830:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12418869/M830-3.patch
  against trunk revision 813585.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The patch appears to cause tar ant target to fail.

-1 findbugs.  The patch appears to cause Findbugs to fail.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/24/testReport/
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/24/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/24/console

This message is automatically generated.

 Providing BZip2 splitting support for Text data
 ---

 Key: MAPREDUCE-830
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-830
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: Abdul Qadeer
Assignee: Abdul Qadeer
 Fix For: 0.21.0

 Attachments: M830-2.patch, M830-3.patch, MapReduce-830-version1.patch


 HADOOP-4012 (https://issues.apache.org/jira/browse/HADOOP-4012) is providing 
 support to handle BZip2 compressed data such that the input compressed file 
 is split at arbitrary points.  This JIRA uses that functionality in 
 LineRecordReader.  The benefit of this work is that, if user provides 
 compressed BZip2 Text data, it will be split by Hadoop and hence will be 
 processed by multiple mappers.  So BZip2 compressed data will be able to 
 fully utilize the cluster power.  Currently BZip2 compressed Text file goes 
 to one mapper and is not split.  So the enhancement in this JIRA provides 
 splitting support  and a considerable performance gains.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-830) Providing BZip2 splitting support for Text data

2009-09-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12753930#action_12753930
 ] 

Hadoop QA commented on MAPREDUCE-830:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12419222/M830-4.patch
  against trunk revision 813585.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/59/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/59/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/59/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/59/console

This message is automatically generated.

 Providing BZip2 splitting support for Text data
 ---

 Key: MAPREDUCE-830
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-830
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: Abdul Qadeer
Assignee: Abdul Qadeer
 Fix For: 0.21.0

 Attachments: M830-2.patch, M830-3.patch, M830-4.patch, M830-4.patch, 
 MapReduce-830-version1.patch


 HADOOP-4012 (https://issues.apache.org/jira/browse/HADOOP-4012) is providing 
 support to handle BZip2 compressed data such that the input compressed file 
 is split at arbitrary points.  This JIRA uses that functionality in 
 LineRecordReader.  The benefit of this work is that, if user provides 
 compressed BZip2 Text data, it will be split by Hadoop and hence will be 
 processed by multiple mappers.  So BZip2 compressed data will be able to 
 fully utilize the cluster power.  Currently BZip2 compressed Text file goes 
 to one mapper and is not split.  So the enhancement in this JIRA provides 
 splitting support  and a considerable performance gains.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-830) Providing BZip2 splitting support for Text data

2009-09-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12753979#action_12753979
 ] 

Hudson commented on MAPREDUCE-830:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #30 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/30/])
. Add support for splittable compression to TextInputFormats. Contributed 
by Abdul Qadeer


 Providing BZip2 splitting support for Text data
 ---

 Key: MAPREDUCE-830
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-830
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: Abdul Qadeer
Assignee: Abdul Qadeer
 Fix For: 0.21.0

 Attachments: M830-2.patch, M830-3.patch, M830-4.patch, M830-4.patch, 
 MapReduce-830-version1.patch


 HADOOP-4012 (https://issues.apache.org/jira/browse/HADOOP-4012) is providing 
 support to handle BZip2 compressed data such that the input compressed file 
 is split at arbitrary points.  This JIRA uses that functionality in 
 LineRecordReader.  The benefit of this work is that, if user provides 
 compressed BZip2 Text data, it will be split by Hadoop and hence will be 
 processed by multiple mappers.  So BZip2 compressed data will be able to 
 fully utilize the cluster power.  Currently BZip2 compressed Text file goes 
 to one mapper and is not split.  So the enhancement in this JIRA provides 
 splitting support  and a considerable performance gains.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-830) Providing BZip2 splitting support for Text data

2009-09-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12753999#action_12753999
 ] 

Hudson commented on MAPREDUCE-830:
--

Integrated in Hadoop-Hdfs-trunk-Commit #27 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/27/])
. Add support for splittable compression to TextInputFormats. Contributed 
by Abdul Qadeer


 Providing BZip2 splitting support for Text data
 ---

 Key: MAPREDUCE-830
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-830
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: Abdul Qadeer
Assignee: Abdul Qadeer
 Fix For: 0.21.0

 Attachments: M830-2.patch, M830-3.patch, M830-4.patch, M830-4.patch, 
 MapReduce-830-version1.patch


 HADOOP-4012 (https://issues.apache.org/jira/browse/HADOOP-4012) is providing 
 support to handle BZip2 compressed data such that the input compressed file 
 is split at arbitrary points.  This JIRA uses that functionality in 
 LineRecordReader.  The benefit of this work is that, if user provides 
 compressed BZip2 Text data, it will be split by Hadoop and hence will be 
 processed by multiple mappers.  So BZip2 compressed data will be able to 
 fully utilize the cluster power.  Currently BZip2 compressed Text file goes 
 to one mapper and is not split.  So the enhancement in this JIRA provides 
 splitting support  and a considerable performance gains.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-830) Providing BZip2 splitting support for Text data

2009-09-07 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752304#action_12752304
 ] 

Chris Douglas commented on MAPREDUCE-830:
-

(also includes a workaround for MAPREDUCE-959, which was getting irritating, 
and updates the unit tests to JUnit4 semantics)

 Providing BZip2 splitting support for Text data
 ---

 Key: MAPREDUCE-830
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-830
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: Abdul Qadeer
Assignee: Abdul Qadeer
 Fix For: 0.21.0

 Attachments: M830-2.patch, M830-3.patch, MapReduce-830-version1.patch


 HADOOP-4012 (https://issues.apache.org/jira/browse/HADOOP-4012) is providing 
 support to handle BZip2 compressed data such that the input compressed file 
 is split at arbitrary points.  This JIRA uses that functionality in 
 LineRecordReader.  The benefit of this work is that, if user provides 
 compressed BZip2 Text data, it will be split by Hadoop and hence will be 
 processed by multiple mappers.  So BZip2 compressed data will be able to 
 fully utilize the cluster power.  Currently BZip2 compressed Text file goes 
 to one mapper and is not split.  So the enhancement in this JIRA provides 
 splitting support  and a considerable performance gains.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-830) Providing BZip2 splitting support for Text data

2009-08-30 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12749333#action_12749333
 ] 

Chris Douglas commented on MAPREDUCE-830:
-

(related comments in HADOOP-4012)
* Though it's not changed in bzip, since {{getEnd}} is part of the API, it 
should be called in {{LineRecordReader}}.
* Since the codec has state, the API demands that {{LineRecordReader}} 
synchronize on the codec before creating a splittable stream and calling 
{{getStart}} and {{getEnd}} to avoid race conditions (unless a better solution 
is found in HADOOP-4012)
* The default dir for unit tests is usually /tmp, not .

 Providing BZip2 splitting support for Text data
 ---

 Key: MAPREDUCE-830
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-830
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 0.21.0
Reporter: Abdul Qadeer
Assignee: Abdul Qadeer
 Fix For: 0.21.0

 Attachments: MapReduce-830-version1.patch


 HADOOP-4012 (https://issues.apache.org/jira/browse/HADOOP-4012) is providing 
 support to handle BZip2 compressed data such that the input compressed file 
 is split at arbitrary points.  This JIRA uses that functionality in 
 LineRecordReader.  The benefit of this work is that, if user provides 
 compressed BZip2 Text data, it will be split by Hadoop and hence will be 
 processed by multiple mappers.  So BZip2 compressed data will be able to 
 fully utilize the cluster power.  Currently BZip2 compressed Text file goes 
 to one mapper and is not split.  So the enhancement in this JIRA provides 
 splitting support  and a considerable performance gains.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.