[jira] Commented: (PIG-956) Reduce patch testing time
[ https://issues.apache.org/jira/browse/PIG-956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12760503#action_12760503 ] Hadoop QA commented on PIG-956: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12420749/PIG-956.patch against trunk revision 819691. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 75 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 281 release audit warnings (more than the trunk's current 279 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/49/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/49/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/49/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/49/console This message is automatically generated. Reduce patch testing time - Key: PIG-956 URL: https://issues.apache.org/jira/browse/PIG-956 Project: Pig Issue Type: Improvement Affects Versions: 0.4.0 Reporter: Olga Natkovich Assignee: Olga Natkovich Fix For: 0.6.0 Attachments: PIG-956.patch The proposal is to split the tests into 2 groups: (1) Ten-minute tests - this is a set of tests that run with every patch submission and takes aproximately 10 minutes (2) All tests - these include all tests and they will run nightly This is similar to work done in Hadoop: http://issues.apache.org/jira/browse/HDFS-458 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-721) redirecting releaseaudit o/p to build/test/releaseaudit/pig-releaseaudit-report.txt
[ https://issues.apache.org/jira/browse/PIG-721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12760544#action_12760544 ] Giridharan Kesavan commented on PIG-721: I understand redirecting releaseaudit warnings to at log file would break the current test patch process tnx! redirecting releaseaudit o/p to build/test/releaseaudit/pig-releaseaudit-report.txt Key: PIG-721 URL: https://issues.apache.org/jira/browse/PIG-721 Project: Pig Issue Type: Improvement Components: build Reporter: Giridharan Kesavan Assignee: Giridharan Kesavan Attachments: PIG-721.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-982) [zebra] Prevent checkin test cases from running twice in nightly test.
[ https://issues.apache.org/jira/browse/PIG-982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang reassigned PIG-982: - Assignee: Chao Wang [zebra] Prevent checkin test cases from running twice in nightly test. -- Key: PIG-982 URL: https://issues.apache.org/jira/browse/PIG-982 Project: Pig Issue Type: Bug Reporter: Chao Wang Assignee: Chao Wang Currently check-in test cases are running twice in nightly test. This jira is to fix this problem and also make some other polishing changes to Zebra's build script. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-982) [zebra] Prevent checkin test cases from running twice in nightly test.
[ https://issues.apache.org/jira/browse/PIG-982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-982: -- Status: Patch Available (was: Open) [zebra] Prevent checkin test cases from running twice in nightly test. -- Key: PIG-982 URL: https://issues.apache.org/jira/browse/PIG-982 Project: Pig Issue Type: Bug Reporter: Chao Wang Assignee: Chao Wang Currently check-in test cases are running twice in nightly test. This jira is to fix this problem and also make some other polishing changes to Zebra's build script. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-982) [zebra] Prevent checkin test cases from running twice in nightly test.
[ https://issues.apache.org/jira/browse/PIG-982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12760702#action_12760702 ] Chao Wang commented on PIG-982: --- The previous patch did not get submitted properly due to network issue - will resubmit. [zebra] Prevent checkin test cases from running twice in nightly test. -- Key: PIG-982 URL: https://issues.apache.org/jira/browse/PIG-982 Project: Pig Issue Type: Bug Reporter: Chao Wang Assignee: Chao Wang Currently check-in test cases are running twice in nightly test. This jira is to fix this problem and also make some other polishing changes to Zebra's build script. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-982) [zebra] Prevent checkin test cases from running twice in nightly test.
[ https://issues.apache.org/jira/browse/PIG-982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-982: -- Status: Open (was: Patch Available) [zebra] Prevent checkin test cases from running twice in nightly test. -- Key: PIG-982 URL: https://issues.apache.org/jira/browse/PIG-982 Project: Pig Issue Type: Bug Reporter: Chao Wang Assignee: Chao Wang Currently check-in test cases are running twice in nightly test. This jira is to fix this problem and also make some other polishing changes to Zebra's build script. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-956) Reduce patch testing time
[ https://issues.apache.org/jira/browse/PIG-956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12760705#action_12760705 ] Olga Natkovich commented on PIG-956: The two warnings are there because of the 2 files that I added to list the tests don't have apache header. I don't think JUnit will allow a header there. I checked that Hadoop does not have the header there. Once the patch is reviewed, I will commit it to trunk and also to the branch-0.5 Reduce patch testing time - Key: PIG-956 URL: https://issues.apache.org/jira/browse/PIG-956 Project: Pig Issue Type: Improvement Affects Versions: 0.4.0 Reporter: Olga Natkovich Assignee: Olga Natkovich Fix For: 0.6.0 Attachments: PIG-956.patch The proposal is to split the tests into 2 groups: (1) Ten-minute tests - this is a set of tests that run with every patch submission and takes aproximately 10 minutes (2) All tests - these include all tests and they will run nightly This is similar to work done in Hadoop: http://issues.apache.org/jira/browse/HDFS-458 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-982) [zebra] Prevent checkin test cases from running twice in nightly test.
[ https://issues.apache.org/jira/browse/PIG-982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-982: -- Attachment: patch_build This patch is only involving build script, therefore there is no unit test cases. [zebra] Prevent checkin test cases from running twice in nightly test. -- Key: PIG-982 URL: https://issues.apache.org/jira/browse/PIG-982 Project: Pig Issue Type: Bug Reporter: Chao Wang Assignee: Chao Wang Attachments: patch_build Currently check-in test cases are running twice in nightly test. This jira is to fix this problem and also make some other polishing changes to Zebra's build script. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-982) [zebra] Prevent checkin test cases from running twice in nightly test.
[ https://issues.apache.org/jira/browse/PIG-982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12760707#action_12760707 ] Hadoop QA commented on PIG-982: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org against trunk revision 819691. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/51/console This message is automatically generated. [zebra] Prevent checkin test cases from running twice in nightly test. -- Key: PIG-982 URL: https://issues.apache.org/jira/browse/PIG-982 Project: Pig Issue Type: Bug Reporter: Chao Wang Assignee: Chao Wang Attachments: patch_build Currently check-in test cases are running twice in nightly test. This jira is to fix this problem and also make some other polishing changes to Zebra's build script. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-953) Enable merge join in pig to work with loaders and store functions which can internally index sorted data
[ https://issues.apache.org/jira/browse/PIG-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12760710#action_12760710 ] Pradeep Kamath commented on PIG-953: Dmitriy, I looked at the ResourceSchema proposed in http://wiki.apache.org/pig/LoadStoreRedesignProposal and also spoke with Alan to understand the intent more. The eventual goal is for the setSchema() call in StoreFunc to give the ResourceSchema to the store implementation. The ResourceSchema will contain both pig schema information and sort column information. So Zebra or any other storage function which needs to know about sort columns will get the information from the ResourceSchema passed in setSchema(). However, today there is a way pig runtime conveys the pig schema to store functions (through StoreConfig). We need a separate way to give sort information since pig schema does not have the ability to give it. Since after the rewrite of load/store interfaces this problem will be solved through setSchema(), the solution which we will come up with now in this jira will anyway need to be re-written. So it is cleaner to only keep sort column information in SortColInfo and have an array of SortColInfo in SortInfo. If instead we use ResourceSchema then StoreConfig will have a pig Schema and a Resource Schema which would also be confusing to callers. In short, since this piece code of code will need a re-write later, it is better not to make it generic now and just address immediate needs and the re-write should remove multiple representations of schema/sort information. Enable merge join in pig to work with loaders and store functions which can internally index sorted data - Key: PIG-953 URL: https://issues.apache.org/jira/browse/PIG-953 Project: Pig Issue Type: Improvement Affects Versions: 0.3.0 Reporter: Pradeep Kamath Assignee: Pradeep Kamath Attachments: PIG-953-2.patch, PIG-953.patch Currently merge join implementation in pig includes construction of an index on sorted data and use of that index to seek into the right input to efficiently perform the join operation. Some loaders (notably the zebra loader) internally implement an index on sorted data and can perform this seek efficiently using their index. So the use of the index needs to be abstracted in such a way that when the loader supports indexing, pig uses it (indirectly through the loader) and does not construct an index. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-956) Reduce patch testing time
[ https://issues.apache.org/jira/browse/PIG-956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12760716#action_12760716 ] Daniel Dai commented on PIG-956: Patch looks good. It takes less than 8 min to run test-commit on my machine. For release audit warning, I think it is unnecessarily to add license comments into to a plain text file. +1 for the patch. Reduce patch testing time - Key: PIG-956 URL: https://issues.apache.org/jira/browse/PIG-956 Project: Pig Issue Type: Improvement Affects Versions: 0.4.0 Reporter: Olga Natkovich Assignee: Olga Natkovich Fix For: 0.6.0 Attachments: PIG-956.patch The proposal is to split the tests into 2 groups: (1) Ten-minute tests - this is a set of tests that run with every patch submission and takes aproximately 10 minutes (2) All tests - these include all tests and they will run nightly This is similar to work done in Hadoop: http://issues.apache.org/jira/browse/HDFS-458 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-922) Logical optimizer: push up project
[ https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-922: --- Attachment: PIG-922-p3_3.patch Attach PIG-922-p3_3.patch to address concerns and comments by Pradeep. Logical optimizer: push up project -- Key: PIG-922 URL: https://issues.apache.org/jira/browse/PIG-922 Project: Pig Issue Type: New Feature Components: impl Affects Versions: 0.3.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.6.0 Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, PIG-922-p3_2.patch, PIG-922-p3_3.patch This is a continuation work of [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add another rule to the logical optimizer: Push up project, ie, prune columns as early as possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-922) Logical optimizer: push up project
[ https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-922: --- Status: Open (was: Patch Available) Logical optimizer: push up project -- Key: PIG-922 URL: https://issues.apache.org/jira/browse/PIG-922 Project: Pig Issue Type: New Feature Components: impl Affects Versions: 0.3.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.6.0 Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, PIG-922-p3_2.patch, PIG-922-p3_3.patch This is a continuation work of [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add another rule to the logical optimizer: Push up project, ie, prune columns as early as possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-922) Logical optimizer: push up project
[ https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-922: --- Status: Patch Available (was: Open) Logical optimizer: push up project -- Key: PIG-922 URL: https://issues.apache.org/jira/browse/PIG-922 Project: Pig Issue Type: New Feature Components: impl Affects Versions: 0.3.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.6.0 Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, PIG-922-p3_2.patch, PIG-922-p3_3.patch This is a continuation work of [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add another rule to the logical optimizer: Push up project, ie, prune columns as early as possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs
[ https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12760749#action_12760749 ] Daniel Dai commented on PIG-948: Since most comments pro for this change, so I am going to commit this patch including the url construction part. I will change sleepTime from 500 to 1000. In all cases I have experimented, I can get the correct jobid after 1000ms. [Usability] Relating pig script with MR jobs Key: PIG-948 URL: https://issues.apache.org/jira/browse/PIG-948 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.4.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Fix For: 0.6.0 Attachments: pig-948-2.patch, pig-948.patch Currently its hard to find a way to relate pig script with specific MR job. In a loaded cluster with multiple simultaneous job submissions, its not easy to figure out which specific MR jobs were launched for a given pig script. If Pig can provide this info, it will be useful to debug and monitor the jobs resulting from a pig script. At the very least, Pig should be able to provide user the following information 1) Job id of the launched job. 2) Complete web url of jobtracker running this job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-956) Reduce patch testing time
[ https://issues.apache.org/jira/browse/PIG-956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-956: --- Resolution: Fixed Status: Resolved (was: Patch Available) patch committed to trunk and branch-0.5 Reduce patch testing time - Key: PIG-956 URL: https://issues.apache.org/jira/browse/PIG-956 Project: Pig Issue Type: Improvement Affects Versions: 0.4.0 Reporter: Olga Natkovich Assignee: Olga Natkovich Fix For: 0.6.0 Attachments: PIG-956.patch The proposal is to split the tests into 2 groups: (1) Ten-minute tests - this is a set of tests that run with every patch submission and takes aproximately 10 minutes (2) All tests - these include all tests and they will run nightly This is similar to work done in Hadoop: http://issues.apache.org/jira/browse/HDFS-458 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-922) Logical optimizer: push up project
[ https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12760801#action_12760801 ] Hadoop QA commented on PIG-922: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12420840/PIG-922-p3_3.patch against trunk revision 820111. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 27 new or modified tests. -1 javadoc. The javadoc tool appears to have generated 1 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 4 new Findbugs warnings. -1 release audit. The applied patch generated 288 release audit warnings (more than the trunk's current 281 warnings). -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/52/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/52/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/52/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/52/console This message is automatically generated. Logical optimizer: push up project -- Key: PIG-922 URL: https://issues.apache.org/jira/browse/PIG-922 Project: Pig Issue Type: New Feature Components: impl Affects Versions: 0.3.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.6.0 Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, PIG-922-p3_2.patch, PIG-922-p3_3.patch This is a continuation work of [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add another rule to the logical optimizer: Push up project, ie, prune columns as early as possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs
[ https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12760808#action_12760808 ] Hadoop QA commented on PIG-948: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12420851/pig-948-3.patch against trunk revision 820111. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 405 javac compiler warnings (more than the trunk's current 403 warnings). +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/11/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/11/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/11/console This message is automatically generated. [Usability] Relating pig script with MR jobs Key: PIG-948 URL: https://issues.apache.org/jira/browse/PIG-948 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.4.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Fix For: 0.6.0 Attachments: pig-948-2.patch, pig-948-3.patch, pig-948.patch Currently its hard to find a way to relate pig script with specific MR job. In a loaded cluster with multiple simultaneous job submissions, its not easy to figure out which specific MR jobs were launched for a given pig script. If Pig can provide this info, it will be useful to debug and monitor the jobs resulting from a pig script. At the very least, Pig should be able to provide user the following information 1) Job id of the launched job. 2) Complete web url of jobtracker running this job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-960) Using Hadoop's optimized LineRecordReader for reading Tuples in PigStorage
[ https://issues.apache.org/jira/browse/PIG-960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12760822#action_12760822 ] Ankit Modi commented on PIG-960: Thanks for comments Daniel. Answers: 1. PigLineRecordReader (PLRR) needs to know the type of InputStream it is handling. BZip2 or Uncompressed. Depending on the type of input stream it chooses which Reader to utilize. BPIS ( BufferedPositionedInputStream ) stores the input stream as a protected member. PLRR can access this via following ways: - making member public, - adding a get method to access it or - inherit. I implemented the last one as it makes least changes to BPIS. 2. Good one. Will be fixed in next patch. 3. Will be added in next patch 4. Corrected in next patch. Using Hadoop's optimized LineRecordReader for reading Tuples in PigStorage --- Key: PIG-960 URL: https://issues.apache.org/jira/browse/PIG-960 Project: Pig Issue Type: Improvement Components: impl Reporter: Ankit Modi Attachments: pig_rlr.patch PigStorage's reading of Tuples ( lines ) can be optimized using Hadoop's {{LineRecordReader}}. This can help in following areas - Improving performance reading of Tuples (lines) in {{PigStorage}} - Any future improvements in line reading done in Hadoop's {{LineRecordReader}} is automatically carried over to Pig Issues that are handled by this patch - BZip uses internal buffers and positioning for determining the number of bytes read. Hence buffering done by {{LineRecordReader}} has to be turned off - Current implementation of {{LocalSeekableInputStream}} does not implement {{available}} method. This method has to be implemented. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.