[jira] Updated: (PIG-653) Make fieldsToRead work in loader
[ https://issues.apache.org/jira/browse/PIG-653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Jain updated PIG-653: Attachment: PIG-653.patch Zebra changes for the proposed feature Please reveiw at your earliest convenience Make fieldsToRead work in loader Key: PIG-653 URL: https://issues.apache.org/jira/browse/PIG-653 Project: Pig Issue Type: New Feature Reporter: Alan Gates Assignee: Pradeep Kamath Attachments: PIG-653-2.comment, PIG-653-3-proposal.txt, PIG-653.patch Currently pig does not call the fieldsToRead function in LoadFunc, thus it does not provide information to load functions on what fields are needed. We need to implement a visitor that determines (where possible) which fields in a file will be used and relays that information to the load function. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1119) [zebra] group is a Pig preserved word, zebra needs to use other string for table's group information
[ https://issues.apache.org/jira/browse/PIG-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Jain updated PIG-1119: - Attachment: PIG-1119.patch Changes incorporated as part for code review feedback [zebra] group is a Pig preserved word, zebra needs to use other string for table's group information -- Key: PIG-1119 URL: https://issues.apache.org/jira/browse/PIG-1119 Project: Pig Issue Type: Bug Affects Versions: 0.6.0 Reporter: Jing Huang Fix For: 0.6.0 Attachments: PIG-1119.patch, PIG-1119.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1119) [zebra] group is a Pig preserved word, zebra needs to use other string for table's group information
[ https://issues.apache.org/jira/browse/PIG-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Jain updated PIG-1119: - Status: Open (was: Patch Available) Providing an updated version [zebra] group is a Pig preserved word, zebra needs to use other string for table's group information -- Key: PIG-1119 URL: https://issues.apache.org/jira/browse/PIG-1119 Project: Pig Issue Type: Bug Affects Versions: 0.6.0 Reporter: Jing Huang Fix For: 0.6.0 Attachments: PIG-1119.patch, PIG-1119.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-653) Make fieldsToRead work in loader
[ https://issues.apache.org/jira/browse/PIG-653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Jain updated PIG-653: Status: Patch Available (was: Open) Make fieldsToRead work in loader Key: PIG-653 URL: https://issues.apache.org/jira/browse/PIG-653 Project: Pig Issue Type: New Feature Reporter: Alan Gates Assignee: Pradeep Kamath Attachments: PIG-653-2.comment, PIG-653-3-proposal.txt, PIG-653.patch Currently pig does not call the fieldsToRead function in LoadFunc, thus it does not provide information to load functions on what fields are needed. We need to implement a visitor that determines (where possible) which fields in a file will be used and relays that information to the load function. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1119) [zebra] group is a Pig preserved word, zebra needs to use other string for table's group information
[ https://issues.apache.org/jira/browse/PIG-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Jain updated PIG-1119: - Status: Patch Available (was: Open) [zebra] group is a Pig preserved word, zebra needs to use other string for table's group information -- Key: PIG-1119 URL: https://issues.apache.org/jira/browse/PIG-1119 Project: Pig Issue Type: Bug Affects Versions: 0.6.0 Reporter: Jing Huang Fix For: 0.6.0 Attachments: PIG-1119.patch, PIG-1119.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1105) COUNT_STAR accumulate interface implementation cases failure
[ https://issues.apache.org/jira/browse/PIG-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriranjan Manjunath updated PIG-1105: - Attachment: PIG-1105.patch COUNT_STAR accumulate interface implementation cases failure Key: PIG-1105 URL: https://issues.apache.org/jira/browse/PIG-1105 Project: Pig Issue Type: Bug Components: impl Reporter: Thejas M Nair Assignee: Sriranjan Manjunath Fix For: 0.6.0 Attachments: PIG-1105.1.patch, PIG-1105.patch COUNT_STAR.accumulate is calling sum() which is supposed to be used by intermediate and final parts of algebraic interface. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Build failed in Hudson: Pig-trunk #636
See http://hudson.zones.apache.org/hudson/job/Pig-trunk/636/changes Changes: [yanz] PIG- Multiple Outputs Support (Gaurav Jain via yanz) [olga] PIG-1084: Pig 0.6.0 Documentation improvements (chandec via olgan) [daijy] PIG-922: Logical optimizer: push up project [gates] PIG-1068: COGROUP fails with 'Type mismatch in key from map: expected org.apache.pig.impl.io.NullableText, recieved org.apache.pig.impl.io.NullableTuple' [yanz] PIG-1122 Changed version number of pig dev core jar used in Zebra build from 0.6.0 to 0.7.0 to match Pig version number (yanz) -- [...truncated 2701 lines...] ivy-init-dirs: ivy-probe-antlib: ivy-init-antlib: ivy-init: ivy-buildJar: [ivy:resolve] :: resolving dependencies :: org.apache.pig#Pig;2009-12-04_10-05-58 [ivy:resolve] confs: [buildJar] [ivy:resolve] found com.jcraft#jsch;0.1.38 in maven2 [ivy:resolve] found jline#jline;0.9.94 in maven2 [ivy:resolve] found net.java.dev.javacc#javacc;4.2 in maven2 [ivy:resolve] found junit#junit;4.5 in default [ivy:resolve] :: resolution report :: resolve 88ms :: artifacts dl 5ms - | |modules|| artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| - | buildJar | 4 | 0 | 0 | 0 || 4 | 0 | - [ivy:retrieve] :: retrieving :: org.apache.pig#Pig [ivy:retrieve] confs: [buildJar] [ivy:retrieve] 1 artifacts copied, 3 already retrieved (288kB/4ms) buildJar: [echo] svnString 887139 [jar] Building jar: http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/pig-2009-12-04_10-05-58.jar [copy] Copying 1 file to http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk jarWithOutSvn: findbugs: [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/findbugs [findbugs] Executing findbugs from ant task [findbugs] Running FindBugs... [findbugs] The following classes needed for analysis were missing: [findbugs] com.jcraft.jsch.SocketFactory [findbugs] com.jcraft.jsch.Logger [findbugs] jline.Completor [findbugs] com.jcraft.jsch.Session [findbugs] com.jcraft.jsch.HostKeyRepository [findbugs] com.jcraft.jsch.JSch [findbugs] com.jcraft.jsch.UserInfo [findbugs] jline.ConsoleReaderInputStream [findbugs] com.jcraft.jsch.HostKey [findbugs] jline.ConsoleReader [findbugs] com.jcraft.jsch.ChannelExec [findbugs] jline.History [findbugs] com.jcraft.jsch.ChannelDirectTCPIP [findbugs] com.jcraft.jsch.JSchException [findbugs] com.jcraft.jsch.Channel [findbugs] Warnings generated: 20 [findbugs] Missing classes: 16 [findbugs] Calculating exit code... [findbugs] Setting 'missing class' flag (2) [findbugs] Setting 'bugs found' flag (1) [findbugs] Exit code set to: 3 [findbugs] Java Result: 3 [findbugs] Classes needed for analysis were missing [findbugs] Output saved to http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/findbugs/pig-findbugs-report.xml [xslt] Processing http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/findbugs/pig-findbugs-report.xml to http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/findbugs/pig-findbugs-report.html [xslt] Loading stylesheet /homes/gkesavan/tools/findbugs/latest/src/xsl/default.xsl BUILD SUCCESSFUL Total time: 2 minutes 52 seconds + mv build/pig-2009-12-04_10-05-58.tar.gz http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk + mv build/test/findbugs http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk + mv build/docs/api http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk + /homes/hudson/tools/ant/apache-ant-1.7.0/bin/ant clean Buildfile: build.xml clean: [delete] Deleting directory http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/src-gen [delete] Deleting directory http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/src/docs/build [delete] Deleting directory http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build [delete] Deleting directory http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/test/org/apache/pig/test/utils/dotGraph/parser BUILD SUCCESSFUL Total time: 0 seconds + /homes/hudson/tools/ant/apache-ant-1.7.0/bin/ant -Dtest.junit.output.format=xml -Dtest.output=yes -Dcheckstyle.home=/homes/hudson/tools/checkstyle/latest -Drun.clover=true -Dclover.home=/homes/hudson/tools/clover/latest clover test generate-clover-reports Buildfile: build.xml clover.setup: [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/clover/db [clover-setup] Clover Version 2.4.3, built on
[jira] Updated: (PIG-1104) [zebra] Provide streaming support in Zebra.
[ https://issues.apache.org/jira/browse/PIG-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-1104: -- Status: Patch Available (was: Open) [zebra] Provide streaming support in Zebra. --- Key: PIG-1104 URL: https://issues.apache.org/jira/browse/PIG-1104 Project: Pig Issue Type: New Feature Affects Versions: 0.4.0 Reporter: Chao Wang Assignee: Chao Wang Fix For: 0.6.0, 0.7.0 Attachments: PIG1104.patch Hadoop streaming is very popular among Hadoop users. The main attraction is the simplicity of use. A user can write the application logic in any language and process large amounts of data using Hadoop framework. As more people start to use Zebra to store their data, we expect users would like to run Hadoop streaming scripts to easily process Zebra tables. The following lists a simple example of using Hadoop streaming to access Zebra data. It loads data from foo table using Zebra's TableInputFormat and then writes the data into output using default TextOutputFormat. $ hadoop jar hadoop-streaming.jar -D mapred.reduce.tasks=0 -input foo -output output -mapper 'cat' -inputformat org.apache.hadoop.zebra.mapred.TableInputFormat More detailed, Zebra uses Pig DefaultTuple implementation of Tuple for its records. Currently, when Zebra's TableInputFormat is used for input, the user script sees each line containing key_if_any\tTuple.toString() . We plan to generate CSV format representation of our Pig tuples. To this end, we plan to do the following: 1) Derive a sub class ZupleTuple from pig's DefaultTuple class and override its toString() method to present the data into CSV format. 2) On Zebra side, the tuple factory should be changed to create ZebraTuple objects, instead of DefaultTuple objects. Note that we can only support streaming on the input side - ability to use streaming to read data from Zebra tables. For the output side, the streaming support is not feasible, since the streaming mapper or reducer only emits Text\tText, the output collector has no way of knowing how to convert this to (BytesWritable,Tuple). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-653) Make fieldsToRead work in loader
[ https://issues.apache.org/jira/browse/PIG-653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785928#action_12785928 ] Hadoop QA commented on PIG-653: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426879/PIG-653.patch against trunk revision 887049. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 97 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 395 release audit warnings (more than the trunk's current 368 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/89/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/89/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/89/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/89/console This message is automatically generated. Make fieldsToRead work in loader Key: PIG-653 URL: https://issues.apache.org/jira/browse/PIG-653 Project: Pig Issue Type: New Feature Reporter: Alan Gates Assignee: Pradeep Kamath Attachments: PIG-653-2.comment, PIG-653-3-proposal.txt, PIG-653.patch Currently pig does not call the fieldsToRead function in LoadFunc, thus it does not provide information to load functions on what fields are needed. We need to implement a visitor that determines (where possible) which fields in a file will be used and relays that information to the load function. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-653) Make fieldsToRead work in loader
[ https://issues.apache.org/jira/browse/PIG-653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785936#action_12785936 ] Yan Zhou commented on PIG-653: -- The 27 release audit failures are from 25 pig test scripts and 2 test data files, none of them are source files and should be ignored. Make fieldsToRead work in loader Key: PIG-653 URL: https://issues.apache.org/jira/browse/PIG-653 Project: Pig Issue Type: New Feature Reporter: Alan Gates Assignee: Pradeep Kamath Attachments: PIG-653-2.comment, PIG-653-3-proposal.txt, PIG-653.patch Currently pig does not call the fieldsToRead function in LoadFunc, thus it does not provide information to load functions on what fields are needed. We need to implement a visitor that determines (where possible) which fields in a file will be used and relays that information to the load function. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-653) Make fieldsToRead work in loader
[ https://issues.apache.org/jira/browse/PIG-653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785937#action_12785937 ] Yan Zhou commented on PIG-653: -- A typo in my last comment. should have been 27 audit *warnings* not *failures* Make fieldsToRead work in loader Key: PIG-653 URL: https://issues.apache.org/jira/browse/PIG-653 Project: Pig Issue Type: New Feature Reporter: Alan Gates Assignee: Pradeep Kamath Attachments: PIG-653-2.comment, PIG-653-3-proposal.txt, PIG-653.patch Currently pig does not call the fieldsToRead function in LoadFunc, thus it does not provide information to load functions on what fields are needed. We need to implement a visitor that determines (where possible) which fields in a file will be used and relays that information to the load function. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1119) [zebra] group is a Pig preserved word, zebra needs to use other string for table's group information
[ https://issues.apache.org/jira/browse/PIG-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785939#action_12785939 ] Yan Zhou commented on PIG-1119: --- +1 [zebra] group is a Pig preserved word, zebra needs to use other string for table's group information -- Key: PIG-1119 URL: https://issues.apache.org/jira/browse/PIG-1119 Project: Pig Issue Type: Bug Affects Versions: 0.6.0 Reporter: Jing Huang Fix For: 0.6.0 Attachments: PIG-1119.patch, PIG-1119.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1126) piggybank loaders need to update fieldsToRead function
[ https://issues.apache.org/jira/browse/PIG-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-1126: Attachment: PIG-1126.patch All unit tests passed. Please, review. (This is for both the trunk and 0.6 branch) piggybank loaders need to update fieldsToRead function -- Key: PIG-1126 URL: https://issues.apache.org/jira/browse/PIG-1126 Project: Pig Issue Type: Bug Reporter: Olga Natkovich Assignee: Olga Natkovich Fix For: 0.6.0 Attachments: PIG-1126.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1126) piggybank loaders need to update fieldsToRead function
[ https://issues.apache.org/jira/browse/PIG-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786013#action_12786013 ] Daniel Dai commented on PIG-1126: - +1 piggybank loaders need to update fieldsToRead function -- Key: PIG-1126 URL: https://issues.apache.org/jira/browse/PIG-1126 Project: Pig Issue Type: Bug Reporter: Olga Natkovich Assignee: Olga Natkovich Fix For: 0.6.0 Attachments: PIG-1126.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1119) [zebra] group is a Pig preserved word, zebra needs to use other string for table's group information
[ https://issues.apache.org/jira/browse/PIG-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786027#action_12786027 ] Hadoop QA commented on PIG-1119: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426881/PIG-1119.patch against trunk revision 887049. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 39 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/90/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/90/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/90/console This message is automatically generated. [zebra] group is a Pig preserved word, zebra needs to use other string for table's group information -- Key: PIG-1119 URL: https://issues.apache.org/jira/browse/PIG-1119 Project: Pig Issue Type: Bug Affects Versions: 0.6.0 Reporter: Jing Huang Fix For: 0.6.0 Attachments: PIG-1119.patch, PIG-1119.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1105) COUNT_STAR accumulate interface implementation cases failure
[ https://issues.apache.org/jira/browse/PIG-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786031#action_12786031 ] Hadoop QA commented on PIG-1105: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426887/PIG-1105.patch against trunk revision 887290. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/91/console This message is automatically generated. COUNT_STAR accumulate interface implementation cases failure Key: PIG-1105 URL: https://issues.apache.org/jira/browse/PIG-1105 Project: Pig Issue Type: Bug Components: impl Reporter: Thejas M Nair Assignee: Sriranjan Manjunath Fix For: 0.6.0 Attachments: PIG-1105.1.patch, PIG-1105.patch COUNT_STAR.accumulate is calling sum() which is supposed to be used by intermediate and final parts of algebraic interface. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-653) Make fieldsToRead work in loader
[ https://issues.apache.org/jira/browse/PIG-653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-653: --- Resolution: Duplicate Status: Resolved (was: Patch Available) PIG-922 Make fieldsToRead work in loader Key: PIG-653 URL: https://issues.apache.org/jira/browse/PIG-653 Project: Pig Issue Type: New Feature Reporter: Alan Gates Assignee: Pradeep Kamath Attachments: PIG-653-2.comment, PIG-653-3-proposal.txt, PIG-653.patch Currently pig does not call the fieldsToRead function in LoadFunc, thus it does not provide information to load functions on what fields are needed. We need to implement a visitor that determines (where possible) which fields in a file will be used and relays that information to the load function. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-653) Make fieldsToRead work in loader
[ https://issues.apache.org/jira/browse/PIG-653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786046#action_12786046 ] Yan Zhou commented on PIG-653: -- Zebra changes commited to both trunk and the 6.0 branch. Make fieldsToRead work in loader Key: PIG-653 URL: https://issues.apache.org/jira/browse/PIG-653 Project: Pig Issue Type: New Feature Reporter: Alan Gates Assignee: Pradeep Kamath Attachments: PIG-653-2.comment, PIG-653-3-proposal.txt, PIG-653.patch Currently pig does not call the fieldsToRead function in LoadFunc, thus it does not provide information to load functions on what fields are needed. We need to implement a visitor that determines (where possible) which fields in a file will be used and relays that information to the load function. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (PIG-1126) piggybank loaders need to update fieldsToRead function
[ https://issues.apache.org/jira/browse/PIG-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich resolved PIG-1126. - Resolution: Fixed patch committed to trank and 0.6 branch piggybank loaders need to update fieldsToRead function -- Key: PIG-1126 URL: https://issues.apache.org/jira/browse/PIG-1126 Project: Pig Issue Type: Bug Reporter: Olga Natkovich Assignee: Olga Natkovich Fix For: 0.6.0 Attachments: PIG-1126.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1086) Nested sort by * throw exception
[ https://issues.apache.org/jira/browse/PIG-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Ding updated PIG-1086: -- Status: Patch Available (was: Open) Nested sort by * throw exception Key: PIG-1086 URL: https://issues.apache.org/jira/browse/PIG-1086 Project: Pig Issue Type: Bug Affects Versions: 0.5.0 Reporter: Daniel Dai Assignee: Richard Ding Attachments: PIG-1086.patch The following script fail: A = load '1.txt' as (a0, a1, a2); B = group A by a0; C = foreach B { D = order A by *; generate group, D;}; explain C; Here is the stack: Caused by: java.lang.ArrayIndexOutOfBoundsException: -1 at java.util.ArrayList.get(ArrayList.java:324) at org.apache.pig.impl.logicalLayer.schema.Schema.getField(Schema.java:752) at org.apache.pig.impl.logicalLayer.LOSort.getSortInfo(LOSort.java:332) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:1365) at org.apache.pig.impl.logicalLayer.LOSort.visit(LOSort.java:176) at org.apache.pig.impl.logicalLayer.LOSort.visit(LOSort.java:43) at org.apache.pig.impl.plan.DependencyOrderWalkerWOSeenChk.walk(DependencyOrderWalkerWOSeenChk.java:69) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:1274) at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:130) at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:45) at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:69) at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:234) at org.apache.pig.PigServer.compilePp(PigServer.java:864) at org.apache.pig.PigServer.explain(PigServer.java:583) ... 8 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-1127) Logical operator should contains individual copy of schema object
Logical operator should contains individual copy of schema object - Key: PIG-1127 URL: https://issues.apache.org/jira/browse/PIG-1127 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.6.0 Currently some logical operator only contains a schema reference to the predecessor's schema object. These logical operators include: LOSplitOutput, LOLimit, LOSplit, LOFilter, LOSort, LODistinct, LOUnion. It is ok in the before because we do not change schema object once it is set. Now with the column pruner (PIG-922), we need to change individual schema object so it is no longer acceptable. For example, the following script fail: {code} a = load '1.txt' as (a0, a1:map[], a2); b = foreach a generate a1; c = limit b 10; dump c; {code} We need to fix it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1118) expression with aggregate functions returning null, with accumulate interface
[ https://issues.apache.org/jira/browse/PIG-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-1118: Resolution: Fixed Status: Resolved (was: Patch Available) patch committed on trunk and branch 0.6 expression with aggregate functions returning null, with accumulate interface - Key: PIG-1118 URL: https://issues.apache.org/jira/browse/PIG-1118 Project: Pig Issue Type: Bug Reporter: Thejas M Nair Assignee: Ying He Fix For: 0.6.0 Attachments: PIG_1118.patch The problem is in trunk . It works fine in 0.6 branch. l = load '/tmp/students.txt' as (a : chararray,b : chararray,c : int); grunt g = group l by 1; grunt dump g; (1,{(asdfxc,M,23),(qwer,F,21),(uhsdf,M,34),(zxldf,M,21),(qwer,F,23),(oiue,M,54)}) grunt f = foreach g generate SUM(l.c), 1 + SUM(l.c) + SUM(l.c); grunt dump f; (176L,) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-747) Logical to Physical Plan Translation fails when temporary alias are created within foreach
[ https://issues.apache.org/jira/browse/PIG-747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-747: --- Fix Version/s: (was: 0.6.0) 0.7.0 Unlinking from 0.6.0 release. The change is to large to make this late Logical to Physical Plan Translation fails when temporary alias are created within foreach -- Key: PIG-747 URL: https://issues.apache.org/jira/browse/PIG-747 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Viraj Bhat Assignee: Daniel Dai Fix For: 0.7.0 Attachments: physicalplan.txt, physicalplanprob.pig, PIG-747-1.patch Consider a the pig script which calculates a new column F inside the foreach as: {code} A = load 'physicalplan.txt' as (col1,col2,col3); B = foreach A { D = col1/col2; E = col3/col2; F = E - (D*D); generate F as newcol; }; dump B; {code} This gives the following error: === Caused by: org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogicalToPhysicalTranslatorException: ERROR 2015: Invalid physical operators in the physical plan at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:377) at org.apache.pig.impl.logicalLayer.LOMultiply.visit(LOMultiply.java:63) at org.apache.pig.impl.logicalLayer.LOMultiply.visit(LOMultiply.java:29) at org.apache.pig.impl.plan.DependencyOrderWalkerWOSeenChk.walk(DependencyOrderWalkerWOSeenChk.java:68) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:908) at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:122) at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:41) at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68) at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:246) ... 10 more Caused by: org.apache.pig.impl.plan.PlanException: ERROR 0: Attempt to give operator of type org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.Divide multiple outputs. This operator does not support multiple outputs. at org.apache.pig.impl.plan.OperatorPlan.connect(OperatorPlan.java:158) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.plans.PhysicalPlan.connect(PhysicalPlan.java:89) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:373) ... 19 more === -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1124) Unable to set Custom Job Name using the -Dmapred.job.name parameter
[ https://issues.apache.org/jira/browse/PIG-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-1124: Fix Version/s: (was: 0.6.0) 0.7.0 Unable to set Custom Job Name using the -Dmapred.job.name parameter --- Key: PIG-1124 URL: https://issues.apache.org/jira/browse/PIG-1124 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.6.0 Reporter: Viraj Bhat Priority: Minor Fix For: 0.7.0 As a Hadoop user I want to control the Job name for my analysis via the command line using the following construct:: java -cp pig.jar:$HADOOP_HOME/conf -Dmapred.job.name=hadoop_junkie org.apache.pig.Main broken.pig -Dmapred.job.name should normally set my Hadoop Job name, but somehow during the formation of the job.xml in Pig this information is lost and the job name turns out to be: PigLatin:broken.pig The current workaround seems to be wiring it in the script itself, using the following ( or using parameter substitution). set job.name 'my job' Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1127) Logical operator should contains individual copy of schema object
[ https://issues.apache.org/jira/browse/PIG-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1127: Status: Patch Available (was: Open) Logical operator should contains individual copy of schema object - Key: PIG-1127 URL: https://issues.apache.org/jira/browse/PIG-1127 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.6.0 Attachments: PIG-1127-1.patch Currently some logical operator only contains a schema reference to the predecessor's schema object. These logical operators include: LOSplitOutput, LOLimit, LOSplit, LOFilter, LOSort, LODistinct, LOUnion. It is ok in the before because we do not change schema object once it is set. Now with the column pruner (PIG-922), we need to change individual schema object so it is no longer acceptable. For example, the following script fail: {code} a = load '1.txt' as (a0, a1:map[], a2); b = foreach a generate a1; c = limit b 10; dump c; {code} We need to fix it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1127) Logical operator should contains individual copy of schema object
[ https://issues.apache.org/jira/browse/PIG-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1127: Attachment: PIG-1127-1.patch Logical operator should contains individual copy of schema object - Key: PIG-1127 URL: https://issues.apache.org/jira/browse/PIG-1127 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.6.0 Attachments: PIG-1127-1.patch Currently some logical operator only contains a schema reference to the predecessor's schema object. These logical operators include: LOSplitOutput, LOLimit, LOSplit, LOFilter, LOSort, LODistinct, LOUnion. It is ok in the before because we do not change schema object once it is set. Now with the column pruner (PIG-922), we need to change individual schema object so it is no longer acceptable. For example, the following script fail: {code} a = load '1.txt' as (a0, a1:map[], a2); b = foreach a generate a1; c = limit b 10; dump c; {code} We need to fix it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1104) [zebra] Provide streaming support in Zebra.
[ https://issues.apache.org/jira/browse/PIG-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786188#action_12786188 ] Hadoop QA commented on PIG-1104: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426801/PIG1104.patch against trunk revision 887290. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/92/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/92/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/92/console This message is automatically generated. [zebra] Provide streaming support in Zebra. --- Key: PIG-1104 URL: https://issues.apache.org/jira/browse/PIG-1104 Project: Pig Issue Type: New Feature Affects Versions: 0.4.0 Reporter: Chao Wang Assignee: Chao Wang Fix For: 0.6.0, 0.7.0 Attachments: PIG1104.patch Hadoop streaming is very popular among Hadoop users. The main attraction is the simplicity of use. A user can write the application logic in any language and process large amounts of data using Hadoop framework. As more people start to use Zebra to store their data, we expect users would like to run Hadoop streaming scripts to easily process Zebra tables. The following lists a simple example of using Hadoop streaming to access Zebra data. It loads data from foo table using Zebra's TableInputFormat and then writes the data into output using default TextOutputFormat. $ hadoop jar hadoop-streaming.jar -D mapred.reduce.tasks=0 -input foo -output output -mapper 'cat' -inputformat org.apache.hadoop.zebra.mapred.TableInputFormat More detailed, Zebra uses Pig DefaultTuple implementation of Tuple for its records. Currently, when Zebra's TableInputFormat is used for input, the user script sees each line containing key_if_any\tTuple.toString() . We plan to generate CSV format representation of our Pig tuples. To this end, we plan to do the following: 1) Derive a sub class ZupleTuple from pig's DefaultTuple class and override its toString() method to present the data into CSV format. 2) On Zebra side, the tuple factory should be changed to create ZebraTuple objects, instead of DefaultTuple objects. Note that we can only support streaming on the input side - ability to use streaming to read data from Zebra tables. For the output side, the streaming support is not feasible, since the streaming mapper or reducer only emits Text\tText, the output collector has no way of knowing how to convert this to (BytesWritable,Tuple). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1128) column pruning causing failure when foreach has user-specified schema
[ https://issues.apache.org/jira/browse/PIG-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-1128: Fix Version/s: 0.6.0 column pruning causing failure when foreach has user-specified schema - Key: PIG-1128 URL: https://issues.apache.org/jira/browse/PIG-1128 Project: Pig Issue Type: Bug Affects Versions: 0.6.0, 0.7.0 Reporter: Thejas M Nair Fix For: 0.6.0 Issue is seen in 0.6.0 and trunk. grunt l = load 'dummy.txt' as ( c1 : chararray, c2 : int); grunt f1 = foreach l generate c1 as c1 : chararray, c2 as c2 : int, 'CA' as state : chararray; grunt f2 = foreach f1 generate c1 as c1 : chararray; grunt explain f2; 2009-12-04 13:11:19,010 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1022: Type mismatch merging schema prefix. Field Schema: chararray. Other Field Schema: c2: int ( it does not matter if the new schema has new/different column name - ) gruntl = load 'dummy.txt' as ( c1 : chararray, c2 : int); gruntf1 = foreach l generate c1 as c11 : chararray, c2 as c22 : int, 'CA' as state : chararray; gruntf2 = foreach f1 generate c11 as c111 : chararray; grunt explain f2; 2009-12-04 13:13:01,462 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1022: Type mismatch merging schema prefix. Field Schema: chararray. Other Field Schema: c22: int -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Build failed in Hudson: Pig-trunk #637
See http://hudson.zones.apache.org/hudson/job/Pig-trunk/637/changes Changes: [olga] PIG-1118: expression with aggregate functions returning null, with accumulate interface (yinghe via olgan) [olga] PIG-1126: updated fieldsToRead function (olgan) [yanz] PIG-653 Pig Projection Push Down (Gaurav Jain via yanz) -- [...truncated 2733 lines...] ivy-init-dirs: ivy-probe-antlib: ivy-init-antlib: ivy-init: ivy-buildJar: [ivy:resolve] :: resolving dependencies :: org.apache.pig#Pig;2009-12-04_22-05-57 [ivy:resolve] confs: [buildJar] [ivy:resolve] found com.jcraft#jsch;0.1.38 in maven2 [ivy:resolve] found jline#jline;0.9.94 in maven2 [ivy:resolve] found net.java.dev.javacc#javacc;4.2 in maven2 [ivy:resolve] found junit#junit;4.5 in default [ivy:resolve] :: resolution report :: resolve 76ms :: artifacts dl 4ms - | |modules|| artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| - | buildJar | 4 | 0 | 0 | 0 || 4 | 0 | - [ivy:retrieve] :: retrieving :: org.apache.pig#Pig [ivy:retrieve] confs: [buildJar] [ivy:retrieve] 1 artifacts copied, 3 already retrieved (288kB/4ms) buildJar: [echo] svnString 887379 [jar] Building jar: http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/pig-2009-12-04_22-05-57.jar [copy] Copying 1 file to http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk jarWithOutSvn: findbugs: [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/findbugs [findbugs] Executing findbugs from ant task [findbugs] Running FindBugs... [findbugs] The following classes needed for analysis were missing: [findbugs] com.jcraft.jsch.SocketFactory [findbugs] com.jcraft.jsch.Logger [findbugs] jline.Completor [findbugs] com.jcraft.jsch.Session [findbugs] com.jcraft.jsch.HostKeyRepository [findbugs] com.jcraft.jsch.JSch [findbugs] com.jcraft.jsch.UserInfo [findbugs] jline.ConsoleReaderInputStream [findbugs] com.jcraft.jsch.HostKey [findbugs] jline.ConsoleReader [findbugs] com.jcraft.jsch.ChannelExec [findbugs] jline.History [findbugs] com.jcraft.jsch.ChannelDirectTCPIP [findbugs] com.jcraft.jsch.JSchException [findbugs] com.jcraft.jsch.Channel [findbugs] Warnings generated: 20 [findbugs] Missing classes: 16 [findbugs] Calculating exit code... [findbugs] Setting 'missing class' flag (2) [findbugs] Setting 'bugs found' flag (1) [findbugs] Exit code set to: 3 [findbugs] Java Result: 3 [findbugs] Classes needed for analysis were missing [findbugs] Output saved to http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/findbugs/pig-findbugs-report.xml [xslt] Processing http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/findbugs/pig-findbugs-report.xml to http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/findbugs/pig-findbugs-report.html [xslt] Loading stylesheet /homes/gkesavan/tools/findbugs/latest/src/xsl/default.xsl BUILD SUCCESSFUL Total time: 2 minutes 53 seconds + mv build/pig-2009-12-04_22-05-57.tar.gz http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk + mv build/test/findbugs http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk + mv build/docs/api http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk + /homes/hudson/tools/ant/apache-ant-1.7.0/bin/ant clean Buildfile: build.xml clean: [delete] Deleting directory http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/src-gen [delete] Deleting directory http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/src/docs/build [delete] Deleting directory http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build [delete] Deleting directory http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/test/org/apache/pig/test/utils/dotGraph/parser BUILD SUCCESSFUL Total time: 0 seconds + /homes/hudson/tools/ant/apache-ant-1.7.0/bin/ant -Dtest.junit.output.format=xml -Dtest.output=yes -Dcheckstyle.home=/homes/hudson/tools/checkstyle/latest -Drun.clover=true -Dclover.home=/homes/hudson/tools/clover/latest clover test generate-clover-reports Buildfile: build.xml clover.setup: [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/clover/db [clover-setup] Clover Version 2.4.3, built on March 09 2009 (build-756) [clover-setup] Loaded from: /homes/hudson/tools/clover/latest/lib/clover.jar [clover-setup] Clover: Open Source License registered to Apache. [clover-setup] Clover is enabled with initstring
[jira] Updated: (PIG-1110) Handle compressed file formats -- Gz, BZip with the new proposal
[ https://issues.apache.org/jira/browse/PIG-1110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Ding updated PIG-1110: -- Attachment: PIG-1110.patch Handle compressed file formats -- Gz, BZip with the new proposal Key: PIG-1110 URL: https://issues.apache.org/jira/browse/PIG-1110 Project: Pig Issue Type: Sub-task Reporter: Richard Ding Assignee: Richard Ding Attachments: PIG-1110.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1110) Handle compressed file formats -- Gz, BZip with the new proposal
[ https://issues.apache.org/jira/browse/PIG-1110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Ding updated PIG-1110: -- Release Note: For compressed BZip files, the load-store branch only supports file extension .bz2. It ignores the file extension .bz and treats those files as regular text files. This change is due to the new version of PigStorage which uses Hadoop's TextInputFormat as its InputFormat. Hadoop Flags: [Incompatible change] Handle compressed file formats -- Gz, BZip with the new proposal Key: PIG-1110 URL: https://issues.apache.org/jira/browse/PIG-1110 Project: Pig Issue Type: Sub-task Reporter: Richard Ding Assignee: Richard Ding Attachments: PIG-1110.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-1129) Pig UDF doc: fieldsToRead function
Pig UDF doc: fieldsToRead function --- Key: PIG-1129 URL: https://issues.apache.org/jira/browse/PIG-1129 Project: Pig Issue Type: Task Components: documentation Affects Versions: 0.6.0 Reporter: Corinne Chandel Assignee: Corinne Chandel Priority: Blocker Fix For: 0.6.0 Updated Pig UDF doc to include information about the fieldsToRead function. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1129) Pig UDF doc: fieldsToRead function
[ https://issues.apache.org/jira/browse/PIG-1129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Corinne Chandel updated PIG-1129: - Attachment: Pig-6-UDF.patch Patch: Pig-6-UDF.patch Pig UDF doc: fieldsToRead function --- Key: PIG-1129 URL: https://issues.apache.org/jira/browse/PIG-1129 Project: Pig Issue Type: Task Components: documentation Affects Versions: 0.6.0 Reporter: Corinne Chandel Assignee: Corinne Chandel Priority: Blocker Fix For: 0.6.0 Attachments: Pig-6-UDF.patch Updated Pig UDF doc to include information about the fieldsToRead function. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1129) Pig UDF doc: fieldsToRead function
[ https://issues.apache.org/jira/browse/PIG-1129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Corinne Chandel updated PIG-1129: - Status: Patch Available (was: Open) Apply patch to Pig trunk: http://svn.apache.org/repos/asf/hadoop/pig/trunk Note: No new test code required; changes to documentation only. Pig UDF doc: fieldsToRead function --- Key: PIG-1129 URL: https://issues.apache.org/jira/browse/PIG-1129 Project: Pig Issue Type: Task Components: documentation Affects Versions: 0.6.0 Reporter: Corinne Chandel Assignee: Corinne Chandel Priority: Blocker Fix For: 0.6.0 Attachments: Pig-6-UDF.patch Updated Pig UDF doc to include information about the fieldsToRead function. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1105) COUNT_STAR accumulate interface implementation cases failure
[ https://issues.apache.org/jira/browse/PIG-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriranjan Manjunath updated PIG-1105: - Attachment: (was: PIG-1105.patch) COUNT_STAR accumulate interface implementation cases failure Key: PIG-1105 URL: https://issues.apache.org/jira/browse/PIG-1105 Project: Pig Issue Type: Bug Components: impl Reporter: Thejas M Nair Assignee: Sriranjan Manjunath Fix For: 0.6.0 Attachments: PIG-1105.1.patch COUNT_STAR.accumulate is calling sum() which is supposed to be used by intermediate and final parts of algebraic interface. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1105) COUNT_STAR accumulate interface implementation cases failure
[ https://issues.apache.org/jira/browse/PIG-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriranjan Manjunath updated PIG-1105: - Status: Open (was: Patch Available) COUNT_STAR accumulate interface implementation cases failure Key: PIG-1105 URL: https://issues.apache.org/jira/browse/PIG-1105 Project: Pig Issue Type: Bug Components: impl Reporter: Thejas M Nair Assignee: Sriranjan Manjunath Fix For: 0.6.0 Attachments: PIG-1105.1.patch, PIG-1105.2.patch COUNT_STAR.accumulate is calling sum() which is supposed to be used by intermediate and final parts of algebraic interface. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1105) COUNT_STAR accumulate interface implementation cases failure
[ https://issues.apache.org/jira/browse/PIG-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriranjan Manjunath updated PIG-1105: - Status: Patch Available (was: Open) COUNT_STAR accumulate interface implementation cases failure Key: PIG-1105 URL: https://issues.apache.org/jira/browse/PIG-1105 Project: Pig Issue Type: Bug Components: impl Reporter: Thejas M Nair Assignee: Sriranjan Manjunath Fix For: 0.6.0 Attachments: PIG-1105.1.patch, PIG-1105.2.patch COUNT_STAR.accumulate is calling sum() which is supposed to be used by intermediate and final parts of algebraic interface. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-480) PERFORMANCE: Use identity mapper in a chain of M-R jobs
[ https://issues.apache.org/jira/browse/PIG-480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ying He updated PIG-480: Status: Open (was: Patch Available) this patch has a conflict with the new code that just checked in, which results in compilation error. PERFORMANCE: Use identity mapper in a chain of M-R jobs --- Key: PIG-480 URL: https://issues.apache.org/jira/browse/PIG-480 Project: Pig Issue Type: Improvement Affects Versions: 0.2.0 Reporter: Olga Natkovich Assignee: Ying He Attachments: PIG_480.patch, PIG_480.patch For jobs with two or more MR jobs, use identity mapper wherever possible in second and subsequent MR jobs. Identity mapper is about 50% than pig empty map job because it doesn't parse the data. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-480) PERFORMANCE: Use identity mapper in a chain of M-R jobs
[ https://issues.apache.org/jira/browse/PIG-480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ying He updated PIG-480: Attachment: PIG_480.patch fix the compilation error. PERFORMANCE: Use identity mapper in a chain of M-R jobs --- Key: PIG-480 URL: https://issues.apache.org/jira/browse/PIG-480 Project: Pig Issue Type: Improvement Affects Versions: 0.2.0 Reporter: Olga Natkovich Assignee: Ying He Attachments: PIG_480.patch, PIG_480.patch For jobs with two or more MR jobs, use identity mapper wherever possible in second and subsequent MR jobs. Identity mapper is about 50% than pig empty map job because it doesn't parse the data. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-480) PERFORMANCE: Use identity mapper in a chain of M-R jobs
[ https://issues.apache.org/jira/browse/PIG-480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-480: --- Status: Patch Available (was: Open) PERFORMANCE: Use identity mapper in a chain of M-R jobs --- Key: PIG-480 URL: https://issues.apache.org/jira/browse/PIG-480 Project: Pig Issue Type: Improvement Affects Versions: 0.2.0 Reporter: Olga Natkovich Assignee: Ying He Attachments: PIG_480.patch, PIG_480.patch For jobs with two or more MR jobs, use identity mapper wherever possible in second and subsequent MR jobs. Identity mapper is about 50% than pig empty map job because it doesn't parse the data. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1104) [zebra] Provide streaming support in Zebra.
[ https://issues.apache.org/jira/browse/PIG-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1104: --- Attachment: PIG-1104.patch [zebra] Provide streaming support in Zebra. --- Key: PIG-1104 URL: https://issues.apache.org/jira/browse/PIG-1104 Project: Pig Issue Type: New Feature Affects Versions: 0.4.0 Reporter: Chao Wang Assignee: Chao Wang Fix For: 0.6.0, 0.7.0 Attachments: PIG-1104.patch, PIG1104.patch Hadoop streaming is very popular among Hadoop users. The main attraction is the simplicity of use. A user can write the application logic in any language and process large amounts of data using Hadoop framework. As more people start to use Zebra to store their data, we expect users would like to run Hadoop streaming scripts to easily process Zebra tables. The following lists a simple example of using Hadoop streaming to access Zebra data. It loads data from foo table using Zebra's TableInputFormat and then writes the data into output using default TextOutputFormat. $ hadoop jar hadoop-streaming.jar -D mapred.reduce.tasks=0 -input foo -output output -mapper 'cat' -inputformat org.apache.hadoop.zebra.mapred.TableInputFormat More detailed, Zebra uses Pig DefaultTuple implementation of Tuple for its records. Currently, when Zebra's TableInputFormat is used for input, the user script sees each line containing key_if_any\tTuple.toString() . We plan to generate CSV format representation of our Pig tuples. To this end, we plan to do the following: 1) Derive a sub class ZupleTuple from pig's DefaultTuple class and override its toString() method to present the data into CSV format. 2) On Zebra side, the tuple factory should be changed to create ZebraTuple objects, instead of DefaultTuple objects. Note that we can only support streaming on the input side - ability to use streaming to read data from Zebra tables. For the output side, the streaming support is not feasible, since the streaming mapper or reducer only emits Text\tText, the output collector has no way of knowing how to convert this to (BytesWritable,Tuple). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1104) [zebra] Provide streaming support in Zebra.
[ https://issues.apache.org/jira/browse/PIG-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1104: --- Attachment: (was: PIG1104.patch) [zebra] Provide streaming support in Zebra. --- Key: PIG-1104 URL: https://issues.apache.org/jira/browse/PIG-1104 Project: Pig Issue Type: New Feature Affects Versions: 0.4.0 Reporter: Chao Wang Assignee: Chao Wang Fix For: 0.6.0, 0.7.0 Attachments: PIG-1104.patch Hadoop streaming is very popular among Hadoop users. The main attraction is the simplicity of use. A user can write the application logic in any language and process large amounts of data using Hadoop framework. As more people start to use Zebra to store their data, we expect users would like to run Hadoop streaming scripts to easily process Zebra tables. The following lists a simple example of using Hadoop streaming to access Zebra data. It loads data from foo table using Zebra's TableInputFormat and then writes the data into output using default TextOutputFormat. $ hadoop jar hadoop-streaming.jar -D mapred.reduce.tasks=0 -input foo -output output -mapper 'cat' -inputformat org.apache.hadoop.zebra.mapred.TableInputFormat More detailed, Zebra uses Pig DefaultTuple implementation of Tuple for its records. Currently, when Zebra's TableInputFormat is used for input, the user script sees each line containing key_if_any\tTuple.toString() . We plan to generate CSV format representation of our Pig tuples. To this end, we plan to do the following: 1) Derive a sub class ZupleTuple from pig's DefaultTuple class and override its toString() method to present the data into CSV format. 2) On Zebra side, the tuple factory should be changed to create ZebraTuple objects, instead of DefaultTuple objects. Note that we can only support streaming on the input side - ability to use streaming to read data from Zebra tables. For the output side, the streaming support is not feasible, since the streaming mapper or reducer only emits Text\tText, the output collector has no way of knowing how to convert this to (BytesWritable,Tuple). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1104) [zebra] Provide streaming support in Zebra.
[ https://issues.apache.org/jira/browse/PIG-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1104: --- Status: Open (was: Patch Available) [zebra] Provide streaming support in Zebra. --- Key: PIG-1104 URL: https://issues.apache.org/jira/browse/PIG-1104 Project: Pig Issue Type: New Feature Affects Versions: 0.4.0 Reporter: Chao Wang Assignee: Chao Wang Fix For: 0.6.0, 0.7.0 Attachments: PIG-1104.patch Hadoop streaming is very popular among Hadoop users. The main attraction is the simplicity of use. A user can write the application logic in any language and process large amounts of data using Hadoop framework. As more people start to use Zebra to store their data, we expect users would like to run Hadoop streaming scripts to easily process Zebra tables. The following lists a simple example of using Hadoop streaming to access Zebra data. It loads data from foo table using Zebra's TableInputFormat and then writes the data into output using default TextOutputFormat. $ hadoop jar hadoop-streaming.jar -D mapred.reduce.tasks=0 -input foo -output output -mapper 'cat' -inputformat org.apache.hadoop.zebra.mapred.TableInputFormat More detailed, Zebra uses Pig DefaultTuple implementation of Tuple for its records. Currently, when Zebra's TableInputFormat is used for input, the user script sees each line containing key_if_any\tTuple.toString() . We plan to generate CSV format representation of our Pig tuples. To this end, we plan to do the following: 1) Derive a sub class ZupleTuple from pig's DefaultTuple class and override its toString() method to present the data into CSV format. 2) On Zebra side, the tuple factory should be changed to create ZebraTuple objects, instead of DefaultTuple objects. Note that we can only support streaming on the input side - ability to use streaming to read data from Zebra tables. For the output side, the streaming support is not feasible, since the streaming mapper or reducer only emits Text\tText, the output collector has no way of knowing how to convert this to (BytesWritable,Tuple). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1105) COUNT_STAR accumulate interface implementation cases failure
[ https://issues.apache.org/jira/browse/PIG-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriranjan Manjunath updated PIG-1105: - Status: Open (was: Patch Available) Cancelling since the patch does not have all the changes. COUNT_STAR accumulate interface implementation cases failure Key: PIG-1105 URL: https://issues.apache.org/jira/browse/PIG-1105 Project: Pig Issue Type: Bug Components: impl Reporter: Thejas M Nair Assignee: Sriranjan Manjunath Fix For: 0.6.0 Attachments: PIG-1105.1.patch, PIG-1105.2.patch COUNT_STAR.accumulate is calling sum() which is supposed to be used by intermediate and final parts of algebraic interface. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1105) COUNT_STAR accumulate interface implementation cases failure
[ https://issues.apache.org/jira/browse/PIG-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriranjan Manjunath updated PIG-1105: - Attachment: (was: PIG-1105.2.patch) COUNT_STAR accumulate interface implementation cases failure Key: PIG-1105 URL: https://issues.apache.org/jira/browse/PIG-1105 Project: Pig Issue Type: Bug Components: impl Reporter: Thejas M Nair Assignee: Sriranjan Manjunath Fix For: 0.6.0 Attachments: PIG-1105.1.patch, PIG-1105.2.patch COUNT_STAR.accumulate is calling sum() which is supposed to be used by intermediate and final parts of algebraic interface. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1105) COUNT_STAR accumulate interface implementation cases failure
[ https://issues.apache.org/jira/browse/PIG-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriranjan Manjunath updated PIG-1105: - Status: Patch Available (was: Open) COUNT_STAR accumulate interface implementation cases failure Key: PIG-1105 URL: https://issues.apache.org/jira/browse/PIG-1105 Project: Pig Issue Type: Bug Components: impl Reporter: Thejas M Nair Assignee: Sriranjan Manjunath Fix For: 0.6.0 Attachments: PIG-1105.1.patch, PIG-1105.2.patch COUNT_STAR.accumulate is calling sum() which is supposed to be used by intermediate and final parts of algebraic interface. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1086) Nested sort by * throw exception
[ https://issues.apache.org/jira/browse/PIG-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786300#action_12786300 ] Hadoop QA commented on PIG-1086: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426934/PIG-1086.patch against trunk revision 887318. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/93/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/93/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/93/console This message is automatically generated. Nested sort by * throw exception Key: PIG-1086 URL: https://issues.apache.org/jira/browse/PIG-1086 Project: Pig Issue Type: Bug Affects Versions: 0.5.0 Reporter: Daniel Dai Assignee: Richard Ding Attachments: PIG-1086.patch The following script fail: A = load '1.txt' as (a0, a1, a2); B = group A by a0; C = foreach B { D = order A by *; generate group, D;}; explain C; Here is the stack: Caused by: java.lang.ArrayIndexOutOfBoundsException: -1 at java.util.ArrayList.get(ArrayList.java:324) at org.apache.pig.impl.logicalLayer.schema.Schema.getField(Schema.java:752) at org.apache.pig.impl.logicalLayer.LOSort.getSortInfo(LOSort.java:332) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:1365) at org.apache.pig.impl.logicalLayer.LOSort.visit(LOSort.java:176) at org.apache.pig.impl.logicalLayer.LOSort.visit(LOSort.java:43) at org.apache.pig.impl.plan.DependencyOrderWalkerWOSeenChk.walk(DependencyOrderWalkerWOSeenChk.java:69) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:1274) at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:130) at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:45) at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:69) at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:234) at org.apache.pig.PigServer.compilePp(PigServer.java:864) at org.apache.pig.PigServer.explain(PigServer.java:583) ... 8 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-1128) column pruning causing failure when foreach has user-specified schema
[ https://issues.apache.org/jira/browse/PIG-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai reassigned PIG-1128: --- Assignee: Daniel Dai column pruning causing failure when foreach has user-specified schema - Key: PIG-1128 URL: https://issues.apache.org/jira/browse/PIG-1128 Project: Pig Issue Type: Bug Affects Versions: 0.6.0, 0.7.0 Reporter: Thejas M Nair Assignee: Daniel Dai Fix For: 0.6.0 Issue is seen in 0.6.0 and trunk. grunt l = load 'dummy.txt' as ( c1 : chararray, c2 : int); grunt f1 = foreach l generate c1 as c1 : chararray, c2 as c2 : int, 'CA' as state : chararray; grunt f2 = foreach f1 generate c1 as c1 : chararray; grunt explain f2; 2009-12-04 13:11:19,010 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1022: Type mismatch merging schema prefix. Field Schema: chararray. Other Field Schema: c2: int ( it does not matter if the new schema has new/different column name - ) gruntl = load 'dummy.txt' as ( c1 : chararray, c2 : int); gruntf1 = foreach l generate c1 as c11 : chararray, c2 as c22 : int, 'CA' as state : chararray; gruntf2 = foreach f1 generate c11 as c111 : chararray; grunt explain f2; 2009-12-04 13:13:01,462 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1022: Type mismatch merging schema prefix. Field Schema: chararray. Other Field Schema: c22: int -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1128) column pruning causing failure when foreach has user-specified schema
[ https://issues.apache.org/jira/browse/PIG-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1128: Status: Patch Available (was: Open) column pruning causing failure when foreach has user-specified schema - Key: PIG-1128 URL: https://issues.apache.org/jira/browse/PIG-1128 Project: Pig Issue Type: Bug Affects Versions: 0.6.0, 0.7.0 Reporter: Thejas M Nair Assignee: Daniel Dai Fix For: 0.6.0 Attachments: PIG-1128-1.patch Issue is seen in 0.6.0 and trunk. grunt l = load 'dummy.txt' as ( c1 : chararray, c2 : int); grunt f1 = foreach l generate c1 as c1 : chararray, c2 as c2 : int, 'CA' as state : chararray; grunt f2 = foreach f1 generate c1 as c1 : chararray; grunt explain f2; 2009-12-04 13:11:19,010 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1022: Type mismatch merging schema prefix. Field Schema: chararray. Other Field Schema: c2: int ( it does not matter if the new schema has new/different column name - ) gruntl = load 'dummy.txt' as ( c1 : chararray, c2 : int); gruntf1 = foreach l generate c1 as c11 : chararray, c2 as c22 : int, 'CA' as state : chararray; gruntf2 = foreach f1 generate c11 as c111 : chararray; grunt explain f2; 2009-12-04 13:13:01,462 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1022: Type mismatch merging schema prefix. Field Schema: chararray. Other Field Schema: c22: int -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1128) column pruning causing failure when foreach has user-specified schema
[ https://issues.apache.org/jira/browse/PIG-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1128: Attachment: PIG-1128-1.patch column pruning causing failure when foreach has user-specified schema - Key: PIG-1128 URL: https://issues.apache.org/jira/browse/PIG-1128 Project: Pig Issue Type: Bug Affects Versions: 0.6.0, 0.7.0 Reporter: Thejas M Nair Assignee: Daniel Dai Fix For: 0.6.0 Attachments: PIG-1128-1.patch Issue is seen in 0.6.0 and trunk. grunt l = load 'dummy.txt' as ( c1 : chararray, c2 : int); grunt f1 = foreach l generate c1 as c1 : chararray, c2 as c2 : int, 'CA' as state : chararray; grunt f2 = foreach f1 generate c1 as c1 : chararray; grunt explain f2; 2009-12-04 13:11:19,010 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1022: Type mismatch merging schema prefix. Field Schema: chararray. Other Field Schema: c2: int ( it does not matter if the new schema has new/different column name - ) gruntl = load 'dummy.txt' as ( c1 : chararray, c2 : int); gruntf1 = foreach l generate c1 as c11 : chararray, c2 as c22 : int, 'CA' as state : chararray; gruntf2 = foreach f1 generate c11 as c111 : chararray; grunt explain f2; 2009-12-04 13:13:01,462 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1022: Type mismatch merging schema prefix. Field Schema: chararray. Other Field Schema: c22: int -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1127) Logical operator should contains individual copy of schema object
[ https://issues.apache.org/jira/browse/PIG-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1127: Attachment: PIG-1127-2.patch Logical operator should contains individual copy of schema object - Key: PIG-1127 URL: https://issues.apache.org/jira/browse/PIG-1127 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.6.0 Attachments: PIG-1127-1.patch, PIG-1127-2.patch Currently some logical operator only contains a schema reference to the predecessor's schema object. These logical operators include: LOSplitOutput, LOLimit, LOSplit, LOFilter, LOSort, LODistinct, LOUnion. It is ok in the before because we do not change schema object once it is set. Now with the column pruner (PIG-922), we need to change individual schema object so it is no longer acceptable. For example, the following script fail: {code} a = load '1.txt' as (a0, a1:map[], a2); b = foreach a generate a1; c = limit b 10; dump c; {code} We need to fix it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1127) Logical operator should contains individual copy of schema object
[ https://issues.apache.org/jira/browse/PIG-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1127: Status: Open (was: Patch Available) Logical operator should contains individual copy of schema object - Key: PIG-1127 URL: https://issues.apache.org/jira/browse/PIG-1127 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.6.0 Attachments: PIG-1127-1.patch, PIG-1127-2.patch Currently some logical operator only contains a schema reference to the predecessor's schema object. These logical operators include: LOSplitOutput, LOLimit, LOSplit, LOFilter, LOSort, LODistinct, LOUnion. It is ok in the before because we do not change schema object once it is set. Now with the column pruner (PIG-922), we need to change individual schema object so it is no longer acceptable. For example, the following script fail: {code} a = load '1.txt' as (a0, a1:map[], a2); b = foreach a generate a1; c = limit b 10; dump c; {code} We need to fix it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1127) Logical operator should contains individual copy of schema object
[ https://issues.apache.org/jira/browse/PIG-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786352#action_12786352 ] Hadoop QA commented on PIG-1127: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426964/PIG-1127-1.patch against trunk revision 887401. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/94/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/94/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/94/console This message is automatically generated. Logical operator should contains individual copy of schema object - Key: PIG-1127 URL: https://issues.apache.org/jira/browse/PIG-1127 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.6.0 Attachments: PIG-1127-1.patch, PIG-1127-2.patch Currently some logical operator only contains a schema reference to the predecessor's schema object. These logical operators include: LOSplitOutput, LOLimit, LOSplit, LOFilter, LOSort, LODistinct, LOUnion. It is ok in the before because we do not change schema object once it is set. Now with the column pruner (PIG-922), we need to change individual schema object so it is no longer acceptable. For example, the following script fail: {code} a = load '1.txt' as (a0, a1:map[], a2); b = foreach a generate a1; c = limit b 10; dump c; {code} We need to fix it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.