[jira] Created: (PIG-837) docs ant target is broken
docs ant target is broken -- Key: PIG-837 URL: https://issues.apache.org/jira/browse/PIG-837 Project: Pig Issue Type: Bug Reporter: Giridharan Kesavan docs ant target is broken , this would fail the trunk builds.. [exec] Java Result: 1 [exec] [exec] Copying broken links file to site root. [exec] [exec] Copying 1 file to /home/hudson/hudson-slave/workspace/Pig-Patch-minerva.apache.org/trunk/src/docs/build/site [exec] [exec] BUILD FAILED [exec] /home/nigel/tools/forrest/latest/main/targets/site.xml:180: Error building site. [exec] [exec] There appears to be a problem with your site build. [exec] [exec] Read the output above: [exec] * Cocoon will report the status of each document: [exec] - in column 1: *=okay X=brokenLink ^=pageSkipped (see FAQ). [exec] * Even if only one link is broken, you will still get "failed". [exec] * Your site would still be generated, but some pages would be broken. [exec] - See /home/hudson/hudson-slave/workspace/Pig-Patch-minerva.apache.org/trunk/src/docs/build/site/broken-links.xml [exec] [exec] Total time: 28 seconds BUILD FAILED /home/hudson/hudson-slave/workspace/Pig-Patch-minerva.apache.org/trunk/build.xml:326: exec returned: 1 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Build failed in Hudson: Pig-Patch-minerva.apache.org #72
See http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/72/changes Changes: [olga] PIG-813: documentation updates (chandec via olgan) [pradeepkth] PIG-796: support conversion from numeric types to chararray (Ashutosh Chauhan via pradeepkth) -- started Building remotely on minerva.apache.org (Ubuntu) Updating http://svn.apache.org/repos/asf/hadoop/pig/trunk U test/org/apache/pig/test/TestPOCast.java C CHANGES.txt U src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POCast.java U src/org/apache/pig/data/DataType.java D src/docs/src/documentation/content/xdocs/quickstart.xml U src/docs/src/documentation/content/xdocs/site.xml U src/docs/src/documentation/content/xdocs/index.xml U src/docs/src/documentation/content/xdocs/piglatin.xml Fetching 'http://svn.apache.org/repos/asf/hadoop/core/nightly/test-patch' at -1 into 'http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/ws/trunk/test/bin' At revision 781914 At revision 781914 no change for http://svn.apache.org/repos/asf/hadoop/core/nightly/test-patch since the previous build [Pig-Patch-minerva.apache.org] $ /bin/bash /tmp/hudson7154531927977690732.sh /home/hudson/tools/java/latest1.6/bin/java Buildfile: build.xml check-for-findbugs: findbugs.check: java5.check: forrest.check: hudson-test-patch: [exec] [exec] [exec] == [exec] == [exec] Testing patch for PIG-765. [exec] == [exec] == [exec] [exec] [exec] Reverted 'CHANGES.txt' [exec] [exec] Fetching external item into 'test/bin' [exec] Atest/bin/test-patch.sh [exec] Updated external to revision 781914. [exec] [exec] Updated to revision 781914. [exec] PIG-765 patch is being downloaded at Thu Jun 4 22:48:13 PDT 2009 from [exec] http://issues.apache.org/jira/secure/attachment/12409932/pig-765.patch [exec] [exec] [exec] == [exec] == [exec] Pre-building trunk to determine trunk number [exec] of release audit, javac, and Findbugs warnings. [exec] == [exec] == [exec] [exec] [exec] /home/hudson/tools/ant/latest/bin/ant -Djava5.home=/home/hudson/tools/java/latest1.5 -Dforrest.home=/home/nigel/tools/forrest/latest -DPigPatchProcess= releaseaudit > http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/ws/patchprocess/trunkReleaseAuditWarnings.txt 2>&1 [exec] /home/hudson/tools/ant/latest/bin/ant -Djavac.args=-Xlint -Xmaxwarns 1000 -Declipse.home=/home/nigel/tools/eclipse/latest -Djava5.home=/home/hudson/tools/java/latest1.5 -Dforrest.home=/home/nigel/tools/forrest/latest -DPigPatchProcess= clean tar > http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/ws/patchprocess/trunkJavacWarnings.txt 2>&1 [exec] Trunk compilation is broken? [exec] % Total% Received % Xferd Average Speed TimeTime Time Current [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] [exec] Dload Upload Total Spent Left Speed [exec] [exec] 0 00 00 0 0 0 --:--:-- --:--:-- --:--:-- 0 0 00 00 0 0 0 --:--:-- --:--:-- --:--:-- 0 BUILD FAILED http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/ws/trunk/build.xml :653: exec returned: 1 Total time: 1 minute 45 seconds Recording test results Description found: PIG-765
[jira] Created: (PIG-836) Allow setting of end-of-record delimiter in PigStorage
Allow setting of end-of-record delimiter in PigStorage -- Key: PIG-836 URL: https://issues.apache.org/jira/browse/PIG-836 Project: Pig Issue Type: Improvement Components: impl Reporter: George Mavromatis Fix For: 0.2.0 PigStorage allows overriding the default field delimiter ('\t'), but does not allow overriding the record delimiter ('\n'). It is a valid use case that fields contain new lines, e.g. because they are contents of a document/web page. It is possible for the user to create a custom load/store UDF to achieve that, but that is extra work on the user, many users will have to do it , and that udf would be the exact code duplicate of the PigStorage except for the delimiter. Thus, PigStorage() should allow to configure both field and record separators. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-765) to implement jdiff
[ https://issues.apache.org/jira/browse/PIG-765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giridharan Kesavan updated PIG-765: --- Attachment: pig-765.patch this jdiff patch is created after resolving the author tag issue mentioned in pig-806. > to implement jdiff > -- > > Key: PIG-765 > URL: https://issues.apache.org/jira/browse/PIG-765 > Project: Pig > Issue Type: Improvement > Components: build >Reporter: Giridharan Kesavan >Assignee: Giridharan Kesavan > Attachments: pig-765.patch, pig-765.patch, pig-765.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-765) to implement jdiff
[ https://issues.apache.org/jira/browse/PIG-765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giridharan Kesavan updated PIG-765: --- Status: Patch Available (was: In Progress) > to implement jdiff > -- > > Key: PIG-765 > URL: https://issues.apache.org/jira/browse/PIG-765 > Project: Pig > Issue Type: Improvement > Components: build >Reporter: Giridharan Kesavan >Assignee: Giridharan Kesavan > Attachments: pig-765.patch, pig-765.patch, pig-765.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-765) to implement jdiff
[ https://issues.apache.org/jira/browse/PIG-765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giridharan Kesavan updated PIG-765: --- Status: In Progress (was: Patch Available) > to implement jdiff > -- > > Key: PIG-765 > URL: https://issues.apache.org/jira/browse/PIG-765 > Project: Pig > Issue Type: Improvement > Components: build >Reporter: Giridharan Kesavan >Assignee: Giridharan Kesavan > Attachments: pig-765.patch, pig-765.patch, pig-765.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-833) Storage access layer
[ https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716471#action_12716471 ] Jeff Hammerbacher commented on PIG-833: --- Hey Hong, I never mentioned SQL or an ecosystem in my comment, but thanks for your observation. I was simply referring to the existence of a fairly detailed discussion in a related subproject that the Pig team may not have been following. I'll add an additional one here: https://issues.apache.org/jira/browse/HIVE-279 addresses the predicate pushdown feature. Regards, Jeff > Storage access layer > > > Key: PIG-833 > URL: https://issues.apache.org/jira/browse/PIG-833 > Project: Pig > Issue Type: New Feature >Reporter: Jay Tang > > A layer is needed to provide a high level data access abstraction and a > tabular view of data in Hadoop, and could free Pig users from implementing > their own data storage/retrieval code. This layer should also include a > columnar storage format in order to provide fast data projection, > CPU/space-efficient data serialization, and a schema language to manage > physical storage metadata. Eventually it could also support predicate > pushdown for further performance improvement. Initially, this layer could be > a contrib project in Pig and become a hadoop subproject later on. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-833) Storage access layer
[ https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716470#action_12716470 ] Hong Tang commented on PIG-833: --- Jeff, just like the SQL effort, the space of columnar storage is also wide open, and I think it is more beneficial to the overall healthy of the hadoop ecosystem. With that being said, I also looked at the patch attached with HIVE-352. It appears that what the patch does is a level below our stated objectives. Specifically, the guts of the implementation (RCFile) is very close in spirit to TFile as described HADOOP-3315, which seems to have its first comprehensive patch back in December 2008. > Storage access layer > > > Key: PIG-833 > URL: https://issues.apache.org/jira/browse/PIG-833 > Project: Pig > Issue Type: New Feature >Reporter: Jay Tang > > A layer is needed to provide a high level data access abstraction and a > tabular view of data in Hadoop, and could free Pig users from implementing > their own data storage/retrieval code. This layer should also include a > columnar storage format in order to provide fast data projection, > CPU/space-efficient data serialization, and a schema language to manage > physical storage metadata. Eventually it could also support predicate > pushdown for further performance improvement. Initially, this layer could be > a contrib project in Pig and become a hadoop subproject later on. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-823) Hadoop Metadata Service
[ https://issues.apache.org/jira/browse/PIG-823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716463#action_12716463 ] Jeff Hammerbacher commented on PIG-823: --- Hey Olga, Really looking forward to seeing more discussion on this issue. The NameNode already contains file metadata like ctime, mtime, the block list, permissions, etc. Will the proposed metadata service subsume those attributes as well? Curious to see the proposed design. Thanks, Jeff > Hadoop Metadata Service > --- > > Key: PIG-823 > URL: https://issues.apache.org/jira/browse/PIG-823 > Project: Pig > Issue Type: New Feature >Reporter: Olga Natkovich > > This JIRA is created to track development of a metadata system for Hadoop. > The goal of the system is to allow users and applications to register data > stored on HDFS, search for the data available on HDFS, and associate metadata > such as schema, statistics, etc. with a particular data unit or a data set > stored on HDFS. The initial goal is to provide a fairly generic, low level > abstraction that any user or application on HDFS can use to store an retrieve > metadata. Over time a higher level abstractions closely tied to particular > applications or tools can be developed. > Over time, it would make sense for the metadata service to become a > subproject within Hadoop. For now, the proposal is to make it a contrib to > Pig since Pig SQL is likely to be the first user of the system. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-833) Storage access layer
[ https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716462#action_12716462 ] Jeff Hammerbacher commented on PIG-833: --- You may want to see the Hive project, where a columnar storage format has been developed and benchmarked: https://issues.apache.org/jira/browse/HIVE-352. > Storage access layer > > > Key: PIG-833 > URL: https://issues.apache.org/jira/browse/PIG-833 > Project: Pig > Issue Type: New Feature >Reporter: Jay Tang > > A layer is needed to provide a high level data access abstraction and a > tabular view of data in Hadoop, and could free Pig users from implementing > their own data storage/retrieval code. This layer should also include a > columnar storage format in order to provide fast data projection, > CPU/space-efficient data serialization, and a schema language to manage > physical storage metadata. Eventually it could also support predicate > pushdown for further performance improvement. Initially, this layer could be > a contrib project in Pig and become a hadoop subproject later on. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-835) Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type)
Multiquery optimization does not handle the case where the map keys in the split plans have different key types (tuple and non tuple key type) -- Key: PIG-835 URL: https://issues.apache.org/jira/browse/PIG-835 Project: Pig Issue Type: Bug Affects Versions: 0.2.1 Reporter: Pradeep Kamath Assignee: Pradeep Kamath Fix For: 0.3.0 A query like the following results in an exception on execution: {noformat} a = load 'mult.input' as (name, age, gpa); b = group a ALL; c = foreach b generate group, COUNT(a); store c into 'foo'; d = group a by (name, gpa); e = foreach d generate flatten(group), MIN(a.age); store e into 'bar'; {noformat} Exception on execution: 09/06/04 16:56:11 INFO mapred.TaskInProgress: Error from attempt_200906041655_0001_r_00_3: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:312) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:117) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:248) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:238) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:320) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:288) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:268) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:142) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-831) Records and bytes written reported by pig are wrong in a multi-store program
[ https://issues.apache.org/jira/browse/PIG-831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-831: --- Status: Patch Available (was: Open) > Records and bytes written reported by pig are wrong in a multi-store program > > > Key: PIG-831 > URL: https://issues.apache.org/jira/browse/PIG-831 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.3.0 >Reporter: Alan Gates >Assignee: Alan Gates >Priority: Minor > Attachments: PIG-831.patch > > > The stats features checked in as part of PIG-626 (reporting the number of > records and bytes written at the end of the query) print wrong values (often > but not always 0) when the pig script being run contains more than 1 store. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-834) incorrect plan when algebraic functions are nested
incorrect plan when algebraic functions are nested -- Key: PIG-834 URL: https://issues.apache.org/jira/browse/PIG-834 Project: Pig Issue Type: Bug Components: impl Reporter: Thejas M Nair Priority: Critical a = load 'students.txt' as (c1,c2,c3,c4); c = group a by c2; f = foreach c generate COUNT(org.apache.pig.builtin.Distinct($1.$2)); Notice that Distinct udf is missing in Combiner and reduce stage. As a result distinct does not function, and incorrect results are produced. Distinct should have been evaluated in the 3 stages and output of Distinct should be given to COUNT in reduce stage. # Map Reduce Plan #-- MapReduce node 1-122 Map Plan Local Rearrange[tuple]{bytearray}(false) - 1-139 | | | Project[bytearray][1] - 1-140 | |---New For Each(false,false)[bag] - 1-127 | | | POUserFunc(org.apache.pig.builtin.COUNT$Initial)[tuple] - 1-125 | | | |---POUserFunc(org.apache.pig.builtin.Distinct)[bag] - 1-126 | | | |---Project[bag][2] - 1-123 | | | |---Project[bag][1] - 1-124 | | | Project[bytearray][0] - 1-133 | |---Pre Combiner Local Rearrange[tuple]{Unknown} - 1-141 | |---Load(hdfs://wilbur11.labs.corp.sp1.yahoo.com/user/tejas/students.txt:org.apache.pig.builtin.PigStorage) - 1-111 Combine Plan Local Rearrange[tuple]{bytearray}(false) - 1-143 | | | Project[bytearray][1] - 1-144 | |---New For Each(false,false)[bag] - 1-132 | | | POUserFunc(org.apache.pig.builtin.COUNT$Intermediate)[tuple] - 1-130 | | | |---Project[bag][0] - 1-135 | | | Project[bytearray][1] - 1-134 | |---POCombinerPackage[tuple]{bytearray} - 1-137 Reduce Plan Store(fakefile:org.apache.pig.builtin.PigStorage) - 1-121 | |---New For Each(false)[bag] - 1-120 | | | POUserFunc(org.apache.pig.builtin.COUNT$Final)[long] - 1-119 | | | |---Project[bag][0] - 1-136 | |---POCombinerPackage[tuple]{bytearray} - 1-145 Global sort: false -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-834) incorrect plan when algebraic functions are nested
[ https://issues.apache.org/jira/browse/PIG-834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated PIG-834: -- Description: a = load 'students.txt' as (c1,c2,c3,c4); c = group a by c2; f = foreach c generate COUNT(org.apache.pig.builtin.Distinct($1.$2)); Notice that Distinct udf is missing in Combiner and reduce stage. As a result distinct does not function, and incorrect results are produced. Distinct should have been evaluated in the 3 stages and output of Distinct should be given to COUNT in reduce stage. {code} # Map Reduce Plan #-- MapReduce node 1-122 Map Plan Local Rearrange[tuple]{bytearray}(false) - 1-139 | | | Project[bytearray][1] - 1-140 | |---New For Each(false,false)[bag] - 1-127 | | | POUserFunc(org.apache.pig.builtin.COUNT$Initial)[tuple] - 1-125 | | | |---POUserFunc(org.apache.pig.builtin.Distinct)[bag] - 1-126 | | | |---Project[bag][2] - 1-123 | | | |---Project[bag][1] - 1-124 | | | Project[bytearray][0] - 1-133 | |---Pre Combiner Local Rearrange[tuple]{Unknown} - 1-141 | |---Load(hdfs://wilbur11.labs.corp.sp1.yahoo.com/user/tejas/students.txt:org.apache.pig.builtin.PigStorage) - 1-111 Combine Plan Local Rearrange[tuple]{bytearray}(false) - 1-143 | | | Project[bytearray][1] - 1-144 | |---New For Each(false,false)[bag] - 1-132 | | | POUserFunc(org.apache.pig.builtin.COUNT$Intermediate)[tuple] - 1-130 | | | |---Project[bag][0] - 1-135 | | | Project[bytearray][1] - 1-134 | |---POCombinerPackage[tuple]{bytearray} - 1-137 Reduce Plan Store(fakefile:org.apache.pig.builtin.PigStorage) - 1-121 | |---New For Each(false)[bag] - 1-120 | | | POUserFunc(org.apache.pig.builtin.COUNT$Final)[long] - 1-119 | | | |---Project[bag][0] - 1-136 | |---POCombinerPackage[tuple]{bytearray} - 1-145 Global sort: false {code} was: a = load 'students.txt' as (c1,c2,c3,c4); c = group a by c2; f = foreach c generate COUNT(org.apache.pig.builtin.Distinct($1.$2)); Notice that Distinct udf is missing in Combiner and reduce stage. As a result distinct does not function, and incorrect results are produced. Distinct should have been evaluated in the 3 stages and output of Distinct should be given to COUNT in reduce stage. # Map Reduce Plan #-- MapReduce node 1-122 Map Plan Local Rearrange[tuple]{bytearray}(false) - 1-139 | | | Project[bytearray][1] - 1-140 | |---New For Each(false,false)[bag] - 1-127 | | | POUserFunc(org.apache.pig.builtin.COUNT$Initial)[tuple] - 1-125 | | | |---POUserFunc(org.apache.pig.builtin.Distinct)[bag] - 1-126 | | | |---Project[bag][2] - 1-123 | | | |---Project[bag][1] - 1-124 | | | Project[bytearray][0] - 1-133 | |---Pre Combiner Local Rearrange[tuple]{Unknown} - 1-141 | |---Load(hdfs://wilbur11.labs.corp.sp1.yahoo.com/user/tejas/students.txt:org.apache.pig.builtin.PigStorage) - 1-111 Combine Plan Local Rearrange[tuple]{bytearray}(false) - 1-143 | | | Project[bytearray][1] - 1-144 | |---New For Each(false,false)[bag] - 1-132 | | | POUserFunc(org.apache.pig.builtin.COUNT$Intermediate)[tuple] - 1-130 | | | |---Project[bag][0] - 1-135 | | | Project[bytearray][1] - 1-134 | |---POCombinerPackage[tuple]{bytearray} - 1-137 Reduce Plan Store(fakefile:org.apache.pig.builtin.PigStorage) - 1-121 | |---New For Each(false)[bag] - 1-120 | | | POUserFunc(org.apache.pig.builtin.COUNT$Final)[long] - 1-119 | | | |---Project[bag][0] - 1-136 | |---POCombinerPackage[tuple]{bytearray} - 1-145 Global sort: false > incorrect plan when algebraic functions are nested > -- > > Key: PIG-834 > URL: https://issues.apache.org/jira/browse/PIG-834 > Project: Pig > Issue Type: Bug > Components: impl >Reporter: Thejas M Nair >Priority: Critical > > a = load 'students.txt' as (c1,c2,c3,c4); > c = group a by c2; > f = foreach c generate COUNT(org.apache.pig.builtin.Distinct($1.$2)); > Notice that Distinct udf is missing in Combiner and reduce stage. As a result > distinct does not function, and incorrect results are produced. > Distinct should have been evaluated in the 3 stages and output of Distinct > should be given to COUNT in reduce stage. > {code} > # Map Reduce Plan > #-- > MapReduce node 1-122 > Map Plan > Local Rearrange[t
[jira] Commented: (PIG-831) Records and bytes written reported by pig are wrong in a multi-store program
[ https://issues.apache.org/jira/browse/PIG-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716392#action_12716392 ] Olga Natkovich commented on PIG-831: +1 on the patch. please, keep the bug open since we should at some point correctly report numbers for multiquery > Records and bytes written reported by pig are wrong in a multi-store program > > > Key: PIG-831 > URL: https://issues.apache.org/jira/browse/PIG-831 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.3.0 >Reporter: Alan Gates >Assignee: Alan Gates >Priority: Minor > Attachments: PIG-831.patch > > > The stats features checked in as part of PIG-626 (reporting the number of > records and bytes written at the end of the query) print wrong values (often > but not always 0) when the pig script being run contains more than 1 store. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-817) Pig Docs for 0.3.0 Release
[ https://issues.apache.org/jira/browse/PIG-817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716370#action_12716370 ] Corinne Chandel commented on PIG-817: - Please delete this file (no longer in use): quickstart.xml \Trunk\src\docs\src\documentation\content\xdocs\quickstart.xml > Pig Docs for 0.3.0 Release > -- > > Key: PIG-817 > URL: https://issues.apache.org/jira/browse/PIG-817 > Project: Pig > Issue Type: Task > Components: documentation >Affects Versions: 0.3.0 >Reporter: Corinne Chandel > Attachments: PIG-817-2.patch > > > Update Pig docs for 0.3.0 release > > Getting Started > > Pig Latin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-817) Pig Docs for 0.3.0 Release
[ https://issues.apache.org/jira/browse/PIG-817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Corinne Chandel updated PIG-817: Attachment: PIG-817-2.patch Updated patch #2. Apply this patch to: http://svn.apache.org/repos/asf/hadoop/pig/trunk Note: No new test code; changes to documentation only. > Pig Docs for 0.3.0 Release > -- > > Key: PIG-817 > URL: https://issues.apache.org/jira/browse/PIG-817 > Project: Pig > Issue Type: Task > Components: documentation >Affects Versions: 0.3.0 >Reporter: Corinne Chandel > Attachments: PIG-817-2.patch > > > Update Pig docs for 0.3.0 release > > Getting Started > > Pig Latin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (PIG-817) Pig Docs for 0.3.0 Release
[ https://issues.apache.org/jira/browse/PIG-817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712496#action_12712496 ] Corinne Chandel edited comment on PIG-817 at 6/4/09 11:50 AM: -- (1) PIG-817-2.patch - patch file was (Author: chandec): (1) PIG_817.patch - patch file (2) Doc-Build.zip - local doc build (for review) (3) Doc-XML-Files - copies of the updated XML files (in case you need them) > Pig Docs for 0.3.0 Release > -- > > Key: PIG-817 > URL: https://issues.apache.org/jira/browse/PIG-817 > Project: Pig > Issue Type: Task > Components: documentation >Affects Versions: 0.3.0 >Reporter: Corinne Chandel > Attachments: PIG-817-2.patch > > > Update Pig docs for 0.3.0 release > > Getting Started > > Pig Latin -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-833) Storage access layer
Storage access layer Key: PIG-833 URL: https://issues.apache.org/jira/browse/PIG-833 Project: Pig Issue Type: New Feature Reporter: Jay Tang A layer is needed to provide a high level data access abstraction and a tabular view of data in Hadoop, and could free Pig users from implementing their own data storage/retrieval code. This layer should also include a columnar storage format in order to provide fast data projection, CPU/space-efficient data serialization, and a schema language to manage physical storage metadata. Eventually it could also support predicate pushdown for further performance improvement. Initially, this layer could be a contrib project in Pig and become a hadoop subproject later on. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-796) support conversion from numeric types to chararray
[ https://issues.apache.org/jira/browse/PIG-796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pradeep Kamath updated PIG-796: --- Resolution: Fixed Fix Version/s: 0.3.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Patch commited - thanks for contributing Ashutosh! > support conversion from numeric types to chararray > --- > > Key: PIG-796 > URL: https://issues.apache.org/jira/browse/PIG-796 > Project: Pig > Issue Type: Improvement >Affects Versions: 0.2.0 >Reporter: Olga Natkovich > Fix For: 0.3.0 > > Attachments: 796.patch, pig-796.patch, pig-796.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-830) Port Apache Log parsing piggybank contrib to Pig 0.2
[ https://issues.apache.org/jira/browse/PIG-830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716336#action_12716336 ] Hadoop QA commented on PIG-830: --- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12409730/pig-830-v3.patch against trunk revision 781599. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 27 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/71/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/71/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/71/console This message is automatically generated. > Port Apache Log parsing piggybank contrib to Pig 0.2 > > > Key: PIG-830 > URL: https://issues.apache.org/jira/browse/PIG-830 > Project: Pig > Issue Type: New Feature >Affects Versions: 0.2.0 >Reporter: Dmitriy V. Ryaboy >Priority: Minor > Attachments: pig-830-v2.patch, pig-830-v3.patch, pig-830.patch, > TEST-org.apache.pig.piggybank.test.storage.TestMyRegExLoader.txt > > > The piggybank contribs (pig-472, pig-473, pig-474, pig-476, pig-486, > pig-487, pig-488, pig-503, pig-509) got dropped after the types branch was > merged in. > They should be updated to work with the current APIs and added back into > trunk. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-564) Parameter Substitution using -param option does not seem to work when parameters contain special characters such as +,=,-,?,' "
[ https://issues.apache.org/jira/browse/PIG-564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-564: --- Resolution: Fixed Status: Resolved (was: Patch Available) patch committed > Parameter Substitution using -param option does not seem to work when > parameters contain special characters such as +,=,-,?,' " > --- > > Key: PIG-564 > URL: https://issues.apache.org/jira/browse/PIG-564 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.2.0 >Reporter: Viraj Bhat >Assignee: Olga Natkovich > Attachments: PIG-564.patch > > > Consider the following Pig script which uses parameter substitution > {code} > %default qual '/user/viraj' > %default mydir 'mydir_myextraqual' > VISIT_LOGS = load '$qual/$mydir' as (a,b,c); > dump VISIT_LOGS; > {code} > If you run the script as: > == > java -cp pig.jar:${HADOOP_HOME}/conf/ -Dhod.server='' org.apache.pig.Main > -param mydir=mydir-myextraqual mypigparamsub.pig > == > You get the following error: > == > 2008-12-15 19:49:43,964 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - java.io.IOException: /user/viraj/mydir does not exist > at > org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:109) > at > org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59) > at > org.apache.pig.impl.io.ValidatingInputFileSpec.(ValidatingInputFileSpec.java:44) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:200) > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742) > at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370) > at > org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247) > at > org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279) > at java.lang.Thread.run(Thread.java:619) > java.io.IOException: Unable to open iterator for alias: VISIT_LOGS [Job > terminated with anomalous status FAILED] > at org.apache.pig.PigServer.openIterator(PigServer.java:389) > at > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:269) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:178) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:84) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64) > at org.apache.pig.Main.main(Main.java:306) > Caused by: java.io.IOException: Job terminated with anomalous status FAILED > ... 6 more > == > Also tried using: -param mydir='mydir\-myextraqual' > This behavior occurs if the parameter value contains characters such as +,=, > ?. > A workaround for this behavior is using a param_file which contains > = on each line, with the enclosed by > quotes. For example: > mydir='mydir-myextraqual' and then running the pig script as: > java -cp pig.jar:${HADOOP_HOME}/conf/ -Dhod.server='' org.apache.pig.Main > -param_file myparamfile mypigparamsub.pig > The following issues need to be fixed: > 1) In -param option if parameter value contains special characters, it is > truncated > 2) In param_file, if param_value contains a special characters, it should be > enclosed in quotes > 3) If 2 is a known issue then it should be documented in > http://wiki.apache.org/pig/ParameterSubstitution -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-832) Make import list configurable
Make import list configurable - Key: PIG-832 URL: https://issues.apache.org/jira/browse/PIG-832 Project: Pig Issue Type: Improvement Affects Versions: 0.2.0 Reporter: Olga Natkovich Assignee: Olga Natkovich Fix For: 0.3.0 Currently, it is hardwired in PigContext. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-830) Port Apache Log parsing piggybank contrib to Pig 0.2
[ https://issues.apache.org/jira/browse/PIG-830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy V. Ryaboy updated PIG-830: -- Status: Patch Available (was: Open) > Port Apache Log parsing piggybank contrib to Pig 0.2 > > > Key: PIG-830 > URL: https://issues.apache.org/jira/browse/PIG-830 > Project: Pig > Issue Type: New Feature >Affects Versions: 0.2.0 >Reporter: Dmitriy V. Ryaboy >Priority: Minor > Attachments: pig-830-v2.patch, pig-830-v3.patch, pig-830.patch, > TEST-org.apache.pig.piggybank.test.storage.TestMyRegExLoader.txt > > > The piggybank contribs (pig-472, pig-473, pig-474, pig-476, pig-486, > pig-487, pig-488, pig-503, pig-509) got dropped after the types branch was > merged in. > They should be updated to work with the current APIs and added back into > trunk. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-830) Port Apache Log parsing piggybank contrib to Pig 0.2
[ https://issues.apache.org/jira/browse/PIG-830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy V. Ryaboy updated PIG-830: -- Status: Open (was: Patch Available) trying to get Hudson to pick up the third patch. > Port Apache Log parsing piggybank contrib to Pig 0.2 > > > Key: PIG-830 > URL: https://issues.apache.org/jira/browse/PIG-830 > Project: Pig > Issue Type: New Feature >Affects Versions: 0.2.0 >Reporter: Dmitriy V. Ryaboy >Priority: Minor > Attachments: pig-830-v2.patch, pig-830-v3.patch, pig-830.patch, > TEST-org.apache.pig.piggybank.test.storage.TestMyRegExLoader.txt > > > The piggybank contribs (pig-472, pig-473, pig-474, pig-476, pig-486, > pig-487, pig-488, pig-503, pig-509) got dropped after the types branch was > merged in. > They should be updated to work with the current APIs and added back into > trunk. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-564) Parameter Substitution using -param option does not seem to work when parameters contain special characters such as +,=,-,?,' "
[ https://issues.apache.org/jira/browse/PIG-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716246#action_12716246 ] Hudson commented on PIG-564: Integrated in Pig-trunk #463 (See [http://hudson.zones.apache.org/hudson/job/Pig-trunk/463/]) : problem with parameter substitution and special charachters (olgan) > Parameter Substitution using -param option does not seem to work when > parameters contain special characters such as +,=,-,?,' " > --- > > Key: PIG-564 > URL: https://issues.apache.org/jira/browse/PIG-564 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.2.0 >Reporter: Viraj Bhat >Assignee: Olga Natkovich > Attachments: PIG-564.patch > > > Consider the following Pig script which uses parameter substitution > {code} > %default qual '/user/viraj' > %default mydir 'mydir_myextraqual' > VISIT_LOGS = load '$qual/$mydir' as (a,b,c); > dump VISIT_LOGS; > {code} > If you run the script as: > == > java -cp pig.jar:${HADOOP_HOME}/conf/ -Dhod.server='' org.apache.pig.Main > -param mydir=mydir-myextraqual mypigparamsub.pig > == > You get the following error: > == > 2008-12-15 19:49:43,964 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - java.io.IOException: /user/viraj/mydir does not exist > at > org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:109) > at > org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59) > at > org.apache.pig.impl.io.ValidatingInputFileSpec.(ValidatingInputFileSpec.java:44) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:200) > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742) > at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370) > at > org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247) > at > org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279) > at java.lang.Thread.run(Thread.java:619) > java.io.IOException: Unable to open iterator for alias: VISIT_LOGS [Job > terminated with anomalous status FAILED] > at org.apache.pig.PigServer.openIterator(PigServer.java:389) > at > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:269) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:178) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:84) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64) > at org.apache.pig.Main.main(Main.java:306) > Caused by: java.io.IOException: Job terminated with anomalous status FAILED > ... 6 more > == > Also tried using: -param mydir='mydir\-myextraqual' > This behavior occurs if the parameter value contains characters such as +,=, > ?. > A workaround for this behavior is using a param_file which contains > = on each line, with the enclosed by > quotes. For example: > mydir='mydir-myextraqual' and then running the pig script as: > java -cp pig.jar:${HADOOP_HOME}/conf/ -Dhod.server='' org.apache.pig.Main > -param_file myparamfile mypigparamsub.pig > The following issues need to be fixed: > 1) In -param option if parameter value contains special characters, it is > truncated > 2) In param_file, if param_value contains a special characters, it should be > enclosed in quotes > 3) If 2 is a known issue then it should be documented in > http://wiki.apache.org/pig/ParameterSubstitution -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.