date:20091204


 [ 
https://issues.apache.org/jira/browse/PIG-653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gaurav Jain updated PIG-653:


Attachment: PIG-653.patch


Zebra changes for the proposed feature

Please reveiw at your earliest convenience

 Make fieldsToRead work in loader
 

 Key: PIG-653
 URL: https://issues.apache.org/jira/browse/PIG-653
 Project: Pig
  Issue Type: New Feature
Reporter: Alan Gates
Assignee: Pradeep Kamath
 Attachments: PIG-653-2.comment, PIG-653-3-proposal.txt, PIG-653.patch


 Currently pig does not call the fieldsToRead function in LoadFunc, thus it 
 does not provide information to load functions on what fields are needed.  We 
 need to implement a visitor that determines (where possible) which fields in 
 a file will be used and relays that information to the load function.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1119) [zebra] group is a Pig preserved word, zebra needs to use other string for table's group information


 [ 
https://issues.apache.org/jira/browse/PIG-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gaurav Jain updated PIG-1119:
-

Attachment: PIG-1119.patch


Changes incorporated as part for code review feedback

 [zebra] group is a Pig preserved word, zebra needs to use other string for 
 table's group information
 --

 Key: PIG-1119
 URL: https://issues.apache.org/jira/browse/PIG-1119
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.6.0
Reporter: Jing Huang
 Fix For: 0.6.0

 Attachments: PIG-1119.patch, PIG-1119.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1119) [zebra] group is a Pig preserved word, zebra needs to use other string for table's group information


 [ 
https://issues.apache.org/jira/browse/PIG-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gaurav Jain updated PIG-1119:
-

Status: Open  (was: Patch Available)


Providing an updated version

 [zebra] group is a Pig preserved word, zebra needs to use other string for 
 table's group information
 --

 Key: PIG-1119
 URL: https://issues.apache.org/jira/browse/PIG-1119
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.6.0
Reporter: Jing Huang
 Fix For: 0.6.0

 Attachments: PIG-1119.patch, PIG-1119.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-653) Make fieldsToRead work in loader


 [ 
https://issues.apache.org/jira/browse/PIG-653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gaurav Jain updated PIG-653:


Status: Patch Available  (was: Open)

 Make fieldsToRead work in loader
 

 Key: PIG-653
 URL: https://issues.apache.org/jira/browse/PIG-653
 Project: Pig
  Issue Type: New Feature
Reporter: Alan Gates
Assignee: Pradeep Kamath
 Attachments: PIG-653-2.comment, PIG-653-3-proposal.txt, PIG-653.patch


 Currently pig does not call the fieldsToRead function in LoadFunc, thus it 
 does not provide information to load functions on what fields are needed.  We 
 need to implement a visitor that determines (where possible) which fields in 
 a file will be used and relays that information to the load function.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1119) [zebra] group is a Pig preserved word, zebra needs to use other string for table's group information


 [ 
https://issues.apache.org/jira/browse/PIG-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gaurav Jain updated PIG-1119:
-

Status: Patch Available  (was: Open)

 [zebra] group is a Pig preserved word, zebra needs to use other string for 
 table's group information
 --

 Key: PIG-1119
 URL: https://issues.apache.org/jira/browse/PIG-1119
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.6.0
Reporter: Jing Huang
 Fix For: 0.6.0

 Attachments: PIG-1119.patch, PIG-1119.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1105) COUNT_STAR accumulate interface implementation cases failure


 [ 
https://issues.apache.org/jira/browse/PIG-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriranjan Manjunath updated PIG-1105:
-

Attachment: PIG-1105.patch

 COUNT_STAR accumulate interface implementation cases failure
 

 Key: PIG-1105
 URL: https://issues.apache.org/jira/browse/PIG-1105
 Project: Pig
  Issue Type: Bug
  Components: impl
Reporter: Thejas M Nair
Assignee: Sriranjan Manjunath
 Fix For: 0.6.0

 Attachments: PIG-1105.1.patch, PIG-1105.patch


 COUNT_STAR.accumulate is calling sum() which is supposed to be used by 
 intermediate and final parts of algebraic interface.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Build failed in Hudson: Pig-trunk #636

2009-12-04 Thread Apache Hudson Server

See http://hudson.zones.apache.org/hudson/job/Pig-trunk/636/changes

Changes:

[yanz] PIG- Multiple Outputs Support (Gaurav Jain via yanz)

[olga] PIG-1084: Pig 0.6.0 Documentation improvements  (chandec via olgan)

[daijy] PIG-922: Logical optimizer: push up project

[gates] PIG-1068:  COGROUP fails with 'Type mismatch in key from map: expected 
org.apache.pig.impl.io.NullableText, recieved 
org.apache.pig.impl.io.NullableTuple'

[yanz] PIG-1122 Changed version number of pig dev core jar used in Zebra build 
from 0.6.0 to 0.7.0
to match Pig version number (yanz)

--
[...truncated 2701 lines...]
ivy-init-dirs:

ivy-probe-antlib:

ivy-init-antlib:

ivy-init:

ivy-buildJar:
[ivy:resolve] :: resolving dependencies :: 
org.apache.pig#Pig;2009-12-04_10-05-58
[ivy:resolve]   confs: [buildJar]
[ivy:resolve]   found com.jcraft#jsch;0.1.38 in maven2
[ivy:resolve]   found jline#jline;0.9.94 in maven2
[ivy:resolve]   found net.java.dev.javacc#javacc;4.2 in maven2
[ivy:resolve]   found junit#junit;4.5 in default
[ivy:resolve] :: resolution report :: resolve 88ms :: artifacts dl 5ms
-
|  |modules||   artifacts   |
|   conf   | number| search|dwnlded|evicted|| number|dwnlded|
-
| buildJar |   4   |   0   |   0   |   0   ||   4   |   0   |
-
[ivy:retrieve] :: retrieving :: org.apache.pig#Pig
[ivy:retrieve]  confs: [buildJar]
[ivy:retrieve]  1 artifacts copied, 3 already retrieved (288kB/4ms)

buildJar:
 [echo] svnString 887139
  [jar] Building jar: 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/pig-2009-12-04_10-05-58.jar
 [copy] Copying 1 file to 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk

jarWithOutSvn:

findbugs:
[mkdir] Created dir: 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/findbugs
 [findbugs] Executing findbugs from ant task
 [findbugs] Running FindBugs...
 [findbugs] The following classes needed for analysis were missing:
 [findbugs]   com.jcraft.jsch.SocketFactory
 [findbugs]   com.jcraft.jsch.Logger
 [findbugs]   jline.Completor
 [findbugs]   com.jcraft.jsch.Session
 [findbugs]   com.jcraft.jsch.HostKeyRepository
 [findbugs]   com.jcraft.jsch.JSch
 [findbugs]   com.jcraft.jsch.UserInfo
 [findbugs]   jline.ConsoleReaderInputStream
 [findbugs]   com.jcraft.jsch.HostKey
 [findbugs]   jline.ConsoleReader
 [findbugs]   com.jcraft.jsch.ChannelExec
 [findbugs]   jline.History
 [findbugs]   com.jcraft.jsch.ChannelDirectTCPIP
 [findbugs]   com.jcraft.jsch.JSchException
 [findbugs]   com.jcraft.jsch.Channel
 [findbugs] Warnings generated: 20
 [findbugs] Missing classes: 16
 [findbugs] Calculating exit code...
 [findbugs] Setting 'missing class' flag (2)
 [findbugs] Setting 'bugs found' flag (1)
 [findbugs] Exit code set to: 3
 [findbugs] Java Result: 3
 [findbugs] Classes needed for analysis were missing
 [findbugs] Output saved to 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/findbugs/pig-findbugs-report.xml
 [xslt] Processing 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/findbugs/pig-findbugs-report.xml
 to 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/findbugs/pig-findbugs-report.html
 [xslt] Loading stylesheet 
/homes/gkesavan/tools/findbugs/latest/src/xsl/default.xsl

BUILD SUCCESSFUL
Total time: 2 minutes 52 seconds
+ mv build/pig-2009-12-04_10-05-58.tar.gz 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk
+ mv build/test/findbugs 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk
+ mv build/docs/api 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk
+ /homes/hudson/tools/ant/apache-ant-1.7.0/bin/ant clean
Buildfile: build.xml

clean:
   [delete] Deleting directory 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/src-gen
   [delete] Deleting directory 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/src/docs/build
   [delete] Deleting directory 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build
   [delete] Deleting directory 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/test/org/apache/pig/test/utils/dotGraph/parser

BUILD SUCCESSFUL
Total time: 0 seconds
+ /homes/hudson/tools/ant/apache-ant-1.7.0/bin/ant 
-Dtest.junit.output.format=xml -Dtest.output=yes 
-Dcheckstyle.home=/homes/hudson/tools/checkstyle/latest -Drun.clover=true 
-Dclover.home=/homes/hudson/tools/clover/latest clover test 
generate-clover-reports
Buildfile: build.xml

clover.setup:
[mkdir] Created dir: 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/clover/db
[clover-setup] Clover Version 2.4.3, built on

[jira] Updated: (PIG-1104) [zebra] Provide streaming support in Zebra.


 [ 
https://issues.apache.org/jira/browse/PIG-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-1104:
--

Status: Patch Available  (was: Open)

 [zebra] Provide streaming support in Zebra.
 ---

 Key: PIG-1104
 URL: https://issues.apache.org/jira/browse/PIG-1104
 Project: Pig
  Issue Type: New Feature
Affects Versions: 0.4.0
Reporter: Chao Wang
Assignee: Chao Wang
 Fix For: 0.6.0, 0.7.0

 Attachments: PIG1104.patch


 Hadoop streaming is very popular among Hadoop users. The main attraction is 
 the simplicity of use. A user can write the application logic in any language 
 and process large amounts of data using Hadoop framework. As more people 
 start to use Zebra to store their data, we expect users would like to run 
 Hadoop streaming scripts to easily process Zebra tables. 
 The following lists a simple example of using Hadoop streaming to access 
 Zebra data. It loads data from foo table using Zebra's TableInputFormat and 
 then writes the data into output using default TextOutputFormat. 
 $ hadoop jar hadoop-streaming.jar -D mapred.reduce.tasks=0 -input foo -output 
 output -mapper 'cat' -inputformat 
 org.apache.hadoop.zebra.mapred.TableInputFormat 
 More detailed, Zebra uses Pig DefaultTuple implementation of Tuple for its 
 records. Currently, when Zebra's TableInputFormat is used for input, the user 
 script sees each line containing  key_if_any\tTuple.toString() . We plan to 
 generate CSV format representation of our Pig tuples. To this end, we plan to 
 do the following: 
 1) Derive a sub class ZupleTuple from pig's DefaultTuple class and override 
 its toString() method to present the data into CSV format. 
 2) On Zebra side, the tuple factory should be changed to create ZebraTuple 
 objects, instead of DefaultTuple objects. 
 Note that we can only support streaming on the input side - ability to use 
 streaming to read data from Zebra tables. For the output side, the streaming 
 support is not feasible, since the streaming mapper or reducer only emits 
 Text\tText, the output collector has no way of knowing how to convert this 
 to (BytesWritable,Tuple).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-653) Make fieldsToRead work in loader

[
https://issues.apache.org/jira/browse/PIG-653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785928#action_12785928
]

Hadoop QA commented on PIG-653:
---

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12426879/PIG-653.patch
against trunk revision 887049.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 97 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

-1 release audit. The applied patch generated 395 release audit warnings
(more than the trunk's current 368 warnings).

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/89/testReport/
Release audit warnings:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/89/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
Findbugs warnings:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/89/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/89/console

This message is automatically generated.

Make fieldsToRead work in loader

Key: PIG-653
URL: https://issues.apache.org/jira/browse/PIG-653
Project: Pig
Issue Type: New Feature
Reporter: Alan Gates
Assignee: Pradeep Kamath
Attachments: PIG-653-2.comment, PIG-653-3-proposal.txt, PIG-653.patch

Currently pig does not call the fieldsToRead function in LoadFunc, thus it
does not provide information to load functions on what fields are needed. We
need to implement a visitor that determines (where possible) which fields in
a file will be used and relays that information to the load function.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-653) Make fieldsToRead work in loader


[ 
https://issues.apache.org/jira/browse/PIG-653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785936#action_12785936
 ] 

Yan Zhou commented on PIG-653:
--

The 27 release audit failures are from 25 pig test scripts and 2 test data 
files, none of them are source files and should be ignored.

 Make fieldsToRead work in loader
 

 Key: PIG-653
 URL: https://issues.apache.org/jira/browse/PIG-653
 Project: Pig
  Issue Type: New Feature
Reporter: Alan Gates
Assignee: Pradeep Kamath
 Attachments: PIG-653-2.comment, PIG-653-3-proposal.txt, PIG-653.patch


 Currently pig does not call the fieldsToRead function in LoadFunc, thus it 
 does not provide information to load functions on what fields are needed.  We 
 need to implement a visitor that determines (where possible) which fields in 
 a file will be used and relays that information to the load function.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-653) Make fieldsToRead work in loader


[ 
https://issues.apache.org/jira/browse/PIG-653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785937#action_12785937
 ] 

Yan Zhou commented on PIG-653:
--

A typo in my last comment. should have been 27 audit *warnings* not *failures*

 Make fieldsToRead work in loader
 

 Key: PIG-653
 URL: https://issues.apache.org/jira/browse/PIG-653
 Project: Pig
  Issue Type: New Feature
Reporter: Alan Gates
Assignee: Pradeep Kamath
 Attachments: PIG-653-2.comment, PIG-653-3-proposal.txt, PIG-653.patch


 Currently pig does not call the fieldsToRead function in LoadFunc, thus it 
 does not provide information to load functions on what fields are needed.  We 
 need to implement a visitor that determines (where possible) which fields in 
 a file will be used and relays that information to the load function.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-1119) [zebra] group is a Pig preserved word, zebra needs to use other string for table's group information


[ 
https://issues.apache.org/jira/browse/PIG-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785939#action_12785939
 ] 

Yan Zhou commented on PIG-1119:
---

+1

 [zebra] group is a Pig preserved word, zebra needs to use other string for 
 table's group information
 --

 Key: PIG-1119
 URL: https://issues.apache.org/jira/browse/PIG-1119
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.6.0
Reporter: Jing Huang
 Fix For: 0.6.0

 Attachments: PIG-1119.patch, PIG-1119.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1126) piggybank loaders need to update fieldsToRead function


 [ 
https://issues.apache.org/jira/browse/PIG-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-1126:


Attachment: PIG-1126.patch

All unit tests passed. Please, review. (This is for both the trunk and 0.6 
branch)

 piggybank loaders need to update fieldsToRead function
 --

 Key: PIG-1126
 URL: https://issues.apache.org/jira/browse/PIG-1126
 Project: Pig
  Issue Type: Bug
Reporter: Olga Natkovich
Assignee: Olga Natkovich
 Fix For: 0.6.0

 Attachments: PIG-1126.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-1126) piggybank loaders need to update fieldsToRead function


[ 
https://issues.apache.org/jira/browse/PIG-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786013#action_12786013
 ] 

Daniel Dai commented on PIG-1126:
-

+1

 piggybank loaders need to update fieldsToRead function
 --

 Key: PIG-1126
 URL: https://issues.apache.org/jira/browse/PIG-1126
 Project: Pig
  Issue Type: Bug
Reporter: Olga Natkovich
Assignee: Olga Natkovich
 Fix For: 0.6.0

 Attachments: PIG-1126.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-1119) [zebra] group is a Pig preserved word, zebra needs to use other string for table's group information

[
https://issues.apache.org/jira/browse/PIG-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786027#action_12786027
]

Hadoop QA commented on PIG-1119:

+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12426881/PIG-1119.patch
against trunk revision 887049.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 39 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/90/testReport/
Findbugs warnings:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/90/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/90/console

This message is automatically generated.

[zebra] group is a Pig preserved word, zebra needs to use other string for
table's group information
--

Key: PIG-1119
URL: https://issues.apache.org/jira/browse/PIG-1119
Project: Pig
Issue Type: Bug
Affects Versions: 0.6.0
Reporter: Jing Huang
Fix For: 0.6.0

Attachments: PIG-1119.patch, PIG-1119.patch

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-1105) COUNT_STAR accumulate interface implementation cases failure


[ 
https://issues.apache.org/jira/browse/PIG-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786031#action_12786031
 ] 

Hadoop QA commented on PIG-1105:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12426887/PIG-1105.patch
  against trunk revision 887290.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/91/console

This message is automatically generated.

 COUNT_STAR accumulate interface implementation cases failure
 

 Key: PIG-1105
 URL: https://issues.apache.org/jira/browse/PIG-1105
 Project: Pig
  Issue Type: Bug
  Components: impl
Reporter: Thejas M Nair
Assignee: Sriranjan Manjunath
 Fix For: 0.6.0

 Attachments: PIG-1105.1.patch, PIG-1105.patch


 COUNT_STAR.accumulate is calling sum() which is supposed to be used by 
 intermediate and final parts of algebraic interface.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-653) Make fieldsToRead work in loader


 [ 
https://issues.apache.org/jira/browse/PIG-653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-653:
---

Resolution: Duplicate
Status: Resolved  (was: Patch Available)

PIG-922

 Make fieldsToRead work in loader
 

 Key: PIG-653
 URL: https://issues.apache.org/jira/browse/PIG-653
 Project: Pig
  Issue Type: New Feature
Reporter: Alan Gates
Assignee: Pradeep Kamath
 Attachments: PIG-653-2.comment, PIG-653-3-proposal.txt, PIG-653.patch


 Currently pig does not call the fieldsToRead function in LoadFunc, thus it 
 does not provide information to load functions on what fields are needed.  We 
 need to implement a visitor that determines (where possible) which fields in 
 a file will be used and relays that information to the load function.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-653) Make fieldsToRead work in loader


[ 
https://issues.apache.org/jira/browse/PIG-653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786046#action_12786046
 ] 

Yan Zhou commented on PIG-653:
--

Zebra changes commited to both trunk and the 6.0 branch.

 Make fieldsToRead work in loader
 

 Key: PIG-653
 URL: https://issues.apache.org/jira/browse/PIG-653
 Project: Pig
  Issue Type: New Feature
Reporter: Alan Gates
Assignee: Pradeep Kamath
 Attachments: PIG-653-2.comment, PIG-653-3-proposal.txt, PIG-653.patch


 Currently pig does not call the fieldsToRead function in LoadFunc, thus it 
 does not provide information to load functions on what fields are needed.  We 
 need to implement a visitor that determines (where possible) which fields in 
 a file will be used and relays that information to the load function.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (PIG-1126) piggybank loaders need to update fieldsToRead function


 [ 
https://issues.apache.org/jira/browse/PIG-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich resolved PIG-1126.
-

Resolution: Fixed

patch committed to trank and 0.6 branch

 piggybank loaders need to update fieldsToRead function
 --

 Key: PIG-1126
 URL: https://issues.apache.org/jira/browse/PIG-1126
 Project: Pig
  Issue Type: Bug
Reporter: Olga Natkovich
Assignee: Olga Natkovich
 Fix For: 0.6.0

 Attachments: PIG-1126.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1086) Nested sort by * throw exception

2009-12-04 Thread Richard Ding (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1086:
--

Status: Patch Available  (was: Open)

 Nested sort by * throw exception
 

 Key: PIG-1086
 URL: https://issues.apache.org/jira/browse/PIG-1086
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.5.0
Reporter: Daniel Dai
Assignee: Richard Ding
 Attachments: PIG-1086.patch


 The following script fail:
 A = load '1.txt' as (a0, a1, a2);
 B = group A by a0;
 C = foreach B { D = order A by *; generate group, D;};
 explain C;
 Here is the stack:
 Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
 at java.util.ArrayList.get(ArrayList.java:324)
 at 
 org.apache.pig.impl.logicalLayer.schema.Schema.getField(Schema.java:752)
 at 
 org.apache.pig.impl.logicalLayer.LOSort.getSortInfo(LOSort.java:332)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:1365)
 at org.apache.pig.impl.logicalLayer.LOSort.visit(LOSort.java:176)
 at org.apache.pig.impl.logicalLayer.LOSort.visit(LOSort.java:43)
 at 
 org.apache.pig.impl.plan.DependencyOrderWalkerWOSeenChk.walk(DependencyOrderWalkerWOSeenChk.java:69)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:1274)
 at 
 org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:130)
 at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:45)
 at 
 org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:69)
 at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:234)
 at org.apache.pig.PigServer.compilePp(PigServer.java:864)
 at org.apache.pig.PigServer.explain(PigServer.java:583)
 ... 8 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (PIG-1127) Logical operator should contains individual copy of schema object

Logical operator should contains individual copy of schema object
-

 Key: PIG-1127
 URL: https://issues.apache.org/jira/browse/PIG-1127
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0


Currently some logical operator only contains a schema reference to the 
predecessor's schema object. These logical operators include: LOSplitOutput, 
LOLimit, LOSplit, LOFilter, LOSort, LODistinct, LOUnion. It is ok in the before 
because we do not change schema object once it is set. Now with the column 
pruner (PIG-922), we need to change individual schema object so it is no longer 
acceptable. For example, the following script fail:

{code}
a = load '1.txt' as (a0, a1:map[], a2);
b = foreach a generate a1;
c = limit b 10;
dump c;
{code}

We need to fix it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1118) expression with aggregate functions returning null, with accumulate interface


 [ 
https://issues.apache.org/jira/browse/PIG-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-1118:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

patch committed on trunk and branch 0.6

 expression with aggregate functions returning null, with accumulate interface
 -

 Key: PIG-1118
 URL: https://issues.apache.org/jira/browse/PIG-1118
 Project: Pig
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Ying He
 Fix For: 0.6.0

 Attachments: PIG_1118.patch


 The problem is in trunk . It works fine in 0.6 branch.
 l = load '/tmp/students.txt' as (a : chararray,b : chararray,c : int);
 grunt g = group l by 1;
 grunt dump g;
 (1,{(asdfxc,M,23),(qwer,F,21),(uhsdf,M,34),(zxldf,M,21),(qwer,F,23),(oiue,M,54)})
 grunt f = foreach g generate SUM(l.c), 1 + SUM(l.c) + SUM(l.c);
 grunt dump f;
 (176L,)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-747) Logical to Physical Plan Translation fails when temporary alias are created within foreach


 [ 
https://issues.apache.org/jira/browse/PIG-747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-747:
---

Fix Version/s: (was: 0.6.0)
   0.7.0

Unlinking from 0.6.0 release. The change is to large to make this late

 Logical to Physical Plan Translation fails when temporary alias are created 
 within foreach
 --

 Key: PIG-747
 URL: https://issues.apache.org/jira/browse/PIG-747
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Viraj Bhat
Assignee: Daniel Dai
 Fix For: 0.7.0

 Attachments: physicalplan.txt, physicalplanprob.pig, PIG-747-1.patch


 Consider a the pig script which calculates a new column F inside the foreach 
 as:
 {code}
 A = load 'physicalplan.txt' as (col1,col2,col3);
 B = foreach A {
D = col1/col2;
E = col3/col2;
F = E - (D*D);
generate
F as newcol;
 };
 dump B;
 {code}
 This gives the following error:
 ===
 Caused by: 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogicalToPhysicalTranslatorException:
  ERROR 2015: Invalid physical operators in the physical plan
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:377)
 at 
 org.apache.pig.impl.logicalLayer.LOMultiply.visit(LOMultiply.java:63)
 at 
 org.apache.pig.impl.logicalLayer.LOMultiply.visit(LOMultiply.java:29)
 at 
 org.apache.pig.impl.plan.DependencyOrderWalkerWOSeenChk.walk(DependencyOrderWalkerWOSeenChk.java:68)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:908)
 at 
 org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:122)
 at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:41)
 at 
 org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68)
 at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:246)
 ... 10 more
 Caused by: org.apache.pig.impl.plan.PlanException: ERROR 0: Attempt to give 
 operator of type 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.Divide
  multiple outputs.  This operator does not support multiple outputs.
 at 
 org.apache.pig.impl.plan.OperatorPlan.connect(OperatorPlan.java:158)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.plans.PhysicalPlan.connect(PhysicalPlan.java:89)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:373)
 ... 19 more
 ===

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1124) Unable to set Custom Job Name using the -Dmapred.job.name parameter


 [ 
https://issues.apache.org/jira/browse/PIG-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-1124:


Fix Version/s: (was: 0.6.0)
   0.7.0

 Unable to set Custom Job Name using the -Dmapred.job.name parameter
 ---

 Key: PIG-1124
 URL: https://issues.apache.org/jira/browse/PIG-1124
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.6.0
Reporter: Viraj Bhat
Priority: Minor
 Fix For: 0.7.0


 As a Hadoop user I want to control the Job name for my analysis via the 
 command line using the following construct::
 java -cp pig.jar:$HADOOP_HOME/conf -Dmapred.job.name=hadoop_junkie 
 org.apache.pig.Main broken.pig
 -Dmapred.job.name should normally set my Hadoop Job name, but somehow during 
 the formation of the job.xml in Pig this information is lost and the job name 
 turns out to be:
 PigLatin:broken.pig
 The current workaround seems to be wiring it in the script itself, using the 
 following ( or using parameter substitution).
 set job.name 'my job'
 Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1127) Logical operator should contains individual copy of schema object


 [ 
https://issues.apache.org/jira/browse/PIG-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1127:


Status: Patch Available  (was: Open)

 Logical operator should contains individual copy of schema object
 -

 Key: PIG-1127
 URL: https://issues.apache.org/jira/browse/PIG-1127
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-1127-1.patch


 Currently some logical operator only contains a schema reference to the 
 predecessor's schema object. These logical operators include: LOSplitOutput, 
 LOLimit, LOSplit, LOFilter, LOSort, LODistinct, LOUnion. It is ok in the 
 before because we do not change schema object once it is set. Now with the 
 column pruner (PIG-922), we need to change individual schema object so it is 
 no longer acceptable. For example, the following script fail:
 {code}
 a = load '1.txt' as (a0, a1:map[], a2);
 b = foreach a generate a1;
 c = limit b 10;
 dump c;
 {code}
 We need to fix it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1127) Logical operator should contains individual copy of schema object


 [ 
https://issues.apache.org/jira/browse/PIG-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1127:


Attachment: PIG-1127-1.patch

 Logical operator should contains individual copy of schema object
 -

 Key: PIG-1127
 URL: https://issues.apache.org/jira/browse/PIG-1127
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-1127-1.patch


 Currently some logical operator only contains a schema reference to the 
 predecessor's schema object. These logical operators include: LOSplitOutput, 
 LOLimit, LOSplit, LOFilter, LOSort, LODistinct, LOUnion. It is ok in the 
 before because we do not change schema object once it is set. Now with the 
 column pruner (PIG-922), we need to change individual schema object so it is 
 no longer acceptable. For example, the following script fail:
 {code}
 a = load '1.txt' as (a0, a1:map[], a2);
 b = foreach a generate a1;
 c = limit b 10;
 dump c;
 {code}
 We need to fix it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-1104) [zebra] Provide streaming support in Zebra.

[
https://issues.apache.org/jira/browse/PIG-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786188#action_12786188
]

Hadoop QA commented on PIG-1104:

+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12426801/PIG1104.patch
against trunk revision 887290.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 8 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/92/testReport/
Findbugs warnings:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/92/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/92/console

This message is automatically generated.

[zebra] Provide streaming support in Zebra.
---

Key: PIG-1104
URL: https://issues.apache.org/jira/browse/PIG-1104
Project: Pig
Issue Type: New Feature
Affects Versions: 0.4.0
Reporter: Chao Wang
Assignee: Chao Wang
Fix For: 0.6.0, 0.7.0

Attachments: PIG1104.patch

Hadoop streaming is very popular among Hadoop users. The main attraction is
the simplicity of use. A user can write the application logic in any language
and process large amounts of data using Hadoop framework. As more people
start to use Zebra to store their data, we expect users would like to run
Hadoop streaming scripts to easily process Zebra tables.
The following lists a simple example of using Hadoop streaming to access
Zebra data. It loads data from foo table using Zebra's TableInputFormat and
then writes the data into output using default TextOutputFormat.
$ hadoop jar hadoop-streaming.jar -D mapred.reduce.tasks=0 -input foo -output
output -mapper 'cat' -inputformat
org.apache.hadoop.zebra.mapred.TableInputFormat
More detailed, Zebra uses Pig DefaultTuple implementation of Tuple for its
records. Currently, when Zebra's TableInputFormat is used for input, the user
script sees each line containing key_if_any\tTuple.toString() . We plan to
generate CSV format representation of our Pig tuples. To this end, we plan to
do the following:
1) Derive a sub class ZupleTuple from pig's DefaultTuple class and override
its toString() method to present the data into CSV format.
2) On Zebra side, the tuple factory should be changed to create ZebraTuple
objects, instead of DefaultTuple objects.
Note that we can only support streaming on the input side - ability to use
streaming to read data from Zebra tables. For the output side, the streaming
support is not feasible, since the streaming mapper or reducer only emits
Text\tText, the output collector has no way of knowing how to convert this
to (BytesWritable,Tuple).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1128) column pruning causing failure when foreach has user-specified schema


 [ 
https://issues.apache.org/jira/browse/PIG-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-1128:


Fix Version/s: 0.6.0

 column pruning causing failure when foreach has user-specified schema
 -

 Key: PIG-1128
 URL: https://issues.apache.org/jira/browse/PIG-1128
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0
Reporter: Thejas M Nair
 Fix For: 0.6.0


 Issue is seen in 0.6.0 and trunk.
 grunt l = load 'dummy.txt' as ( c1 : chararray,  c2 : int);  
  
 grunt f1 = foreach l generate c1 as c1 : chararray, c2 as c2 : int, 'CA' as 
 state : chararray;
 grunt f2 = foreach f1 generate c1 as c1 : chararray; 
  
 grunt explain f2;
 2009-12-04 13:11:19,010 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 1022: Type mismatch merging schema prefix. Field Schema: chararray. Other 
 Field Schema: c2: int
 ( it does not matter if the new schema has new/different column name - )
 gruntl = load 'dummy.txt' as ( c1 : chararray,  c2 : int);
 gruntf1 = foreach l generate c1 as c11 : chararray, c2 as c22 : int, 'CA' as 
 state : chararray;
 gruntf2 = foreach f1 generate c11 as c111 : chararray;
 grunt explain f2;
 2009-12-04 13:13:01,462 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 1022: Type mismatch merging schema prefix. Field Schema: chararray. Other 
 Field Schema: c22: int

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Build failed in Hudson: Pig-trunk #637

2009-12-04 Thread Apache Hudson Server

See http://hudson.zones.apache.org/hudson/job/Pig-trunk/637/changes

Changes:

[olga] PIG-1118: expression with aggregate functions returning null, with 
accumulate
interface (yinghe via olgan)

[olga] PIG-1126: updated fieldsToRead function (olgan)

[yanz] PIG-653  Pig Projection Push Down (Gaurav Jain via yanz)

--
[...truncated 2733 lines...]
ivy-init-dirs:

ivy-probe-antlib:

ivy-init-antlib:

ivy-init:

ivy-buildJar:
[ivy:resolve] :: resolving dependencies :: 
org.apache.pig#Pig;2009-12-04_22-05-57
[ivy:resolve]   confs: [buildJar]
[ivy:resolve]   found com.jcraft#jsch;0.1.38 in maven2
[ivy:resolve]   found jline#jline;0.9.94 in maven2
[ivy:resolve]   found net.java.dev.javacc#javacc;4.2 in maven2
[ivy:resolve]   found junit#junit;4.5 in default
[ivy:resolve] :: resolution report :: resolve 76ms :: artifacts dl 4ms
-
|  |modules||   artifacts   |
|   conf   | number| search|dwnlded|evicted|| number|dwnlded|
-
| buildJar |   4   |   0   |   0   |   0   ||   4   |   0   |
-
[ivy:retrieve] :: retrieving :: org.apache.pig#Pig
[ivy:retrieve]  confs: [buildJar]
[ivy:retrieve]  1 artifacts copied, 3 already retrieved (288kB/4ms)

buildJar:
 [echo] svnString 887379
  [jar] Building jar: 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/pig-2009-12-04_22-05-57.jar
 [copy] Copying 1 file to 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk

jarWithOutSvn:

findbugs:
[mkdir] Created dir: 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/findbugs
 [findbugs] Executing findbugs from ant task
 [findbugs] Running FindBugs...
 [findbugs] The following classes needed for analysis were missing:
 [findbugs]   com.jcraft.jsch.SocketFactory
 [findbugs]   com.jcraft.jsch.Logger
 [findbugs]   jline.Completor
 [findbugs]   com.jcraft.jsch.Session
 [findbugs]   com.jcraft.jsch.HostKeyRepository
 [findbugs]   com.jcraft.jsch.JSch
 [findbugs]   com.jcraft.jsch.UserInfo
 [findbugs]   jline.ConsoleReaderInputStream
 [findbugs]   com.jcraft.jsch.HostKey
 [findbugs]   jline.ConsoleReader
 [findbugs]   com.jcraft.jsch.ChannelExec
 [findbugs]   jline.History
 [findbugs]   com.jcraft.jsch.ChannelDirectTCPIP
 [findbugs]   com.jcraft.jsch.JSchException
 [findbugs]   com.jcraft.jsch.Channel
 [findbugs] Warnings generated: 20
 [findbugs] Missing classes: 16
 [findbugs] Calculating exit code...
 [findbugs] Setting 'missing class' flag (2)
 [findbugs] Setting 'bugs found' flag (1)
 [findbugs] Exit code set to: 3
 [findbugs] Java Result: 3
 [findbugs] Classes needed for analysis were missing
 [findbugs] Output saved to 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/findbugs/pig-findbugs-report.xml
 [xslt] Processing 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/findbugs/pig-findbugs-report.xml
 to 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/findbugs/pig-findbugs-report.html
 [xslt] Loading stylesheet 
/homes/gkesavan/tools/findbugs/latest/src/xsl/default.xsl

BUILD SUCCESSFUL
Total time: 2 minutes 53 seconds
+ mv build/pig-2009-12-04_22-05-57.tar.gz 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk
+ mv build/test/findbugs 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk
+ mv build/docs/api 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk
+ /homes/hudson/tools/ant/apache-ant-1.7.0/bin/ant clean
Buildfile: build.xml

clean:
   [delete] Deleting directory 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/src-gen
   [delete] Deleting directory 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/src/docs/build
   [delete] Deleting directory 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build
   [delete] Deleting directory 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/test/org/apache/pig/test/utils/dotGraph/parser

BUILD SUCCESSFUL
Total time: 0 seconds
+ /homes/hudson/tools/ant/apache-ant-1.7.0/bin/ant 
-Dtest.junit.output.format=xml -Dtest.output=yes 
-Dcheckstyle.home=/homes/hudson/tools/checkstyle/latest -Drun.clover=true 
-Dclover.home=/homes/hudson/tools/clover/latest clover test 
generate-clover-reports
Buildfile: build.xml

clover.setup:
[mkdir] Created dir: 
http://hudson.zones.apache.org/hudson/job/Pig-trunk/ws/trunk/build/test/clover/db
[clover-setup] Clover Version 2.4.3, built on March 09 2009 (build-756)
[clover-setup] Loaded from: /homes/hudson/tools/clover/latest/lib/clover.jar
[clover-setup] Clover: Open Source License registered to Apache.
[clover-setup] Clover is enabled with initstring

[jira] Updated: (PIG-1110) Handle compressed file formats -- Gz, BZip with the new proposal

2009-12-04 Thread Richard Ding (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1110:
--

Attachment: PIG-1110.patch

 Handle compressed file formats -- Gz, BZip with the new proposal
 

 Key: PIG-1110
 URL: https://issues.apache.org/jira/browse/PIG-1110
 Project: Pig
  Issue Type: Sub-task
Reporter: Richard Ding
Assignee: Richard Ding
 Attachments: PIG-1110.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1110) Handle compressed file formats -- Gz, BZip with the new proposal

2009-12-04 Thread Richard Ding (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1110:
--

Release Note: For compressed BZip files, the load-store branch only 
supports file extension .bz2. It ignores the file extension .bz and treats 
those files as regular text files. This change is due to the new version of 
PigStorage which uses Hadoop's TextInputFormat as its InputFormat.
Hadoop Flags: [Incompatible change]

 Handle compressed file formats -- Gz, BZip with the new proposal
 

 Key: PIG-1110
 URL: https://issues.apache.org/jira/browse/PIG-1110
 Project: Pig
  Issue Type: Sub-task
Reporter: Richard Ding
Assignee: Richard Ding
 Attachments: PIG-1110.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (PIG-1129) Pig UDF doc: fieldsToRead function

2009-12-04 Thread Corinne Chandel (JIRA)

Pig UDF doc: fieldsToRead function 
---

 Key: PIG-1129
 URL: https://issues.apache.org/jira/browse/PIG-1129
 Project: Pig
  Issue Type: Task
  Components: documentation
Affects Versions: 0.6.0
Reporter: Corinne Chandel
Assignee: Corinne Chandel
Priority: Blocker
 Fix For: 0.6.0


Updated Pig UDF doc to include information about the fieldsToRead function.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1129) Pig UDF doc: fieldsToRead function

2009-12-04 Thread Corinne Chandel (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Corinne Chandel updated PIG-1129:
-

Attachment: Pig-6-UDF.patch

Patch: Pig-6-UDF.patch


 Pig UDF doc: fieldsToRead function 
 ---

 Key: PIG-1129
 URL: https://issues.apache.org/jira/browse/PIG-1129
 Project: Pig
  Issue Type: Task
  Components: documentation
Affects Versions: 0.6.0
Reporter: Corinne Chandel
Assignee: Corinne Chandel
Priority: Blocker
 Fix For: 0.6.0

 Attachments: Pig-6-UDF.patch


 Updated Pig UDF doc to include information about the fieldsToRead function.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1129) Pig UDF doc: fieldsToRead function

2009-12-04 Thread Corinne Chandel (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Corinne Chandel updated PIG-1129:
-

Status: Patch Available  (was: Open)

Apply patch to Pig trunk: http://svn.apache.org/repos/asf/hadoop/pig/trunk

Note: No new test code required; changes to documentation only.

 Pig UDF doc: fieldsToRead function 
 ---

 Key: PIG-1129
 URL: https://issues.apache.org/jira/browse/PIG-1129
 Project: Pig
  Issue Type: Task
  Components: documentation
Affects Versions: 0.6.0
Reporter: Corinne Chandel
Assignee: Corinne Chandel
Priority: Blocker
 Fix For: 0.6.0

 Attachments: Pig-6-UDF.patch


 Updated Pig UDF doc to include information about the fieldsToRead function.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1105) COUNT_STAR accumulate interface implementation cases failure


 [ 
https://issues.apache.org/jira/browse/PIG-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriranjan Manjunath updated PIG-1105:
-

Attachment: (was: PIG-1105.patch)

 COUNT_STAR accumulate interface implementation cases failure
 

 Key: PIG-1105
 URL: https://issues.apache.org/jira/browse/PIG-1105
 Project: Pig
  Issue Type: Bug
  Components: impl
Reporter: Thejas M Nair
Assignee: Sriranjan Manjunath
 Fix For: 0.6.0

 Attachments: PIG-1105.1.patch


 COUNT_STAR.accumulate is calling sum() which is supposed to be used by 
 intermediate and final parts of algebraic interface.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1105) COUNT_STAR accumulate interface implementation cases failure


 [ 
https://issues.apache.org/jira/browse/PIG-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriranjan Manjunath updated PIG-1105:
-

Status: Open  (was: Patch Available)

 COUNT_STAR accumulate interface implementation cases failure
 

 Key: PIG-1105
 URL: https://issues.apache.org/jira/browse/PIG-1105
 Project: Pig
  Issue Type: Bug
  Components: impl
Reporter: Thejas M Nair
Assignee: Sriranjan Manjunath
 Fix For: 0.6.0

 Attachments: PIG-1105.1.patch, PIG-1105.2.patch


 COUNT_STAR.accumulate is calling sum() which is supposed to be used by 
 intermediate and final parts of algebraic interface.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1105) COUNT_STAR accumulate interface implementation cases failure


 [ 
https://issues.apache.org/jira/browse/PIG-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriranjan Manjunath updated PIG-1105:
-

Status: Patch Available  (was: Open)

 COUNT_STAR accumulate interface implementation cases failure
 

 Key: PIG-1105
 URL: https://issues.apache.org/jira/browse/PIG-1105
 Project: Pig
  Issue Type: Bug
  Components: impl
Reporter: Thejas M Nair
Assignee: Sriranjan Manjunath
 Fix For: 0.6.0

 Attachments: PIG-1105.1.patch, PIG-1105.2.patch


 COUNT_STAR.accumulate is calling sum() which is supposed to be used by 
 intermediate and final parts of algebraic interface.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-480) PERFORMANCE: Use identity mapper in a chain of M-R jobs

2009-12-04 Thread Ying He (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ying He updated PIG-480:


Status: Open  (was: Patch Available)

this patch has a conflict with the new code that just checked in, which results 
in compilation error.

 PERFORMANCE: Use identity mapper in a chain of M-R jobs
 ---

 Key: PIG-480
 URL: https://issues.apache.org/jira/browse/PIG-480
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.2.0
Reporter: Olga Natkovich
Assignee: Ying He
 Attachments: PIG_480.patch, PIG_480.patch


 For jobs with two or more MR jobs, use identity mapper wherever possible in 
 second and subsequent MR jobs. Identity mapper is about 50% than pig empty 
 map job because it doesn't parse the data. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-480) PERFORMANCE: Use identity mapper in a chain of M-R jobs

2009-12-04 Thread Ying He (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ying He updated PIG-480:


Attachment: PIG_480.patch

fix the compilation error.

 PERFORMANCE: Use identity mapper in a chain of M-R jobs
 ---

 Key: PIG-480
 URL: https://issues.apache.org/jira/browse/PIG-480
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.2.0
Reporter: Olga Natkovich
Assignee: Ying He
 Attachments: PIG_480.patch, PIG_480.patch


 For jobs with two or more MR jobs, use identity mapper wherever possible in 
 second and subsequent MR jobs. Identity mapper is about 50% than pig empty 
 map job because it doesn't parse the data. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-480) PERFORMANCE: Use identity mapper in a chain of M-R jobs


 [ 
https://issues.apache.org/jira/browse/PIG-480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-480:
---

Status: Patch Available  (was: Open)

 PERFORMANCE: Use identity mapper in a chain of M-R jobs
 ---

 Key: PIG-480
 URL: https://issues.apache.org/jira/browse/PIG-480
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.2.0
Reporter: Olga Natkovich
Assignee: Ying He
 Attachments: PIG_480.patch, PIG_480.patch


 For jobs with two or more MR jobs, use identity mapper wherever possible in 
 second and subsequent MR jobs. Identity mapper is about 50% than pig empty 
 map job because it doesn't parse the data. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1104) [zebra] Provide streaming support in Zebra.

2009-12-04 Thread Chao Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Wang updated PIG-1104:
---

Attachment: PIG-1104.patch

 [zebra] Provide streaming support in Zebra.
 ---

 Key: PIG-1104
 URL: https://issues.apache.org/jira/browse/PIG-1104
 Project: Pig
  Issue Type: New Feature
Affects Versions: 0.4.0
Reporter: Chao Wang
Assignee: Chao Wang
 Fix For: 0.6.0, 0.7.0

 Attachments: PIG-1104.patch, PIG1104.patch


 Hadoop streaming is very popular among Hadoop users. The main attraction is 
 the simplicity of use. A user can write the application logic in any language 
 and process large amounts of data using Hadoop framework. As more people 
 start to use Zebra to store their data, we expect users would like to run 
 Hadoop streaming scripts to easily process Zebra tables. 
 The following lists a simple example of using Hadoop streaming to access 
 Zebra data. It loads data from foo table using Zebra's TableInputFormat and 
 then writes the data into output using default TextOutputFormat. 
 $ hadoop jar hadoop-streaming.jar -D mapred.reduce.tasks=0 -input foo -output 
 output -mapper 'cat' -inputformat 
 org.apache.hadoop.zebra.mapred.TableInputFormat 
 More detailed, Zebra uses Pig DefaultTuple implementation of Tuple for its 
 records. Currently, when Zebra's TableInputFormat is used for input, the user 
 script sees each line containing  key_if_any\tTuple.toString() . We plan to 
 generate CSV format representation of our Pig tuples. To this end, we plan to 
 do the following: 
 1) Derive a sub class ZupleTuple from pig's DefaultTuple class and override 
 its toString() method to present the data into CSV format. 
 2) On Zebra side, the tuple factory should be changed to create ZebraTuple 
 objects, instead of DefaultTuple objects. 
 Note that we can only support streaming on the input side - ability to use 
 streaming to read data from Zebra tables. For the output side, the streaming 
 support is not feasible, since the streaming mapper or reducer only emits 
 Text\tText, the output collector has no way of knowing how to convert this 
 to (BytesWritable,Tuple).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1104) [zebra] Provide streaming support in Zebra.

2009-12-04 Thread Chao Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Wang updated PIG-1104:
---

Attachment: (was: PIG1104.patch)

 [zebra] Provide streaming support in Zebra.
 ---

 Key: PIG-1104
 URL: https://issues.apache.org/jira/browse/PIG-1104
 Project: Pig
  Issue Type: New Feature
Affects Versions: 0.4.0
Reporter: Chao Wang
Assignee: Chao Wang
 Fix For: 0.6.0, 0.7.0

 Attachments: PIG-1104.patch


 Hadoop streaming is very popular among Hadoop users. The main attraction is 
 the simplicity of use. A user can write the application logic in any language 
 and process large amounts of data using Hadoop framework. As more people 
 start to use Zebra to store their data, we expect users would like to run 
 Hadoop streaming scripts to easily process Zebra tables. 
 The following lists a simple example of using Hadoop streaming to access 
 Zebra data. It loads data from foo table using Zebra's TableInputFormat and 
 then writes the data into output using default TextOutputFormat. 
 $ hadoop jar hadoop-streaming.jar -D mapred.reduce.tasks=0 -input foo -output 
 output -mapper 'cat' -inputformat 
 org.apache.hadoop.zebra.mapred.TableInputFormat 
 More detailed, Zebra uses Pig DefaultTuple implementation of Tuple for its 
 records. Currently, when Zebra's TableInputFormat is used for input, the user 
 script sees each line containing  key_if_any\tTuple.toString() . We plan to 
 generate CSV format representation of our Pig tuples. To this end, we plan to 
 do the following: 
 1) Derive a sub class ZupleTuple from pig's DefaultTuple class and override 
 its toString() method to present the data into CSV format. 
 2) On Zebra side, the tuple factory should be changed to create ZebraTuple 
 objects, instead of DefaultTuple objects. 
 Note that we can only support streaming on the input side - ability to use 
 streaming to read data from Zebra tables. For the output side, the streaming 
 support is not feasible, since the streaming mapper or reducer only emits 
 Text\tText, the output collector has no way of knowing how to convert this 
 to (BytesWritable,Tuple).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1104) [zebra] Provide streaming support in Zebra.

2009-12-04 Thread Chao Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Wang updated PIG-1104:
---

Status: Open  (was: Patch Available)

 [zebra] Provide streaming support in Zebra.
 ---

 Key: PIG-1104
 URL: https://issues.apache.org/jira/browse/PIG-1104
 Project: Pig
  Issue Type: New Feature
Affects Versions: 0.4.0
Reporter: Chao Wang
Assignee: Chao Wang
 Fix For: 0.6.0, 0.7.0

 Attachments: PIG-1104.patch


 Hadoop streaming is very popular among Hadoop users. The main attraction is 
 the simplicity of use. A user can write the application logic in any language 
 and process large amounts of data using Hadoop framework. As more people 
 start to use Zebra to store their data, we expect users would like to run 
 Hadoop streaming scripts to easily process Zebra tables. 
 The following lists a simple example of using Hadoop streaming to access 
 Zebra data. It loads data from foo table using Zebra's TableInputFormat and 
 then writes the data into output using default TextOutputFormat. 
 $ hadoop jar hadoop-streaming.jar -D mapred.reduce.tasks=0 -input foo -output 
 output -mapper 'cat' -inputformat 
 org.apache.hadoop.zebra.mapred.TableInputFormat 
 More detailed, Zebra uses Pig DefaultTuple implementation of Tuple for its 
 records. Currently, when Zebra's TableInputFormat is used for input, the user 
 script sees each line containing  key_if_any\tTuple.toString() . We plan to 
 generate CSV format representation of our Pig tuples. To this end, we plan to 
 do the following: 
 1) Derive a sub class ZupleTuple from pig's DefaultTuple class and override 
 its toString() method to present the data into CSV format. 
 2) On Zebra side, the tuple factory should be changed to create ZebraTuple 
 objects, instead of DefaultTuple objects. 
 Note that we can only support streaming on the input side - ability to use 
 streaming to read data from Zebra tables. For the output side, the streaming 
 support is not feasible, since the streaming mapper or reducer only emits 
 Text\tText, the output collector has no way of knowing how to convert this 
 to (BytesWritable,Tuple).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1105) COUNT_STAR accumulate interface implementation cases failure


 [ 
https://issues.apache.org/jira/browse/PIG-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriranjan Manjunath updated PIG-1105:
-

Status: Open  (was: Patch Available)

Cancelling since the patch does not have all the changes.

 COUNT_STAR accumulate interface implementation cases failure
 

 Key: PIG-1105
 URL: https://issues.apache.org/jira/browse/PIG-1105
 Project: Pig
  Issue Type: Bug
  Components: impl
Reporter: Thejas M Nair
Assignee: Sriranjan Manjunath
 Fix For: 0.6.0

 Attachments: PIG-1105.1.patch, PIG-1105.2.patch


 COUNT_STAR.accumulate is calling sum() which is supposed to be used by 
 intermediate and final parts of algebraic interface.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1105) COUNT_STAR accumulate interface implementation cases failure


 [ 
https://issues.apache.org/jira/browse/PIG-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriranjan Manjunath updated PIG-1105:
-

Attachment: (was: PIG-1105.2.patch)

 COUNT_STAR accumulate interface implementation cases failure
 

 Key: PIG-1105
 URL: https://issues.apache.org/jira/browse/PIG-1105
 Project: Pig
  Issue Type: Bug
  Components: impl
Reporter: Thejas M Nair
Assignee: Sriranjan Manjunath
 Fix For: 0.6.0

 Attachments: PIG-1105.1.patch, PIG-1105.2.patch


 COUNT_STAR.accumulate is calling sum() which is supposed to be used by 
 intermediate and final parts of algebraic interface.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1105) COUNT_STAR accumulate interface implementation cases failure


 [ 
https://issues.apache.org/jira/browse/PIG-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriranjan Manjunath updated PIG-1105:
-

Status: Patch Available  (was: Open)

 COUNT_STAR accumulate interface implementation cases failure
 

 Key: PIG-1105
 URL: https://issues.apache.org/jira/browse/PIG-1105
 Project: Pig
  Issue Type: Bug
  Components: impl
Reporter: Thejas M Nair
Assignee: Sriranjan Manjunath
 Fix For: 0.6.0

 Attachments: PIG-1105.1.patch, PIG-1105.2.patch


 COUNT_STAR.accumulate is calling sum() which is supposed to be used by 
 intermediate and final parts of algebraic interface.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-1086) Nested sort by * throw exception


[ 
https://issues.apache.org/jira/browse/PIG-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786300#action_12786300
 ] 

Hadoop QA commented on PIG-1086:


+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12426934/PIG-1086.patch
  against trunk revision 887318.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/93/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/93/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/93/console

This message is automatically generated.

 Nested sort by * throw exception
 

 Key: PIG-1086
 URL: https://issues.apache.org/jira/browse/PIG-1086
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.5.0
Reporter: Daniel Dai
Assignee: Richard Ding
 Attachments: PIG-1086.patch


 The following script fail:
 A = load '1.txt' as (a0, a1, a2);
 B = group A by a0;
 C = foreach B { D = order A by *; generate group, D;};
 explain C;
 Here is the stack:
 Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
 at java.util.ArrayList.get(ArrayList.java:324)
 at 
 org.apache.pig.impl.logicalLayer.schema.Schema.getField(Schema.java:752)
 at 
 org.apache.pig.impl.logicalLayer.LOSort.getSortInfo(LOSort.java:332)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:1365)
 at org.apache.pig.impl.logicalLayer.LOSort.visit(LOSort.java:176)
 at org.apache.pig.impl.logicalLayer.LOSort.visit(LOSort.java:43)
 at 
 org.apache.pig.impl.plan.DependencyOrderWalkerWOSeenChk.walk(DependencyOrderWalkerWOSeenChk.java:69)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:1274)
 at 
 org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:130)
 at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:45)
 at 
 org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:69)
 at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:234)
 at org.apache.pig.PigServer.compilePp(PigServer.java:864)
 at org.apache.pig.PigServer.explain(PigServer.java:583)
 ... 8 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (PIG-1128) column pruning causing failure when foreach has user-specified schema


 [ 
https://issues.apache.org/jira/browse/PIG-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai reassigned PIG-1128:
---

Assignee: Daniel Dai

 column pruning causing failure when foreach has user-specified schema
 -

 Key: PIG-1128
 URL: https://issues.apache.org/jira/browse/PIG-1128
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0
Reporter: Thejas M Nair
Assignee: Daniel Dai
 Fix For: 0.6.0


 Issue is seen in 0.6.0 and trunk.
 grunt l = load 'dummy.txt' as ( c1 : chararray,  c2 : int);  
  
 grunt f1 = foreach l generate c1 as c1 : chararray, c2 as c2 : int, 'CA' as 
 state : chararray;
 grunt f2 = foreach f1 generate c1 as c1 : chararray; 
  
 grunt explain f2;
 2009-12-04 13:11:19,010 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 1022: Type mismatch merging schema prefix. Field Schema: chararray. Other 
 Field Schema: c2: int
 ( it does not matter if the new schema has new/different column name - )
 gruntl = load 'dummy.txt' as ( c1 : chararray,  c2 : int);
 gruntf1 = foreach l generate c1 as c11 : chararray, c2 as c22 : int, 'CA' as 
 state : chararray;
 gruntf2 = foreach f1 generate c11 as c111 : chararray;
 grunt explain f2;
 2009-12-04 13:13:01,462 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 1022: Type mismatch merging schema prefix. Field Schema: chararray. Other 
 Field Schema: c22: int

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1128) column pruning causing failure when foreach has user-specified schema


 [ 
https://issues.apache.org/jira/browse/PIG-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1128:


Status: Patch Available  (was: Open)

 column pruning causing failure when foreach has user-specified schema
 -

 Key: PIG-1128
 URL: https://issues.apache.org/jira/browse/PIG-1128
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0
Reporter: Thejas M Nair
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-1128-1.patch


 Issue is seen in 0.6.0 and trunk.
 grunt l = load 'dummy.txt' as ( c1 : chararray,  c2 : int);  
  
 grunt f1 = foreach l generate c1 as c1 : chararray, c2 as c2 : int, 'CA' as 
 state : chararray;
 grunt f2 = foreach f1 generate c1 as c1 : chararray; 
  
 grunt explain f2;
 2009-12-04 13:11:19,010 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 1022: Type mismatch merging schema prefix. Field Schema: chararray. Other 
 Field Schema: c2: int
 ( it does not matter if the new schema has new/different column name - )
 gruntl = load 'dummy.txt' as ( c1 : chararray,  c2 : int);
 gruntf1 = foreach l generate c1 as c11 : chararray, c2 as c22 : int, 'CA' as 
 state : chararray;
 gruntf2 = foreach f1 generate c11 as c111 : chararray;
 grunt explain f2;
 2009-12-04 13:13:01,462 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 1022: Type mismatch merging schema prefix. Field Schema: chararray. Other 
 Field Schema: c22: int

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1128) column pruning causing failure when foreach has user-specified schema


 [ 
https://issues.apache.org/jira/browse/PIG-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1128:


Attachment: PIG-1128-1.patch

 column pruning causing failure when foreach has user-specified schema
 -

 Key: PIG-1128
 URL: https://issues.apache.org/jira/browse/PIG-1128
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.6.0, 0.7.0
Reporter: Thejas M Nair
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-1128-1.patch


 Issue is seen in 0.6.0 and trunk.
 grunt l = load 'dummy.txt' as ( c1 : chararray,  c2 : int);  
  
 grunt f1 = foreach l generate c1 as c1 : chararray, c2 as c2 : int, 'CA' as 
 state : chararray;
 grunt f2 = foreach f1 generate c1 as c1 : chararray; 
  
 grunt explain f2;
 2009-12-04 13:11:19,010 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 1022: Type mismatch merging schema prefix. Field Schema: chararray. Other 
 Field Schema: c2: int
 ( it does not matter if the new schema has new/different column name - )
 gruntl = load 'dummy.txt' as ( c1 : chararray,  c2 : int);
 gruntf1 = foreach l generate c1 as c11 : chararray, c2 as c22 : int, 'CA' as 
 state : chararray;
 gruntf2 = foreach f1 generate c11 as c111 : chararray;
 grunt explain f2;
 2009-12-04 13:13:01,462 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 1022: Type mismatch merging schema prefix. Field Schema: chararray. Other 
 Field Schema: c22: int

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1127) Logical operator should contains individual copy of schema object


 [ 
https://issues.apache.org/jira/browse/PIG-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1127:


Attachment: PIG-1127-2.patch

 Logical operator should contains individual copy of schema object
 -

 Key: PIG-1127
 URL: https://issues.apache.org/jira/browse/PIG-1127
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-1127-1.patch, PIG-1127-2.patch


 Currently some logical operator only contains a schema reference to the 
 predecessor's schema object. These logical operators include: LOSplitOutput, 
 LOLimit, LOSplit, LOFilter, LOSort, LODistinct, LOUnion. It is ok in the 
 before because we do not change schema object once it is set. Now with the 
 column pruner (PIG-922), we need to change individual schema object so it is 
 no longer acceptable. For example, the following script fail:
 {code}
 a = load '1.txt' as (a0, a1:map[], a2);
 b = foreach a generate a1;
 c = limit b 10;
 dump c;
 {code}
 We need to fix it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1127) Logical operator should contains individual copy of schema object


 [ 
https://issues.apache.org/jira/browse/PIG-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1127:


Status: Open  (was: Patch Available)

 Logical operator should contains individual copy of schema object
 -

 Key: PIG-1127
 URL: https://issues.apache.org/jira/browse/PIG-1127
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-1127-1.patch, PIG-1127-2.patch


 Currently some logical operator only contains a schema reference to the 
 predecessor's schema object. These logical operators include: LOSplitOutput, 
 LOLimit, LOSplit, LOFilter, LOSort, LODistinct, LOUnion. It is ok in the 
 before because we do not change schema object once it is set. Now with the 
 column pruner (PIG-922), we need to change individual schema object so it is 
 no longer acceptable. For example, the following script fail:
 {code}
 a = load '1.txt' as (a0, a1:map[], a2);
 b = foreach a generate a1;
 c = limit b 10;
 dump c;
 {code}
 We need to fix it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-1127) Logical operator should contains individual copy of schema object