[jira] [Updated] (PIG-1916) Nested cross

2011-07-14 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1916: Attachment: PIG-1916_5.patch Change the patch slightly to fix test-patch warnings. Nested cross

[jira] [Resolved] (PIG-1916) Nested cross

2011-07-14 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai resolved PIG-1916. - Resolution: Fixed Release Note: Allow cross two or more bags inside a foreach statement. For

Jenkins build is back to normal : Pig-trunk-commit #853

2011-07-14 Thread Apache Jenkins Server
See https://builds.apache.org/job/Pig-trunk-commit/853/changes

Build failed in Jenkins: Pig-trunk #1042

2011-07-14 Thread Apache Jenkins Server
See https://builds.apache.org/job/Pig-trunk/1042/changes Changes: [daijy] PIG-1916: Nested cross -- [...truncated 34926 lines...] [junit] at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) [junit] at

[jira] [Commented] (PIG-1916) Nested cross

2011-07-14 Thread Zhijie Shen (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065184#comment-13065184 ] Zhijie Shen commented on PIG-1916: -- Thanks for your help, Daniel! Nested cross

[jira] [Updated] (PIG-2001) DefaultTuple(List) constructor is inefficient, causes List.size() System.arraycopy() calls (though they are 0 byte copies), DefaultTuple(int) constructor is a bit misleadin

2011-07-14 Thread Thejas M Nair (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated PIG-2001: --- Resolution: Fixed Status: Resolved (was: Patch Available) +1 . Patch committed to trunk.

Re: Cubing in Pig

2011-07-14 Thread Gianmarco
If you want to add a new operator the right place to add the logic should be LogicalPlanBuilder. Just a question, are you sure this code is correct? I can't understand how it works. cubed = foreach rel generate flatten(CubeDimensions(a, b)); cube = foreach (group rel by $0) generate

[jira] [Updated] (PIG-2156) Limit/Sample with variable does not work if the expression starts with an integer/double

2011-07-14 Thread Thejas M Nair (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated PIG-2156: --- Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) +1

[jira] [Updated] (PIG-2143) Improvements for PigStorage

2011-07-14 Thread Dmitriy V. Ryaboy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy V. Ryaboy updated PIG-2143: --- Status: Open (was: Patch Available) Improvements for PigStorage ---

[jira] [Updated] (PIG-2143) Improvements for PigStorage

2011-07-14 Thread Dmitriy V. Ryaboy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy V. Ryaboy updated PIG-2143: --- Status: Patch Available (was: Open) resubmitting patch Improvements for PigStorage

[jira] [Updated] (PIG-2143) Improvements for PigStorage

2011-07-14 Thread Dmitriy V. Ryaboy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy V. Ryaboy updated PIG-2143: --- Attachment: PIG-2143.2.diff Thanks for the reviews. Uploading a patch that fixes the repeated

Re: Cubing in Pig

2011-07-14 Thread Dmitriy Ryaboy
I think showing the data at every step will help. rel: (green,tall) (red,short) (green,short) cubed: (green,tall) (green,) (,tall) (,) (red,short) (red,) (,short) (,) (green,short) (green,) (,short) (,) cube: I did mess up typing the code in the email -- it should look more like this: cube =

Re: Cubing in Pig

2011-07-14 Thread Jonathan Coveney
Dmitry, a quick point on your approach... I assume that you meant to do, replacing rel with cubed? If you ran what you pasted, you don't actually make reference to the cubed that you output, which may have influenced run time. cubed = foreach rel generate flatten(CubeDimensions(a, b)); cube =

[jira] [Commented] (PIG-2143) Improvements for PigStorage

2011-07-14 Thread Raghu Angadi (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065391#comment-13065391 ] Raghu Angadi commented on PIG-2143: --- PigStorageSchema is not setting -schema argument.

Re: Cubing in Pig

2011-07-14 Thread Dmitriy Ryaboy
Jon, I ran the right script, I just wrote out the wrong one in the email :-). I also compared results of both computations to ensure correctness. Arnab posted his slides: http://pdf.cx/44wrk My approach is the naive approach described in slides 11-17. D On Thu, Jul 14, 2011 at 11:54 AM,

[jira] [Commented] (PIG-2159) New logical plan uses incorrect class for SUM causing for ClassCastException

2011-07-14 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065444#comment-13065444 ] Daniel Dai commented on PIG-2159: - The error is caused by schema generated by unionOnSchema,

[jira] [Created] (PIG-2163) Improve nested cross to stream one relation

2011-07-14 Thread Daniel Dai (JIRA)
Improve nested cross to stream one relation --- Key: PIG-2163 URL: https://issues.apache.org/jira/browse/PIG-2163 Project: Pig Issue Type: Improvement Components: impl Affects Versions:

[jira] [Updated] (PIG-2159) New logical plan uses incorrect class for SUM causing for ClassCastException

2011-07-14 Thread Dmitriy V. Ryaboy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy V. Ryaboy updated PIG-2159: --- Priority: Blocker (was: Major) Sounds like a blocker for the 0.9 release, changing the

[jira] [Updated] (PIG-2115) Pig HBaseStorage configuration and setup issues

2011-07-14 Thread Dmitriy V. Ryaboy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy V. Ryaboy updated PIG-2115: --- Status: Open (was: Patch Available) Pig HBaseStorage configuration and setup issues

[jira] [Commented] (PIG-2115) Pig HBaseStorage configuration and setup issues

2011-07-14 Thread Dmitriy V. Ryaboy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065460#comment-13065460 ] Dmitriy V. Ryaboy commented on PIG-2115: Sorry for the long wait. This looks fine,

[jira] [Updated] (PIG-2114) Enhancements to PIG HBaseStorage Load Store Func with extra scan configurations

2011-07-14 Thread Dmitriy V. Ryaboy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy V. Ryaboy updated PIG-2114: --- Status: Open (was: Patch Available) canceling patch pending unit tests other updates.

[jira] [Updated] (PIG-1946) HBaseStorage constructor syntax is error prone

2011-07-14 Thread Bill Graham (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-1946: - Attachment: PIG-1946_2.patch Attaching a new patch with the changes suggested. I've created a new unit

[jira] [Updated] (PIG-1946) HBaseStorage constructor syntax is error prone

2011-07-14 Thread Bill Graham (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-1946: - Release Note: More friendly HBaseStorage constructor syntax. Status: Patch Available (was: Open)

Re: Cubing in Pig

2011-07-14 Thread Thejas Nair
+1 to what Gianmarco said about the place to do it. See sample_clause in LogicalPlanGenerator.g. I tried the expanded query (2 dimensions) with 0.8, it results only in 2 MR jobs, the 1st MR job has all the computation being done in a single MR job. The 2nd MR job just concats the outputs

Pig testing proposal

2011-07-14 Thread Alan Gates
I have posted a proposal for changes in Pig's testing that I would like to make. https://cwiki.apache.org/confluence/display/PIG/PigTestProposal Please take a look and provide feedback. Alan.

[jira] [Resolved] (PIG-2154) e2e test harness should be database agnostic

2011-07-14 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates resolved PIG-2154. - Resolution: Won't Fix Rather than switch Postgres for another DB we will switch the tests to use other

Re: Cubing in Pig

2011-07-14 Thread Dmitriy Ryaboy
In the dw world, using a single table and using null as an all marker is the standard thing to do. In my udf I actually allow an optional string to be passed to the constructor to denote all if null is a valid value... I'll post the udf shortly, it's a prerequisite to LOCube. I suspect the

[jira] [Created] (PIG-2165) Need a way to deal with params and param_file in embedded pig in python

2011-07-14 Thread Supreeth (JIRA)
Need a way to deal with params and param_file in embedded pig in python --- Key: PIG-2165 URL: https://issues.apache.org/jira/browse/PIG-2165 Project: Pig Issue Type: Bug

[jira] [Updated] (PIG-2159) New logical plan uses incorrect class for SUM causing for ClassCastException

2011-07-14 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-2159: Attachment: PIG-2159-1.patch New logical plan uses incorrect class for SUM causing for ClassCastException

[jira] [Updated] (PIG-2165) Need a way to deal with params and param_file in embedded pig in python

2011-07-14 Thread Olga Natkovich (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-2165: Fix Version/s: 0.10 Need a way to deal with params and param_file in embedded pig in python

Re: Cubing in Pig

2011-07-14 Thread Thejas Nair
On 7/14/11 3:03 PM, Dmitriy Ryaboy wrote: In the dw world, using a single table and using null as an all marker is the standard thing to do But I imagine that in the dw world, the cube results would get stored in such a way that you can efficiently retrieve results of specific group-bys

Re: Pig testing proposal

2011-07-14 Thread Thejas Nair
On 7/14/11 2:39 PM, Alan Gates wrote: I have posted a proposal for changes in Pig's testing that I would like to make. https://cwiki.apache.org/confluence/display/PIG/PigTestProposal Please take a look and provide feedback. Alan. +1 for the proposal. -Thejas

Re: Pig testing proposal

2011-07-14 Thread Thejas Nair
I think having SQL as a way to generate benchmark has some value, and we should be open to having that option in e2e harness as well. But I don't see that as a blocker. In some cases, I would expect that writing an alternative pig-latin query to generate benchmark might not be easy, and there

[jira] [Updated] (PIG-2162) bin/pig should not modify user args

2011-07-14 Thread Raghu Angadi (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raghu Angadi updated PIG-2162: -- Fix Version/s: 0.10 Affects Version/s: 0.8.0 Release Note: bin/pig handles args with

[jira] [Commented] (PIG-2115) Pig HBaseStorage configuration and setup issues

2011-07-14 Thread Greg Bowyer (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065619#comment-13065619 ] Greg Bowyer commented on PIG-2115: -- Fine point, do you want me to change the patch ? Pig

RE: Pig testing proposal

2011-07-14 Thread Olga Natkovich
+1 -Original Message- From: Alan Gates [mailto:ga...@hortonworks.com] Sent: Thursday, July 14, 2011 2:39 PM To: dev@pig.apache.org Subject: Pig testing proposal I have posted a proposal for changes in Pig's testing that I would like to make.

[jira] [Updated] (PIG-794) Use Avro serialization in Pig

2011-07-14 Thread Thejas M Nair (JIRA)
[ https://issues.apache.org/jira/browse/PIG-794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated PIG-794: -- Status: Open (was: Patch Available) Removing the patch-available state, as the issues noted in the jira

[jira] [Commented] (PIG-2115) Pig HBaseStorage configuration and setup issues

2011-07-14 Thread Dmitriy V. Ryaboy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065634#comment-13065634 ] Dmitriy V. Ryaboy commented on PIG-2115: ifn' you don't mind. I can do it if you are

[jira] [Updated] (PIG-2159) New logical plan uses incorrect class for SUM causing for ClassCastException

2011-07-14 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-2159: Attachment: PIG-2159-2.patch Fix test failure on TestUnionOnSchemaSetter. New logical plan uses incorrect

[jira] [Created] (PIG-2166) UDFs to flatten a bag

2011-07-14 Thread Daniel Dai (JIRA)
UDFs to flatten a bag - Key: PIG-2166 URL: https://issues.apache.org/jira/browse/PIG-2166 Project: Pig Issue Type: Improvement Reporter: Daniel Dai Priority: Minor Get several request for a UDF

[jira] [Updated] (PIG-2143) Improvements for PigStorage

2011-07-14 Thread Dmitriy V. Ryaboy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy V. Ryaboy updated PIG-2143: --- Status: Open (was: Patch Available) Canceling patch to incorporate the feedback.

[jira] [Updated] (PIG-2143) Improvements for PigStorage

2011-07-14 Thread Dmitriy V. Ryaboy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy V. Ryaboy updated PIG-2143: --- Status: Patch Available (was: Open) Improvements for PigStorage ---

[jira] [Updated] (PIG-2143) Improvements for PigStorage

2011-07-14 Thread Dmitriy V. Ryaboy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy V. Ryaboy updated PIG-2143: --- Attachment: PIG-2143.3.patch Submitting a new version of the patch. * added a bunch of

[jira] [Updated] (PIG-2161) TOTUPLE should use no-copy tuple creation

2011-07-14 Thread Dmitriy V. Ryaboy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy V. Ryaboy updated PIG-2161: --- Status: Patch Available (was: Open) TOTUPLE should use no-copy tuple creation

[jira] [Updated] (PIG-2161) TOTUPLE should use no-copy tuple creation

2011-07-14 Thread Dmitriy V. Ryaboy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy V. Ryaboy updated PIG-2161: --- Attachment: pig_2161.patch Attaching a trivial fix. It's worth noting there was an explicit

[jira] [Commented] (PIG-2165) Need a way to deal with params and param_file in embedded pig in python

2011-07-14 Thread Julien Le Dem (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065725#comment-13065725 ] Julien Le Dem commented on PIG-2165: I see two options: - One way would be to use the

[jira] [Updated] (PIG-2143) Improvements for PigStorage

2011-07-14 Thread Dmitriy V. Ryaboy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy V. Ryaboy updated PIG-2143: --- Attachment: PIG-2143.4.patch Recreated the patch, this time using git --no-prefix so it'll