Re: toJSON function for tuples, bags and strings, PIG-2641

2012-04-10 Thread Russell Jurney
So far this is not easy. On Mon, Apr 9, 2012 at 5:42 PM, Russell Jurney russell.jur...@gmail.comwrote: I see Jackson being used in the Mozilla stuff. It looks pretty straightforward. On Mon, Apr 9, 2012 at 5:38 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote: Jackson is your friend. On

[jira] [Commented] (PIG-2632) Create a SchemaTuple which generates efficient Tuples via code gen

2012-04-10 Thread Dmitriy V. Ryaboy (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250473#comment-13250473 ] Dmitriy V. Ryaboy commented on PIG-2632: you know guys, at this point, I am starting

[jira] [Commented] (PIG-2632) Create a SchemaTuple which generates efficient Tuples via code gen

2012-04-10 Thread Dmitriy V. Ryaboy (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250477#comment-13250477 ] Dmitriy V. Ryaboy commented on PIG-2632: Before Jon kills me: one reason we aren't

[jira] [Updated] (PIG-2643) Use bytecode generation to make a performance replacement for InvokeForLong, InvokeForString, etc

2012-04-10 Thread Jonathan Coveney (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Coveney updated PIG-2643: -- Attachment: PIG-2643-0.patch Oh, I also should add tests. Can leverage the tests Dmitriy used

[jira] [Commented] (PIG-2632) Create a SchemaTuple which generates efficient Tuples via code gen

2012-04-10 Thread Jonathan Coveney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250479#comment-13250479 ] Jonathan Coveney commented on PIG-2632: --- Haha, I won't kill anyone if we find a better

[jira] [Commented] (PIG-2632) Create a SchemaTuple which generates efficient Tuples via code gen

2012-04-10 Thread Jonathan Coveney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250481#comment-13250481 ] Jonathan Coveney commented on PIG-2632: --- One last note: implementing getfield(String)

Re: toJSON function for tuples, bags and strings, PIG-2641

2012-04-10 Thread Russell Jurney
Is there a way to get the field names in an EvalFunc? I am close to done but... no cigar :) I need these to finish. On Mon, Apr 9, 2012 at 11:03 PM, Russell Jurney russell.jur...@gmail.comwrote: So far this is not easy. On Mon, Apr 9, 2012 at 5:42 PM, Russell Jurney

[jira] [Updated] (PIG-2596) Jython UDF does not handle boolean output

2012-04-10 Thread Aniket Mokashi (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aniket Mokashi updated PIG-2596: Status: Patch Available (was: Open) Jython UDF does not handle boolean output

Re: toJSON function for tuples, bags and strings, PIG-2641

2012-04-10 Thread Russell Jurney
Followup question: would it be nice if JsonLoader inferred schemas when none is present, according to some defaults? On Tue, Apr 10, 2012 at 12:48 AM, Russell Jurney russell.jur...@gmail.comwrote: Is there a way to get the field names in an EvalFunc? I am close to done but... no cigar :) I

Fwd: AvroStorage/Avro Schema Question

2012-04-10 Thread Russell Jurney
Whoops, sorry to post to user. Scott Carey explains how to fix my ARRAY_ELEM problem. -- Forwarded message -- From: Scott Carey scottca...@apache.org Date: Mon, Apr 2, 2012 at 9:13 AM Subject: Re: AvroStorage/Avro Schema Question To: u...@avro.apache.org It appears as though

[jira] [Updated] (PIG-2627) Custom partitioner not set when POSplit is involved in Plan

2012-04-10 Thread Aniket Mokashi (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aniket Mokashi updated PIG-2627: Attachment: PIG-2627.patch Custom partitioner not set when POSplit is involved in Plan

[jira] [Updated] (PIG-2627) Custom partitioner not set when POSplit is involved in Plan

2012-04-10 Thread Aniket Mokashi (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aniket Mokashi updated PIG-2627: Status: Patch Available (was: Open) Custom partitioner not set when POSplit is involved in

[jira] [Commented] (PIG-2603) pigPackagesToSend must be configurable

2012-04-10 Thread Aniket Mokashi (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250542#comment-13250542 ] Aniket Mokashi commented on PIG-2603: - how about pig.additional.jars?

[jira] [Created] (PIG-2644) Piggybank's HadoopJobHistoryLoader throws NPE when reading broken history file

2012-04-10 Thread Mathias Herberts (Created) (JIRA)
Piggybank's HadoopJobHistoryLoader throws NPE when reading broken history file -- Key: PIG-2644 URL: https://issues.apache.org/jira/browse/PIG-2644 Project: Pig

[jira] [Updated] (PIG-2644) Piggybank's HadoopJobHistoryLoader throws NPE when reading broken history file

2012-04-10 Thread Mathias Herberts (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mathias Herberts updated PIG-2644: -- Attachment: PIG-2644.patch The following patch adds nullity tests prior to accessing a value

[jira] [Commented] (PIG-2643) Use bytecode generation to make a performance replacement for InvokeForLong, InvokeForString, etc

2012-04-10 Thread Dmitriy V. Ryaboy (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250703#comment-13250703 ] Dmitriy V. Ryaboy commented on PIG-2643: Jon, the problem with non-static methods

[jira] [Commented] (PIG-2632) Create a SchemaTuple which generates efficient Tuples via code gen

2012-04-10 Thread Dmitriy V. Ryaboy (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250706#comment-13250706 ] Dmitriy V. Ryaboy commented on PIG-2632: that's a good point, actually. What would

Re: toJSON function for tuples, bags and strings, PIG-2641

2012-04-10 Thread Dmitriy Ryaboy
first question: you can do this when outputSchema() is called, as it's passed the input schema. IIRC, in trunk you have hooks to pass that info to the backend in a udf. second question: see discussion on JsonLoader jira.. short answer: non-trivial, no clear decision on what the most sensible

Re: toJSON function for tuples, bags and strings, PIG-2641

2012-04-10 Thread Russell Jurney
I forgot about UDFContext providing the schema, and the pig docs are out of date. Is no problem now. About default behavior for json, that would seem to be: tuples - objects, bags - arrays, integers - long, decimals - double, and configs for setting low precision to substitute int/float. Maps are

[jira] [Commented] (PIG-2643) Use bytecode generation to make a performance replacement for InvokeForLong, InvokeForString, etc

2012-04-10 Thread Jonathan Coveney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250720#comment-13250720 ] Jonathan Coveney commented on PIG-2643: --- I'm not quite sure what you mean...could you

[jira] [Commented] (PIG-2643) Use bytecode generation to make a performance replacement for InvokeForLong, InvokeForString, etc

2012-04-10 Thread Jonathan Coveney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250745#comment-13250745 ] Jonathan Coveney commented on PIG-2643: --- I am thinking about this and what I _think_

[jira] [Commented] (PIG-2643) Use bytecode generation to make a performance replacement for InvokeForLong, InvokeForString, etc

2012-04-10 Thread Dmitriy V. Ryaboy (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250804#comment-13250804 ] Dmitriy V. Ryaboy commented on PIG-2643: Not quite -- what I mean is that people

Re: toJSON function for tuples, bags and strings, PIG-2641

2012-04-10 Thread Dmitriy Ryaboy
Russ, I appreciate the passion, but let's drop fiery rhetoric in favor of technical discussion, yeah? :-) No one is against accessibility. The problem with making things simple is that it's really hard to make them simple. Without appropriate amount of forethought, you paint yourself into ugly

[jira] [Commented] (PIG-2632) Create a SchemaTuple which generates efficient Tuples via code gen

2012-04-10 Thread Scott Carey (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250831#comment-13250831 ] Scott Carey commented on PIG-2632: -- {quote} should a:int,b:int really generate a different

[jira] [Commented] (PIG-2632) Create a SchemaTuple which generates efficient Tuples via code gen

2012-04-10 Thread Jonathan Coveney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250837#comment-13250837 ] Jonathan Coveney commented on PIG-2632: --- I agree 100% that it's time to iron out

[jira] [Commented] (PIG-2632) Create a SchemaTuple which generates efficient Tuples via code gen

2012-04-10 Thread Dmitriy V. Ryaboy (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250845#comment-13250845 ] Dmitriy V. Ryaboy commented on PIG-2632: remember to attribute the code to mahout

[jira] [Commented] (PIG-2642) StoreMetadata.storeSchema can't access files in the output directory (Hadoop 0.23)

2012-04-10 Thread Jonathan Eagles (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250852#comment-13250852 ] Jonathan Eagles commented on PIG-2642: -- Thomas, I have tested the patch and it is

[jira] [Commented] (PIG-2632) Create a SchemaTuple which generates efficient Tuples via code gen

2012-04-10 Thread Jonathan Coveney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250853#comment-13250853 ] Jonathan Coveney commented on PIG-2632: --- Ah, gotcha. As far as the NOTICE.txt, I

[jira] [Commented] (PIG-2643) Use bytecode generation to make a performance replacement for InvokeForLong, InvokeForString, etc

2012-04-10 Thread Jonathan Coveney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250854#comment-13250854 ] Jonathan Coveney commented on PIG-2643: --- Ah, ok. Well, first, since we have a more

[jira] [Commented] (PIG-2632) Create a SchemaTuple which generates efficient Tuples via code gen

2012-04-10 Thread Julien Le Dem (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250857#comment-13250857 ] Julien Le Dem commented on PIG-2632: By design Avro defines the binary format and the

[jira] [Commented] (PIG-2643) Use bytecode generation to make a performance replacement for InvokeForLong, InvokeForString, etc

2012-04-10 Thread Dmitriy V. Ryaboy (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250987#comment-13250987 ] Dmitriy V. Ryaboy commented on PIG-2643: I'm a bit confused about the math keyword

[jira] [Commented] (PIG-2632) Create a SchemaTuple which generates efficient Tuples via code gen

2012-04-10 Thread Scott Carey (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250999#comment-13250999 ] Scott Carey commented on PIG-2632: -- Avro has a pure Java code generator that uses the

[jira] [Commented] (PIG-2643) Use bytecode generation to make a performance replacement for InvokeForLong, InvokeForString, etc

2012-04-10 Thread Jonathan Coveney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13251336#comment-13251336 ] Jonathan Coveney commented on PIG-2643: --- I don't know that I like including all of the

[jira] [Commented] (PIG-2643) Use bytecode generation to make a performance replacement for InvokeForLong, InvokeForString, etc

2012-04-10 Thread Jonathan Coveney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13251343#comment-13251343 ] Jonathan Coveney commented on PIG-2643: --- Also, as an aside, I think it'd be awesome to