[jira] Subscription: PIG patch available

2013-01-17 Thread jira
Issue Subscription
Filter: PIG patch available (34 issues)

Subscriber: pigdaily

Key Summary
PIG-3123Simplify Logical Plans By Removing Unneccessary Identity Projections
https://issues.apache.org/jira/browse/PIG-3123
PIG-3122Operators should not implicitly become reserved keywords
https://issues.apache.org/jira/browse/PIG-3122
PIG-3114Duplicated macro name error when using pigunit
https://issues.apache.org/jira/browse/PIG-3114
PIG-3108HBaseStorage returns empty maps when mixing wildcard- with other 
columns
https://issues.apache.org/jira/browse/PIG-3108
PIG-3105Fix TestJobSubmission unit test failure.
https://issues.apache.org/jira/browse/PIG-3105
PIG-3098Add another test for the self join case
https://issues.apache.org/jira/browse/PIG-3098
PIG-3090Introduce a syntax to be able to easily refer to the previously 
defined relation
https://issues.apache.org/jira/browse/PIG-3090
PIG-3088Add a builtin udf which removes prefixes
https://issues.apache.org/jira/browse/PIG-3088
PIG-3086Allow A Prefix To Be Added To URIs In PigUnit Tests 
https://issues.apache.org/jira/browse/PIG-3086
PIG-3083Introduce new syntax that let's you project just the columns that 
come from a given :: prefix
https://issues.apache.org/jira/browse/PIG-3083
PIG-3082outputSchema of a UDF allows two usages when describing a Tuple 
schema
https://issues.apache.org/jira/browse/PIG-3082
PIG-3078Make a UDF that, given a string, returns just the columns prefixed 
by that string
https://issues.apache.org/jira/browse/PIG-3078
PIG-3073POUserFunc creating log spam for large scripts
https://issues.apache.org/jira/browse/PIG-3073
PIG-3069Native Windows Compatibility for Pig E2E Tests and Harness
https://issues.apache.org/jira/browse/PIG-3069
PIG-3057make readField protected to be able to override it if we extend 
PigStorage
https://issues.apache.org/jira/browse/PIG-3057
PIG-3028testGrunt dev test needs some command filters to run correctly 
without cygwin
https://issues.apache.org/jira/browse/PIG-3028
PIG-3027pigTest unit test needs a newline filter for comparisons of golden 
multi-line
https://issues.apache.org/jira/browse/PIG-3027
PIG-3026Pig checked-in baseline comparisons need a pre-filter to address 
OS-specific newline differences
https://issues.apache.org/jira/browse/PIG-3026
PIG-3025TestPruneColumn unit test - SimpleEchoStreamingCommand perl inline 
script needs simplification
https://issues.apache.org/jira/browse/PIG-3025
PIG-3024TestEmptyInputDir unit test - hadoop version detection logic is 
brittle
https://issues.apache.org/jira/browse/PIG-3024
PIG-3015Rewrite of AvroStorage
https://issues.apache.org/jira/browse/PIG-3015
PIG-3010Allow UDF's to flatten themselves
https://issues.apache.org/jira/browse/PIG-3010
PIG-2959Add a pig.cmd for Pig to run under Windows
https://issues.apache.org/jira/browse/PIG-2959
PIG-2955 Fix bunch of Pig e2e tests on Windows 
https://issues.apache.org/jira/browse/PIG-2955
PIG-2878Pig current releases lack a UDF equalIgnoreCase.This function 
returns a Boolean value indicating whether string left is equal to string 
right. This check is case insensitive.
https://issues.apache.org/jira/browse/PIG-2878
PIG-2873Converting bin/pig shell script to python
https://issues.apache.org/jira/browse/PIG-2873
PIG-2834MultiStorage requires unused constructor argument
https://issues.apache.org/jira/browse/PIG-2834
PIG-2661Pig uses an extra job for loading data in Pigmix L9
https://issues.apache.org/jira/browse/PIG-2661
PIG-2645PigSplit does not handle the case where SerializationFactory 
returns null
https://issues.apache.org/jira/browse/PIG-2645
PIG-2507Semicolon in paramenters for UDF results in parsing error
https://issues.apache.org/jira/browse/PIG-2507
PIG-2417Streaming UDFs -  allow users to easily write UDFs in scripting 
languages with no JVM implementation.
https://issues.apache.org/jira/browse/PIG-2417
PIG-2312NPE when relation and column share the same name and used in Nested 
Foreach 
https://issues.apache.org/jira/browse/PIG-2312
PIG-1942script UDF (jython) should utilize the intended output schema to 
more directly convert Py objects to Pig objects
https://issues.apache.org/jira/browse/PIG-1942
PIG-1237Piggybank MutliStorage - specify field to write in output
https://issues.apache.org/jira/browse/PIG-1237

You may edit this subscription at:
https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=13225&filterId=12322384


[jira] [Commented] (PIG-3090) Introduce a syntax to be able to easily refer to the previously defined relation

2013-01-17 Thread Russell Jurney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556786#comment-13556786
 ] 

Russell Jurney commented on PIG-3090:
-

But wait - does this work with grunt?

Russell Jurney http://datasyndrome.com




> Introduce a syntax to be able to easily refer to the previously defined 
> relation
> 
>
> Key: PIG-3090
> URL: https://issues.apache.org/jira/browse/PIG-3090
> Project: Pig
>  Issue Type: New Feature
>Reporter: Jonathan Coveney
>Assignee: Jonathan Coveney
> Fix For: 0.12
>
> Attachments: PIG-3090-0.patch
>
>
> Sometimes I feel like swimming with ANTLRs. This particular feature isn't too 
> hard to add... and supports syntax like this:
> {code}
> a = load 'thing' as (x:int);
> b = foreach @ generate x;
> c = foreach @ generate x;
> d = foreach @ generate x;
> {code}
> I have a patch, though I need to make sure it doesn't change anything (it 
> shouldn't) and I need to add tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3090) Introduce a syntax to be able to easily refer to the previously defined relation

2013-01-17 Thread Russell Jurney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556784#comment-13556784
 ] 

Russell Jurney commented on PIG-3090:
-

Does this work with grunt? Maybe I'll review it.

> Introduce a syntax to be able to easily refer to the previously defined 
> relation
> 
>
> Key: PIG-3090
> URL: https://issues.apache.org/jira/browse/PIG-3090
> Project: Pig
>  Issue Type: New Feature
>Reporter: Jonathan Coveney
>Assignee: Jonathan Coveney
> Fix For: 0.12
>
> Attachments: PIG-3090-0.patch
>
>
> Sometimes I feel like swimming with ANTLRs. This particular feature isn't too 
> hard to add... and supports syntax like this:
> {code}
> a = load 'thing' as (x:int);
> b = foreach @ generate x;
> c = foreach @ generate x;
> d = foreach @ generate x;
> {code}
> I have a patch, though I need to make sure it doesn't change anything (it 
> shouldn't) and I need to add tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3015) Rewrite of AvroStorage

2013-01-17 Thread Russell Jurney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556725#comment-13556725
 ] 

Russell Jurney commented on PIG-3015:
-

Plz note, Avro will have Boolean soon:  AVRO-1229

> Rewrite of AvroStorage
> --
>
> Key: PIG-3015
> URL: https://issues.apache.org/jira/browse/PIG-3015
> Project: Pig
>  Issue Type: Improvement
>  Components: piggybank
>Reporter: Joseph Adler
>Assignee: Joseph Adler
> Attachments: bad.avro, good.avro, PIG-3015-2.patch, PIG-3015-3.patch, 
> PIG-3015-4.patch, PIG-3015-5.patch, TestInput.java, Test.java
>
>
> The current AvroStorage implementation has a lot of issues: it requires old 
> versions of Avro, it copies data much more than needed, and it's verbose and 
> complicated. (One pet peeve of mine is that old versions of Avro don't 
> support Snappy compression.)
> I rewrote AvroStorage from scratch to fix these issues. In early tests, the 
> new implementation is significantly faster, and the code is a lot simpler. 
> Rewriting AvroStorage also enabled me to implement support for Trevni (as 
> TrevniStorage).
> I'm opening this ticket to facilitate discussion while I figure out the best 
> way to contribute the changes back to Apache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: Add BigInteger and BigDecimal to Pig

2013-01-17 Thread Mathias Herberts

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9012/#review15469
---



src/org/apache/pig/backend/hadoop/HDataType.java


Missing 'break;'



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POCast.java


Please use braces after if for clarity.



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POCast.java


Please use braces after if for clarity.



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POCast.java


Please use braces after if for clarity.



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POCast.java


Please use braces after if for clarity.



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POCast.java


Please use braces after if for clarity.



src/org/apache/pig/backend/hadoop/hbase/HBaseBinaryConverter.java


Why not use BigInteger(byte[]) or the String rep? Same comment for 
BigDecimal and the String rep.



src/org/apache/pig/builtin/TextLoader.java


Message should be 'conversion to BigInteger'



src/org/apache/pig/builtin/TextLoader.java


Message should be 'conversion to BigDecimal'



src/org/apache/pig/builtin/Utf8StorageConverter.java


Shouldn't the charset be specified in the calls to getBytes for both 
BigInteger and BigDecimal?



src/org/apache/pig/data/DataType.java


Indentation is messy here.



src/org/apache/pig/data/DataType.java


Messy indent.



src/org/apache/pig/data/DataType.java


Messy indent.



src/org/apache/pig/data/DataType.java


Should  read 'BigDecimal' here and in next message.



src/org/apache/pig/data/DataType.java


Messy indent



src/org/apache/pig/data/DefaultTuple.java


I don't understand those two lines!



src/org/apache/pig/data/SizeUtil.java


I thought BigDecimal and BigInteger did not have the same size, cf 
http://javamoods.blogspot.fr/2009/03/how-big-is-bigdecimal.html



src/org/apache/pig/impl/util/StorageUtil.java


Shouldn't the charset be specified?



src/org/apache/pig/parser/LogicalPlanBuilder.java


Since suffixes are 'BI' and 'BD', we should strip the last two characters 
of the string, not only the last one.


- Mathias Herberts


On Jan. 17, 2013, 9:02 p.m., Jonathan Coveney wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/9012/
> ---
> 
> (Updated Jan. 17, 2013, 9:02 p.m.)
> 
> 
> Review request for pig, Alan Gates and Mathias Herberts.
> 
> 
> Description
> ---
> 
> This patch adds big integer and big decimal support to Pig. It could use more 
> tests, something I'd appreciate feedback on (but I wanted to make sure the 
> core implementation is good)
> 
> 
> This addresses bug PIG-2764.
> https://issues.apache.org/jira/browse/PIG-2764
> 
> 
> Diffs
> -
> 
>   .gitignore cc62d7d 
>   src/org/apache/pig/LoadCaster.java 574769b 
>   src/org/apache/pig/PigWarning.java 5de075f 
>   src/org/apache/pig/StoreCaster.java 5fe48de 
>   src/org/apache/pig/backend/hadoop/BigDecimalWritable.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/BigIntegerWritable.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/HDataType.java 84a56b8 
>   
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
>  96fba6b 
>   
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigBigDecimalRawComparator.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigBigIntegerRawComparator.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/partitioners/WeightedRangePartitioner.java
>  9749339 
>   
> src/org/apache/pi

[jira] [Commented] (PIG-3098) Add another test for the self join case

2013-01-17 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556692#comment-13556692
 ] 

Julien Le Dem commented on PIG-3098:


one minor comment regarding asserts:
{noformat}
  assertEquals(tuples.size(), out.size());
  for (Tuple t : out) {
assertTrue(tuples.remove(t));
  }
  assertTrue(tuples.isEmpty());
{noformat}
if wrong it is not going to give much information.
please add a message as the first parameter with some info:
{noformat}
  assertEquals("tuple count for " + out, tuples.size(), out.size());
  for (Tuple t : out) {
assertTrue("existence of " + t, tuples.remove(t));
  }
  assertTrue("all tuples consumed in " + tuples, tuples.isEmpty());
{noformat}



> Add another test for the self join case
> ---
>
> Key: PIG-3098
> URL: https://issues.apache.org/jira/browse/PIG-3098
> Project: Pig
>  Issue Type: Bug
>Reporter: Jonathan Coveney
>Assignee: Jonathan Coveney
> Fix For: 0.12
>
> Attachments: PIG-3098-0.patch
>
>
> This adds a test to TestJoin that doesn't just make sure that self joins work 
> semantically in the parser, but also that it pulls the right data through. 
> Thought it'd be easier to just make a new JIRA than to reopen PIG-3020.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3082) outputSchema of a UDF allows two usages when describing a Tuple schema

2013-01-17 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556686#comment-13556686
 ] 

Julien Le Dem commented on PIG-3082:


Thanks for fixing Jon!
I find the error message a little confusing:
{noformat}
 throw new FrontendException("Given UDF returns an improper Schema. Should only 
return Tuple, Bag, or a single item. Returns: " + udfSchema);
{noformat}
It should contain something along the lines of "... outputSchema should return 
a Schema containing a single Field ...".
Otherwise, it looks good to me.
Thanks

> outputSchema of a UDF allows two usages when describing a Tuple schema
> --
>
> Key: PIG-3082
> URL: https://issues.apache.org/jira/browse/PIG-3082
> Project: Pig
>  Issue Type: Bug
>Reporter: Julien Le Dem
>Assignee: Jonathan Coveney
> Fix For: 0.12
>
> Attachments: PIG-3082-0.patch
>
>
> When defining an evalfunc that returns a Tuple there are two ways you can 
> implement outputSchema().
> - The right way: return a schema that contains one Field that contains the 
> type and schema of the return type of the UDF
> - The unreliable way: return a schema that contains more than one field and 
> it will be understood as a tuple schema even though there is no type (which 
> is in Field class) to specify that. This is particularly deceitful when the 
> output schema is derived from the input schema and the outputted Tuple 
> sometimes contain only one field. In such cases Pig understands the output 
> schema as a tuple only if there is more than one field. And sometimes it 
> works, sometimes it does not.
> We should at least issue a warning (backward compatibility) if not plain 
> throw an exception when the output schema contains more than one Field.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2764) Add a biginteger and bigdecimal type to pig

2013-01-17 Thread Jonathan Coveney (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Coveney updated PIG-2764:
--

Attachment: PIG-2764-3.patch

Pulled master, put patch here and in RB: https://reviews.apache.org/r/9012/

> Add a biginteger and bigdecimal type to pig
> ---
>
> Key: PIG-2764
> URL: https://issues.apache.org/jira/browse/PIG-2764
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jonathan Coveney
>Assignee: Jonathan Coveney
> Attachments: fixedpoint.patch, PIG-2764-0.patch, PIG-2764-1.patch, 
> PIG-2764-2_nows.patch, PIG-2764-2.patch, PIG-2764-3.patch
>
>
> I think it would be useful for applications where precision is more important 
> than speed to have the option of using java's bigdecimal and biginteger types 
> natively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Review Request: Add BigInteger and BigDecimal to Pig

2013-01-17 Thread Jonathan Coveney

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9012/
---

Review request for pig, Alan Gates and Mathias Herberts.


Description
---

This patch adds big integer and big decimal support to Pig. It could use more 
tests, something I'd appreciate feedback on (but I wanted to make sure the core 
implementation is good)


This addresses bug PIG-2764.
https://issues.apache.org/jira/browse/PIG-2764


Diffs
-

  .gitignore cc62d7d 
  src/org/apache/pig/LoadCaster.java 574769b 
  src/org/apache/pig/PigWarning.java 5de075f 
  src/org/apache/pig/StoreCaster.java 5fe48de 
  src/org/apache/pig/backend/hadoop/BigDecimalWritable.java PRE-CREATION 
  src/org/apache/pig/backend/hadoop/BigIntegerWritable.java PRE-CREATION 
  src/org/apache/pig/backend/hadoop/HDataType.java 84a56b8 
  
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
 96fba6b 
  
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigBigDecimalRawComparator.java
 PRE-CREATION 
  
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigBigIntegerRawComparator.java
 PRE-CREATION 
  
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/partitioners/WeightedRangePartitioner.java
 9749339 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/PhysicalOperator.java
 f40eb43 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Add.java
 c84b767 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/ConstantExpression.java
 db3840f 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Divide.java
 4656c28 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/EqualToExpr.java
 6683beb 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/ExpressionOperator.java
 2806336 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/GTOrEqualToExpr.java
 d64a080 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/GreaterThanExpr.java
 704d0b8 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/LTOrEqualToExpr.java
 9dc929e 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/LessThanExpr.java
 0320698 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Mod.java
 6819185 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Multiply.java
 7b57bed 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/NotEqualToExpr.java
 79a4461 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POBinCond.java
 08544d5 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POCast.java
 e8c2f2c 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POIsNull.java
 f20b839 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/PONegative.java
 c076ae7 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POProject.java
 8887133 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserComparisonFunc.java
 479eb83 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java
 3c7e741 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Subtract.java
 79d4c73 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach.java
 bf2ba08 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POLocalRearrange.java
 ddb25f1 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPartialAgg.java
 aa11409 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPreCombinerLocalRearrange.java
 52401eb 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POSort.java
 ad33e7b 
  src/org/apache/pig/backend/hadoop/hbase/HBaseBinaryConverter.java 60a5899 
  src/org/apache/pig/backend/hadoop/hbase/HBaseStorage.java a6f4ea6 
  src/org/apache/pig/builtin/ABS.java 8a7c631 
  src/org/apache/pig/builtin/BigDecimalAbs.java PRE-CREATION 
  src/org/apache/pig/builtin/BigIntegerAbs.java PRE-CREATION 
  src/org/apache/pig/builtin/BinStorage.java 38b4492 
  src/org/apache/pig/builtin/TextLoader.java d5bcf02 
  src/org/apache/pig/builtin/Utf8StorageConverter.java da12ed6 
  src/org/apache/pig/data/BinInterSedes.java e851d8b 
  src/org/apache/pig/data/DataReaderWriter.java 37a162a 
  src/org/apache/pig/data/DataType.ja

[jira] [Commented] (PIG-3098) Add another test for the self join case

2013-01-17 Thread Jonathan Coveney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556528#comment-13556528
 ] 

Jonathan Coveney commented on PIG-3098:
---

Bump

> Add another test for the self join case
> ---
>
> Key: PIG-3098
> URL: https://issues.apache.org/jira/browse/PIG-3098
> Project: Pig
>  Issue Type: Bug
>Reporter: Jonathan Coveney
>Assignee: Jonathan Coveney
> Fix For: 0.12
>
> Attachments: PIG-3098-0.patch
>
>
> This adds a test to TestJoin that doesn't just make sure that self joins work 
> semantically in the parser, but also that it pulls the right data through. 
> Thought it'd be easier to just make a new JIRA than to reopen PIG-3020.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3078) Make a UDF that, given a string, returns just the columns prefixed by that string

2013-01-17 Thread Jonathan Coveney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556529#comment-13556529
 ] 

Jonathan Coveney commented on PIG-3078:
---

Bump

> Make a UDF that, given a string, returns just the columns prefixed by that 
> string
> -
>
> Key: PIG-3078
> URL: https://issues.apache.org/jira/browse/PIG-3078
> Project: Pig
>  Issue Type: Bug
>Reporter: Jonathan Coveney
>Assignee: Jonathan Coveney
> Fix For: 0.12
>
> Attachments: PIG-3078-0.patch, PIG-3078-1.patch
>
>
> This comes up fairly often, usually as the result of a join. Given that the 
> resulting schema has the column name prepended, a udf in the following form 
> could give just the columns from the desired relation:
> Pluck('relation_name', *)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3082) outputSchema of a UDF allows two usages when describing a Tuple schema

2013-01-17 Thread Jonathan Coveney (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Coveney updated PIG-3082:
--

Fix Version/s: 0.12
   Status: Patch Available  (was: Open)

> outputSchema of a UDF allows two usages when describing a Tuple schema
> --
>
> Key: PIG-3082
> URL: https://issues.apache.org/jira/browse/PIG-3082
> Project: Pig
>  Issue Type: Bug
>Reporter: Julien Le Dem
>Assignee: Jonathan Coveney
> Fix For: 0.12
>
> Attachments: PIG-3082-0.patch
>
>
> When defining an evalfunc that returns a Tuple there are two ways you can 
> implement outputSchema().
> - The right way: return a schema that contains one Field that contains the 
> type and schema of the return type of the UDF
> - The unreliable way: return a schema that contains more than one field and 
> it will be understood as a tuple schema even though there is no type (which 
> is in Field class) to specify that. This is particularly deceitful when the 
> output schema is derived from the input schema and the outputted Tuple 
> sometimes contain only one field. In such cases Pig understands the output 
> schema as a tuple only if there is more than one field. And sometimes it 
> works, sometimes it does not.
> We should at least issue a warning (backward compatibility) if not plain 
> throw an exception when the output schema contains more than one Field.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3090) Introduce a syntax to be able to easily refer to the previously defined relation

2013-01-17 Thread Jonathan Coveney (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Coveney updated PIG-3090:
--

Fix Version/s: 0.12
   Status: Patch Available  (was: Open)

> Introduce a syntax to be able to easily refer to the previously defined 
> relation
> 
>
> Key: PIG-3090
> URL: https://issues.apache.org/jira/browse/PIG-3090
> Project: Pig
>  Issue Type: New Feature
>Reporter: Jonathan Coveney
>Assignee: Jonathan Coveney
> Fix For: 0.12
>
> Attachments: PIG-3090-0.patch
>
>
> Sometimes I feel like swimming with ANTLRs. This particular feature isn't too 
> hard to add... and supports syntax like this:
> {code}
> a = load 'thing' as (x:int);
> b = foreach @ generate x;
> c = foreach @ generate x;
> d = foreach @ generate x;
> {code}
> I have a patch, though I need to make sure it doesn't change anything (it 
> shouldn't) and I need to add tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2764) Add a biginteger and bigdecimal type to pig

2013-01-17 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556471#comment-13556471
 ] 

Alan Gates commented on PIG-2764:
-

I can review it.  I'll try to get to it tomorrow (1/18).

> Add a biginteger and bigdecimal type to pig
> ---
>
> Key: PIG-2764
> URL: https://issues.apache.org/jira/browse/PIG-2764
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jonathan Coveney
>Assignee: Jonathan Coveney
> Attachments: fixedpoint.patch, PIG-2764-0.patch, PIG-2764-1.patch, 
> PIG-2764-2_nows.patch, PIG-2764-2.patch
>
>
> I think it would be useful for applications where precision is more important 
> than speed to have the option of using java's bigdecimal and biginteger types 
> natively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (PIG-3121) Optionally convert long to chararray in JsonStorage

2013-01-17 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556402#comment-13556402
 ] 

Alan Gates edited comment on PIG-3121 at 1/17/13 6:08 PM:
--

My concern is what you brought up, the problem here isn't JsonStorage.  

One other option I'd like to point out is that you could extend JsonStorage 
with a new class CasterJsonStorage.  The only method it would implement would 
be putNext.  In that method it could do the casts and then call 
super.putNext().  This is hopefully light weight enough from your viewpoint and 
avoids pushing one off features into JsonStorage.

  was (Author: alangates):
My concern is what you brought up, the problem here isn't JsonStorage.  

One other option I'd like to point out is that you could extend JsonStorage 
with a new class CasterJsonStorage.  The only method it would implement would 
be putNext.  In that method it could do the casts and then call 
super.putNext().  This is hopefully light weight enough from your viewpoint and 
avoids pushing one of features into JsonStorage.
  
> Optionally convert long to chararray in JsonStorage
> ---
>
> Key: PIG-3121
> URL: https://issues.apache.org/jira/browse/PIG-3121
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.10.0
>Reporter: Josh Levy
>
> I work with a data set that uses random longs (64 bit integers) as 
> identifiers.  Recently I've been accessing the data from Pig and using 
> JsonStorage to save records, that I then run through another script to get 
> JSON that I can feed into other tools.  One of the tools I use is broken in 
> the sense that it treats all numbers as 64 bit floating point, and it can't 
> faithfully reproduce most of the identifiers I pass it.  My work around is to 
> convert the identifiers to strings before they get to that tool.  
> If I provide a patch, is there interest in adding an option to JsonStorage 
> that tells it to serialize all longs as if they are strings? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3121) Optionally convert long to chararray in JsonStorage

2013-01-17 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556402#comment-13556402
 ] 

Alan Gates commented on PIG-3121:
-

My concern is what you brought up, the problem here isn't JsonStorage.  

One other option I'd like to point out is that you could extend JsonStorage 
with a new class CasterJsonStorage.  The only method it would implement would 
be putNext.  In that method it could do the casts and then call 
super.putNext().  This is hopefully light weight enough from your viewpoint and 
avoids pushing one of features into JsonStorage.

> Optionally convert long to chararray in JsonStorage
> ---
>
> Key: PIG-3121
> URL: https://issues.apache.org/jira/browse/PIG-3121
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.10.0
>Reporter: Josh Levy
>
> I work with a data set that uses random longs (64 bit integers) as 
> identifiers.  Recently I've been accessing the data from Pig and using 
> JsonStorage to save records, that I then run through another script to get 
> JSON that I can feed into other tools.  One of the tools I use is broken in 
> the sense that it treats all numbers as 64 bit floating point, and it can't 
> faithfully reproduce most of the identifiers I pass it.  My work around is to 
> convert the identifiers to strings before they get to that tool.  
> If I provide a patch, is there interest in adding an option to JsonStorage 
> that tells it to serialize all longs as if they are strings? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira