[VOTE] Release Pig 0.12.0 (candidate 0)

2013-10-05 Thread Daniel Dai
Hi,

I have created a candidate build for Pig 0.12.0.

Keys used to sign the release are available at
http://svn.apache.org/viewvc/pig/trunk/KEYS?view=markup

Please download, test, and try it out:

http://people.apache.org/~daijy/pig-0.12.0-candidate-0/

Should we release this? Vote closes on EOD next Wed, Oct 9th.

Thanks,
Daniel

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


[jira] Subscription: PIG patch available

2013-10-05 Thread jira
Issue Subscription
Filter: PIG patch available (13 issues)

Subscriber: pigdaily

Key Summary
PIG-3501Initial implementation of TezJobControlCompiler
https://issues.apache.org/jira/browse/PIG-3501
PIG-3500Initial implementation of TezCompiler
https://issues.apache.org/jira/browse/PIG-3500
PIG-3496Propagate HBase 0.95 jars to the backend
https://issues.apache.org/jira/browse/PIG-3496
PIG-3451EvalFunc ctor reflection to determine value of type param T is 
brittle
https://issues.apache.org/jira/browse/PIG-3451
PIG-3449Move JobCreationException to 
org.apache.pig.backend.hadoop.executionengine
https://issues.apache.org/jira/browse/PIG-3449
PIG-3441Allow Pig to use default resources from Configuration objects
https://issues.apache.org/jira/browse/PIG-3441
PIG-3388No support for Regex for row filter in 
org.apache.pig.backend.hadoop.hbase.HBaseStorage
https://issues.apache.org/jira/browse/PIG-3388
PIG-3347Store invocation in local mode brings sire effect
https://issues.apache.org/jira/browse/PIG-3347
PIG-3325Adding a tuple to a bag is slow
https://issues.apache.org/jira/browse/PIG-3325
PIG-3257Add unique identifier UDF
https://issues.apache.org/jira/browse/PIG-3257
PIG-3117A debug mode in which pig does not delete temporary files
https://issues.apache.org/jira/browse/PIG-3117
PIG-3088Add a builtin udf which removes prefixes
https://issues.apache.org/jira/browse/PIG-3088
PIG-3021Split results missing records when there is null values in the 
column comparison
https://issues.apache.org/jira/browse/PIG-3021

You may edit this subscription at:
https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=13225&filterId=12322384


[jira] [Updated] (PIG-3088) Add a builtin udf which removes prefixes

2013-10-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3088:


Fix Version/s: (was: 0.12.0)
   0.13.0

> Add a builtin udf which removes prefixes
> 
>
> Key: PIG-3088
> URL: https://issues.apache.org/jira/browse/PIG-3088
> Project: Pig
>  Issue Type: New Feature
>Reporter: Jonathan Coveney
>Assignee: Jonathan Coveney
> Fix For: 0.13.0
>
> Attachments: PIG-3088-0.patch
>
>
> This is something that I always hear people complaining about. Note that this 
> depends on the FlattenOutput annotation.
> This UDF supports the following.
> {code}
> a = load 'a' as (x1, y1, z1);
> b = load 'a' as (x2, y2, z2);
> c = join a by x1, b by x2;
> describe c;
> --c: {a::x1: bytearray,a::y1: bytearray,a::z1: bytearray,b::x2: 
> bytearray,b::y2: bytearray,b::z2: bytearray}
> d = foreach c generate RemovePrefix(*);
> describe d;
> --d: {x1: bytearray,y1: bytearray,z1: bytearray,x2: bytearray,y2: 
> bytearray,z2: bytearray}
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-3127) Add e2e testing for BigInteger and BigDecimal data type

2013-10-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3127:


Fix Version/s: (was: 0.12.0)
   0.13.0

> Add e2e testing for BigInteger and BigDecimal data type
> ---
>
> Key: PIG-3127
> URL: https://issues.apache.org/jira/browse/PIG-3127
> Project: Pig
>  Issue Type: Task
>Affects Versions: 0.12.0
>Reporter: Jonathan Coveney
>Assignee: Alan Gates
>Priority: Blocker
> Fix For: 0.13.0
>
>
> We need e2e test coverage for these new data types.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-3257) Add unique identifier UDF

2013-10-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3257:


Fix Version/s: (was: 0.12.0)
   0.13.0

> Add unique identifier UDF
> -
>
> Key: PIG-3257
> URL: https://issues.apache.org/jira/browse/PIG-3257
> Project: Pig
>  Issue Type: Improvement
>  Components: internal-udfs
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: 0.13.0
>
> Attachments: PIG-3257.patch
>
>
> It would be good to have a Pig function to generate unique identifiers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-2830) Macros should work in Grunt

2013-10-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-2830:


Fix Version/s: (was: 0.12.0)
   0.13.0

> Macros should work in Grunt
> ---
>
> Key: PIG-2830
> URL: https://issues.apache.org/jira/browse/PIG-2830
> Project: Pig
>  Issue Type: Improvement
>  Components: grunt, parser
>Affects Versions: 0.10.0, 0.11, 0.10.1
>Reporter: Russell Jurney
>Priority: Critical
>  Labels: fun, grunt, happy, macro, pants
> Fix For: 0.13.0
>
>
> It would be very helpful in writing Pig scripts if Grunt could load and use 
> Macros in an interactive session.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-3005) TestLargeFile#testOrderBy is failing

2013-10-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3005:


Fix Version/s: (was: 0.12.0)
   0.13.0

> TestLargeFile#testOrderBy is failing
> 
>
> Key: PIG-3005
> URL: https://issues.apache.org/jira/browse/PIG-3005
> Project: Pig
>  Issue Type: Bug
> Environment: Mac OSX 10.6.8
>Reporter: Jonathan Coveney
> Fix For: 0.13.0
>
>
> When run locally, at least, this test is failing for me.
> Has anyone else noticed this failing?



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-2880) Pig current releases lack a UDF charAt.This UDF returns the char value at the specified index.

2013-10-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-2880:


Fix Version/s: (was: 0.12.0)
   0.13.0

> Pig current releases lack a UDF charAt.This UDF returns the char value at the 
> specified index.
> --
>
> Key: PIG-2880
> URL: https://issues.apache.org/jira/browse/PIG-2880
> Project: Pig
>  Issue Type: New Feature
>  Components: piggybank
>Reporter: Sabir Ayappalli
>  Labels: patch
> Fix For: 0.13.0
>
> Attachments: CharAt.java.patch
>
>
> Pig current releases lack a UDF charAt.This UDF returns the char value at the 
> specified index. An index ranges from 0 to length() - 1. The first char value 
> of the sequence is at index 0, the next at index 1, and so on.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-3164) Pig current releases lack a UDF endsWith.This UDF tests if a given string ends with the specified suffix.

2013-10-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3164:


Fix Version/s: (was: 0.12.0)
   0.13.0

> Pig current releases lack a UDF endsWith.This UDF tests if a given string 
> ends with the specified suffix.
> -
>
> Key: PIG-3164
> URL: https://issues.apache.org/jira/browse/PIG-3164
> Project: Pig
>  Issue Type: New Feature
>  Components: piggybank
>Affects Versions: 0.10.0
>Reporter: Anuroopa George
>Assignee: Anuroopa George
> Fix For: 0.13.0
>
> Attachments: ENDSWITH.java.patch, ENDSWITH_updated.java
>
>
> Pig current releases lack a UDF endsWith.This UDF tests if a given string  
> ends with the specified suffix.This UDF returns true if the character 
> sequence represented by the string argument given as a suffix is a suffix of 
> the character sequence represented by the given string; false otherwise.Also 
> true will be returned if the given suffix is an empty string or is equal to 
> the given String.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-2981) add e2e tests for DateTime data type

2013-10-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-2981:


Fix Version/s: (was: 0.12.0)
   0.13.0

> add e2e tests for DateTime  data type
> -
>
> Key: PIG-2981
> URL: https://issues.apache.org/jira/browse/PIG-2981
> Project: Pig
>  Issue Type: Test
>Reporter: Thejas M Nair
>Assignee: Annie Lin
> Fix For: 0.13.0
>
>
> e2e tests for DateTime datatype need to be added.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-3010) Allow UDF's to flatten themselves

2013-10-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3010:


Fix Version/s: (was: 0.12.0)
   0.13.0

> Allow UDF's to flatten themselves
> -
>
> Key: PIG-3010
> URL: https://issues.apache.org/jira/browse/PIG-3010
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jonathan Coveney
>Assignee: Jonathan Coveney
> Fix For: 0.13.0
>
> Attachments: PIG-3010-0.patch, PIG-3010-1.patch, 
> PIG-3010-2_nowhitespace.patch, PIG-3010-2.patch, PIG-3010-3_nows.patch, 
> PIG-3010-3.patch, PIG-3010-4_nows.patch, PIG-3010-4.patch, 
> PIG-3010-5_nows.patch, PIG-3010-5.patch
>
>
> This is something I thought would be cool for a while, so I sat down and did 
> it because I think there are some useful debugging tools it'd help with.
> The idea is that if you attach an annotation to a UDF, the Tuple or DataBag 
> you output will be flattened. This is quite powerful. A very common pattern 
> is:
> a = foreach data generate Flatten(MyUdf(thing)) as (a,b,c);
> This would let you just do:
> a = foreach data generate MyUdf(thing);
> With the exact same result!



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-2315) Make as clause work in generate

2013-10-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-2315:


Fix Version/s: (was: 0.12.0)
   0.12.1

> Make as clause work in generate
> ---
>
> Key: PIG-2315
> URL: https://issues.apache.org/jira/browse/PIG-2315
> Project: Pig
>  Issue Type: Bug
>Reporter: Olga Natkovich
>Assignee: Gianmarco De Francisci Morales
> Fix For: 0.12.1
>
> Attachments: PIG-2315-1.patch, PIG-2315-1.patch
>
>
> Currently, the following syntax is supported and ignored causing confusing 
> with users:
> A1 = foreach A1 generate a as a:chararray ;
> After this statement a just retains its previous type



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-2672) Optimize the use of DistributedCache

2013-10-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-2672:


Fix Version/s: (was: 0.12.0)
   0.13.0

> Optimize the use of DistributedCache
> 
>
> Key: PIG-2672
> URL: https://issues.apache.org/jira/browse/PIG-2672
> Project: Pig
>  Issue Type: Improvement
>Reporter: Rohini Palaniswamy
> Fix For: 0.13.0
>
> Attachments: PIG-2672.patch
>
>
> Pig currently copies jar files to a temporary location in hdfs and then adds 
> them to DistributedCache for each job launched. This is inefficient in terms 
> of 
>* Space - The jars are distributed to task trackers for every job taking 
> up lot of local temporary space in tasktrackers.
>* Performance - The jar distribution impacts the job launch time.  



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-3325) Adding a tuple to a bag is slow

2013-10-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3325:


Fix Version/s: (was: 0.12.0)
   0.12.1

> Adding a tuple to a bag is slow
> ---
>
> Key: PIG-3325
> URL: https://issues.apache.org/jira/browse/PIG-3325
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.11, 0.12.0, 0.11.1, 0.11.2
>Reporter: Mark Wagner
>Assignee: Dmitriy V. Ryaboy
>Priority: Critical
> Fix For: 0.12.1
>
> Attachments: PIG-3325.2.patch, PIG-3325.3.patch, PIG-3325.demo.patch, 
> PIG-3325.optimize.1.patch
>
>
> The time it takes to add a tuple to a bag has increased significantly, 
> causing some jobs to take about 50x longer compared to 0.10.1. I've tracked 
> this down to PIG-2923, which has made adding a tuple heavier weight (it now 
> includes some memory estimation).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-3388) No support for Regex for row filter in org.apache.pig.backend.hadoop.hbase.HBaseStorage

2013-10-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3388:


Fix Version/s: (was: 0.12.0)
   0.13.0

> No support for Regex for row filter in 
> org.apache.pig.backend.hadoop.hbase.HBaseStorage
> ---
>
> Key: PIG-3388
> URL: https://issues.apache.org/jira/browse/PIG-3388
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Affects Versions: 0.11, 0.11.1
>Reporter: vikram s
>Assignee: Lorand Bendig
> Fix For: 0.13.0
>
> Attachments: PIG-3388.patch
>
>
> Currently,scan operation with rowfilter has support for gt,lt,gte,etc. 
> However no support for the regular expression.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-3347) Store invocation in local mode brings sire effect

2013-10-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3347:


Fix Version/s: (was: 0.12.0)
   0.12.1

> Store invocation in local mode brings sire effect
> -
>
> Key: PIG-3347
> URL: https://issues.apache.org/jira/browse/PIG-3347
> Project: Pig
>  Issue Type: Bug
>  Components: grunt
>Affects Versions: 0.11
> Environment: local mode
>Reporter: Sergey
>Assignee: Daniel Dai
> Fix For: 0.12.1
>
> Attachments: PIG-3347-1.patch
>
>
> The problem is that intermediate 'store' invocation "changes" the final store 
> output. Looks like it brings some kind of side effect. We did use 'local' 
> mode to run script
> here is the input data:
> 1
> 1
> Here is the script:
> {code}
> a = load 'test';
> a_group = group a by $0;
> b = foreach a_group {
>   a_distinct = distinct a.$0;
>   generate group, a_distinct;
> }
> --store b into 'b';
> c = filter b by SIZE(a_distinct) == 1;
> store c into 'out';
> {code}
> We expect output to be:
> 1 1
> The output is empty file.
> Uncomment {code}--store b into 'b';{code} line and see the diffrence.
> Yuo would get expected output.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-3463) Pig should use hadoop local mode for small jobs

2013-10-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3463:


Fix Version/s: (was: 0.12.0)
   0.13.0

> Pig should use hadoop local mode for small jobs
> ---
>
> Key: PIG-3463
> URL: https://issues.apache.org/jira/browse/PIG-3463
> Project: Pig
>  Issue Type: New Feature
>  Components: impl
>Affects Versions: 0.11.1
>Reporter: Aniket Mokashi
> Fix For: 0.13.0
>
>
> Pig should use hadoop local mode for small jobs - few mappers, few reducers 
> and few mb of data.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-3451) EvalFunc ctor reflection to determine value of type param T is brittle

2013-10-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3451:


Fix Version/s: (was: 0.12.0)
   0.13.0

> EvalFunc ctor reflection to determine value of type param T is brittle
> -
>
> Key: PIG-3451
> URL: https://issues.apache.org/jira/browse/PIG-3451
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.11.1
>Reporter: Andy Schlaikjer
>Assignee: Andy Schlaikjer
> Fix For: 0.13.0
>
> Attachments: PIG-3451-3.patch
>
>
> The {{EvalFunc}} base class has logic in its default ctor to attempt to 
> determine the runtime type of its type parameter {{T}}. This logic is brittle 
> when the derived class has type parameters of its own. For instance:
> {code}
> public static abstract EvalFunc1 extends EvalFunc {}
> public static abstract EvalFunc2 extends EvalFunc1 {}
> public static EvalFunc3 extends EvalFunc1 { ... }
> {code}
> Here, {{EvalFunc3}} does specify concrete type {{DataBag}} for {{T}} of 
> {{EvalFunc}}, but the existing logic in the default ctor fails to identify 
> it.
> Here's a unit test which reproduces this failure:
> https://github.com/sagemintblue/pig/compare/apache:trunk...hazen/repro_eval_func_reflection_bug
> Here's the test with an update to {{EvalFunc}}'s logic which fixes the issue:
> https://github.com/sagemintblue/pig/compare/apache:trunk...hazen/fix_eval_func_reflection



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-3478) Make StreamingUDF work for Hadoop 2

2013-10-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3478:


Fix Version/s: (was: 0.12.0)
   0.12.1

> Make StreamingUDF work for Hadoop 2
> ---
>
> Key: PIG-3478
> URL: https://issues.apache.org/jira/browse/PIG-3478
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Jeremy Karn
> Fix For: 0.12.1
>
>
> PIG-2417 introduced Streaming UDF. However, it does not work under Hadoop 2. 
> Both unit tests/e2e tests under Haodop 2 fails. We need to fix it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-3472) Pig should avoid replicated join if size is greater than configured limit

2013-10-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3472:


Fix Version/s: (was: 0.12.0)
   0.13.0

> Pig should avoid replicated join if size is greater than configured limit
> -
>
> Key: PIG-3472
> URL: https://issues.apache.org/jira/browse/PIG-3472
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Affects Versions: 0.11.1
>Reporter: Aniket Mokashi
> Fix For: 0.13.0
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-3377) New AvroStorage throws NPE when storing untyped map/array/bag

2013-10-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3377:


Fix Version/s: (was: 0.12.0)
   0.12.1

> New AvroStorage throws NPE when storing untyped map/array/bag
> -
>
> Key: PIG-3377
> URL: https://issues.apache.org/jira/browse/PIG-3377
> Project: Pig
>  Issue Type: Bug
>  Components: internal-udfs
>Reporter: Cheolsoo Park
>Assignee: Joseph Adler
> Fix For: 0.12.1
>
>
> The following example demonstrates the issue:
> {code}
> a = LOAD 'foo' AS (m:map[]);
> STORE a INTO 'bar' USING AvroStorage();
> {code}
> This fails with the following error:
> {code}
> java.lang.NullPointerException
> at 
> org.apache.pig.impl.util.avro.AvroStorageSchemaConversionUtilities.resourceFieldSchemaToAvroSchema(AvroStorageSchemaConversionUtilities.java:462)
> at 
> org.apache.pig.impl.util.avro.AvroStorageSchemaConversionUtilities.resourceSchemaToAvroSchema(AvroStorageSchemaConversionUtilities.java:335)
> at org.apache.pig.builtin.AvroStorage.checkSchema(AvroStorage.java:472)
> {code}
> Similarly, untyped bag causes the following error:
> {code}
> Caused by: java.lang.NullPointerException
> at org.apache.avro.Schema$ArraySchema.toJson(Schema.java:722)
> ...
> at org.apache.avro.Schema.getElementType(Schema.java:256)
> at 
> org.apache.pig.builtin.AvroStorage.setOutputAvroSchema(AvroStorage.java:491)
> {code}
> The problem is that AvroStorage cannot derive the output schema from untyped 
> map/bag/tuple. When type is not defined, it should be assumed as bytearray.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-3480) TFile-based tmpfile compression crashes in some cases

2013-10-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3480:


Fix Version/s: (was: 0.12.0)
   0.12.1

> TFile-based tmpfile compression crashes in some cases
> -
>
> Key: PIG-3480
> URL: https://issues.apache.org/jira/browse/PIG-3480
> Project: Pig
>  Issue Type: Bug
>Reporter: Dmitriy V. Ryaboy
> Fix For: 0.12.1
>
> Attachments: PIG-3480.patch
>
>
> When pig tmpfile compression is on, some jobs fail inside core hadoop 
> internals.
> Suspect TFile is the problem, because an experiment in replacing TFile with 
> SequenceFile succeeded.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (PIG-3503) More document for Pig 0.12 new features

2013-10-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai resolved PIG-3503.
-

  Resolution: Fixed
Hadoop Flags: Reviewed

Patch committed to trunk and branch 0.12.

> More document for Pig 0.12 new features
> ---
>
> Key: PIG-3503
> URL: https://issues.apache.org/jira/browse/PIG-3503
> Project: Pig
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.12.0
>
> Attachments: PIG-3503-1.patch, PIG-3503-2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-3503) More document for Pig 0.12 new features

2013-10-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3503:


Attachment: PIG-3503-2.patch

Addressing Thejas' review comment.

> More document for Pig 0.12 new features
> ---
>
> Key: PIG-3503
> URL: https://issues.apache.org/jira/browse/PIG-3503
> Project: Pig
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.12.0
>
> Attachments: PIG-3503-1.patch, PIG-3503-2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (PIG-3503) More document for Pig 0.12 new features

2013-10-05 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787319#comment-13787319
 ] 

Thejas M Nair commented on PIG-3503:


Everything else looks good. You can commit after the changes.


> More document for Pig 0.12 new features
> ---
>
> Key: PIG-3503
> URL: https://issues.apache.org/jira/browse/PIG-3503
> Project: Pig
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.12.0
>
> Attachments: PIG-3503-1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (PIG-3503) More document for Pig 0.12 new features

2013-10-05 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787318#comment-13787318
 ] 

Thejas M Nair commented on PIG-3503:


"If use set command without providing key/value pair, Pig print all the 
configurations and all system properties. " can be changed to 
"If set command is used without key/value pair argument, Pig prints all the 
configurations and system properties."

In perf.xml
In the example, should we use a load function that supports partition filter 
pushdown ? Otherwise, people might expect it to work with PigStorage.
Also, should the example in it without the filter statement be removed ? 
{code}
+
+A = LOAD 'input' as (dt, state, event);
+
{code}


> More document for Pig 0.12 new features
> ---
>
> Key: PIG-3503
> URL: https://issues.apache.org/jira/browse/PIG-3503
> Project: Pig
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.12.0
>
> Attachments: PIG-3503-1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-3503) More document for Pig 0.12 new features

2013-10-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3503:


Attachment: PIG-3503-1.patch

> More document for Pig 0.12 new features
> ---
>
> Key: PIG-3503
> URL: https://issues.apache.org/jira/browse/PIG-3503
> Project: Pig
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.12.0
>
> Attachments: PIG-3503-1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)