[jira] [Commented] (PIG-2620) Customizable Error Handling in Pig

2013-12-04 Thread Russell Jurney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13839875#comment-13839875
 ] 

Russell Jurney commented on PIG-2620:
-

This ticket is the future of data processing. Who do we have to bribe to get 
this built?

> Customizable Error Handling in Pig
> --
>
> Key: PIG-2620
> URL: https://issues.apache.org/jira/browse/PIG-2620
> Project: Pig
>  Issue Type: New Feature
>Reporter: Dmitriy V. Ryaboy
>
> The current behavior of Pig when handling exceptions thrown by UDFs is to 
> fail and stop processing. We want to extend this behavior to let user have 
> finer grain control on error handling.
> Depending on the use-case there are several options users would like to have:
> Stop the execution and report an error
> Ignore tuples that cause exceptions and log warnings
> Ignore tuples that cause exceptions and redirect them to an error relation 
> (to enable statistics, debugging, ...)
> Write their own error handler



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (PIG-3611) Add order by string, descending order e2e tests

2013-12-04 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai resolved PIG-3611.
-

  Resolution: Fixed
Hadoop Flags: Reviewed

Patch committed to tez branch.

> Add order by string, descending order e2e tests
> ---
>
> Key: PIG-3611
> URL: https://issues.apache.org/jira/browse/PIG-3611
> Project: Pig
>  Issue Type: Sub-task
>  Components: tez
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: tez-branch
>
> Attachments: PIG-3611-1.patch
>
>
> Order by string, descending order works with PIG-3527. We shall add e2e tests.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (PIG-3611) Add order by string, descending order e2e tests

2013-12-04 Thread Cheolsoo Park (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13839683#comment-13839683
 ] 

Cheolsoo Park commented on PIG-3611:


+1.

> Add order by string, descending order e2e tests
> ---
>
> Key: PIG-3611
> URL: https://issues.apache.org/jira/browse/PIG-3611
> Project: Pig
>  Issue Type: Sub-task
>  Components: tez
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: tez-branch
>
> Attachments: PIG-3611-1.patch
>
>
> Order by string, descending order works with PIG-3527. We shall add e2e tests.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-3610) Fix e2e tests Operators_3, Operators_5

2013-12-04 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3610:


Description: Operators_3, Operators_5 failed after PIG-3565 checkin. We 
shall fix those.  (was: Operators_3, Operators_5, Join_1 failed after PIG-3565 
checkin. We shall fix those.)

> Fix e2e tests Operators_3, Operators_5
> --
>
> Key: PIG-3610
> URL: https://issues.apache.org/jira/browse/PIG-3610
> Project: Pig
>  Issue Type: Sub-task
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: tez-branch
>
> Attachments: PIG-3610-1.patch
>
>
> Operators_3, Operators_5 failed after PIG-3565 checkin. We shall fix those.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-3610) Fix e2e tests Operators_3, Operators_5

2013-12-04 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3610:


Summary: Fix e2e tests Operators_3, Operators_5  (was: Fix e2e tests 
Operators_3, Operators_5, Join_1)

> Fix e2e tests Operators_3, Operators_5
> --
>
> Key: PIG-3610
> URL: https://issues.apache.org/jira/browse/PIG-3610
> Project: Pig
>  Issue Type: Sub-task
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: tez-branch
>
> Attachments: PIG-3610-1.patch
>
>
> Operators_3, Operators_5, Join_1 failed after PIG-3565 checkin. We shall fix 
> those.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-3611) Add order by string, descending order e2e tests

2013-12-04 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3611:


Attachment: PIG-3611-1.patch

> Add order by string, descending order e2e tests
> ---
>
> Key: PIG-3611
> URL: https://issues.apache.org/jira/browse/PIG-3611
> Project: Pig
>  Issue Type: Sub-task
>  Components: tez
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: tez-branch
>
> Attachments: PIG-3611-1.patch
>
>
> Order by string, descending order works with PIG-3527. We shall add e2e tests.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] Subscription: PIG patch available

2013-12-04 Thread jira
Issue Subscription
Filter: PIG patch available (8 issues)

Subscriber: pigdaily

Key Summary
PIG-3592Should not try to create success file for non-fs schemes like hbase
https://issues.apache.org/jira/browse/PIG-3592
PIG-3587add functionality for rolling over dates
https://issues.apache.org/jira/browse/PIG-3587
PIG-3573Provide StoreFunc and LoadFunc for Accumulo
https://issues.apache.org/jira/browse/PIG-3573
PIG-3572Fix all unit test for during build pig with Hadoop 2.X on Windows.
https://issues.apache.org/jira/browse/PIG-3572
PIG-3453Implement a Storm backend to Pig
https://issues.apache.org/jira/browse/PIG-3453
PIG-3441Allow Pig to use default resources from Configuration objects
https://issues.apache.org/jira/browse/PIG-3441
PIG-3347Store invocation brings side effect
https://issues.apache.org/jira/browse/PIG-3347
PIG-2629Wrong Usage of Scalar which is null causes high namenode operation 
https://issues.apache.org/jira/browse/PIG-2629

You may edit this subscription at:
https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=13225&filterId=12322384


[jira] [Commented] (PIG-2594) JsonLoader/JsonStorage does not work with boolean

2013-12-04 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13839557#comment-13839557
 ] 

Rohini Palaniswamy commented on PIG-2594:
-

[~pgroudas],
  This patch has become outdated. Can you rebase and resubmit your patch? 
Please ensure that you click on "Submit Patch" next time after uploading a 
patch, as it puts it in the Patch Available list and ensures that someone takes 
a look and gets it committed.

> JsonLoader/JsonStorage does not work with boolean
> -
>
> Key: PIG-2594
> URL: https://issues.apache.org/jira/browse/PIG-2594
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.10.0
>Reporter: Daniel Dai
> Attachments: PIG-2594-2.patch, PIG-2594.patch, PIG-2594.patch
>
>
> The following script fail:
> {code}
> A = LOAD 'allscalar10k.json' using JsonLoader();
> store B into 'output';
> {code}
> Exception:
> java.io.IOException: Unknown type in input schema: 5
>   at org.apache.pig.builtin.JsonLoader.readField(JsonLoader.java:292)
>   at org.apache.pig.builtin.JsonLoader.getNext(JsonLoader.java:157)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187)
>   at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)
>   at 
> org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (PIG-3611) Add order by string, descending order e2e tests

2013-12-04 Thread Daniel Dai (JIRA)
Daniel Dai created PIG-3611:
---

 Summary: Add order by string, descending order e2e tests
 Key: PIG-3611
 URL: https://issues.apache.org/jira/browse/PIG-3611
 Project: Pig
  Issue Type: Sub-task
  Components: tez
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: tez-branch


Order by string, descending order works with PIG-3527. We shall add e2e tests.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (PIG-3610) Fix e2e tests Operators_3, Operators_5, Join_1

2013-12-04 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai resolved PIG-3610.
-

  Resolution: Fixed
Hadoop Flags: Reviewed

Patch committed to tez branch. Thanks Cheolsoo for quick review!

> Fix e2e tests Operators_3, Operators_5, Join_1
> --
>
> Key: PIG-3610
> URL: https://issues.apache.org/jira/browse/PIG-3610
> Project: Pig
>  Issue Type: Sub-task
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: tez-branch
>
> Attachments: PIG-3610-1.patch
>
>
> Operators_3, Operators_5, Join_1 failed after PIG-3565 checkin. We shall fix 
> those.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (PIG-3610) Fix e2e tests Operators_3, Operators_5, Join_1

2013-12-04 Thread Cheolsoo Park (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13839513#comment-13839513
 ] 

Cheolsoo Park commented on PIG-3610:


+1.

Join_1 is still fails intermittently, but that's expected and not scope of this 
fix.

> Fix e2e tests Operators_3, Operators_5, Join_1
> --
>
> Key: PIG-3610
> URL: https://issues.apache.org/jira/browse/PIG-3610
> Project: Pig
>  Issue Type: Sub-task
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: tez-branch
>
> Attachments: PIG-3610-1.patch
>
>
> Operators_3, Operators_5, Join_1 failed after PIG-3565 checkin. We shall fix 
> those.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-3610) Fix e2e tests Operators_3, Operators_5, Join_1

2013-12-04 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3610:


Attachment: PIG-3610-1.patch

The problem is fixed if set partitioner on edge instead of vertex. Not sure why 
it works before PIG-3565.

> Fix e2e tests Operators_3, Operators_5, Join_1
> --
>
> Key: PIG-3610
> URL: https://issues.apache.org/jira/browse/PIG-3610
> Project: Pig
>  Issue Type: Sub-task
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: tez-branch
>
> Attachments: PIG-3610-1.patch
>
>
> Operators_3, Operators_5, Join_1 failed after PIG-3565 checkin. We shall fix 
> those.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (PIG-3610) Fix e2e tests Operators_3, Operators_5, Join_1

2013-12-04 Thread Daniel Dai (JIRA)
Daniel Dai created PIG-3610:
---

 Summary: Fix e2e tests Operators_3, Operators_5, Join_1
 Key: PIG-3610
 URL: https://issues.apache.org/jira/browse/PIG-3610
 Project: Pig
  Issue Type: Sub-task
  Components: impl
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: tez-branch


Operators_3, Operators_5, Join_1 failed after PIG-3565 checkin. We shall fix 
those.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (PIG-3609) ClassCastException when calling compareTo method on AvroBagWrapper 

2013-12-04 Thread Richard Ding (JIRA)
Richard Ding created PIG-3609:
-

 Summary: ClassCastException when calling compareTo method on 
AvroBagWrapper 
 Key: PIG-3609
 URL: https://issues.apache.org/jira/browse/PIG-3609
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.12.0
Reporter: Richard Ding
Priority: Minor


One got the following exception when calling compareTo method on AvroBagWrapper 
with an AvroBagWrapper object:

{code}
java.lang.ClassCastException: org.apache.pig.impl.util.avro.AvroBagWrapper 
incompatible with java.util.Collection
at org.apache.avro.generic.GenericData.compare(GenericData.java:786)
at org.apache.avro.generic.GenericData.compare(GenericData.java:760)
at 
org.apache.pig.impl.util.avro.AvroBagWrapper.compareTo(AvroBagWrapper.java:78)
{code}

Looking at the code, it compares objects with different types:

{code}
return GenericData.get().compare(theArray, o, theArray.getSchema());
{code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (PIG-3608) ClassCastException when looking up a value from AvroMapWrapper using a Utf8 key

2013-12-04 Thread Richard Ding (JIRA)
Richard Ding created PIG-3608:
-

 Summary: ClassCastException when looking up a value from 
AvroMapWrapper using a Utf8 key
 Key: PIG-3608
 URL: https://issues.apache.org/jira/browse/PIG-3608
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.12.0
Reporter: Richard Ding
Priority: Minor


One got the following exception:

{code}
java.lang.ClassCastException: org.apache.avro.util.Utf8 incompatible with 
java.lang.String 
at org.apache.pig.impl.util.avro.AvroMapWrapper.get(AvroMapWrapper.java:80)
{code}

This is related to the change by PIG-3420.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (PIG-3607) PigRecordReader should report progress for each inputsplit processed

2013-12-04 Thread Rohini Palaniswamy (JIRA)
Rohini Palaniswamy created PIG-3607:
---

 Summary: PigRecordReader should report progress for each 
inputsplit processed
 Key: PIG-3607
 URL: https://issues.apache.org/jira/browse/PIG-3607
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.11.1
Reporter: Rohini Palaniswamy
Assignee: Rohini Palaniswamy


 Currently progress() is called only when records are processed. In a case 
where there were lot of empty input files, the task timed out and was killed 
because no progress was reported.  Too many empty input files are bad, but we 
still don't want to fail. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (PIG-3406) JsonStorage generates NPE after successfully saving records

2013-12-04 Thread mcouper99 (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838924#comment-13838924
 ] 

mcouper99 commented on PIG-3406:


I'm not sure if this is related, but I ran into the issue this morning.  In 
looking at the generated schema, there does appear to be a 'null' name in the 
output:

{"fields":[{"name":null,"type":15,"description":"autogenerated from Pig Field 
Schema","schema":null},{"name":"group::valid_events::flight_id","type":15,"description":"autogenerated
 from Pig Field 
Schema","schema":null},{"name":"imps","type":15,"description":"autogenerated 
from Pig Field 
Schema","schema":null},{"name":"clicks","type":15,"description":"autogenerated 
from Pig Field 
Schema","schema":null}],"version":0,"sortKeys":[],"sortKeyOrders":[]}

In this case, I suspect it has something to do with the fact that I'm using a 
derived field in a group-by that doesn't have an associated alias:

group events by ((long)epoch/360 * 3600, flight_id);


> JsonStorage generates NPE after successfully saving records
> ---
>
> Key: PIG-3406
> URL: https://issues.apache.org/jira/browse/PIG-3406
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.11.1
> Environment: OSX 10.8.4, running PigServer in local mode from a 
> Groovy script.
>Reporter: Kirk Stork
>
> Make a new PigServer("local") instance, process some queries and save an 
> alias using
> PigServer piggy = new PigServer("local")
> ... some queries..
> def job = piggy.store("c_observer_id", "obs.json", 
> 'org.apache.pig.builtin.JsonStorage');
> The local directory obs.json is created and populated with correct results.
> but then an NPE is thrown, apparently during some attempt to store the schema.
> HadoopVersion PigVersion  UserId  StartedAt   FinishedAt  Features
> 1.1.0 0.11.1  kirk2013-08-01 14:09:41 2013-08-01 14:09:42 GROUP_BY
> Success!
> Job Stats (time in seconds):
> JobId Alias   Feature Outputs
> job_local_0001c_observer_id,g_observer_id,obs GROUP_BY,COMBINER   
> obs.json,
> Input(s):
> Successfully read records from: 
> "/Volumes/Work/work/combatxxi-acceptance-testing/pig-input/Ambush_Mine_RPG-7.cxxi/Replication_1_SIMKIT_CONGRUENTIAL/ObserveLogger.log"
> Output(s):
> Successfully stored records in: "obs.json"
> Job DAG:
> job_local_0001
> org.apache.pig.PigException: ERROR 1002: Unable to store alias c_observer_id
>   at org.apache.pig.PigServer.storeEx(PigServer.java:935)
>   at org.apache.pig.PigServer.store(PigServer.java:898)
>   at org.apache.pig.PigServer$store.call(Unknown Source)
>   at 
> org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:45)
>   at 
> org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:108)
>   at 
> org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:124)
>   at 
> edu.nps.cxxi.testbench.lff33.ObserveLoggerService.oink(ObserveLoggerService.groovy:27)
>   at edu.nps.cxxi.testbench.lff33.ObserveLoggerService$oink.call(Unknown 
> Source)
>   at 
> org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:45)
>   at 
> org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:108)
>   at 
> org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:116)
>   at 
> edu.nps.cxxi.testbench.lff33.ObserveLoggerServiceTests.testSomething(ObserveLoggerServiceTests.groovy:42)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
>   at org.junit.runners.ParentRunner.runChildren(Pa