[jira] [Commented] (PIG-2620) Customizable Error Handling in Pig
[ https://issues.apache.org/jira/browse/PIG-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13839875#comment-13839875 ] Russell Jurney commented on PIG-2620: - This ticket is the future of data processing. Who do we have to bribe to get this built? > Customizable Error Handling in Pig > -- > > Key: PIG-2620 > URL: https://issues.apache.org/jira/browse/PIG-2620 > Project: Pig > Issue Type: New Feature >Reporter: Dmitriy V. Ryaboy > > The current behavior of Pig when handling exceptions thrown by UDFs is to > fail and stop processing. We want to extend this behavior to let user have > finer grain control on error handling. > Depending on the use-case there are several options users would like to have: > Stop the execution and report an error > Ignore tuples that cause exceptions and log warnings > Ignore tuples that cause exceptions and redirect them to an error relation > (to enable statistics, debugging, ...) > Write their own error handler -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (PIG-3611) Add order by string, descending order e2e tests
[ https://issues.apache.org/jira/browse/PIG-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai resolved PIG-3611. - Resolution: Fixed Hadoop Flags: Reviewed Patch committed to tez branch. > Add order by string, descending order e2e tests > --- > > Key: PIG-3611 > URL: https://issues.apache.org/jira/browse/PIG-3611 > Project: Pig > Issue Type: Sub-task > Components: tez >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: tez-branch > > Attachments: PIG-3611-1.patch > > > Order by string, descending order works with PIG-3527. We shall add e2e tests. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (PIG-3611) Add order by string, descending order e2e tests
[ https://issues.apache.org/jira/browse/PIG-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13839683#comment-13839683 ] Cheolsoo Park commented on PIG-3611: +1. > Add order by string, descending order e2e tests > --- > > Key: PIG-3611 > URL: https://issues.apache.org/jira/browse/PIG-3611 > Project: Pig > Issue Type: Sub-task > Components: tez >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: tez-branch > > Attachments: PIG-3611-1.patch > > > Order by string, descending order works with PIG-3527. We shall add e2e tests. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (PIG-3610) Fix e2e tests Operators_3, Operators_5
[ https://issues.apache.org/jira/browse/PIG-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-3610: Description: Operators_3, Operators_5 failed after PIG-3565 checkin. We shall fix those. (was: Operators_3, Operators_5, Join_1 failed after PIG-3565 checkin. We shall fix those.) > Fix e2e tests Operators_3, Operators_5 > -- > > Key: PIG-3610 > URL: https://issues.apache.org/jira/browse/PIG-3610 > Project: Pig > Issue Type: Sub-task > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: tez-branch > > Attachments: PIG-3610-1.patch > > > Operators_3, Operators_5 failed after PIG-3565 checkin. We shall fix those. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (PIG-3610) Fix e2e tests Operators_3, Operators_5
[ https://issues.apache.org/jira/browse/PIG-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-3610: Summary: Fix e2e tests Operators_3, Operators_5 (was: Fix e2e tests Operators_3, Operators_5, Join_1) > Fix e2e tests Operators_3, Operators_5 > -- > > Key: PIG-3610 > URL: https://issues.apache.org/jira/browse/PIG-3610 > Project: Pig > Issue Type: Sub-task > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: tez-branch > > Attachments: PIG-3610-1.patch > > > Operators_3, Operators_5, Join_1 failed after PIG-3565 checkin. We shall fix > those. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (PIG-3611) Add order by string, descending order e2e tests
[ https://issues.apache.org/jira/browse/PIG-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-3611: Attachment: PIG-3611-1.patch > Add order by string, descending order e2e tests > --- > > Key: PIG-3611 > URL: https://issues.apache.org/jira/browse/PIG-3611 > Project: Pig > Issue Type: Sub-task > Components: tez >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: tez-branch > > Attachments: PIG-3611-1.patch > > > Order by string, descending order works with PIG-3527. We shall add e2e tests. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] Subscription: PIG patch available
Issue Subscription Filter: PIG patch available (8 issues) Subscriber: pigdaily Key Summary PIG-3592Should not try to create success file for non-fs schemes like hbase https://issues.apache.org/jira/browse/PIG-3592 PIG-3587add functionality for rolling over dates https://issues.apache.org/jira/browse/PIG-3587 PIG-3573Provide StoreFunc and LoadFunc for Accumulo https://issues.apache.org/jira/browse/PIG-3573 PIG-3572Fix all unit test for during build pig with Hadoop 2.X on Windows. https://issues.apache.org/jira/browse/PIG-3572 PIG-3453Implement a Storm backend to Pig https://issues.apache.org/jira/browse/PIG-3453 PIG-3441Allow Pig to use default resources from Configuration objects https://issues.apache.org/jira/browse/PIG-3441 PIG-3347Store invocation brings side effect https://issues.apache.org/jira/browse/PIG-3347 PIG-2629Wrong Usage of Scalar which is null causes high namenode operation https://issues.apache.org/jira/browse/PIG-2629 You may edit this subscription at: https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=13225&filterId=12322384
[jira] [Commented] (PIG-2594) JsonLoader/JsonStorage does not work with boolean
[ https://issues.apache.org/jira/browse/PIG-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13839557#comment-13839557 ] Rohini Palaniswamy commented on PIG-2594: - [~pgroudas], This patch has become outdated. Can you rebase and resubmit your patch? Please ensure that you click on "Submit Patch" next time after uploading a patch, as it puts it in the Patch Available list and ensures that someone takes a look and gets it committed. > JsonLoader/JsonStorage does not work with boolean > - > > Key: PIG-2594 > URL: https://issues.apache.org/jira/browse/PIG-2594 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.10.0 >Reporter: Daniel Dai > Attachments: PIG-2594-2.patch, PIG-2594.patch, PIG-2594.patch > > > The following script fail: > {code} > A = LOAD 'allscalar10k.json' using JsonLoader(); > store B into 'output'; > {code} > Exception: > java.io.IOException: Unknown type in input schema: 5 > at org.apache.pig.builtin.JsonLoader.readField(JsonLoader.java:292) > at org.apache.pig.builtin.JsonLoader.getNext(JsonLoader.java:157) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532) > at > org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) > at org.apache.hadoop.mapred.Child.main(Child.java:249) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (PIG-3611) Add order by string, descending order e2e tests
Daniel Dai created PIG-3611: --- Summary: Add order by string, descending order e2e tests Key: PIG-3611 URL: https://issues.apache.org/jira/browse/PIG-3611 Project: Pig Issue Type: Sub-task Components: tez Reporter: Daniel Dai Assignee: Daniel Dai Fix For: tez-branch Order by string, descending order works with PIG-3527. We shall add e2e tests. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (PIG-3610) Fix e2e tests Operators_3, Operators_5, Join_1
[ https://issues.apache.org/jira/browse/PIG-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai resolved PIG-3610. - Resolution: Fixed Hadoop Flags: Reviewed Patch committed to tez branch. Thanks Cheolsoo for quick review! > Fix e2e tests Operators_3, Operators_5, Join_1 > -- > > Key: PIG-3610 > URL: https://issues.apache.org/jira/browse/PIG-3610 > Project: Pig > Issue Type: Sub-task > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: tez-branch > > Attachments: PIG-3610-1.patch > > > Operators_3, Operators_5, Join_1 failed after PIG-3565 checkin. We shall fix > those. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (PIG-3610) Fix e2e tests Operators_3, Operators_5, Join_1
[ https://issues.apache.org/jira/browse/PIG-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13839513#comment-13839513 ] Cheolsoo Park commented on PIG-3610: +1. Join_1 is still fails intermittently, but that's expected and not scope of this fix. > Fix e2e tests Operators_3, Operators_5, Join_1 > -- > > Key: PIG-3610 > URL: https://issues.apache.org/jira/browse/PIG-3610 > Project: Pig > Issue Type: Sub-task > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: tez-branch > > Attachments: PIG-3610-1.patch > > > Operators_3, Operators_5, Join_1 failed after PIG-3565 checkin. We shall fix > those. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (PIG-3610) Fix e2e tests Operators_3, Operators_5, Join_1
[ https://issues.apache.org/jira/browse/PIG-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-3610: Attachment: PIG-3610-1.patch The problem is fixed if set partitioner on edge instead of vertex. Not sure why it works before PIG-3565. > Fix e2e tests Operators_3, Operators_5, Join_1 > -- > > Key: PIG-3610 > URL: https://issues.apache.org/jira/browse/PIG-3610 > Project: Pig > Issue Type: Sub-task > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: tez-branch > > Attachments: PIG-3610-1.patch > > > Operators_3, Operators_5, Join_1 failed after PIG-3565 checkin. We shall fix > those. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (PIG-3610) Fix e2e tests Operators_3, Operators_5, Join_1
Daniel Dai created PIG-3610: --- Summary: Fix e2e tests Operators_3, Operators_5, Join_1 Key: PIG-3610 URL: https://issues.apache.org/jira/browse/PIG-3610 Project: Pig Issue Type: Sub-task Components: impl Reporter: Daniel Dai Assignee: Daniel Dai Fix For: tez-branch Operators_3, Operators_5, Join_1 failed after PIG-3565 checkin. We shall fix those. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (PIG-3609) ClassCastException when calling compareTo method on AvroBagWrapper
Richard Ding created PIG-3609: - Summary: ClassCastException when calling compareTo method on AvroBagWrapper Key: PIG-3609 URL: https://issues.apache.org/jira/browse/PIG-3609 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.12.0 Reporter: Richard Ding Priority: Minor One got the following exception when calling compareTo method on AvroBagWrapper with an AvroBagWrapper object: {code} java.lang.ClassCastException: org.apache.pig.impl.util.avro.AvroBagWrapper incompatible with java.util.Collection at org.apache.avro.generic.GenericData.compare(GenericData.java:786) at org.apache.avro.generic.GenericData.compare(GenericData.java:760) at org.apache.pig.impl.util.avro.AvroBagWrapper.compareTo(AvroBagWrapper.java:78) {code} Looking at the code, it compares objects with different types: {code} return GenericData.get().compare(theArray, o, theArray.getSchema()); {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (PIG-3608) ClassCastException when looking up a value from AvroMapWrapper using a Utf8 key
Richard Ding created PIG-3608: - Summary: ClassCastException when looking up a value from AvroMapWrapper using a Utf8 key Key: PIG-3608 URL: https://issues.apache.org/jira/browse/PIG-3608 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.12.0 Reporter: Richard Ding Priority: Minor One got the following exception: {code} java.lang.ClassCastException: org.apache.avro.util.Utf8 incompatible with java.lang.String at org.apache.pig.impl.util.avro.AvroMapWrapper.get(AvroMapWrapper.java:80) {code} This is related to the change by PIG-3420. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (PIG-3607) PigRecordReader should report progress for each inputsplit processed
Rohini Palaniswamy created PIG-3607: --- Summary: PigRecordReader should report progress for each inputsplit processed Key: PIG-3607 URL: https://issues.apache.org/jira/browse/PIG-3607 Project: Pig Issue Type: Bug Affects Versions: 0.11.1 Reporter: Rohini Palaniswamy Assignee: Rohini Palaniswamy Currently progress() is called only when records are processed. In a case where there were lot of empty input files, the task timed out and was killed because no progress was reported. Too many empty input files are bad, but we still don't want to fail. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (PIG-3406) JsonStorage generates NPE after successfully saving records
[ https://issues.apache.org/jira/browse/PIG-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838924#comment-13838924 ] mcouper99 commented on PIG-3406: I'm not sure if this is related, but I ran into the issue this morning. In looking at the generated schema, there does appear to be a 'null' name in the output: {"fields":[{"name":null,"type":15,"description":"autogenerated from Pig Field Schema","schema":null},{"name":"group::valid_events::flight_id","type":15,"description":"autogenerated from Pig Field Schema","schema":null},{"name":"imps","type":15,"description":"autogenerated from Pig Field Schema","schema":null},{"name":"clicks","type":15,"description":"autogenerated from Pig Field Schema","schema":null}],"version":0,"sortKeys":[],"sortKeyOrders":[]} In this case, I suspect it has something to do with the fact that I'm using a derived field in a group-by that doesn't have an associated alias: group events by ((long)epoch/360 * 3600, flight_id); > JsonStorage generates NPE after successfully saving records > --- > > Key: PIG-3406 > URL: https://issues.apache.org/jira/browse/PIG-3406 > Project: Pig > Issue Type: Bug >Affects Versions: 0.11.1 > Environment: OSX 10.8.4, running PigServer in local mode from a > Groovy script. >Reporter: Kirk Stork > > Make a new PigServer("local") instance, process some queries and save an > alias using > PigServer piggy = new PigServer("local") > ... some queries.. > def job = piggy.store("c_observer_id", "obs.json", > 'org.apache.pig.builtin.JsonStorage'); > The local directory obs.json is created and populated with correct results. > but then an NPE is thrown, apparently during some attempt to store the schema. > HadoopVersion PigVersion UserId StartedAt FinishedAt Features > 1.1.0 0.11.1 kirk2013-08-01 14:09:41 2013-08-01 14:09:42 GROUP_BY > Success! > Job Stats (time in seconds): > JobId Alias Feature Outputs > job_local_0001c_observer_id,g_observer_id,obs GROUP_BY,COMBINER > obs.json, > Input(s): > Successfully read records from: > "/Volumes/Work/work/combatxxi-acceptance-testing/pig-input/Ambush_Mine_RPG-7.cxxi/Replication_1_SIMKIT_CONGRUENTIAL/ObserveLogger.log" > Output(s): > Successfully stored records in: "obs.json" > Job DAG: > job_local_0001 > org.apache.pig.PigException: ERROR 1002: Unable to store alias c_observer_id > at org.apache.pig.PigServer.storeEx(PigServer.java:935) > at org.apache.pig.PigServer.store(PigServer.java:898) > at org.apache.pig.PigServer$store.call(Unknown Source) > at > org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:45) > at > org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:108) > at > org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:124) > at > edu.nps.cxxi.testbench.lff33.ObserveLoggerService.oink(ObserveLoggerService.groovy:27) > at edu.nps.cxxi.testbench.lff33.ObserveLoggerService$oink.call(Unknown > Source) > at > org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:45) > at > org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:108) > at > org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:116) > at > edu.nps.cxxi.testbench.lff33.ObserveLoggerServiceTests.testSomething(ObserveLoggerServiceTests.groovy:42) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) > at org.junit.runners.ParentRunner.runChildren(Pa