[jira] [Commented] (PIG-2101) Registering a Python function in a directory other than the current working directory fails
[ https://issues.apache.org/jira/browse/PIG-2101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045599#comment-13045599 ] Daniel Eklund commented on PIG-2101: As per: http://mail-archives.apache.org/mod_mbox/pig-user/201106.mbox/browser While implicit in the title of this bug, the deserialization on the back-end is failing for any python file located in anything other than a relative directory Registered on the front end. The functionality should support all directory references: absolute, parent-relative (i.e. '../..'), and child-relative. > Registering a Python function in a directory other than the current working > directory fails > --- > > Key: PIG-2101 > URL: https://issues.apache.org/jira/browse/PIG-2101 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.8.1 >Reporter: Alan Gates > > In MapReduce mode, if the register command references a directory other than > the current one, executing the Python UDF on the backend fails with: > Deserialization error: could not instantiate > 'org.apache.pig.scripting.jython.JythonFunction' with arguments > '[../udfs/python/production.py, production]' > I assume it is using the path on the backend to try to locate the UDF. > The script is: > {code} > register '../udfs/python/production.py' using jython as bballudfs; > players = load 'baseball' as (name:chararray, team:chararray, > pos:bag{t:(p:chararray)}, bat:map[]); > nonnull = filter players by bat#'slugging_percentage' is not null and > bat#'on_base_percentage' is not null; > calcprod = foreach nonnull generate name, bballudfs.production( > (float)bat#'slugging_percentage', > (float)bat#'on_base_percentage'); > dump calcprod; > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2101) Registering a Python function in a directory other than the current working directory fails
[ https://issues.apache.org/jira/browse/PIG-2101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13043982#comment-13043982 ] Daniel Eklund commented on PIG-2101: I can actually still get it to work with a relative directory not involving '..' For instance Register 'test/simple.py' as myNamespace; where test is a subdir in the working directory. But any path with '..' fails. Would also be nice to add something in the documentation about NOT using absolute paths. > Registering a Python function in a directory other than the current working > directory fails > --- > > Key: PIG-2101 > URL: https://issues.apache.org/jira/browse/PIG-2101 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.8.1 >Reporter: Alan Gates > > In MapReduce mode, if the register command references a directory other than > the current one, executing the Python UDF on the backend fails with: > Deserialization error: could not instantiate > 'org.apache.pig.scripting.jython.JythonFunction' with arguments > '[../udfs/python/production.py, production]' > I assume it is using the path on the backend to try to locate the UDF. > The script is: > {code} > register '../udfs/python/production.py' using jython as bballudfs; > players = load 'baseball' as (name:chararray, team:chararray, > pos:bag{t:(p:chararray)}, bat:map[]); > nonnull = filter players by bat#'slugging_percentage' is not null and > bat#'on_base_percentage' is not null; > calcprod = foreach nonnull generate name, bballudfs.production( > (float)bat#'slugging_percentage', > (float)bat#'on_base_percentage'); > dump calcprod; > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2087) Not able to project into 'group' tuple from FILTER
[ https://issues.apache.org/jira/browse/PIG-2087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13042362#comment-13042362 ] Daniel Eklund commented on PIG-2087: I was trying to immediately project out of a STRSPLIT(). Something like this: STRSPLIT(timestamp, ' ', 1).$0 See our conversation here: http://mail-archives.apache.org/mod_mbox/pig-user/201105.mbox/%3c4dd2e5fe.9040...@yahoo-inc.com%3E Unfortunately, due to the circumstances of the client, I am stuck using the version 0.8. If I get a moment free, I can check it out on 0.8.1. The above issue was the only one that necessitated me going back to the old planner, but I found too many other problems (not just this one documented here) that I went back AGAIN to the new one, and used a workaround for the STRSPLIT() projection. thanks. > Not able to project into 'group' tuple from FILTER > --- > > Key: PIG-2087 > URL: https://issues.apache.org/jira/browse/PIG-2087 > Project: Pig > Issue Type: Bug >Affects Versions: 0.8.0 >Reporter: Daniel Eklund >Priority: Minor > Labels: pig > > GROUP creates the 'group' tuple, but subsequent FILTER statements cannot > project into it without throwing ClassCastExceptions. > Example: > --- > my_data = LOAD 'test.txt' using PigStorage(',') > as (name:chararray, age:int, eye_color:chararray, height:int); > by_age_and_color = GROUP my_data BY (age, eye_color); > OUT2 = FILTER by_age_and_color by group.age is not null; > dump OUT2; > -- I get a similar problem even if I do something like: > OUT3 = FILTER by_age_and_color by group.age > 9; > dump OUT3; > - sample test.txt - > ravi,33,blue,43 > brendan,33,green,53 > ravichandra,15,blue,43 > leonor,15,brown,46 > caeser,18,blue,23 > JCVD,,blue,23 > anthony,33,blue,46 > xavier,23,blue,13 > patrick,18,blue,33 > sang,33,brown,44 > --- > java.lang.ClassCastException: java.lang.Integer cannot be cast to > org.apache.pig.data.Tuple > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:392) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:138) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:276) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.GreaterThanExpr.getNext(GreaterThanExpr.java:72) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:256) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:236) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:646) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2087) Not able to project into 'group' tuple from FILTER
[ https://issues.apache.org/jira/browse/PIG-2087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13039290#comment-13039290 ] Daniel Eklund commented on PIG-2087: Cloudera could not reproduce but together we narrowed it down and found that it was a result of -Dpig.usenewlogicalplan=false that I was using (I was actually using "set pig.usenewlogicalplan false") Unfortunately I need this for a workaround on another issue, and since this won't be an issue for later releases, I'll let you decide whether it needs any priority at all. thanks. > Not able to project into 'group' tuple from FILTER > --- > > Key: PIG-2087 > URL: https://issues.apache.org/jira/browse/PIG-2087 > Project: Pig > Issue Type: Bug >Affects Versions: 0.8.0 >Reporter: Daniel Eklund >Priority: Minor > Labels: pig > > GROUP creates the 'group' tuple, but subsequent FILTER statements cannot > project into it without throwing ClassCastExceptions. > Example: > --- > my_data = LOAD 'test.txt' using PigStorage(',') > as (name:chararray, age:int, eye_color:chararray, height:int); > by_age_and_color = GROUP my_data BY (age, eye_color); > OUT2 = FILTER by_age_and_color by group.age is not null; > dump OUT2; > -- I get a similar problem even if I do something like: > OUT3 = FILTER by_age_and_color by group.age > 9; > dump OUT3; > - sample test.txt - > ravi,33,blue,43 > brendan,33,green,53 > ravichandra,15,blue,43 > leonor,15,brown,46 > caeser,18,blue,23 > JCVD,,blue,23 > anthony,33,blue,46 > xavier,23,blue,13 > patrick,18,blue,33 > sang,33,brown,44 > --- > java.lang.ClassCastException: java.lang.Integer cannot be cast to > org.apache.pig.data.Tuple > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:392) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:138) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:276) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.GreaterThanExpr.getNext(GreaterThanExpr.java:72) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:256) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:236) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:646) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2087) Not able to project into 'group' tuple from FILTER
[ https://issues.apache.org/jira/browse/PIG-2087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038718#comment-13038718 ] Daniel Eklund commented on PIG-2087: https://issues.cloudera.org/browse/DISTRO-238 > Not able to project into 'group' tuple from FILTER > --- > > Key: PIG-2087 > URL: https://issues.apache.org/jira/browse/PIG-2087 > Project: Pig > Issue Type: Bug >Affects Versions: 0.8.0 >Reporter: Daniel Eklund >Priority: Minor > Labels: pig > > GROUP creates the 'group' tuple, but subsequent FILTER statements cannot > project into it without throwing ClassCastExceptions. > Example: > --- > my_data = LOAD 'test.txt' using PigStorage(',') > as (name:chararray, age:int, eye_color:chararray, height:int); > by_age_and_color = GROUP my_data BY (age, eye_color); > OUT2 = FILTER by_age_and_color by group.age is not null; > dump OUT2; > -- I get a similar problem even if I do something like: > OUT3 = FILTER by_age_and_color by group.age > 9; > dump OUT3; > - sample test.txt - > ravi,33,blue,43 > brendan,33,green,53 > ravichandra,15,blue,43 > leonor,15,brown,46 > caeser,18,blue,23 > JCVD,,blue,23 > anthony,33,blue,46 > xavier,23,blue,13 > patrick,18,blue,33 > sang,33,brown,44 > --- > java.lang.ClassCastException: java.lang.Integer cannot be cast to > org.apache.pig.data.Tuple > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:392) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:138) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:276) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.GreaterThanExpr.getNext(GreaterThanExpr.java:72) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:256) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:236) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:646) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2087) Not able to project into 'group' tuple from FILTER
[ https://issues.apache.org/jira/browse/PIG-2087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038122#comment-13038122 ] Daniel Eklund commented on PIG-2087: Apache Pig version 0.8.0-cdh3u0 (rexported) Looks like this was introduced by the Cloudera version of 0.8.0. Apologies about that. I had assumed that their distribution followed the apache to the tee. > Not able to project into 'group' tuple from FILTER > --- > > Key: PIG-2087 > URL: https://issues.apache.org/jira/browse/PIG-2087 > Project: Pig > Issue Type: Bug >Affects Versions: 0.8.0 >Reporter: Daniel Eklund >Priority: Minor > Labels: pig > > GROUP creates the 'group' tuple, but subsequent FILTER statements cannot > project into it without throwing ClassCastExceptions. > Example: > --- > my_data = LOAD 'test.txt' using PigStorage(',') > as (name:chararray, age:int, eye_color:chararray, height:int); > by_age_and_color = GROUP my_data BY (age, eye_color); > OUT2 = FILTER by_age_and_color by group.age is not null; > dump OUT2; > -- I get a similar problem even if I do something like: > OUT3 = FILTER by_age_and_color by group.age > 9; > dump OUT3; > - sample test.txt - > ravi,33,blue,43 > brendan,33,green,53 > ravichandra,15,blue,43 > leonor,15,brown,46 > caeser,18,blue,23 > JCVD,,blue,23 > anthony,33,blue,46 > xavier,23,blue,13 > patrick,18,blue,33 > sang,33,brown,44 > --- > java.lang.ClassCastException: java.lang.Integer cannot be cast to > org.apache.pig.data.Tuple > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:392) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:138) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:276) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.GreaterThanExpr.getNext(GreaterThanExpr.java:72) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:256) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:236) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:646) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (PIG-2087) Not able to project into 'group' tuple from FILTER
Not able to project into 'group' tuple from FILTER --- Key: PIG-2087 URL: https://issues.apache.org/jira/browse/PIG-2087 Project: Pig Issue Type: Bug Affects Versions: 0.8.0 Reporter: Daniel Eklund Priority: Minor GROUP creates the 'group' tuple, but subsequent FILTER statements cannot project into it without throwing ClassCastExceptions. Example: --- my_data = LOAD 'test.txt' using PigStorage(',') as (name:chararray, age:int, eye_color:chararray, height:int); by_age_and_color = GROUP my_data BY (age, eye_color); OUT2 = FILTER by_age_and_color by group.age is not null; dump OUT2; -- I get a similar problem even if I do something like: OUT3 = FILTER by_age_and_color by group.age > 9; dump OUT3; - sample test.txt - ravi,33,blue,43 brendan,33,green,53 ravichandra,15,blue,43 leonor,15,brown,46 caeser,18,blue,23 JCVD,,blue,23 anthony,33,blue,46 xavier,23,blue,13 patrick,18,blue,33 sang,33,brown,44 --- java.lang.ClassCastException: java.lang.Integer cannot be cast to org.apache.pig.data.Tuple at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:392) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:138) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:276) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.GreaterThanExpr.getNext(GreaterThanExpr.java:72) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:256) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:236) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:646) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira