[
https://issues.apache.org/jira/browse/PIG-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605749#comment-13605749
]
Daniel Dai commented on PIG-3049:
---------------------------------
This syntax should be supported. The error stack is on map side, explain shows:
{code}
MapReduce node scope-25
Map Plan
a1: Local Rearrange[tuple]{tuple}(false) - scope-10
| |
| Project[int][1] - scope-11
|
|---a: New For Each(false,false)[bag] - scope-7
| |
| Cast[chararray] - scope-2
| |
| |---Project[bytearray][0] - scope-1
| |
| Cast[int] - scope-5
| |
| |---Project[bytearray][1] - scope-4
|
|---a:
Load(file:///Users/daijy/pig/words_and_numbers:org.apache.pig.builtin.PigStorage)
- scope-0--------
Reduce Plan
b: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-24
|
|---b: New For Each(false,false)[bag] - scope-23
| |
| Project[int][0] - scope-12
| |
| POUserFunc(org.apache.pig.builtin.LongSum)[long] - scope-16
| |
| |---Project[bag][0] - scope-15
| |
| |---RelationToExpressionProject[bag][*] - scope-14
| |
| |---a_bag: New For Each(false)[bag] - scope-20
| | |
| | Project[chararray][0] - scope-18
| |
| |---Project[bag][1] - scope-17
|
|---a1: Package[tuple]{int} - scope-9--------
Global sort: false
Secondary sort: true
{code}
The key type for Local Rearrange is wrong. So this should be a bug.
Johnny, are you still working on it?
> Cannot sort on a bag in nested foreach
> --------------------------------------
>
> Key: PIG-3049
> URL: https://issues.apache.org/jira/browse/PIG-3049
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.11, 0.12
> Reporter: Jonathan Coveney
> Assignee: Johnny Zhang
> Fix For: 0.12
>
>
> The following script fails.
> {code}
> a = load 'words_and_numbers' as (word:chararray, number:int);
> b = foreach (group a by number) {
> a_bag = a.word;
> ord = order a_bag by word;
> generate group, ord;
> }
> dump b;
> {code}
> On this data:
> {code}
> $ cat words_and_numbers
>
> hey 1
> hey 2
> you 3
> you 4
> I 5
> could 6
> {code}
> it throws the following error:
> {code}
> ava.lang.ClassCastException: java.lang.String cannot be cast to
> org.apache.pig.data.Tuple
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:469)
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:308)
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:160)
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:384)
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:340)
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:333)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210)
> {code}
> Is this a supported feature of Pig? Seems reasonable, just seems like
> something weird is going on.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira