Jeff,

As i mentioned, comtent ratings is a bag of tuples. First flatten() (which i
do ) is going to flatten the bag but not the tuple. It is possible to do a
second flatten() to flatten even the tuple (which would make possible the
syntax you are suggesting), but like i said i tried that as well and that
still builds the projection behind the scenes which fails all the same.

The fix to this seems trivial removal of seemingly unnecessary (and
overlooked) cast to tuple at the line 389.

i guess i'll do a custom build and see if it works as the bug is really
popping up a lot.

Thanks.
-d


On Mon, Jul 19, 2010 at 11:00 PM, Jeff Zhang <zjf...@gmail.com> wrote:

> Hi Dmitriy
>
> You can try this :
>
> IMP_F2 = foreach IMP_F1 generate ... , FLATTEN(contentRatings);
> IMP_F3 = filter IMP_F2 by vendorId==1
>
>
>
> On Tue, Jul 20, 2010 at 1:36 PM, Dmitriy Lyubimov <dlie...@gmail.com>
> wrote:
>
> > isn't i am doing the same?
> >
> > I actually tried to flatten bag then flatten tuple as well. It seems
> > internally still construct the tree that includes projection and gives
> the
> > same error. Which of course makes sense, it builds the vistor tree behind
> > the scenes, which should contain projection sooner or later.
> >
> >
> > i actually stumbled upon the same error in a comnpletely different
> context
> > and i probably try to patch it (there seems to be a clear error in the
> > logic
> > in that place which processes both tuples and non-tuples but somehow
> > ignores
> > the fact of the latter and tries to cast to a Tuple anyway...
> > not sure why. Perhaps the fix is just to remove the cast. But this error
> > pops up second time today for me in different contexts..
> >
> > Thanks.
> >
> > -Dmitriy
> >
> > On Mon, Jul 19, 2010 at 10:03 PM, Jeff Zhang <zjf...@gmail.com> wrote:
> >
> > > It looks like a bug of Pig.
> > > I try the following script:
> > >
> > > a = load 'data/a.txt' as (b:bag{t:tuple(f1:int,f2:int)});
> > > result = foreach a generate FLATTEN(b) as c;
> > > describe result;
> > >
> > > the output is
> > > result: {c: int,f2: int}
> > > The c is considered one field of tuple other than tuple
> > >
> > >
> > > On Tue, Jul 20, 2010 at 6:00 AM, Dmitriy Lyubimov <dlie...@gmail.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I would greatly appreciate somebody's help with the following pig
> error
> > > > during MR
> > > >
> > > > all mappers fail with the following stack trace
> > > >
> > > > java.lang.ClassCastException: java.lang.Integer cannot be cast to
> > > > org.apache.pig.data.Tuple
> > > >        at
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:389)
> > > >        at
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POIsNull.getNext(POIsNull.java:152)
> > > >        at
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PONot.getNext(PONot.java:71)
> > > >        at
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POAnd.getNext(POAnd.java:67)
> > > >        at
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148)
> > > >        at
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:272)
> > > >        at
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLimit.getNext(POLimit.java:85)
> > > >        at
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:272)
> > > >        at
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:255)
> > > >        at
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:232)
> > > >        at
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:227)
> > > >        at
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:52)
> > > >        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> > > >        at
> > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
> > > >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> > > >        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > the pig script fragment causing this is as follows :
> > > > IMP_F2 = foreach IMP_F1 generate ... , FLATTEN(contentRatings) as
> > > > contentRating;
> > > > IMP_F3 = filter IMP_F2 by contentRating is not null and
> > > > contentRating.vendorId==1
> > > >
> > > > if i remove IMP_F3 line then the job goes thru but adding IMP_F3
> > > > filtering causes this.
> > > > describe IMP_F2 produces
> > > >
> > > > IMP_F2: {... ,contentRating: (vendorId: int, ... ), ... }
> > > >
> > > >
> > > > i also tried casts like 'filter by ...
> > > > (int)(contentRating.vendorId)==1 which did not change anything.
> > > >
> > > > Any ideas for workaround are appreciated.
> > > >
> > > > Thanks in advance.
> > > > -Dmitriy
> > > >
> > >
> > >
> > >
> > > --
> > > Best Regards
> > >
> > > Jeff Zhang
> > >
> >
>
>
>
> --
> Best Regards
>
> Jeff Zhang
>

Reply via email to