Are you sure it worked in MR. You should have got an error like *Scalar has more than one row in the output. 1st : (xxxx), 2nd :(yyyy) (common cause: "JOIN" then "FOREACH ... GENERATE foo.bar" should be "foo::bar" )*
cd1.first == cd2.second should be written as cd1::first == cd2::second. Refer http://pig.apache.org/docs/r0.16.0/basic.html#disambiguate On Sat, Jul 9, 2016 at 4:19 PM, Joel D <[email protected]> wrote: > Hi, > > Below code work in pig MapReduce mode but doesn't in Tez. In the sense > mstat should return 'matches' but returns nothing when executed in tez mode. > > cd1 = LOAD '/user/falcon/data/cd1.txt' USING PigStorage('\n') AS first: > chararray; > cd2 = LOAD '/user/falcon/data/cd1.txt' USING PigStorage('\n') AS second: > chararray; > > > combined = JOIN cd1 BY first FULL OUTER, cd2 BY second; > > > mstat = FOREACH combined GENERATE ( > CASE > WHEN cd1.first == cd2.second THEN 'matches' > else 'mismatch' > END > ) as match_status; > > dump mstat; > > > > Suggestions please. > > Thanks, > Joel > > > >
