I see that the jira is for unit tests and not e2e test. Please use Util.checkQueryOutputsAfterSort(iter, expectedResults);
-Rohini On Mon, Dec 22, 2014 at 6:39 PM, Rohini Palaniswamy <[email protected] > wrote: > > Usually I have been fixing these kinds of tests by adding an order by when > I added new tests for Union for Tez. In this case you can add order by > after the distinct in the nested foreach. > > Daniel, > Any better suggestions? > > Regards, > Rohini > > > On Wed, Dec 17, 2014 at 10:38 PM, Zhang, Liyun <[email protected]> > wrote: >> >> Hi all, >> I met a problem that “group operator has different results in >> different engines like "spark" and "mapreduce"(PIG-4282< >> https://issues.apache.org/jira/browse/PIG-4282>). >> >> groupdistinct.pig >> A = load 'input1.txt' as (age:int,gpa:int); >> B = group A by age; >> C = foreach B { >> D = A.gpa; >> E = distinct D; >> generate group, MIN(E); >> }; >> dump C; >> input1.txt is: >> 10 89 >> 20 78 >> 10 68 >> 10 89 >> 20 92 >> the mapreduce output is: >> (10,68),(20,78) >> the spark output is >> (20,78),(10,68) >> These two results are different, because the sequence of field ‘group’ is >> not same. >> >> Is there any way to guarantee the sequence of “group” field as the input >> when using “group” operator in pig? >> >> >> Best regards >> Zhang,Liyun >> >>
