[
https://issues.apache.org/jira/browse/PIG-5166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15984513#comment-15984513
]
Adam Szita commented on PIG-5166:
---------------------------------
This is a float precision problem and although {{floatpostprocess}} is turned
on for this test case it doesn't help too much because it's only considering 3
decimals...
Here are the "different" values produced by MR and Spark without float post
processing, basically they are all the same, but the e2e test framework fails
to compare them properly:
{code:title=MR full prec}
bob davidson 2.3175
bob ichabod 1.8125
david xylophone 1.7865000000000002
rachel ellison 2.5294999999999996
zach ellison 2.4774999999999996
zach steinbeck 1.9785000000000004
zach white 1.6575
{code}
{code:title=SPRAK full prec}
bob davidson 2.3175000000000003
bob ichabod 1.8125000000000002
david xylophone 1.7864999999999998
rachel ellison 2.5295
zach ellison 2.4775
zach steinbeck 1.9785
zach white 1.6575000000000002
{code}
I propose an *extension of floatpostprocessor.pl* so that a test case may
define how many decimals it wants to use before comparing results. The default
can stay 3 as it is right now, and this can be overridden by supplying for
example {{'decimals' => 6}} as seen in [^PIG-5166.0.patch]
[~kellyzly] [~nkollar] let me know what you think
> GroupAggFunc_9 is failing with spark exec type
> ----------------------------------------------
>
> Key: PIG-5166
> URL: https://issues.apache.org/jira/browse/PIG-5166
> Project: Pig
> Issue Type: Sub-task
> Components: spark
> Reporter: Nandor Kollar
> Assignee: Adam Szita
> Fix For: spark-branch
>
> Attachments: PIG-5166.0.patch
>
>
> results are different:
> {code}
> diff GroupAggFunc_9.out/out_sorted GroupAggFunc_9_benchmark.out/out_sorted
> 30c30
> < bob davidson 2.318
> ---
> > bob davidson 2.317
> 35c35
> < bob ichabod 1.813
> ---
> > bob ichabod 1.812
> 102c102
> < david xylophone 1.786
> ---
> > david xylophone 1.787
> 447c447
> < rachel ellison 2.530
> ---
> > rachel ellison 2.529
> 655c655
> < zach ellison 2.478
> ---
> > zach ellison 2.477
> 669c669
> < zach steinbeck 1.978
> ---
> > zach steinbeck 1.979
> 673c673
> < zach white 1.658
> ---
> > zach white 1.657
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)