[ 
https://issues.apache.org/jira/browse/ARROW-12334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17324288#comment-17324288
 ] 

Andy Grove commented on ARROW-12334:
------------------------------------

I tracked this down and there are two separate bugs:

1. We are getting RepartitionExec in the plan which is not compatible with 
Ballista and explodes the number of partitions (and likely causes incorrect 
results)
2. The query actually works fine and the final sort produces 2 rows, but the 
results are created by reading all the intermediate results as well

> [Rust] [Ballista] Aggregate queries producing incorrect results
> ---------------------------------------------------------------
>
>                 Key: ARROW-12334
>                 URL: https://issues.apache.org/jira/browse/ARROW-12334
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Rust - Ballista
>            Reporter: Andy Grove
>            Assignee: Andy Grove
>            Priority: Major
>
> I just ran benchmarks for the first time in a while and I see duplicate 
> entries for group by keys.
>  
> For example, query 1 has "group by l_returnflag, l_linestatus" and I see 
> multiple results with l_returnflag = 'A' and l_linestatus = 'F'.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to