[ https://issues.apache.org/jira/browse/ARROW-12334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17324288#comment-17324288 ]
Andy Grove commented on ARROW-12334: ------------------------------------ I tracked this down and there are two separate bugs: 1. We are getting RepartitionExec in the plan which is not compatible with Ballista and explodes the number of partitions (and likely causes incorrect results) 2. The query actually works fine and the final sort produces 2 rows, but the results are created by reading all the intermediate results as well > [Rust] [Ballista] Aggregate queries producing incorrect results > --------------------------------------------------------------- > > Key: ARROW-12334 > URL: https://issues.apache.org/jira/browse/ARROW-12334 > Project: Apache Arrow > Issue Type: Bug > Components: Rust - Ballista > Reporter: Andy Grove > Assignee: Andy Grove > Priority: Major > > I just ran benchmarks for the first time in a while and I see duplicate > entries for group by keys. > > For example, query 1 has "group by l_returnflag, l_linestatus" and I see > multiple results with l_returnflag = 'A' and l_linestatus = 'F'. -- This message was sent by Atlassian Jira (v8.3.4#803005)