alamb commented on code in PR #23182:
URL: https://github.com/apache/datafusion/pull/23182#discussion_r3483192005
##########
datafusion/physical-plan/src/aggregates/aggregate_hash_table/final_table.rs:
##########
@@ -55,30 +57,17 @@ impl AggregateHashTable<FinalMarker> {
) -> Result<Option<RecordBatch>> {
let output_schema = Arc::clone(&self.output_schema);
let batch_size = self.batch_size;
- match &mut self.state {
+ match std::mem::replace(&mut self.state,
AggregateHashTableState::Done) {
Review Comment:
I found it confusing here that self.state was note replaced in this function
(so it looks like self.state is lost)
i think what actually happens is that `emit_next_materialized_batch` sets it
I think it would be easier to understand what is happening if we inlined
emit_next_materialized_batch here, but maybe just a comment would be good
enough.
##########
datafusion/physical-plan/src/aggregates/aggregate_hash_table/common.rs:
##########
@@ -297,11 +300,53 @@ pub(super) struct AggregateHashTableBuffer {
}
pub(super) enum AggregateHashTableState {
+ /// Accumulating input rows into group keys and aggregate state.
Building(AggregateHashTableBuffer),
+ /// Emitting results directly from group keys and aggregate state.
Outputting(AggregateHashTableBuffer),
+ /// Materialize all the output results, and then incrementally output in
the `OutputtingMaterializedFinal` state.
+ ///
+ /// Note this is a temporary solution until the `GroupValues` issue is
solved:
Review Comment:
👍 it also keeps the existing behavior (of DataFusion 54)
##########
datafusion/physical-plan/src/aggregates/aggregate_hash_table/final_table.rs:
##########
@@ -55,30 +57,17 @@ impl AggregateHashTable<FinalMarker> {
) -> Result<Option<RecordBatch>> {
let output_schema = Arc::clone(&self.output_schema);
let batch_size = self.batch_size;
- match &mut self.state {
+ match std::mem::replace(&mut self.state,
AggregateHashTableState::Done) {
Review Comment:
I pushed a commit with a comment explanining this
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]