yjshen commented on PR #3009: URL: https://github.com/apache/arrow-datafusion/pull/3009#issuecomment-1204699701
The main concern that departure `row_aggregate` from `aggregate` comes from the intention to do in-place updates for row-based states. I assume we could store pointers in `RowLayout::WordAligned` for varlena states during accumulation, finalizing and inlining them into the row-state while we are spilling later. UDAFs with`Box<dyn Any>` state requires more, perhaps an extra serde provided. For the `median` state store, a `Map<value, value_occurence_count>` might be more likely to be space efficient. Though it may require more computations than the current approach, and likely not work with arrow compute kernels. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
