LiaCastaneda opened a new pull request, #22996:
URL: https://github.com/apache/datafusion/pull/22996

   ## Which issue does this PR close?
   
   - Closes https://github.com/apache/datafusion/issues/22993
   
   
   ## Rationale for this change
   
   DataFusion has no way to aggregate key-value pairs into a map. map_agg is a 
common aggregate in other engines (Trino, Presto, Spark, Postgres via 
`json_object_agg`), and Arrow already has a native `MapArray` to back it. This 
PR adds it.
   
   
   
   ## What changes are included in this PR?
   
   New `map_agg(key, value)` aggregate function that collects key-value pairs 
into a `Map(K, V)`, one map per group.
   `ORDER BY` support (`map_agg(key, value ORDER BY expr)`).
   
   ## Are these changes tested?
   
   Yes
   
   - Unit tests in `map_agg.rs` covering pair collection, NULL-key skipping, 
NULL-value retention, first-wins dedup, multi-partition merge, and the ORDER BY 
ASC/DESC paths.
   - sqllogictest cases in `map_agg.slt` exercising `GROUP BY` and `ORDER BY` 
through the full SQL plan.
   
   ## Are there any user-facing changes?
   
   Yes -- adds a new `map_agg` aggregate function available from SQL. No 
breaking changes to existing APIs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to