Hi folks,

Red Arrow, the Ruby binding of Arrow GLib, implements grouped aggregation
features for RecordBatch and Table.  Because these features are written in
Ruby, they are too slow for large size data.  We need to make them much
faster.

To improve their calculation speed, they should be written in C++, and
should be put in Arrow C++ instead of Red Arrow.

Is anyone working on implementing group-by operation for RecordBatch and
Table in Arrow C++?  If no one has worked on it, I would like to try it.

By the way, I found that the grouped aggregation feature is mentioned in
the design document of Arrow C++ Query Engine.  Is Query Engine, not Arrow
C++ Core, a suitable location to implement group-by operation?

Reply via email to