Hi folks, Red Arrow, the Ruby binding of Arrow GLib, implements grouped aggregation features for RecordBatch and Table. Because these features are written in Ruby, they are too slow for large size data. We need to make them much faster.
To improve their calculation speed, they should be written in C++, and should be put in Arrow C++ instead of Red Arrow. Is anyone working on implementing group-by operation for RecordBatch and Table in Arrow C++? If no one has worked on it, I would like to try it. By the way, I found that the grouped aggregation feature is mentioned in the design document of Arrow C++ Query Engine. Is Query Engine, not Arrow C++ Core, a suitable location to implement group-by operation?