Re: Support of more manipulation for Record Batch

2020-04-03 Thread Chengxin Ma
Hi Wes, Thank you for your answer. The projects you mentioned look very exciting. I will keep an eye on them. Kind Regards Chengxin Sent with ProtonMail Secure Email. ‐‐‐ Original Message ‐‐‐ On Thursday, April 2, 2020 5:46 PM, Wes McKinney wrote: > hi Chengxin, > > Yes, if you look

Re: Support of more manipulation for Record Batch

2020-04-02 Thread Wes McKinney
hi Chengxin, Yes, if you look at the JIRA tracker and look for past discussions on the mailing list, there are plans to develop comprehensive data manipulation and query processing capabilities in this project for use in Python, R, and any other language that binds to C++, including C/GLib and

Support of more manipulation for Record Batch

2020-04-02 Thread Chengxin Ma
Hi all, I am working on a distributed sorting program which runs on multiple computation nodes. In this sorting program, data is represented as pandas DataFrames and key operations are groupby, concat, and sort_values. For shuffling data among the computation nodes, the DataFrames are