Within Arrow-C++ that is the only way I am aware of.  You might be able to
use DuckDb.  It should be able to scan parquet files.

Is this the same program that you shared before?  Were you able to figure
out threading?  Can you create a JIRA with some sample input files and a
reproducible example?

On Wed, Sep 14, 2022 at 5:14 PM 1057445597 <1057445...@qq.com> wrote:

> Acero performs poorly, and coredump occurs frequently!
>
> In the scenario I'm working on, I'll read one Parquet file and then
> several other Parquet files. These files will have the same column name
> (UUID). I need to join (by UUID), project (remove UUID), and filter (some
> custom filtering) the results of the two reads. I found that Acero could
> only be used to do join, but when I tested it, Acero performance was very
> poor and very unstable, coredump often happened. Is there another way? Or
> just another way to do a join!
>
>
> ------------------------------
> 1057445597
> 1057445...@qq.com
>
> <https://wx.mail.qq.com/home/index?t=readmail_businesscard_midpage&nocheck=true&name=1057445597&icon=http%3A%2F%2Fthirdqq.qlogo.cn%2Fg%3Fb%3Dsdk%26k%3DIlyZtc5eQb1ZfPd0rzpQlQ%26s%3D100%26t%3D1551800738%3Frand%3D1648208978&mail=1057445597%40qq.com&code=>
>
>

Reply via email to