Re: [Parquet][Python, C++]Seg fault using new dataset api; filters not work with old dataset api

Alenka Frim Thu, 14 Apr 2022 07:48:28 -0700

>
> I assume the new implementation is for reading? Like when writing a
> Parquet file we can still change the row group size. The seg fault
> comes from reading, where we do not need to pass in row group size as
> parameters.
>


Oh sorry, I misunderstood!

For the filtering case, yes filtering is only supported in the new
> dataset API, however, both the dataset api and read_table api can pass
> in filters and set use_legacy_dataset=True, which in turn causes the
> error above. There should be logic in the code to handle such a case
> instead of printing errors. Or it should be noted in the doc.
>

That is correct! Created a JIRA for this:
https://issues.apache.org/jira/browse/ARROW-16199

Re: [Parquet][Python, C++]Seg fault using new dataset api; filters not work with old dataset api

Reply via email to