hi Shawn,

I expect this is the default because Parquet comes from the Hadoop
ecosystem, and the Hadoop block size is usually set to 64MB. Why would you
need a different default? You can set it to the size that fits your use
case best, right?

Marnix



On Tue, Feb 22, 2022 at 1:42 PM Shawn Zeng <[email protected]> wrote:

> For a clarification, I am referring to pyarrow.parquet.write_table
>
> Shawn Zeng <[email protected]> 于2022年2月22日周二 20:40写道:
>
>> Hi,
>>
>> The default row_group_size is really large, which means a large table
>> smaller than 64M rows will not get the benefits of row group level
>> statistics. What is the reason for this? Do you plan to change the default?
>>
>> Thanks,
>> Shawn
>>
>

Reply via email to