[
https://issues.apache.org/jira/browse/PARQUET-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17325205#comment-17325205
]
Mika Ristimäki commented on PARQUET-2030:
-----------------------------------------
That is what I thought also, so I first tried
{code}
ExampleParquetWriter.builder(file)
.config("page.size.row.check.min", "1")
.config("page.size.row.check.max", "1")
{code}
I also tried prefixing with "parquet"
{code}
ExampleParquetWriter.builder(file)
.config("parquet.page.size.row.check.min", "1")
.config("parquet.page.size.row.check.max", "1")
{code}
But neither worked. I also tried to follow the code how ParquetWriter.Builder
"conf" instance variable is converted to "InternalParquetRecordWriter" "props"
instance variable (that is used to initiate the "recordCountForNextMemCheck"
ivar), but I could not find such code path.
I of course may be fixing the wrong thing, and instead of exposing these
configs I should fix the ParquetWriter.Builder "conf" usage.
Or am I using the ParquetWriter.Build.config method somehow in a wrong way?
> Expose page size row check configurations to ParquetWriter.Builder
> ------------------------------------------------------------------
>
> Key: PARQUET-2030
> URL: https://issues.apache.org/jira/browse/PARQUET-2030
> Project: Parquet
> Issue Type: Improvement
> Reporter: Mika Ristimäki
> Priority: Minor
>
> PARQUET-1920 makes it possible to configure "page.size.row.check.max" and
> "page.size.row.check.max". But those configurations are not exposed to
> "org.apache.parquet.hadoop.ParquetWriter.Builder".
--
This message was sent by Atlassian Jira
(v8.3.4#803005)