[ 
https://issues.apache.org/jira/browse/PARQUET-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17325205#comment-17325205
 ] 

Mika Ristimäki commented on PARQUET-2030:
-----------------------------------------

That is what I thought also, so I first tried

{code}
ExampleParquetWriter.builder(file)
  .config("page.size.row.check.min", "1")
  .config("page.size.row.check.max", "1")
{code}

I also tried prefixing with "parquet"
{code}
ExampleParquetWriter.builder(file)
  .config("parquet.page.size.row.check.min", "1")
  .config("parquet.page.size.row.check.max", "1")
{code}

But neither worked. I also tried to follow the code how ParquetWriter.Builder 
"conf" instance variable is converted to "InternalParquetRecordWriter" "props" 
instance variable (that is used to initiate the "recordCountForNextMemCheck" 
ivar), but I could not find such code path.

I of course may be fixing the wrong thing, and instead of exposing these 
configs I should fix the ParquetWriter.Builder "conf" usage. 

Or am I using the ParquetWriter.Build.config method somehow in a wrong way?

> Expose page size row check configurations to ParquetWriter.Builder
> ------------------------------------------------------------------
>
>                 Key: PARQUET-2030
>                 URL: https://issues.apache.org/jira/browse/PARQUET-2030
>             Project: Parquet
>          Issue Type: Improvement
>            Reporter: Mika Ristimäki
>            Priority: Minor
>
> PARQUET-1920 makes it possible to configure "page.size.row.check.max" and 
> "page.size.row.check.max". But those configurations are not exposed to 
> "org.apache.parquet.hadoop.ParquetWriter.Builder".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to