amousavigourabi commented on code in PR #1157:
URL: https://github.com/apache/parquet-mr/pull/1157#discussion_r1341182236


##########
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetWriter.java:
##########
@@ -358,9 +358,29 @@ public abstract static class Builder<T, SELF extends 
Builder<T, SELF>> {
     private long rowGroupSize = DEFAULT_BLOCK_SIZE;
     private int maxPaddingSize = MAX_PADDING_SIZE_DEFAULT;
     private boolean enableValidation = DEFAULT_IS_VALIDATING_ENABLED;
-    private ParquetProperties.Builder encodingPropsBuilder =
+    private final ParquetProperties.Builder encodingPropsBuilder =
         ParquetProperties.builder();
 
+    private boolean isPageSizeSet = false;

Review Comment:
   I agree that these booleans aren't amazing. Optional fields would not solve 
the issue of making withXXX methods always override the configuration, next to 
this, Optionals shouldn't be used as fields IMO, as that is [not their intended 
use](https://mail.openjdk.org/pipermail/jdk8-dev/2013-September/003274.html). 
Maybe an approach with a Set would work better?



##########
parquet-hadoop/src/test/java/org/apache/parquet/filter2/dictionarylevel/DictionaryFilterTest.java:
##########
@@ -195,10 +200,28 @@ private static void prepareFile(WriterVersion version, 
Path file) throws IOExcep
     writeData(f, writer);
   }
 
+  private static void prepareFileWithConf(WriterVersion version, Path file) 
throws IOException {
+    Configuration configuration = new Configuration();

Review Comment:
   These configurations are used to build the `encodingProps` ParquetProperties 
within ParquetWriter's Builder, which are used elsewhere as well. While the 
usage of Configuration in this way may be discouraged, the current situation of 
settings sometimes not being picked up by the ParquetWriter is inconsistent 
with both ParquetReader behaviour and user expectations.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to