bashir2 commented on a change in pull request #14227:
URL: https://github.com/apache/beam/pull/14227#discussion_r595570559
##########
File path:
sdks/java/io/parquet/src/main/java/org/apache/beam/sdk/io/parquet/ParquetIO.java
##########
@@ -1054,6 +1054,7 @@ public static Sink sink(Schema schema) {
return new AutoValue_ParquetIO_Sink.Builder()
.setJsonSchema(schema.toString())
.setCompressionCodec(CompressionCodecName.SNAPPY)
+ .setRowGroupSize(0)
Review comment:
I thought a little more about this and decided to go with your original
suggestion. Now I think it is actually not a bad idea to expose a little bit of
complexities inside `ParquetWriter` here to give a signal to the user that
`rowGroupSize` is actually used for block-size setting too (and there is a
comment too, so that should be fine).
PTAL.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]