ConeyLiu commented on code in PR #1173:
URL: https://github.com/apache/parquet-mr/pull/1173#discussion_r1374573076


##########
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileWriter.java:
##########
@@ -988,6 +988,21 @@ void writeColumnChunk(ColumnDescriptor descriptor,
     endColumn();
   }
 
+  /**
+   * Overwrite the column total statistics. This special used when the column 
total statistics
+   * is known while all the page statistics are invalid, for example when 
rewriting the column.
+   *
+   * @param totalStatistics the column total statistics
+   * @throws IOException if there is an error while writing
+   */
+  public void endColumn(Statistics<?> totalStatistics) throws IOException {
+    Preconditions.checkArgument(totalStatistics != null, "Column total 
statistics can not be null");
+    currentStatistics = totalStatistics;
+    // Invalid the ColumnIndex
+    columnIndexBuilder = ColumnIndexBuilder.getNoOpBuilder();

Review Comment:
   Must do this since the column statistics may not match with the built column 
index.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to