Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2024-02-21 Thread via GitHub


stream2000 commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1956177100

   > A runtime error will be introduced here. Passing 
com.codahale.metrics.Meter as a parameter to flink code will cause a 
NoSuchMethodError, unless the flink code is also shaded, but the bundle-jar 
will be too large
   
   @zhuanshenbsj1 Sorry for the inconvenience introduced, I have already fix it 
in #10723 , PTAL.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2024-02-04 Thread via GitHub


zhuanshenbsj1 commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1926124497

   A runtime error will be introduced here. Passing com.codahale.metrics.Meter 
as a parameter to flink code will cause a NoSuchMethodError, unless the flink 
code is also shaded, but the bundle-jar will be too large. @stream2000 
@danny0405 
   https://github.com/apache/hudi/assets/34104400/42f1557a-356d-4d07-997d-06347ccfab39;>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-16 Thread via GitHub


danny0405 merged PR #9118:
URL: https://github.com/apache/hudi/pull/9118


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-16 Thread via GitHub


stream2000 commented on code in PR #9118:
URL: https://github.com/apache/hudi/pull/9118#discussion_r1361435492


##
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/bulk/BulkInsertWriterHelper.java:
##
@@ -69,14 +71,21 @@ public class BulkInsertWriterHelper {
   @Nullable
   protected final RowDataKeyGen keyGen;
 
+  protected final Option appendWriteMetrics;
+
   public BulkInsertWriterHelper(Configuration conf, HoodieTable hoodieTable, 
HoodieWriteConfig writeConfig,
 String instantTime, int taskPartitionId, long 
totalSubtaskNum, long taskEpochId, RowType rowType) {
-this(conf, hoodieTable, writeConfig, instantTime, taskPartitionId, 
totalSubtaskNum, taskEpochId, rowType, false);
+this(conf, hoodieTable, writeConfig, instantTime, taskPartitionId, 
totalSubtaskNum, taskEpochId, rowType, false, null);
+  }
+
+  public BulkInsertWriterHelper(Configuration conf, HoodieTable hoodieTable, 
HoodieWriteConfig writeConfig,
+String instantTime, int taskPartitionId, long 
taskId, long taskEpochId, RowType rowType, boolean preserveHoodieMetadata) {
+this(conf, hoodieTable, writeConfig, instantTime, taskPartitionId, taskId, 
taskEpochId, rowType, preserveHoodieMetadata, null);
   }
 
   public BulkInsertWriterHelper(Configuration conf, HoodieTable hoodieTable, 
HoodieWriteConfig writeConfig,
 String instantTime, int taskPartitionId, long 
totalSubtaskNum, long taskEpochId, RowType rowType,
-boolean preserveHoodieMetadata) {
+boolean preserveHoodieMetadata, 
FlinkStreamWriteMetrics metrics) {

Review Comment:
   Sure.  PTAL~ 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-16 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1764497532

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   * 6ee9b9fcbbf2ed709f2a5c12829ce43dee92f0e2 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20346)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-16 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1764204966

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   * 8738f767895a396e88c163a8d785d5e92277fad8 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20333)
 
   * 6ee9b9fcbbf2ed709f2a5c12829ce43dee92f0e2 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20346)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-16 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1764190201

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   * 8738f767895a396e88c163a8d785d5e92277fad8 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20333)
 
   * 6ee9b9fcbbf2ed709f2a5c12829ce43dee92f0e2 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-16 Thread via GitHub


danny0405 commented on code in PR #9118:
URL: https://github.com/apache/hudi/pull/9118#discussion_r1360403842


##
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/bulk/BulkInsertWriterHelper.java:
##
@@ -69,14 +71,21 @@ public class BulkInsertWriterHelper {
   @Nullable
   protected final RowDataKeyGen keyGen;
 
+  protected final Option appendWriteMetrics;
+
   public BulkInsertWriterHelper(Configuration conf, HoodieTable hoodieTable, 
HoodieWriteConfig writeConfig,
 String instantTime, int taskPartitionId, long 
totalSubtaskNum, long taskEpochId, RowType rowType) {
-this(conf, hoodieTable, writeConfig, instantTime, taskPartitionId, 
totalSubtaskNum, taskEpochId, rowType, false);
+this(conf, hoodieTable, writeConfig, instantTime, taskPartitionId, 
totalSubtaskNum, taskEpochId, rowType, false, null);
+  }
+
+  public BulkInsertWriterHelper(Configuration conf, HoodieTable hoodieTable, 
HoodieWriteConfig writeConfig,
+String instantTime, int taskPartitionId, long 
taskId, long taskEpochId, RowType rowType, boolean preserveHoodieMetadata) {
+this(conf, hoodieTable, writeConfig, instantTime, taskPartitionId, taskId, 
taskEpochId, rowType, preserveHoodieMetadata, null);
   }
 
   public BulkInsertWriterHelper(Configuration conf, HoodieTable hoodieTable, 
HoodieWriteConfig writeConfig,
 String instantTime, int taskPartitionId, long 
totalSubtaskNum, long taskEpochId, RowType rowType,
-boolean preserveHoodieMetadata) {
+boolean preserveHoodieMetadata, 
FlinkStreamWriteMetrics metrics) {

Review Comment:
   Can we use `Option metrics` as param instead?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-15 Thread via GitHub


stream2000 commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1763604724

   @danny0405 Hi Danny, could we merge this PR since the test failures are just 
unstable case? Thanks~ 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-13 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1761999841

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   * 8738f767895a396e88c163a8d785d5e92277fad8 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20333)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-13 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1761619670

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   * 68bf4bccc6fcac8c2f96e4223bab206992af314b Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19686)
 
   * 8738f767895a396e88c163a8d785d5e92277fad8 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20333)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-13 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1761552441

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   * 68bf4bccc6fcac8c2f96e4223bab206992af314b Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19686)
 
   * 8738f767895a396e88c163a8d785d5e92277fad8 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-13 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1761205902

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   * 68bf4bccc6fcac8c2f96e4223bab206992af314b Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19686)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-13 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1761146764

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   * c930542593c9ff008a3d5cee08be617fc7fbfcb0 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20319)
 
   * 68bf4bccc6fcac8c2f96e4223bab206992af314b UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-13 Thread via GitHub


stream2000 closed pull request #9118: [HUDI-2141] Support flink stream write 
metrics
URL: https://github.com/apache/hudi/pull/9118


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-13 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1761044227

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   * c930542593c9ff008a3d5cee08be617fc7fbfcb0 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20319)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-12 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1760681513

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   * 4b4d15361b3096e27cc3c54c3347c2cf9224c895 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20301)
 
   * c930542593c9ff008a3d5cee08be617fc7fbfcb0 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20319)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-12 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1760676179

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   * 4b4d15361b3096e27cc3c54c3347c2cf9224c895 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20301)
 
   * c930542593c9ff008a3d5cee08be617fc7fbfcb0 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-12 Thread via GitHub


stream2000 commented on code in PR #9118:
URL: https://github.com/apache/hudi/pull/9118#discussion_r1357668260


##
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/metrics/FlinkStreamWriteMetrics.java:
##
@@ -0,0 +1,170 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.metrics;
+
+import org.apache.hudi.sink.common.AbstractStreamWriteFunction;
+
+import com.codahale.metrics.SlidingWindowReservoir;
+import org.apache.flink.dropwizard.metrics.DropwizardHistogramWrapper;
+import org.apache.flink.dropwizard.metrics.DropwizardMeterWrapper;
+import org.apache.flink.metrics.Histogram;
+import org.apache.flink.metrics.Meter;
+import org.apache.flink.metrics.MetricGroup;
+
+/**
+ * Metrics for flink stream write (including append write, normal/bucket 
stream write etc.).
+ * Used in subclasses of {@link AbstractStreamWriteFunction}.
+ */
+public class FlinkStreamWriteMetrics extends HoodieFlinkMetrics {
+  private static final String DATA_FLUSH_KEY = "data_flush";
+  private static final String FILE_FLUSH_KEY = "file_flush";
+  private static final String HANDLE_CREATION_KEY = "handle_creation";
+
+  /**
+   * Flush data costs during checkpoint.
+   */
+  private long dataFlushCosts;
+
+  /**
+   * Number of records written in during a checkpoint window.
+   */
+  protected long writtenRecords;
+
+  /**
+   * Current write buffer size in StreamWriteFunction.
+   */
+  private long writeBufferedSize;
+
+  /**
+   * Total costs for closing write handles during a checkpoint window.
+   */
+  private long fileFlushTotalCosts;
+
+  /**
+   * Number of handles opened during a checkpoint window. Increased with 
partition number/bucket number etc.
+   */
+  private long numOfOpenHandle;
+
+  /**
+   * Number of files written during a checkpoint window.
+   */
+  private long numOfFilesWritten;
+
+  /**
+   * Number of records written per seconds.
+   */
+  protected final Meter recordWrittenPerSecond;
+
+  /**
+   * Number of write handle switches per seconds.
+   */
+  private final Meter handleSwitchPerSecond;
+
+  /**
+   * Cost of write handle creation.
+   */
+  private final Histogram handleCreationCosts;
+
+  /**
+   * Cost of a file flush.
+   */
+  private final Histogram fileFlushCost;
+
+  public FlinkStreamWriteMetrics(MetricGroup metricGroup) {
+super(metricGroup);
+this.recordWrittenPerSecond = new DropwizardMeterWrapper(new 
com.codahale.metrics.Meter());
+this.handleSwitchPerSecond = new DropwizardMeterWrapper(new 
com.codahale.metrics.Meter());
+this.handleCreationCosts = new DropwizardHistogramWrapper(new 
com.codahale.metrics.Histogram(new SlidingWindowReservoir(100)));
+this.fileFlushCost = new DropwizardHistogramWrapper(new 
com.codahale.metrics.Histogram(new SlidingWindowReservoir(100)));
+  }
+
+  @Override
+  public void registerMetrics() {
+metricGroup.meter("recordWrittenPerSecond", recordWrittenPerSecond);
+metricGroup.gauge("currentCommitWrittenRecords", () -> writtenRecords);
+metricGroup.gauge("dataFlushCosts", () -> dataFlushCosts);
+metricGroup.gauge("writeBufferedSize", () -> writeBufferedSize);
+
+metricGroup.gauge("fileFlushTotalCosts", () -> fileFlushTotalCosts);
+metricGroup.gauge("numOfFilesWritten", () -> numOfFilesWritten);
+metricGroup.gauge("numOfOpenHandle", () -> numOfOpenHandle);
+
+metricGroup.meter("handleSwitchPerSecond", handleSwitchPerSecond);
+
+metricGroup.histogram("handleCreationCosts", handleCreationCosts);
+metricGroup.histogram("handleCloseCosts", fileFlushCost);
+  }

Review Comment:
   Sure, also checked the javadoc. Thanks for point this out~ 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-12 Thread via GitHub


danny0405 commented on code in PR #9118:
URL: https://github.com/apache/hudi/pull/9118#discussion_r1357583122


##
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/metrics/FlinkStreamWriteMetrics.java:
##
@@ -0,0 +1,170 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.metrics;
+
+import org.apache.hudi.sink.common.AbstractStreamWriteFunction;
+
+import com.codahale.metrics.SlidingWindowReservoir;
+import org.apache.flink.dropwizard.metrics.DropwizardHistogramWrapper;
+import org.apache.flink.dropwizard.metrics.DropwizardMeterWrapper;
+import org.apache.flink.metrics.Histogram;
+import org.apache.flink.metrics.Meter;
+import org.apache.flink.metrics.MetricGroup;
+
+/**
+ * Metrics for flink stream write (including append write, normal/bucket 
stream write etc.).
+ * Used in subclasses of {@link AbstractStreamWriteFunction}.
+ */
+public class FlinkStreamWriteMetrics extends HoodieFlinkMetrics {
+  private static final String DATA_FLUSH_KEY = "data_flush";
+  private static final String FILE_FLUSH_KEY = "file_flush";
+  private static final String HANDLE_CREATION_KEY = "handle_creation";
+
+  /**
+   * Flush data costs during checkpoint.
+   */
+  private long dataFlushCosts;
+
+  /**
+   * Number of records written in during a checkpoint window.
+   */
+  protected long writtenRecords;
+
+  /**
+   * Current write buffer size in StreamWriteFunction.
+   */
+  private long writeBufferedSize;
+
+  /**
+   * Total costs for closing write handles during a checkpoint window.
+   */
+  private long fileFlushTotalCosts;
+
+  /**
+   * Number of handles opened during a checkpoint window. Increased with 
partition number/bucket number etc.
+   */
+  private long numOfOpenHandle;
+
+  /**
+   * Number of files written during a checkpoint window.
+   */
+  private long numOfFilesWritten;
+
+  /**
+   * Number of records written per seconds.
+   */
+  protected final Meter recordWrittenPerSecond;
+
+  /**
+   * Number of write handle switches per seconds.
+   */
+  private final Meter handleSwitchPerSecond;
+
+  /**
+   * Cost of write handle creation.
+   */
+  private final Histogram handleCreationCosts;
+
+  /**
+   * Cost of a file flush.
+   */
+  private final Histogram fileFlushCost;
+
+  public FlinkStreamWriteMetrics(MetricGroup metricGroup) {
+super(metricGroup);
+this.recordWrittenPerSecond = new DropwizardMeterWrapper(new 
com.codahale.metrics.Meter());
+this.handleSwitchPerSecond = new DropwizardMeterWrapper(new 
com.codahale.metrics.Meter());
+this.handleCreationCosts = new DropwizardHistogramWrapper(new 
com.codahale.metrics.Histogram(new SlidingWindowReservoir(100)));
+this.fileFlushCost = new DropwizardHistogramWrapper(new 
com.codahale.metrics.Histogram(new SlidingWindowReservoir(100)));
+  }
+
+  @Override
+  public void registerMetrics() {
+metricGroup.meter("recordWrittenPerSecond", recordWrittenPerSecond);
+metricGroup.gauge("currentCommitWrittenRecords", () -> writtenRecords);
+metricGroup.gauge("dataFlushCosts", () -> dataFlushCosts);
+metricGroup.gauge("writeBufferedSize", () -> writeBufferedSize);
+
+metricGroup.gauge("fileFlushTotalCosts", () -> fileFlushTotalCosts);
+metricGroup.gauge("numOfFilesWritten", () -> numOfFilesWritten);
+metricGroup.gauge("numOfOpenHandle", () -> numOfOpenHandle);
+
+metricGroup.meter("handleSwitchPerSecond", handleSwitchPerSecond);
+
+metricGroup.histogram("handleCreationCosts", handleCreationCosts);
+metricGroup.histogram("handleCloseCosts", fileFlushCost);
+  }

Review Comment:
   handleCloseCosts -> fileFlushCosts?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-12 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1759430102

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   * 4b4d15361b3096e27cc3c54c3347c2cf9224c895 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20301)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-12 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1759269289

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   * 4b4d15361b3096e27cc3c54c3347c2cf9224c895 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20301)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-12 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1759253106

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   * 4b4d15361b3096e27cc3c54c3347c2cf9224c895 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-12 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1759172124

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-12 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1759157812

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-12 Thread via GitHub


danny0405 commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1759104229

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-11 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1758820147

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-11 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1758813978

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   * ee3bbd6595f8a69ecaf53d9ac2b445533958832c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20291)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-11 Thread via GitHub


stream2000 commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1758786265

   @danny0405 Hi Danny,  I have addressed all comments, PTAL~ 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-11 Thread via GitHub


stream2000 commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1758785514

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-11 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1757979131

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   * ee3bbd6595f8a69ecaf53d9ac2b445533958832c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20291)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-11 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1757957453

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   * ee3bbd6595f8a69ecaf53d9ac2b445533958832c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20291)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-11 Thread via GitHub


stream2000 commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1757863389

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-11 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1757850550

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   * ee3bbd6595f8a69ecaf53d9ac2b445533958832c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20291)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-11 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1757505901

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   * 8d732e29104fbde138b6ab3fe6df8fb63e10ab07 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20286)
 
   * ee3bbd6595f8a69ecaf53d9ac2b445533958832c Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20291)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-11 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1757486587

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   * 8d732e29104fbde138b6ab3fe6df8fb63e10ab07 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20286)
 
   * ee3bbd6595f8a69ecaf53d9ac2b445533958832c UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-11 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1757344904

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   * 8d732e29104fbde138b6ab3fe6df8fb63e10ab07 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20286)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-11 Thread via GitHub


danny0405 commented on code in PR #9118:
URL: https://github.com/apache/hudi/pull/9118#discussion_r1354583732


##
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteFunction.java:
##
@@ -488,11 +505,24 @@ private void flushRemaining(boolean endInput) {
 this.writeStatuses.addAll(writeStatus);
 // blocks flushing until the coordinator starts a new instant
 this.confirming = true;
+
+writeMetrics.endFlushing();
+writeMetrics.resetAfterCommit();
+  }
+
+  private void registerMetrics() {
+MetricGroup metrics = getRuntimeContext().getMetricGroup();
+writeMetrics = new FlinkStreamWriteMetrics(metrics);
+writeMetrics.registerMetrics();
   }
 
   protected List writeBucket(String instant, DataBucket bucket, 
List records) {
 bucket.preWrite(records);
-return writeFunction.apply(records, instant);
+writeMetrics.startHandleClose();
+List statuses = writeFunction.apply(records, instant);

Review Comment:
   Maybe we just rename `checkpointFlush` -> `dataFlush` and `singleFileFlush` 
-> `fileFlush` ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-11 Thread via GitHub


danny0405 commented on code in PR #9118:
URL: https://github.com/apache/hudi/pull/9118#discussion_r1354581902


##
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/append/AppendWriteFunction.java:
##
@@ -155,5 +168,15 @@ private void flushData(boolean endInput) {
 this.writeStatuses.addAll(writeStatus);
 // blocks flushing until the coordinator starts a new instant
 this.confirming = true;
+
+writeMetrics.endCheckpointFlushing();
+LOG.info("Flushing costs: {} ms", writeMetrics.getCheckpointFlushCosts());
+writeMetrics.resetAfterCommit();

Review Comment:
   We better avoid the logging for each data flush.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-11 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1757038861

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   * a9b387e611bdc9c492a27c6adffe2bf74662be96 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19956)
 
   * 8d732e29104fbde138b6ab3fe6df8fb63e10ab07 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20286)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-11 Thread via GitHub


hudi-bot commented on PR #9118:
URL: https://github.com/apache/hudi/pull/9118#issuecomment-1757025319

   
   ## CI report:
   
   * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN
   * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN
   * a9b387e611bdc9c492a27c6adffe2bf74662be96 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19956)
 
   * 8d732e29104fbde138b6ab3fe6df8fb63e10ab07 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-11 Thread via GitHub


stream2000 commented on code in PR #9118:
URL: https://github.com/apache/hudi/pull/9118#discussion_r1354278561


##
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteFunction.java:
##
@@ -385,16 +393,24 @@ private String getBucketID(HoodieRecord record) {
* @param value HoodieRecord
*/
   protected void bufferRecord(HoodieRecord value) {
+writeMetrics.markRecordIn();
 final String bucketID = getBucketID(value);
 
 DataBucket bucket = this.buckets.computeIfAbsent(bucketID,
-k -> new 
DataBucket(this.config.getDouble(FlinkOptions.WRITE_BATCH_SIZE), value));
+k -> {
+  // create a new bucket and update metrics
+  writeMetrics.increaseNumOfOpenHandle();
+  return new 
DataBucket(this.config.getDouble(FlinkOptions.WRITE_BATCH_SIZE), value);

Review Comment:
   Yes, for stream write we will only create the handle when flushing buckets. 
To avoid misleading, remove metrics here since for stream write, the metrics 
`numOfFilesWritten` is enough. 



##
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteFunction.java:
##
@@ -488,11 +505,24 @@ private void flushRemaining(boolean endInput) {
 this.writeStatuses.addAll(writeStatus);
 // blocks flushing until the coordinator starts a new instant
 this.confirming = true;
+
+writeMetrics.endFlushing();
+writeMetrics.resetAfterCommit();
+  }
+
+  private void registerMetrics() {
+MetricGroup metrics = getRuntimeContext().getMetricGroup();
+writeMetrics = new FlinkStreamWriteMetrics(metrics);
+writeMetrics.registerMetrics();
   }
 
   protected List writeBucket(String instant, DataBucket bucket, 
List records) {
 bucket.preWrite(records);
-return writeFunction.apply(records, instant);
+writeMetrics.startHandleClose();
+List statuses = writeFunction.apply(records, instant);

Review Comment:
   This name is from append write handle close, which is actually file flush. 
Rename it to `singleFileFlush`,  which make sense in both stream write and 
append write.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-10 Thread via GitHub


danny0405 commented on code in PR #9118:
URL: https://github.com/apache/hudi/pull/9118#discussion_r1353905841


##
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteFunction.java:
##
@@ -385,16 +393,24 @@ private String getBucketID(HoodieRecord record) {
* @param value HoodieRecord
*/
   protected void bufferRecord(HoodieRecord value) {
+writeMetrics.markRecordIn();
 final String bucketID = getBucketID(value);
 
 DataBucket bucket = this.buckets.computeIfAbsent(bucketID,
-k -> new 
DataBucket(this.config.getDouble(FlinkOptions.WRITE_BATCH_SIZE), value));
+k -> {
+  // create a new bucket and update metrics
+  writeMetrics.increaseNumOfOpenHandle();
+  return new 
DataBucket(this.config.getDouble(FlinkOptions.WRITE_BATCH_SIZE), value);

Review Comment:
   This is just in-memory bucket, not opening file handle.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]

2023-10-10 Thread via GitHub


danny0405 commented on code in PR #9118:
URL: https://github.com/apache/hudi/pull/9118#discussion_r1353905559


##
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteFunction.java:
##
@@ -488,11 +505,24 @@ private void flushRemaining(boolean endInput) {
 this.writeStatuses.addAll(writeStatus);
 // blocks flushing until the coordinator starts a new instant
 this.confirming = true;
+
+writeMetrics.endFlushing();
+writeMetrics.resetAfterCommit();
+  }
+
+  private void registerMetrics() {
+MetricGroup metrics = getRuntimeContext().getMetricGroup();
+writeMetrics = new FlinkStreamWriteMetrics(metrics);
+writeMetrics.registerMetrics();
   }
 
   protected List writeBucket(String instant, DataBucket bucket, 
List records) {
 bucket.preWrite(records);
-return writeFunction.apply(records, instant);
+writeMetrics.startHandleClose();
+List statuses = writeFunction.apply(records, instant);

Review Comment:
   What does it mean for handle close?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org