Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
stream2000 commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1956177100 > A runtime error will be introduced here. Passing com.codahale.metrics.Meter as a parameter to flink code will cause a NoSuchMethodError, unless the flink code is also shaded, but the bundle-jar will be too large @zhuanshenbsj1 Sorry for the inconvenience introduced, I have already fix it in #10723 , PTAL. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
zhuanshenbsj1 commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1926124497 A runtime error will be introduced here. Passing com.codahale.metrics.Meter as a parameter to flink code will cause a NoSuchMethodError, unless the flink code is also shaded, but the bundle-jar will be too large. @stream2000 @danny0405 https://github.com/apache/hudi/assets/34104400/42f1557a-356d-4d07-997d-06347ccfab39;> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
danny0405 merged PR #9118: URL: https://github.com/apache/hudi/pull/9118 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
stream2000 commented on code in PR #9118: URL: https://github.com/apache/hudi/pull/9118#discussion_r1361435492 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/bulk/BulkInsertWriterHelper.java: ## @@ -69,14 +71,21 @@ public class BulkInsertWriterHelper { @Nullable protected final RowDataKeyGen keyGen; + protected final Option appendWriteMetrics; + public BulkInsertWriterHelper(Configuration conf, HoodieTable hoodieTable, HoodieWriteConfig writeConfig, String instantTime, int taskPartitionId, long totalSubtaskNum, long taskEpochId, RowType rowType) { -this(conf, hoodieTable, writeConfig, instantTime, taskPartitionId, totalSubtaskNum, taskEpochId, rowType, false); +this(conf, hoodieTable, writeConfig, instantTime, taskPartitionId, totalSubtaskNum, taskEpochId, rowType, false, null); + } + + public BulkInsertWriterHelper(Configuration conf, HoodieTable hoodieTable, HoodieWriteConfig writeConfig, +String instantTime, int taskPartitionId, long taskId, long taskEpochId, RowType rowType, boolean preserveHoodieMetadata) { +this(conf, hoodieTable, writeConfig, instantTime, taskPartitionId, taskId, taskEpochId, rowType, preserveHoodieMetadata, null); } public BulkInsertWriterHelper(Configuration conf, HoodieTable hoodieTable, HoodieWriteConfig writeConfig, String instantTime, int taskPartitionId, long totalSubtaskNum, long taskEpochId, RowType rowType, -boolean preserveHoodieMetadata) { +boolean preserveHoodieMetadata, FlinkStreamWriteMetrics metrics) { Review Comment: Sure. PTAL~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1764497532 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN * 6ee9b9fcbbf2ed709f2a5c12829ce43dee92f0e2 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20346) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1764204966 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN * 8738f767895a396e88c163a8d785d5e92277fad8 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20333) * 6ee9b9fcbbf2ed709f2a5c12829ce43dee92f0e2 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20346) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1764190201 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN * 8738f767895a396e88c163a8d785d5e92277fad8 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20333) * 6ee9b9fcbbf2ed709f2a5c12829ce43dee92f0e2 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
danny0405 commented on code in PR #9118: URL: https://github.com/apache/hudi/pull/9118#discussion_r1360403842 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/bulk/BulkInsertWriterHelper.java: ## @@ -69,14 +71,21 @@ public class BulkInsertWriterHelper { @Nullable protected final RowDataKeyGen keyGen; + protected final Option appendWriteMetrics; + public BulkInsertWriterHelper(Configuration conf, HoodieTable hoodieTable, HoodieWriteConfig writeConfig, String instantTime, int taskPartitionId, long totalSubtaskNum, long taskEpochId, RowType rowType) { -this(conf, hoodieTable, writeConfig, instantTime, taskPartitionId, totalSubtaskNum, taskEpochId, rowType, false); +this(conf, hoodieTable, writeConfig, instantTime, taskPartitionId, totalSubtaskNum, taskEpochId, rowType, false, null); + } + + public BulkInsertWriterHelper(Configuration conf, HoodieTable hoodieTable, HoodieWriteConfig writeConfig, +String instantTime, int taskPartitionId, long taskId, long taskEpochId, RowType rowType, boolean preserveHoodieMetadata) { +this(conf, hoodieTable, writeConfig, instantTime, taskPartitionId, taskId, taskEpochId, rowType, preserveHoodieMetadata, null); } public BulkInsertWriterHelper(Configuration conf, HoodieTable hoodieTable, HoodieWriteConfig writeConfig, String instantTime, int taskPartitionId, long totalSubtaskNum, long taskEpochId, RowType rowType, -boolean preserveHoodieMetadata) { +boolean preserveHoodieMetadata, FlinkStreamWriteMetrics metrics) { Review Comment: Can we use `Option metrics` as param instead? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
stream2000 commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1763604724 @danny0405 Hi Danny, could we merge this PR since the test failures are just unstable case? Thanks~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1761999841 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN * 8738f767895a396e88c163a8d785d5e92277fad8 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20333) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1761619670 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN * 68bf4bccc6fcac8c2f96e4223bab206992af314b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19686) * 8738f767895a396e88c163a8d785d5e92277fad8 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20333) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1761552441 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN * 68bf4bccc6fcac8c2f96e4223bab206992af314b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19686) * 8738f767895a396e88c163a8d785d5e92277fad8 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1761205902 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN * 68bf4bccc6fcac8c2f96e4223bab206992af314b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19686) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1761146764 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN * c930542593c9ff008a3d5cee08be617fc7fbfcb0 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20319) * 68bf4bccc6fcac8c2f96e4223bab206992af314b UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
stream2000 closed pull request #9118: [HUDI-2141] Support flink stream write metrics URL: https://github.com/apache/hudi/pull/9118 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1761044227 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN * c930542593c9ff008a3d5cee08be617fc7fbfcb0 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20319) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1760681513 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN * 4b4d15361b3096e27cc3c54c3347c2cf9224c895 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20301) * c930542593c9ff008a3d5cee08be617fc7fbfcb0 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20319) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1760676179 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN * 4b4d15361b3096e27cc3c54c3347c2cf9224c895 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20301) * c930542593c9ff008a3d5cee08be617fc7fbfcb0 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
stream2000 commented on code in PR #9118: URL: https://github.com/apache/hudi/pull/9118#discussion_r1357668260 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/metrics/FlinkStreamWriteMetrics.java: ## @@ -0,0 +1,170 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hudi.metrics; + +import org.apache.hudi.sink.common.AbstractStreamWriteFunction; + +import com.codahale.metrics.SlidingWindowReservoir; +import org.apache.flink.dropwizard.metrics.DropwizardHistogramWrapper; +import org.apache.flink.dropwizard.metrics.DropwizardMeterWrapper; +import org.apache.flink.metrics.Histogram; +import org.apache.flink.metrics.Meter; +import org.apache.flink.metrics.MetricGroup; + +/** + * Metrics for flink stream write (including append write, normal/bucket stream write etc.). + * Used in subclasses of {@link AbstractStreamWriteFunction}. + */ +public class FlinkStreamWriteMetrics extends HoodieFlinkMetrics { + private static final String DATA_FLUSH_KEY = "data_flush"; + private static final String FILE_FLUSH_KEY = "file_flush"; + private static final String HANDLE_CREATION_KEY = "handle_creation"; + + /** + * Flush data costs during checkpoint. + */ + private long dataFlushCosts; + + /** + * Number of records written in during a checkpoint window. + */ + protected long writtenRecords; + + /** + * Current write buffer size in StreamWriteFunction. + */ + private long writeBufferedSize; + + /** + * Total costs for closing write handles during a checkpoint window. + */ + private long fileFlushTotalCosts; + + /** + * Number of handles opened during a checkpoint window. Increased with partition number/bucket number etc. + */ + private long numOfOpenHandle; + + /** + * Number of files written during a checkpoint window. + */ + private long numOfFilesWritten; + + /** + * Number of records written per seconds. + */ + protected final Meter recordWrittenPerSecond; + + /** + * Number of write handle switches per seconds. + */ + private final Meter handleSwitchPerSecond; + + /** + * Cost of write handle creation. + */ + private final Histogram handleCreationCosts; + + /** + * Cost of a file flush. + */ + private final Histogram fileFlushCost; + + public FlinkStreamWriteMetrics(MetricGroup metricGroup) { +super(metricGroup); +this.recordWrittenPerSecond = new DropwizardMeterWrapper(new com.codahale.metrics.Meter()); +this.handleSwitchPerSecond = new DropwizardMeterWrapper(new com.codahale.metrics.Meter()); +this.handleCreationCosts = new DropwizardHistogramWrapper(new com.codahale.metrics.Histogram(new SlidingWindowReservoir(100))); +this.fileFlushCost = new DropwizardHistogramWrapper(new com.codahale.metrics.Histogram(new SlidingWindowReservoir(100))); + } + + @Override + public void registerMetrics() { +metricGroup.meter("recordWrittenPerSecond", recordWrittenPerSecond); +metricGroup.gauge("currentCommitWrittenRecords", () -> writtenRecords); +metricGroup.gauge("dataFlushCosts", () -> dataFlushCosts); +metricGroup.gauge("writeBufferedSize", () -> writeBufferedSize); + +metricGroup.gauge("fileFlushTotalCosts", () -> fileFlushTotalCosts); +metricGroup.gauge("numOfFilesWritten", () -> numOfFilesWritten); +metricGroup.gauge("numOfOpenHandle", () -> numOfOpenHandle); + +metricGroup.meter("handleSwitchPerSecond", handleSwitchPerSecond); + +metricGroup.histogram("handleCreationCosts", handleCreationCosts); +metricGroup.histogram("handleCloseCosts", fileFlushCost); + } Review Comment: Sure, also checked the javadoc. Thanks for point this out~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
danny0405 commented on code in PR #9118: URL: https://github.com/apache/hudi/pull/9118#discussion_r1357583122 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/metrics/FlinkStreamWriteMetrics.java: ## @@ -0,0 +1,170 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hudi.metrics; + +import org.apache.hudi.sink.common.AbstractStreamWriteFunction; + +import com.codahale.metrics.SlidingWindowReservoir; +import org.apache.flink.dropwizard.metrics.DropwizardHistogramWrapper; +import org.apache.flink.dropwizard.metrics.DropwizardMeterWrapper; +import org.apache.flink.metrics.Histogram; +import org.apache.flink.metrics.Meter; +import org.apache.flink.metrics.MetricGroup; + +/** + * Metrics for flink stream write (including append write, normal/bucket stream write etc.). + * Used in subclasses of {@link AbstractStreamWriteFunction}. + */ +public class FlinkStreamWriteMetrics extends HoodieFlinkMetrics { + private static final String DATA_FLUSH_KEY = "data_flush"; + private static final String FILE_FLUSH_KEY = "file_flush"; + private static final String HANDLE_CREATION_KEY = "handle_creation"; + + /** + * Flush data costs during checkpoint. + */ + private long dataFlushCosts; + + /** + * Number of records written in during a checkpoint window. + */ + protected long writtenRecords; + + /** + * Current write buffer size in StreamWriteFunction. + */ + private long writeBufferedSize; + + /** + * Total costs for closing write handles during a checkpoint window. + */ + private long fileFlushTotalCosts; + + /** + * Number of handles opened during a checkpoint window. Increased with partition number/bucket number etc. + */ + private long numOfOpenHandle; + + /** + * Number of files written during a checkpoint window. + */ + private long numOfFilesWritten; + + /** + * Number of records written per seconds. + */ + protected final Meter recordWrittenPerSecond; + + /** + * Number of write handle switches per seconds. + */ + private final Meter handleSwitchPerSecond; + + /** + * Cost of write handle creation. + */ + private final Histogram handleCreationCosts; + + /** + * Cost of a file flush. + */ + private final Histogram fileFlushCost; + + public FlinkStreamWriteMetrics(MetricGroup metricGroup) { +super(metricGroup); +this.recordWrittenPerSecond = new DropwizardMeterWrapper(new com.codahale.metrics.Meter()); +this.handleSwitchPerSecond = new DropwizardMeterWrapper(new com.codahale.metrics.Meter()); +this.handleCreationCosts = new DropwizardHistogramWrapper(new com.codahale.metrics.Histogram(new SlidingWindowReservoir(100))); +this.fileFlushCost = new DropwizardHistogramWrapper(new com.codahale.metrics.Histogram(new SlidingWindowReservoir(100))); + } + + @Override + public void registerMetrics() { +metricGroup.meter("recordWrittenPerSecond", recordWrittenPerSecond); +metricGroup.gauge("currentCommitWrittenRecords", () -> writtenRecords); +metricGroup.gauge("dataFlushCosts", () -> dataFlushCosts); +metricGroup.gauge("writeBufferedSize", () -> writeBufferedSize); + +metricGroup.gauge("fileFlushTotalCosts", () -> fileFlushTotalCosts); +metricGroup.gauge("numOfFilesWritten", () -> numOfFilesWritten); +metricGroup.gauge("numOfOpenHandle", () -> numOfOpenHandle); + +metricGroup.meter("handleSwitchPerSecond", handleSwitchPerSecond); + +metricGroup.histogram("handleCreationCosts", handleCreationCosts); +metricGroup.histogram("handleCloseCosts", fileFlushCost); + } Review Comment: handleCloseCosts -> fileFlushCosts? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1759430102 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN * 4b4d15361b3096e27cc3c54c3347c2cf9224c895 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20301) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1759269289 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN * 4b4d15361b3096e27cc3c54c3347c2cf9224c895 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20301) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1759253106 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN * 4b4d15361b3096e27cc3c54c3347c2cf9224c895 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1759172124 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1759157812 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
danny0405 commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1759104229 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1758820147 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1758813978 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN * ee3bbd6595f8a69ecaf53d9ac2b445533958832c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20291) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
stream2000 commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1758786265 @danny0405 Hi Danny, I have addressed all comments, PTAL~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
stream2000 commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1758785514 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1757979131 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN * ee3bbd6595f8a69ecaf53d9ac2b445533958832c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20291) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1757957453 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN * ee3bbd6595f8a69ecaf53d9ac2b445533958832c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20291) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
stream2000 commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1757863389 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1757850550 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN * ee3bbd6595f8a69ecaf53d9ac2b445533958832c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20291) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1757505901 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN * 8d732e29104fbde138b6ab3fe6df8fb63e10ab07 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20286) * ee3bbd6595f8a69ecaf53d9ac2b445533958832c Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20291) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1757486587 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN * 8d732e29104fbde138b6ab3fe6df8fb63e10ab07 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20286) * ee3bbd6595f8a69ecaf53d9ac2b445533958832c UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1757344904 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN * 8d732e29104fbde138b6ab3fe6df8fb63e10ab07 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20286) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
danny0405 commented on code in PR #9118: URL: https://github.com/apache/hudi/pull/9118#discussion_r1354583732 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteFunction.java: ## @@ -488,11 +505,24 @@ private void flushRemaining(boolean endInput) { this.writeStatuses.addAll(writeStatus); // blocks flushing until the coordinator starts a new instant this.confirming = true; + +writeMetrics.endFlushing(); +writeMetrics.resetAfterCommit(); + } + + private void registerMetrics() { +MetricGroup metrics = getRuntimeContext().getMetricGroup(); +writeMetrics = new FlinkStreamWriteMetrics(metrics); +writeMetrics.registerMetrics(); } protected List writeBucket(String instant, DataBucket bucket, List records) { bucket.preWrite(records); -return writeFunction.apply(records, instant); +writeMetrics.startHandleClose(); +List statuses = writeFunction.apply(records, instant); Review Comment: Maybe we just rename `checkpointFlush` -> `dataFlush` and `singleFileFlush` -> `fileFlush` ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
danny0405 commented on code in PR #9118: URL: https://github.com/apache/hudi/pull/9118#discussion_r1354581902 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/append/AppendWriteFunction.java: ## @@ -155,5 +168,15 @@ private void flushData(boolean endInput) { this.writeStatuses.addAll(writeStatus); // blocks flushing until the coordinator starts a new instant this.confirming = true; + +writeMetrics.endCheckpointFlushing(); +LOG.info("Flushing costs: {} ms", writeMetrics.getCheckpointFlushCosts()); +writeMetrics.resetAfterCommit(); Review Comment: We better avoid the logging for each data flush. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1757038861 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN * a9b387e611bdc9c492a27c6adffe2bf74662be96 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19956) * 8d732e29104fbde138b6ab3fe6df8fb63e10ab07 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20286) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
hudi-bot commented on PR #9118: URL: https://github.com/apache/hudi/pull/9118#issuecomment-1757025319 ## CI report: * f6d7dd97c73898206da91b17144326a7dbbffae8 UNKNOWN * c62db1fdf94ee2c1f9b9e539f7a4b1bb866beb7e UNKNOWN * a9b387e611bdc9c492a27c6adffe2bf74662be96 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19956) * 8d732e29104fbde138b6ab3fe6df8fb63e10ab07 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
stream2000 commented on code in PR #9118: URL: https://github.com/apache/hudi/pull/9118#discussion_r1354278561 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteFunction.java: ## @@ -385,16 +393,24 @@ private String getBucketID(HoodieRecord record) { * @param value HoodieRecord */ protected void bufferRecord(HoodieRecord value) { +writeMetrics.markRecordIn(); final String bucketID = getBucketID(value); DataBucket bucket = this.buckets.computeIfAbsent(bucketID, -k -> new DataBucket(this.config.getDouble(FlinkOptions.WRITE_BATCH_SIZE), value)); +k -> { + // create a new bucket and update metrics + writeMetrics.increaseNumOfOpenHandle(); + return new DataBucket(this.config.getDouble(FlinkOptions.WRITE_BATCH_SIZE), value); Review Comment: Yes, for stream write we will only create the handle when flushing buckets. To avoid misleading, remove metrics here since for stream write, the metrics `numOfFilesWritten` is enough. ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteFunction.java: ## @@ -488,11 +505,24 @@ private void flushRemaining(boolean endInput) { this.writeStatuses.addAll(writeStatus); // blocks flushing until the coordinator starts a new instant this.confirming = true; + +writeMetrics.endFlushing(); +writeMetrics.resetAfterCommit(); + } + + private void registerMetrics() { +MetricGroup metrics = getRuntimeContext().getMetricGroup(); +writeMetrics = new FlinkStreamWriteMetrics(metrics); +writeMetrics.registerMetrics(); } protected List writeBucket(String instant, DataBucket bucket, List records) { bucket.preWrite(records); -return writeFunction.apply(records, instant); +writeMetrics.startHandleClose(); +List statuses = writeFunction.apply(records, instant); Review Comment: This name is from append write handle close, which is actually file flush. Rename it to `singleFileFlush`, which make sense in both stream write and append write. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
danny0405 commented on code in PR #9118: URL: https://github.com/apache/hudi/pull/9118#discussion_r1353905841 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteFunction.java: ## @@ -385,16 +393,24 @@ private String getBucketID(HoodieRecord record) { * @param value HoodieRecord */ protected void bufferRecord(HoodieRecord value) { +writeMetrics.markRecordIn(); final String bucketID = getBucketID(value); DataBucket bucket = this.buckets.computeIfAbsent(bucketID, -k -> new DataBucket(this.config.getDouble(FlinkOptions.WRITE_BATCH_SIZE), value)); +k -> { + // create a new bucket and update metrics + writeMetrics.increaseNumOfOpenHandle(); + return new DataBucket(this.config.getDouble(FlinkOptions.WRITE_BATCH_SIZE), value); Review Comment: This is just in-memory bucket, not opening file handle. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-2141] Support flink stream write metrics [hudi]
danny0405 commented on code in PR #9118: URL: https://github.com/apache/hudi/pull/9118#discussion_r1353905559 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteFunction.java: ## @@ -488,11 +505,24 @@ private void flushRemaining(boolean endInput) { this.writeStatuses.addAll(writeStatus); // blocks flushing until the coordinator starts a new instant this.confirming = true; + +writeMetrics.endFlushing(); +writeMetrics.resetAfterCommit(); + } + + private void registerMetrics() { +MetricGroup metrics = getRuntimeContext().getMetricGroup(); +writeMetrics = new FlinkStreamWriteMetrics(metrics); +writeMetrics.registerMetrics(); } protected List writeBucket(String instant, DataBucket bucket, List records) { bucket.preWrite(records); -return writeFunction.apply(records, instant); +writeMetrics.startHandleClose(); +List statuses = writeFunction.apply(records, instant); Review Comment: What does it mean for handle close? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org