[
https://issues.apache.org/jira/browse/HUDI-5183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ethan Guo updated HUDI-5183:
----------------------------
Component/s: cli
> Cloudwatch mertics won't work for CLI
> -------------------------------------
>
> Key: HUDI-5183
> URL: https://issues.apache.org/jira/browse/HUDI-5183
> Project: Apache Hudi
> Issue Type: Bug
> Components: cli
> Reporter: Shawn Chang
> Priority: Major
>
> This appears to be broken since this commit:
> [https://github.com/apache/hudi/commit/9797fdfbb27ca8f5f06875ad958b597becc27a8d].
> The commit makes metrics prefix configurable, changing it from the earlier
> way of using the table name directly. For metadata table, while publishing
> the metrics, this turns up as empty because the inference mechanism does not
> find table name in the HoodieConfig it is trying to lookup.
>
> This breaks while publishing metrics to cloudwatch, since it cannot publish a
> Dimension with empty value.
>
> Exception:
>
> {code:java}
> 22/03/07 22:25:07 ERROR CloudWatchReporter: Error reporting metrics to
> CloudWatch. The data in this CloudWatch request may have been discarded, and
> not made it to CloudWatch.
> java.util.concurrent.ExecutionException:
> com.amazonaws.services.cloudwatch.model.MissingRequiredParameterException:
> The parameter MetricData.member.1.Dimensions.member.1.Value is required.
> (Service: AmazonCloudWatch; Status Code: 400; Error Code: MissingParameter;
> Request ID: 9b7c60fe-c872-4b92-8518-02bbe5e5e5b9; Proxy: null)
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> ~[?:1.8.0_322]
> at java.util.concurrent.FutureTask.get(FutureTask.java:206) ~[?:1.8.0_322]
> at
> org.apache.hudi.aws.cloudwatch.CloudWatchReporter.report(CloudWatchReporter.java:234)
> ~[hudi-spark3-bundle_2.12-0.10.1-amzn-0-SNAPSHOT.jar:0.10.1-amzn-0-SNAPSHOT]
> at
> org.apache.hudi.aws.cloudwatch.CloudWatchReporter.report(CloudWatchReporter.java:212)
> ~[hudi-spark3-bundle_2.12-0.10.1-amzn-0-SNAPSHOT.jar:0.10.1-amzn-0-SNAPSHOT]
> at
> org.apache.hudi.com.codahale.metrics.ScheduledReporter.report(ScheduledReporter.java:237)
> ~[hudi-spark3-bundle_2.12-0.10.1-amzn-0-SNAPSHOT.jar:0.10.1-amzn-0-SNAPSHOT]
> at
> org.apache.hudi.com.codahale.metrics.ScheduledReporter.lambda$start$0(ScheduledReporter.java:177)
> ~[hudi-spark3-bundle_2.12-0.10.1-amzn-0-SNAPSHOT.jar:0.10.1-amzn-0-SNAPSHOT]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [?:1.8.0_322]
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> [?:1.8.0_322]
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> [?:1.8.0_322]
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> [?:1.8.0_322]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [?:1.8.0_322]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [?:1.8.0_322]
> at java.lang.Thread.run(Thread.java:750) [?:1.8.0_322]
> Caused by:
> com.amazonaws.services.cloudwatch.model.MissingRequiredParameterException:
> The parameter MetricData.member.1.Dimensions.member.1.Value is required.
> (Service: AmazonCloudWatch; Status Code: 400; Error Code: MissingParameter;
> Request ID: 9b7c60fe-c872-4b92-8518-02bbe5e5e5b9; Proxy: null)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1862)
> ~[aws-java-sdk-bundle-1.12.31.jar:?]
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1415)
> ~[aws-java-sdk-bundle-1.12.31.jar:?]
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1384)
> ~[aws-java-sdk-bundle-1.12.31.jar:?]
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1154)
> ~[aws-java-sdk-bundle-1.12.31.jar:?]
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:811)
> ~[aws-java-sdk-bundle-1.12.31.jar:?]
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779)
> ~[aws-java-sdk-bundle-1.12.31.jar:?]
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753)
> ~[aws-java-sdk-bundle-1.12.31.jar:?]
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:713)
> ~[aws-java-sdk-bundle-1.12.31.jar:?]
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:695)
> ~[aws-java-sdk-bundle-1.12.31.jar:?]
> at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559)
> ~[aws-java-sdk-bundle-1.12.31.jar:?]
> at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:539)
> ~[aws-java-sdk-bundle-1.12.31.jar:?]
> at
> com.amazonaws.services.cloudwatch.AmazonCloudWatchClient.doInvoke(AmazonCloudWatchClient.java:3084)
> ~[aws-java-sdk-bundle-1.12.31.jar:?]
> at
> com.amazonaws.services.cloudwatch.AmazonCloudWatchClient.invoke(AmazonCloudWatchClient.java:3051)
> ~[aws-java-sdk-bundle-1.12.31.jar:?]
> at
> com.amazonaws.services.cloudwatch.AmazonCloudWatchClient.invoke(AmazonCloudWatchClient.java:3040)
> ~[aws-java-sdk-bundle-1.12.31.jar:?]
> at
> com.amazonaws.services.cloudwatch.AmazonCloudWatchClient.executePutMetricData(AmazonCloudWatchClient.java:2559)
> ~[aws-java-sdk-bundle-1.12.31.jar:?]
> at
> com.amazonaws.services.cloudwatch.AmazonCloudWatchAsyncClient$30.call(AmazonCloudWatchAsyncClient.java:1314)
> ~[aws-java-sdk-bundle-1.12.31.jar:?]
> at
> com.amazonaws.services.cloudwatch.AmazonCloudWatchAsyncClient$30.call(AmazonCloudWatchAsyncClient.java:1308)
> ~[aws-java-sdk-bundle-1.12.31.jar:?]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_322]
> ... 3 more {code}
>
>
>
> Reproduction:
> # Create a hudi table, any table works
> # Start hudi-cli, with metrics turned on: hoodie.metrics.on=true and metrics
> reporter type set to cloudwatch (hoodie.metrics.reporter.type=CLOUDWATCH)
> # Run commands like `cleans run` or `cluster schedule`
>
> Thoughts:
> - It would be easy to unblock users by passing table name to sparkArgs in
> classes like `CleansCommand`. But it would be more ideal if we can have
> `HoodieWriteConfig` set default values such as table name by referring to
> `HoodieTableConfig` automatically
--
This message was sent by Atlassian Jira
(v8.20.10#820010)