[PR] [FLINK-37688] Implement Amazon CloudWatch Metric Sink Connector [flink-connector-aws]

via GitHub Mon, 21 Apr 2025 13:36:38 -0700


darenwkt opened a new pull request, #202:
URL: https://github.com/apache/flink-connector-aws/pull/202


   <!--
   *Thank you for contributing to Apache Flink AWS Connectors - we are happy 
that you want to help us improve our Flink connectors. To help the community 
review your contribution in the best possible way, please go through the 
checklist below, which will get the contribution into a shape in which it can 
be best reviewed.*
   
   ## Contribution Checklist
   
   - The name of the pull request should correspond to a [JIRA 
issue](https://issues.apache.org/jira/projects/FLINK/issues). Exceptions are 
made for typos in JavaDoc or documentation files, which need no JIRA issue.
   - Commits should be in the form of "[FLINK-XXXX][component] Title of the 
pull request", where [FLINK-XXXX] should be replaced by the actual issue 
number. 
       Generally, [component] should be the connector you are working on.
       For example: "[FLINK-XXXX][Connectors/Kinesis] XXXX" if you are working 
on the Kinesis connector or "[FLINK-XXXX][Connectors/AWS] XXXX" if you are 
working on components shared among all the connectors.
   - Each pull request should only have one JIRA issue.
   - Once all items of the checklist are addressed, remove the above text and 
this checklist, leaving only the filled out template below.
   -->
   
   ## Purpose of the change
   
   Implement Amazon CloudWatch Metric Sink Connector which includes following:
   - DataStream and TableAPI support
   - Integration Test with LocalStack container
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
   #### Unit Test
   - Added unit tests
   
   #### IT Test - Added integration tests for end-to-end deployment
   - Ran IT test locally 
   ```
   mvn clean verify -Prun-end-to-end-tests 
-DdistDir=/Users/darenwkt/Downloads/flink-1.19.2
   ```
   - Verified test pass for DataStream and TableAPI
   ```
   [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
71.197 s - in 
org.apache.flink.connector.cloudwatch.table.test.CloudWatchTableAPIITCase
   [INFO] Running 
org.apache.flink.connector.cloudwatch.sink.test.CloudWatchSinkITCase
   [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
34.422 s - in 
org.apache.flink.connector.cloudwatch.sink.test.CloudWatchSinkITCase
   [INFO] 
   [INFO] Results:
   [INFO] 
   [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0
   ```
   #### DataStream API Manual Test - Manually verified by running the 
CloudWatch connector on a local Flink cluster.
   
   - Create Flink job code
   ```
           StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
           final Map<String, Properties> applicationParameters = 
loadApplicationProperties(env);
   
           
env.fromSource(createDataGeneratorSource(applicationParameters.get("DataGen")), 
WatermarkStrategy.noWatermarks(), "DataGen")
                   .uid("data-gen")
                   .setParallelism(1)
                   .disableChaining()
                   .sinkTo(CloudWatchSink.<TemperatureSample>builder()
                           .setNamespace("CloudWatchSinkTest")
                           
.setCloudWatchClientProperties(applicationParameters.get("CloudWatchSink"))
                           .setElementConverter(new 
MetricWriteRequestElementConverter<>() {
                               @Override
                               public MetricWriteRequest 
apply(TemperatureSample value, SinkWriter.Context context) {
                                   return MetricWriteRequest.builder()
                                           .withMetricName("temperature")
                                           .addDimension("roomId", "testRoomId")
                                           .addDimension("sensorId", 
"testSensorId")
                                           .addValue(value.getTemperature())
                                           .addCount(1d)
                                           
.withTimestamp(Instant.ofEpochMilli(value.getTimestamp()))
                                           .build();
                               }
                           })
                           .build())
                   .uid("cloudwatch-sink").name("CloudWatchSink");
   
           env.enableCheckpointing(5000);
   
           env.execute("CloudWatch sink example");
   ```
   
   - Build jar and submitted job
   ```
   ./bin/flink run cloudwatch-connector-example-1.0.jar
   Job has been submitted with JobID f417eccc9d9204604ec51eeed24196e0
   ```
   
   - Verified job is running and checkpointing successfully
   ![Screenshot 2025-04-21 at 21 29 
12](https://github.com/user-attachments/assets/bba8ceb5-ff22-43a7-a3d0-d87cedb26319)
   
   - Verified CloudWatch received the metric successfully
   ![Screenshot 2025-04-21 at 21 31 
09](https://github.com/user-attachments/assets/ba75dfd3-f2d0-499e-be88-b45218238d50)
   
   
   
   #### TableAPI Manual Test - Manually verified by running the CloudWatch 
connector on a local Flink cluster.
   
   - Built SQL jar and started sql-client with it
   ```
   ./bin/sql-client.sh --jar flink-sql-connector-cloudwatch-5.1-SNAPSHOT.jar
   ```
   
   - Created Sink Table
   ```
   Flink SQL>
   > CREATE TABLE CloudWatchTable
   > (
   >     `my_metric_name`   STRING,
   >     `my_dim` STRING,
   >     `sample_value` DOUBLE,
   >     `sample_count` DOUBLE,
   >     `unit`             STRING,
   >     `storage_res`      INT,
   >     `stats_max` DOUBLE,
   >     `stats_min` DOUBLE,
   >     `stats_sum` DOUBLE,
   >     `stats_count` DOUBLE
   > )
   >     WITH (
   >         'connector' = 'cloudwatch',
   >         'aws.region' = 'us-east-1',
   >         'metric.namespace' = 'cw_connector_namespace',
   >         'metric.name.key' = 'my_metric_name',
   >         'metric.dimension.keys' = 'my_dim',
   >         'metric.value.key' = 'sample_value',
   >         'metric.count.key' = 'sample_count',
   >         'metric.unit.key' = 'unit',
   >         'metric.storage-resolution.key' = 'storage_res',
   >         'metric.statistic.max.key' = 'stats_max',
   >         'metric.statistic.min.key' = 'stats_min',
   >         'metric.statistic.sum.key' = 'stats_sum',
   >         'metric.statistic.sample-count.key' = 'stats_count',
   >         'sink.invalid-metric.retry-mode' = 'RETRY'
   >         );
   [INFO] Execute statement succeed.
   ```
   
   - Inserted sample value into table
   ```
   Flink SQL> INSERT INTO CloudWatchTable VALUES ('test_metric', 'dim_1', 123, 
1, 'Seconds', 60, 999, 1, 10, 1
   1);
   [INFO] Submitting SQL update statement to the cluster...
   [INFO] SQL update statement has been successfully submitted to the cluster:
   Job ID: 90d19a75e7326575b4ea3f4d883091a7
   ```
   
   - Verified job entered FINISHED state successfully
   ![Screenshot 2025-04-21 at 21 09 
59](https://github.com/user-attachments/assets/d6b39933-8ddb-4e87-ad63-491f624d5547)
   
   - Verified Cloudwatch received the metric successfully
   ![Screenshot 2025-04-21 at 21 10 
25](https://github.com/user-attachments/assets/04494336-93a9-47f5-b852-1fb26333fec7)
   
   
   ## Significant changes
   *(Please check any boxes [x] if the answer is "yes". You can first publish 
the PR and check them afterwards, for convenience.)*
   - [ ] Dependencies have been added or upgraded
   - [ ] Public API has been changed (Public API is any class annotated with 
`@Public(Evolving)`)
   - [ ] Serializers have been changed
   - [ ] New feature has been introduced
     - If yes, how is this documented? (not applicable / docs / JavaDocs / not 
documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] [FLINK-37688] Implement Amazon CloudWatch Metric Sink Connector [flink-connector-aws]

Reply via email to