[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-09-08 Thread Danny Cranmer (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17601805#comment-17601805
 ] 

Danny Cranmer commented on FLINK-24229:
---

Hey [~Gusev], I missed the notification for this and actually only saw it 
yesterday. Apologies. I will start looking right away,

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.17.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-09-08 Thread Yuri Gusev (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17601786#comment-17601786
 ] 

Yuri Gusev commented on FLINK-24229:


Hi [~dannycranmer],

We opened a PR to the new repo a while ago. Do you have time to look into it?

Best regards,
Yuri

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.17.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-08-04 Thread Yuri Gusev (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575236#comment-17575236
 ] 

Yuri Gusev commented on FLINK-24229:


Hi Danny, 

Thank you for creating the repo, we'll open a PR there. Sorry for the delay, I 
haven't looked into it for a while.

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.16.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-07-28 Thread Danny Cranmer (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17572640#comment-17572640
 ] 

Danny Cranmer commented on FLINK-24229:
---

[~Gusev] the FLIP has been accepted and the new connector repository has been 
created [1]. Can you please open a PR against the new repo? Thanks

[1] https://github.com/apache/flink-connector-dynamodb

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.16.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-07-15 Thread Danny Cranmer (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17567340#comment-17567340
 ] 

Danny Cranmer commented on FLINK-24229:
---

I have published the FLIP [1], I look forward to your feedback.

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-252%3A+Amazon+DynamoDB+Sink+Connector

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.16.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-07-15 Thread Danny Cranmer (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17567247#comment-17567247
 ] 

Danny Cranmer commented on FLINK-24229:
---

[~prasaanth07] unfortunately I cannot commit to anything yet, the connector 
still needs to be approved by the Flink community. 

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.16.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-07-14 Thread Prasaanth (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17566981#comment-17566981
 ] 

Prasaanth commented on FLINK-24229:
---

[~dannycranmer] : That is excellent, glad to know we may get the DynamoDB 
connector released independent of Flink releases lifecycle. Would there be an 
ETA for the connector release you have in mind? 

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.16.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-07-12 Thread Danny Cranmer (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17566180#comment-17566180
 ] 

Danny Cranmer commented on FLINK-24229:
---

[~prasaanth07] Flink connectors are being moved to new repositories with 
dedicated release lifecycles and independent versioning strategies. I plan on 
proposing to target Flink 1.15+ with this connector, so once released you can 
start using it right away on older Flink versions.

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.16.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-07-12 Thread Prasaanth (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17566154#comment-17566154
 ] 

Prasaanth commented on FLINK-24229:
---

Hi all, excited to see Flink supporting a DynamoDB sink natively. Is 1.17 the 
Flink version where we can expect this? Looks like Flink 1.16 has a release 
freeze on the 25th of July this year. 

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.16.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-07-08 Thread Danny Cranmer (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564342#comment-17564342
 ] 

Danny Cranmer commented on FLINK-24229:
---

[~Gusev] I think we certainly need to hide the {{ElementConverter}}, since the 
DDB sink needs to serialise and deserialise records for checkpointing. If you 
expose the {{ElementConverter}} this would allow users to create objects that 
are incompatible with the checkpoint serde. But generally speaking, yes let's 
move discussion to the new repo

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.16.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-07-08 Thread Yuri Gusev (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564322#comment-17564322
 ] 

Yuri Gusev commented on FLINK-24229:


Ok thanks, we had few open questions in progress (like hiding or not 
ElementConverter). But it may be better to start fresh on a new PR

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.16.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-07-08 Thread Danny Cranmer (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564293#comment-17564293
 ] 

Danny Cranmer commented on FLINK-24229:
---

[~Gusev] we will need to move the PR since it will be a different repository

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.16.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-07-08 Thread Yuri Gusev (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564286#comment-17564286
 ] 

Yuri Gusev commented on FLINK-24229:


Hi [~dannycranmer],

Sure happy to do this once new repo is available. Would the review discussion 
in the current PR or shall we take a fresh look in the new repo already?




> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.16.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-07-08 Thread Danny Cranmer (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564283#comment-17564283
 ] 

Danny Cranmer commented on FLINK-24229:
---

Hey [~Gusev], I have the FLIP ready and currently under internal review. I hope 
to publish it soon. Since we will be creating a new connector repository for 
this, once it is approved and created we will need to port your code to the new 
repository. Will you be able to do this?

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.16.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-07-08 Thread Yuri Gusev (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17564281#comment-17564281
 ] 

Yuri Gusev commented on FLINK-24229:


Hi [~dannycranmer] & Team any updates on when this could be moved forward. I'll 
move on with merging functionality to hide element converter for now.

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.16.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-06-02 Thread Yuri Gusev (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17545434#comment-17545434
 ] 

Yuri Gusev commented on FLINK-24229:


Thanks [~dannycranmer]. 

We fixed most of the comments, there is also a PR to hide ElementConverter for 
review separately,but we can merge into this one.

One last missing part is client creation, I'll try to fix it soon. But most of 
it available for re-review. Would be nice to get it out soon before too many 
changes again, we re-wrote it couple of times already :)

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-05-26 Thread Danny Cranmer (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17542386#comment-17542386
 ] 

Danny Cranmer commented on FLINK-24229:
---

[~martijnvisser] makes sense. Thanks I will do this

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-05-25 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541981#comment-17541981
 ] 

Martijn Visser commented on FLINK-24229:


[~dannycranmer] [~Gusev] Please be aware that we can't add new connectors to 
Flink's repository anymore, since we've agreed to move out connectors to their 
own individual repositories. For this addition, we should have a small FLIP 
(see https://cwiki.apache.org/confluence/display/FLINK/FLIP+Connector+Template) 
and a vote, so we can create apache/flink-connector-dynamodb and host the code 
over there. 

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-05-19 Thread Danny Cranmer (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17539597#comment-17539597
 ] 

Danny Cranmer commented on FLINK-24229:
---

[~Gusev] apologies for the delay in reviewing your latest commits. Just an 
update to say I plan on working on this for the Flink 1.16 deadline.

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-03-11 Thread Danny Cranmer (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504973#comment-17504973
 ] 

Danny Cranmer commented on FLINK-24229:
---

I was planning on reviewing this against master once the 1.15 branch had been 
cut. I can take a first pass next week if that works [~Gusev]?

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-03-11 Thread Zichen Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504939#comment-17504939
 ] 

Zichen Liu commented on FLINK-24229:


Would you mind elaborating a little on why the changes recommended on the 
sample PR would impact sink performance? I only had a nitty 
[comment|[https://github.com/YuriGusev/flink/commit/de83346dc24367ff1101ae13934dfcf1fc649de2#r68475400].]
 [~dannycranmer] might be better placed to review.

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-03-11 Thread Yuri Gusev (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504922#comment-17504922
 ] 

Yuri Gusev commented on FLINK-24229:


Ok thank you [~martijnvisser] and [~CrynetLogistics] .

There were some changes on the master and no there are conflicts so I'll rebase 
our job on top of the latest changes.

[~CrynetLogistics]  would you mind taking a look at the question we had in the 
PR about elementConverter being hidden. He have a sample PR ([see Nir's 
comment|http://example.com]https://github.com/apache/flink/pull/18518#discussion_r799306670)
 on what it would take to hide elementConverter. But that might impact 
performance of the sink quite a lot.

We will wait with the documentation ticket until we agree with the reviewers on 
the ElementConverter changes. 

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-03-11 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504912#comment-17504912
 ] 

Martijn Visser commented on FLINK-24229:


Most of the committers are currently fixing bugs and doing release testing due 
to the upcoming 1.15 release. After that, I'm sure there will be more reviewers 
available. 

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-03-11 Thread Zichen Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504909#comment-17504909
 ] 

Zichen Liu commented on FLINK-24229:


Maybe [~arvid] ?

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-03-11 Thread Yuri Gusev (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504870#comment-17504870
 ] 

Yuri Gusev commented on FLINK-24229:


Do you know who else can we ask for reviews? [~dannycranmer] [~CrynetLogistics]

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Yuri Gusev
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-02-09 Thread Danny Cranmer (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17489602#comment-17489602
 ] 

Danny Cranmer commented on FLINK-24229:
---

[~Gusev] unfortunately I will not have the capacity to review this PR before 
1.15 freeze. We have a bunch of outstanding work to do on KDS/KDF. I will 
update the 1.15 project plan to reflect this, maybe another committer can take 
a look?

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-02-04 Thread Danny Cranmer (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17486873#comment-17486873
 ] 

Danny Cranmer commented on FLINK-24229:
---

[~Gusev] I plan to start on this review next week. I will do my best to get it 
in before the 1.15 feature freeze

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-02-02 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17485638#comment-17485638
 ] 

Martijn Visser commented on FLINK-24229:


[~Gusev] It depends if [~dannycranmer] has time to review it and merge it 
before the release branch is cut. 

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-02-02 Thread Yuri Gusev (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17485631#comment-17485631
 ] 

Yuri Gusev commented on FLINK-24229:


Hi [~MartijnVisser] ,

Do you think we can merge this before the release? The PR is ready and got 
several reviews already. Is there anything missing?

For documentation I created a separate ticket, will try to finish up this in 
next two days (waited for reviews to happen).

 

 

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-01-26 Thread Zichen Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17482452#comment-17482452
 ] 

Zichen Liu commented on FLINK-24229:


(y)

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-01-26 Thread Yuri Gusev (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17482364#comment-17482364
 ] 

Yuri Gusev commented on FLINK-24229:


Pull request is created [~CrynetLogistics], would appreciate your review.

FYI [~nir.tsruya] 

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-01-25 Thread Zichen Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17481801#comment-17481801
 ] 

Zichen Liu commented on FLINK-24229:


Hi [~Gusev] 

Thank you so much for the contribution. I will be available and ready to help 
with & review the pull request.

Can't wait. I really hope we can make it for the release.

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-01-24 Thread Yuri Gusev (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17481375#comment-17481375
 ] 

Yuri Gusev commented on FLINK-24229:


Hi [~MartijnVisser] [~CrynetLogistics] ,

I'm sorry for the delay with this. Implementation is ready, we've just finished 
integration tests as well. Doing last refinements to add javadoc and rebase and 
will open PR hopefully tomorrow.

Hope we are still in time to make it into release.

 

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-01-23 Thread Nir Tsruya (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17480586#comment-17480586
 ] 

Nir Tsruya commented on FLINK-24229:


Hey [~CrynetLogistics] , I created a pull request for the fatal exception 
handler [https://github.com/apache/flink/pull/18449]

 

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-01-17 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17477293#comment-17477293
 ] 

Martijn Visser commented on FLINK-24229:


[~CrynetLogistics] Whatever is preferred by those involved, I have no 
preference :)

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-01-16 Thread Nir Tsruya (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476760#comment-17476760
 ] 

Nir Tsruya commented on FLINK-24229:


[~CrynetLogistics] I created this Jira 
https://issues.apache.org/jira/browse/FLINK-25661 

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-01-14 Thread Zichen Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476288#comment-17476288
 ] 

Zichen Liu commented on FLINK-24229:


[~MartijnVisser] [~Gusev] I'm working to getting the firehose sink in before 
the deadline. But I'm very happy to give a hand on the dynamo db pull request 
review.

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-01-14 Thread Zichen Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476287#comment-17476287
 ] 

Zichen Liu commented on FLINK-24229:


[~nir.tsruya] I think you're right. I was going to tag on any modifications 
needed to `aws-kinesis-data-stream-connector` since I added it not long ago, 
but probably a new Jira might be better.

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-01-14 Thread Nir Tsruya (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476067#comment-17476067
 ] 

Nir Tsruya commented on FLINK-24229:


[~CrynetLogistics] the AsyncSinkWriter is already being used by the 
aws-kinesis-data-stream-connector, and any change in the base class would have 
to involve changes in the implementations. Is it still considered a hotfix?

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2022-01-05 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17469494#comment-17469494
 ] 

Martijn Visser commented on FLINK-24229:


[~CrynetLogistics] [~Gusev] For your info, the plan is that the release-1.15 
branch will cut on February 6th, per 
https://cwiki.apache.org/confluence/display/FLINK/1.15+Release

Would be great if we could get it in, do you think that's feasible? Anything 
that can be done to help out with?

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2021-12-23 Thread Zichen Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17464647#comment-17464647
 ] 

Zichen Liu commented on FLINK-24229:


Hi [~Gusev] 

First, my apologies for not responding earlier :'( It's been quite a hectic 
ride until now.

Thanks for the suggestion, I think it will be a good refactor. Should be all 
good for it to live in a separate PR (that way we can keep them small :)), 
probably good to tag it with [hotfix] so that you don't need to have a Jira for 
it.

Regards

 

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2021-12-06 Thread Yuri Gusev (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17453963#comment-17453963
 ] 

Yuri Gusev commented on FLINK-24229:


Now I think I understood your first comment. :) I think moving definition of 
fatalExceptionCons out of constructor into a separate method or allowing to 
pass the fatal error consumer to the constructor would do. Shall we do this 
refactor on the main PR for the AsyncSinkWriter [~CrynetLogistics] ? (or if it 
is merged already we can do in this PR)

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2021-11-29 Thread Yuri Gusev (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450520#comment-17450520
 ] 

Yuri Gusev commented on FLINK-24229:



We are working on it, implementation is almost there, will submit a PR for 
review may be this/start of the next week. In our previous implementation we 
deduplicated entries during aggregation of a batch, but now we need to move 
this to the writer itself, because you are doing batching in the 
AsyncSinkWriter.

I'm not sure I followed your answer on the fatal exception behaviour.

What we would like to do is to allow user to define what to do in the end (for 
example after all retries towards DynamoDB for the current batch). 

This is how we achieved it in the old implementation: 
[WriteRequestFailureHandler.java|https://github.com/YuriGusev/flink/blob/FLINK-16504_dynamodb_connector_rebased/flink-connectors/flink-connector-dynamodb/src/main/java/org/apache/flink/streaming/connectors/dynamodb/WriteRequestFailureHandler.java],
 
[DynamoDbSink.java|https://github.com/YuriGusev/flink/blob/8787c343b615602c989fa793e0f4687ef40e530c/flink-connectors/flink-connector-dynamodb/src/main/java/org/apache/flink/streaming/connectors/dynamodb/DynamoDbSink.java#L355]

At the moment it will send the failed records back to the queue and retry 
again, or on fatalException it will propagate the error and stop the 
application.


> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2021-11-24 Thread Zichen Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17448591#comment-17448591
 ] 

Zichen Liu commented on FLINK-24229:


hey [~Gusev] 

Hope you're well. Was wondering how the DynamoDB sink was going and if you 
wanted to discuss anything about it?

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2021-11-12 Thread Zichen Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17442731#comment-17442731
 ] 

Zichen Liu commented on FLINK-24229:


Hey [~nir.tsruya] 

Glad to hear!

I guess users can currently customize fatal exception behaviour in the async 
callback of `submitRequestEntries`, but this improvement would refactor out 
this handling into a separate method. I guess this is the main advantage you're 
interested in.

Also, it would give the user the possibility of handling these in the mailbox 
thread (though I believe the behaviour would end up being the same).

 

 

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2021-11-12 Thread Nir Tsruya (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17442712#comment-17442712
 ] 

Nir Tsruya commented on FLINK-24229:


hey [~CrynetLogistics] 

We are working on the AsyncSinkWriter implementation, and I can see that there 
are 2 consumers, the 
[fatalExceptionCons|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/writer/AsyncSinkWriter.java#L136]
 and 
[requestResult|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/writer/AsyncSinkWriter.java#L304]
 . As I understand the requestResult consumer will simply add failed write 
requests to the queue for retry. and the fatalExceptions consumer is used when 
a fata error that should not be retried is thrown and so it will simply throw 
the error.

We are interested in adding a functionality that will allow to configure the 
behavior of the fatalExceptions. For example, in case of fatal error, publish 
the record to a DLQ. preferably such behavior would be configurable by the user 
of the AsyncSink.

What do you think?

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2021-11-10 Thread Zichen Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17441623#comment-17441623
 ] 

Zichen Liu commented on FLINK-24229:


Hi [~Gusev] 

Sorry for the delay, everything related to the base sink has been merged.

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2021-11-05 Thread Zichen Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17439185#comment-17439185
 ] 

Zichen Liu commented on FLINK-24229:


Hi [~Gusev] 

Sadly it may be possible that I won't be able to get the PR merged today... 
sorry about that. If you need a reference for what the changes will look like 
(mostly minor), please use the file changes in the PR as a guide.

 

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2021-11-04 Thread Zichen Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438824#comment-17438824
 ] 

Zichen Liu commented on FLINK-24229:


Hey [~Gusev]

That's great, thanks for letting me know. Hope you had a good vacation!! 

I'm going to try to get changes for the base sink though by tomorrow, so as to 
not have the API move by the time you start work. 
[https://github.com/apache/flink/pull/17687]  I'm close to getting it in as the 
reviews for this change have mostly taken place on a separate PR page.

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2021-11-03 Thread Yuri Gusev (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437945#comment-17437945
 ] 

Yuri Gusev commented on FLINK-24229:


Hi [~CrynetLogistics], sorry for not getting back earlier, I was on a vacation 
and just returned.
If it is December I hope we can catch this.

I will be working on it this Friday, will give an update next week.



> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2021-10-29 Thread Zichen Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435986#comment-17435986
 ] 

Zichen Liu commented on FLINK-24229:


Hi [~Gusev],

Just wanted to check in with you to see how it's going with the adoption of the 
Async Sink base? Is there anything I can do to help?

Also not sure if you got a response to your question about 1.15 release, but I 
heard it was December time?

Regards

Zichen

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2021-10-12 Thread Zichen Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427606#comment-17427606
 ] 

Zichen Liu commented on FLINK-24229:


Great to hear, thank you indeed.

I'm not sure what the deadline for the 1.15 release is, perhaps, 
[~dannycranmer] would know?

Regards

Zichen

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2021-10-11 Thread Yuri Gusev (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17427308#comment-17427308
 ] 

Yuri Gusev commented on FLINK-24229:


Yes, I think it should not be hard to adopt it to the new async sink base. 

Thanks [~CrynetLogistics], then will try to do it and come back for 
input/reviews. 

Do you know when is the deadline for 1.15 release?

Best regards,
Yuri Gusev

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2021-10-08 Thread Zichen Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426280#comment-17426280
 ] 

Zichen Liu commented on FLINK-24229:


Hi [~Gusev],

Apologies for the late reply, just returned recently. Glad to see there is 
demand for a DynamoDB Sink for Flink!

I have completed the implementation of the generic Async Sink and have merged 
it in at FLINK-24041. However I have not yet started the implementation of the 
DynamoDB sink at all. Your implementation and tests looks detailed, and I don't 
think it will be much work to adapt it to use the Async Sink base.

Would you like to give this a go? I would be happy to help in the effort.

Regards

Zichen

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-24229) [FLIP-171] DynamoDB implementation of Async Sink

2021-10-06 Thread Yuri Gusev (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424872#comment-17424872
 ] 

Yuri Gusev commented on FLINK-24229:


Hi [~CrynetLogistics],

Did you start working on this implementation? We delivered dynamodb sink 
implementation here https://issues.apache.org/jira/browse/FLINK-16504, but it 
uses the old approach for independent sink functions. 

What is your plan on this ticket? Can we join the implementation 
efforts/reviews?

Best regards,
Yuri

> [FLIP-171] DynamoDB implementation of Async Sink
> 
>
> Key: FLINK-24229
> URL: https://issues.apache.org/jira/browse/FLINK-24229
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Zichen Liu
>Assignee: Zichen Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> h2. Motivation
> *User stories:*
>  As a Flink user, I’d like to use DynamoDB as sink for my data pipeline.
> *Scope:*
>  * Implement an asynchronous sink for DynamoDB by inheriting the 
> AsyncSinkBase class. The implementation can for now reside in its own module 
> in flink-connectors.
>  * Implement an asynchornous sink writer for DynamoDB by extending the 
> AsyncSinkWriter. The implementation must deal with failed requests and retry 
> them using the {{requeueFailedRequestEntry}} method. If possible, the 
> implementation should batch multiple requests (PutRecordsRequestEntry 
> objects) to Firehose for increased throughput. The implemented Sink Writer 
> will be used by the Sink class that will be created as part of this story.
>  * Java / code-level docs.
>  * End to end testing: add tests that hits a real AWS instance. (How to best 
> donate resources to the Flink project to allow this to happen?)
> h2. References
> More details to be found 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)