[jira] [Created] (FLINK-22830) DataSet API to allow to collect and do the bulk upload in DB

Ravi (Jira) Tue, 01 Jun 2021 04:31:16 -0700

Ravi created FLINK-22830:
----------------------------

             Summary: DataSet API to allow to collect and do the bulk upload in 
DB
                 Key: FLINK-22830
                 URL: https://issues.apache.org/jira/browse/FLINK-22830
             Project: Flink
          Issue Type: Improvement
            Reporter: Ravi



Experts,

 

I am trying to perform some ETL operation on large data set as batch 
processing. My requirement is to extract the data , transform it and then save 
to mongoDB. I am using Apache FLINK but the performance is very slow as I am 
doing the mongoDB update on each row.

Is there any way where we can sink as bulk record so that performance can 
increase. Like after all the transformation we do the bulk update on mongoDB. 
We can aggregate them all and finally sink it in DB just like stream 
[.aggregate() .sink(\{bulk update})]  . Please refer 
[https://stackoverflow.com/questions/67717964/java-apache-flink-batch-processing-performance-for-bulk-mongodb-update]
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-22830) DataSet API to allow to collect and do the bulk upload in DB

Reply via email to