Hi Jingsong Lee,
Thanks for the details. Were you able to achieve end-to-end exactly once 
support with Mongo? 
Also, for doing any intermittent reads from Mongo (Kafka -> process event -> 
lookup Mongo -> enhance event -> Sink to Mongo), I am thinking of using Async 
IO 
(https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/asyncio.html).
 In your implementation, did you have a need to use this API?
RegardsVijay

    On Sunday, October 13, 2019, 08:11:06 PM PDT, JingsongLee 
<lzljs3620...@aliyun.com.INVALID> wrote:  
 
 Hi vijay:

I developed an append stream sink for Mongo internally, which writes data in 
batches according to configureable batch size, and also provides asynchronous 
flush. But only insert not update or upsert. It's a good job. It's been working 
very well for a long time. (Throughput depends mainly on MongoServer)

I've investigated Mongo's API, use "UpdateOptions.upsert" to achieve upsert 
functionality, which should be similar to ElasticsearchUpsertTableSink.

About reading, the main problem is how to support distributed multiple 
parallelism reads, need to get the splitKeys of MongoShard to build the 
filtering conditions to split mongo source to multiple parallelism. (I think 
beam MongoDbIO is a very good example)

Best,
Jingsong Lee


------------------------------------------------------------------
From:Vijay Srinivasaraghavan <vijikar...@yahoo.com.INVALID>
Send Time:2019年10月13日(星期日) 11:52
To:Dev <dev@flink.apache.org>
Subject:Mongo Connector

Hello,
Do we know how much of support we have for Mongo? The documentation page is 
pointing to a connector repo that was very old (last updated 5 years ago) and 
looks like that was just a sample code to showcase the integration.
https://ci.apache.org/projects/flink/flink-docs-stable/dev/batch/connectors.html#access-mongodb
I am planning to build a pipeline that involves heavy use of Mongo (reads as 
well as bulk upserts). Trying to understand if anyone has used Mongo in the 
pipeline and would like to share some of their experience?

Are there any known limitations and gotchas?
Appreciate any inputs.
RegardsVijay  

Reply via email to