Is Splittable DoFn suitable for fetch data from a socket server?

2018-09-18 Thread flyisland
Hi Gurus, I'm trying to create an IO connector to fetch data from a socket server from Beam, I'm new to Beam, but according to this blog < https://beam.apache.org/blog/2017/08/16/splittable-do-fn.html>, it seems that SDF is the recommended way to implement an IO connector now. This in-house built

Re: Is Splittable DoFn suitable for fetch data from a socket server?

2018-09-19 Thread Lukasz Cwik
Before getting into what you could use and the current state of SplittableDoFn and its supported features, I was wondering what reliability guarantees does the socket server have around messages? Is receiving and processing each message reliably important or is it ok to drop messages when things f

Re: Is Splittable DoFn suitable for fetch data from a socket server?

2018-09-19 Thread flyisland
Hi Lukasz, This socket server is like an MQTT server, it has queues inside it and the client should ack each message. > Is receiving and processing each message reliably important or is it ok to drop messages when things fail? A: Reliable is important, no messages should be lost. > Is there a me

Re: Is Splittable DoFn suitable for fetch data from a socket server?

2018-09-20 Thread Lukasz Cwik
Are duplicate messages ok? Can you ack messages from a different client or are messages sticky to a single client (e.g. if one client loses connection, when it reconnects can it ack messages it received or are those messages automatically replayed)? UnboundedSources are the only current "source"

Re: Is Splittable DoFn suitable for fetch data from a socket server?

2018-09-20 Thread flyisland
Hi Lukasz, With the current API we provided, messages cannot be acked from a different client. The server will re-send messages to the reconnected client if those messages were not acked. So there'll be duplicate messages, but with a "redeliver times" property in the header. Thanks for your help

Re: Is Splittable DoFn suitable for fetch data from a socket server?

2018-09-20 Thread Reuven Lax
Is there information in the message that can be used as an id, that can be used for deduplication? On Thu, Sep 20, 2018 at 6:36 PM flyisland wrote: > Hi Lukasz, > > With the current API we provided, messages cannot be acked from a > different client. > > The server will re-send messages to the r

Re: Is Splittable DoFn suitable for fetch data from a socket server?

2018-09-20 Thread flyisland
Hi Reuven, There is no explicit ID in the message itself, and if there is information can be used as an ID is depend on use cases. On Fri, Sep 21, 2018 at 11:05 AM Reuven Lax wrote: > Is there information in the message that can be used as an id, that can be > used for deduplication? > > On Thu

Re: Is Splittable DoFn suitable for fetch data from a socket server?

2018-09-21 Thread Raghu Angadi
> This in-house built socket server could accept multiple clients, but only send messages to the first-connected client, and will send messages to the second client if the first one disconnected. Server sending messages to first client connection only is quite critical. Even if you use Source API

Re: Is Splittable DoFn suitable for fetch data from a socket server?

2018-09-22 Thread flyisland
Hi Raghu, Thanks very much! Yes, I think I should focus on the UnboundeSource API for now. On Sat, Sep 22, 2018 at 7:01 AM Raghu Angadi wrote: > > This in-house built socket server could accept multiple clients, but > only send messages to the first-connected client, and will send messages to >

Re: Is Splittable DoFn suitable for fetch data from a socket server?

2018-10-03 Thread flyisland
Hi Raghu, > Assuming you need to ack on the same connection that served the records, finalize() functionality in UnboundedSource API is important case. You can use UnboundeSource API for now. I have got a new question now, where should I keep the connection for later ack action? The MqttIO/JmsIO