Re: kafka connect questions
Thanks Per for the feedback. My comments below. 1) There is a load balancer in front of the cluster. 2) Kafka does run in a cluster mode. I was referring to the producer that doesn't. Furthermore, the data sources are not heterogeneous and we cannot install anything on them. All we can do have them push the data to the cluster via the load balancer. We thought maybe a tcp listener connector that runs in cluster mode is a viable solution. We could add a converter that converts the protobuf messages to the format that we need to write to the kafka topic. On Wed, Jul 5, 2017 at 3:22 AM, Sönke Liebau < soenke.lie...@opencore.com.invalid> wrote: > Hi Clay, > > I agree with Gwen in thinking that you might want to take a second look at > streaming protobuf data to Kafka and then having connectors read that from > Kafka. To address the issues in order: > > 1. You say you have hundreds of machines sending data, but if you run a > connector that is not tied to a single ip address you'd basically need to > update all these data senders with the new ip address, if the connector > moves around in the cluster. Only way around that that I can come up with > is to have a loadbalancer with a single ip pointed at all machines in the > cluster and performing regular healthchecks to find out where the job is > currently running (similar to what Mesos does), but that would not be > interruption-free. > > 2. Kafka does run in a cluster mode, is HA and very scalable. It was > arguably built to do the exact job that you are describing here. The > downside is, that you would need to change your data senders, which might > not be possible, I do not know that. Perhaps you could implement a tiny > tool that reads from TCP and forwards the message to Kafka (Logstash might > be an option, not sure). To make this HA and scalable just deploy more than > one of these jobs and put a loadbalancer before them to distribute requests > across all instances. This is a very similar architecture to what you > wanted to do with connect, but without the issue of jobs moving around in > the cluster which would create unnecessary complexity. > > Just my 2 cent, but hope it helps :) > > > On Wed, Jul 5, 2017 at 6:09 AM, Clay Teahouse> wrote: > > > Hello Gwen, > > > > Thanks for the reply. My comments/answers inline. > > > > 1. Connectors that listen on sockets typically run in stand-alone mode, > so > > they can tied to a specific machine (in distributed mode, connectors can > > move around). > > [Clay:] Even if the connectors move around, they can still listen to a > > specific port on the node in the cluster, right? The data will be sent to > > the cluster of connectors from hundreds of data sources. > > 2. Why do you need a connector? Why not just use Kafka producer to send > > protobuf directly to Kafka? > > > > [Clay:] I have hundreds of data sources which push the data to the > > connectors. I do need the connectors to run in a cluster mode, for HA and > > scalability. > > > > > > > > On Tue, Jul 4, 2017 at 10:45 PM, Gwen Shapira wrote: > > > > > I don't remember seeing one. There is no reason not to write one (let > us > > > know if you do, so we can put it on the connector hub!). > > > > > > Few things: > > > 1. Connectors that listen on sockets typically run in stand-alone mode, > > so > > > they can tied to a specific machine (in distributed mode, connectors > can > > > move around). > > > 2. Why do you need a connector? Why not just use Kafka producer to send > > > protobuf directly to Kafka? > > > > > > Gwen > > > On Tue, Jul 4, 2017 at 9:02 AM Clay Teahouse > > > wrote: > > > > > > > Hello All, > > > > > > > > I'd appreciate your help with the following questions. > > > > > > > > 1) Is there kafka connect for listening to tcp sockets? > > > > > > > > 2) If, can the messages be in protobuf, with each messaged prefixed > > with > > > > the length of the message? > > > > > > > > thanks > > > > Clay > > > > > > > > > > > > > -- > Sönke Liebau > Partner > Tel. +49 179 7940878 > OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany >
Re: kafka connect questions
Hi Clay, I agree with Gwen in thinking that you might want to take a second look at streaming protobuf data to Kafka and then having connectors read that from Kafka. To address the issues in order: 1. You say you have hundreds of machines sending data, but if you run a connector that is not tied to a single ip address you'd basically need to update all these data senders with the new ip address, if the connector moves around in the cluster. Only way around that that I can come up with is to have a loadbalancer with a single ip pointed at all machines in the cluster and performing regular healthchecks to find out where the job is currently running (similar to what Mesos does), but that would not be interruption-free. 2. Kafka does run in a cluster mode, is HA and very scalable. It was arguably built to do the exact job that you are describing here. The downside is, that you would need to change your data senders, which might not be possible, I do not know that. Perhaps you could implement a tiny tool that reads from TCP and forwards the message to Kafka (Logstash might be an option, not sure). To make this HA and scalable just deploy more than one of these jobs and put a loadbalancer before them to distribute requests across all instances. This is a very similar architecture to what you wanted to do with connect, but without the issue of jobs moving around in the cluster which would create unnecessary complexity. Just my 2 cent, but hope it helps :) On Wed, Jul 5, 2017 at 6:09 AM, Clay Teahousewrote: > Hello Gwen, > > Thanks for the reply. My comments/answers inline. > > 1. Connectors that listen on sockets typically run in stand-alone mode, so > they can tied to a specific machine (in distributed mode, connectors can > move around). > [Clay:] Even if the connectors move around, they can still listen to a > specific port on the node in the cluster, right? The data will be sent to > the cluster of connectors from hundreds of data sources. > 2. Why do you need a connector? Why not just use Kafka producer to send > protobuf directly to Kafka? > > [Clay:] I have hundreds of data sources which push the data to the > connectors. I do need the connectors to run in a cluster mode, for HA and > scalability. > > > > On Tue, Jul 4, 2017 at 10:45 PM, Gwen Shapira wrote: > > > I don't remember seeing one. There is no reason not to write one (let us > > know if you do, so we can put it on the connector hub!). > > > > Few things: > > 1. Connectors that listen on sockets typically run in stand-alone mode, > so > > they can tied to a specific machine (in distributed mode, connectors can > > move around). > > 2. Why do you need a connector? Why not just use Kafka producer to send > > protobuf directly to Kafka? > > > > Gwen > > On Tue, Jul 4, 2017 at 9:02 AM Clay Teahouse > > wrote: > > > > > Hello All, > > > > > > I'd appreciate your help with the following questions. > > > > > > 1) Is there kafka connect for listening to tcp sockets? > > > > > > 2) If, can the messages be in protobuf, with each messaged prefixed > with > > > the length of the message? > > > > > > thanks > > > Clay > > > > > > -- Sönke Liebau Partner Tel. +49 179 7940878 OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany
Re: kafka connect questions
Well I guess there is one: https://github.com/dhanuka84/kafka-connect-tcp Maybe you can use or build on top of that. On 05/07/17 05:45, Gwen Shapira wrote: I don't remember seeing one. There is no reason not to write one (let us know if you do, so we can put it on the connector hub!). Few things: 1. Connectors that listen on sockets typically run in stand-alone mode, so they can tied to a specific machine (in distributed mode, connectors can move around). 2. Why do you need a connector? Why not just use Kafka producer to send protobuf directly to Kafka? Gwen On Tue, Jul 4, 2017 at 9:02 AM Clay Teahousewrote: Hello All, I'd appreciate your help with the following questions. 1) Is there kafka connect for listening to tcp sockets? 2) If, can the messages be in protobuf, with each messaged prefixed with the length of the message? thanks Clay
Re: kafka connect questions
Hello Gwen, Thanks for the reply. My comments/answers inline. 1. Connectors that listen on sockets typically run in stand-alone mode, so they can tied to a specific machine (in distributed mode, connectors can move around). [Clay:] Even if the connectors move around, they can still listen to a specific port on the node in the cluster, right? The data will be sent to the cluster of connectors from hundreds of data sources. 2. Why do you need a connector? Why not just use Kafka producer to send protobuf directly to Kafka? [Clay:] I have hundreds of data sources which push the data to the connectors. I do need the connectors to run in a cluster mode, for HA and scalability. On Tue, Jul 4, 2017 at 10:45 PM, Gwen Shapirawrote: > I don't remember seeing one. There is no reason not to write one (let us > know if you do, so we can put it on the connector hub!). > > Few things: > 1. Connectors that listen on sockets typically run in stand-alone mode, so > they can tied to a specific machine (in distributed mode, connectors can > move around). > 2. Why do you need a connector? Why not just use Kafka producer to send > protobuf directly to Kafka? > > Gwen > On Tue, Jul 4, 2017 at 9:02 AM Clay Teahouse > wrote: > > > Hello All, > > > > I'd appreciate your help with the following questions. > > > > 1) Is there kafka connect for listening to tcp sockets? > > > > 2) If, can the messages be in protobuf, with each messaged prefixed with > > the length of the message? > > > > thanks > > Clay > > >
Re: kafka connect questions
I don't remember seeing one. There is no reason not to write one (let us know if you do, so we can put it on the connector hub!). Few things: 1. Connectors that listen on sockets typically run in stand-alone mode, so they can tied to a specific machine (in distributed mode, connectors can move around). 2. Why do you need a connector? Why not just use Kafka producer to send protobuf directly to Kafka? Gwen On Tue, Jul 4, 2017 at 9:02 AM Clay Teahousewrote: > Hello All, > > I'd appreciate your help with the following questions. > > 1) Is there kafka connect for listening to tcp sockets? > > 2) If, can the messages be in protobuf, with each messaged prefixed with > the length of the message? > > thanks > Clay >
kafka connect questions
Hello All, I'd appreciate your help with the following questions. 1) Is there kafka connect for listening to tcp sockets? 2) If, can the messages be in protobuf, with each messaged prefixed with the length of the message? thanks Clay