Re: kafka connect questions

2017-07-05 Thread Clay Teahouse
Thanks Per for the feedback. My comments below.

1) There is a load balancer in front of the cluster.
2) Kafka does run in a cluster mode. I was referring to the producer  that
doesn't. Furthermore, the data sources are not heterogeneous and we cannot
install anything on them. All we can do have them push the data to the
cluster via the load balancer. We thought maybe a tcp listener connector
that runs in cluster mode is a viable solution. We could add a converter
that converts the protobuf messages to the format that we need to write to
the kafka topic.

On Wed, Jul 5, 2017 at 3:22 AM, Sönke Liebau <
soenke.lie...@opencore.com.invalid> wrote:

> Hi Clay,
>
> I agree with Gwen in thinking that you might want to take a second look at
> streaming protobuf data to Kafka and then having connectors read that from
> Kafka. To address the issues in order:
>
> 1. You say you have hundreds of machines sending data, but if you run a
> connector that is not tied to a single ip address you'd basically need to
> update all these data senders with the new ip address, if the connector
> moves around in the cluster. Only way around that that I can come up with
> is to have a loadbalancer with a single ip pointed at all machines in the
> cluster and performing regular healthchecks to find out where the job is
> currently running (similar to what Mesos does), but that would not be
> interruption-free.
>
> 2. Kafka does run in a cluster mode, is HA and very scalable. It was
> arguably built to do the exact job that you are describing here. The
> downside is, that you would need to change your data senders, which might
> not be possible, I do not know that. Perhaps you could implement a tiny
> tool that reads from TCP and forwards the message to Kafka (Logstash might
> be an option, not sure). To make this HA and scalable just deploy more than
> one of these jobs and put a loadbalancer before them to distribute requests
> across all instances. This is a very similar architecture to what you
> wanted to do with connect, but without the issue of jobs moving around in
> the cluster which would create unnecessary complexity.
>
> Just my 2 cent, but hope it helps :)
>
>
> On Wed, Jul 5, 2017 at 6:09 AM, Clay Teahouse 
> wrote:
>
> > Hello Gwen,
> >
> > Thanks for the reply. My comments/answers inline.
> >
> > 1. Connectors that listen on sockets typically run in stand-alone mode,
> so
> > they can tied to a specific machine (in distributed mode, connectors can
> > move around).
> > [Clay:] Even if the connectors move around, they can still listen to a
> > specific port on the node in the cluster, right? The data will be sent to
> > the cluster of connectors from hundreds of data sources.
> > 2. Why do you need a connector? Why not just use Kafka producer to send
> > protobuf directly to Kafka?
> >
> > [Clay:] I have hundreds of data sources which push the data to the
> > connectors. I do need the connectors to run in a cluster mode, for HA and
> > scalability.
> >
> >
> >
> > On Tue, Jul 4, 2017 at 10:45 PM, Gwen Shapira  wrote:
> >
> > > I don't remember seeing one. There is no reason not to write one (let
> us
> > > know if you do, so we can put it on the connector hub!).
> > >
> > > Few things:
> > > 1. Connectors that listen on sockets typically run in stand-alone mode,
> > so
> > > they can tied to a specific machine (in distributed mode, connectors
> can
> > > move around).
> > > 2. Why do you need a connector? Why not just use Kafka producer to send
> > > protobuf directly to Kafka?
> > >
> > > Gwen
> > > On Tue, Jul 4, 2017 at 9:02 AM Clay Teahouse 
> > > wrote:
> > >
> > > > Hello All,
> > > >
> > > > I'd appreciate your help with the following questions.
> > > >
> > > > 1) Is there kafka connect for listening to tcp sockets?
> > > >
> > > > 2) If, can the messages be in protobuf, with each messaged prefixed
> > with
> > > > the length of the message?
> > > >
> > > > thanks
> > > > Clay
> > > >
> > >
> >
>
>
>
> --
> Sönke Liebau
> Partner
> Tel. +49 179 7940878
> OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany
>


Re: kafka connect questions

2017-07-05 Thread Sönke Liebau
Hi Clay,

I agree with Gwen in thinking that you might want to take a second look at
streaming protobuf data to Kafka and then having connectors read that from
Kafka. To address the issues in order:

1. You say you have hundreds of machines sending data, but if you run a
connector that is not tied to a single ip address you'd basically need to
update all these data senders with the new ip address, if the connector
moves around in the cluster. Only way around that that I can come up with
is to have a loadbalancer with a single ip pointed at all machines in the
cluster and performing regular healthchecks to find out where the job is
currently running (similar to what Mesos does), but that would not be
interruption-free.

2. Kafka does run in a cluster mode, is HA and very scalable. It was
arguably built to do the exact job that you are describing here. The
downside is, that you would need to change your data senders, which might
not be possible, I do not know that. Perhaps you could implement a tiny
tool that reads from TCP and forwards the message to Kafka (Logstash might
be an option, not sure). To make this HA and scalable just deploy more than
one of these jobs and put a loadbalancer before them to distribute requests
across all instances. This is a very similar architecture to what you
wanted to do with connect, but without the issue of jobs moving around in
the cluster which would create unnecessary complexity.

Just my 2 cent, but hope it helps :)


On Wed, Jul 5, 2017 at 6:09 AM, Clay Teahouse 
wrote:

> Hello Gwen,
>
> Thanks for the reply. My comments/answers inline.
>
> 1. Connectors that listen on sockets typically run in stand-alone mode, so
> they can tied to a specific machine (in distributed mode, connectors can
> move around).
> [Clay:] Even if the connectors move around, they can still listen to a
> specific port on the node in the cluster, right? The data will be sent to
> the cluster of connectors from hundreds of data sources.
> 2. Why do you need a connector? Why not just use Kafka producer to send
> protobuf directly to Kafka?
>
> [Clay:] I have hundreds of data sources which push the data to the
> connectors. I do need the connectors to run in a cluster mode, for HA and
> scalability.
>
>
>
> On Tue, Jul 4, 2017 at 10:45 PM, Gwen Shapira  wrote:
>
> > I don't remember seeing one. There is no reason not to write one (let us
> > know if you do, so we can put it on the connector hub!).
> >
> > Few things:
> > 1. Connectors that listen on sockets typically run in stand-alone mode,
> so
> > they can tied to a specific machine (in distributed mode, connectors can
> > move around).
> > 2. Why do you need a connector? Why not just use Kafka producer to send
> > protobuf directly to Kafka?
> >
> > Gwen
> > On Tue, Jul 4, 2017 at 9:02 AM Clay Teahouse 
> > wrote:
> >
> > > Hello All,
> > >
> > > I'd appreciate your help with the following questions.
> > >
> > > 1) Is there kafka connect for listening to tcp sockets?
> > >
> > > 2) If, can the messages be in protobuf, with each messaged prefixed
> with
> > > the length of the message?
> > >
> > > thanks
> > > Clay
> > >
> >
>



-- 
Sönke Liebau
Partner
Tel. +49 179 7940878
OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany


Re: kafka connect questions

2017-07-05 Thread Per Steffensen

Well I guess there is one: https://github.com/dhanuka84/kafka-connect-tcp
Maybe you can use or build on top of that.

On 05/07/17 05:45, Gwen Shapira wrote:

I don't remember seeing one. There is no reason not to write one (let us
know if you do, so we can put it on the connector hub!).

Few things:
1. Connectors that listen on sockets typically run in stand-alone mode, so
they can tied to a specific machine (in distributed mode, connectors can
move around).
2. Why do you need a connector? Why not just use Kafka producer to send
protobuf directly to Kafka?

Gwen
On Tue, Jul 4, 2017 at 9:02 AM Clay Teahouse  wrote:


Hello All,

I'd appreciate your help with the following questions.

1) Is there kafka connect for listening to tcp sockets?

2) If, can the messages be in protobuf, with each messaged prefixed with
the length of the message?

thanks
Clay





Re: kafka connect questions

2017-07-04 Thread Clay Teahouse
Hello Gwen,

Thanks for the reply. My comments/answers inline.

1. Connectors that listen on sockets typically run in stand-alone mode, so
they can tied to a specific machine (in distributed mode, connectors can
move around).
[Clay:] Even if the connectors move around, they can still listen to a
specific port on the node in the cluster, right? The data will be sent to
the cluster of connectors from hundreds of data sources.
2. Why do you need a connector? Why not just use Kafka producer to send
protobuf directly to Kafka?

[Clay:] I have hundreds of data sources which push the data to the
connectors. I do need the connectors to run in a cluster mode, for HA and
scalability.



On Tue, Jul 4, 2017 at 10:45 PM, Gwen Shapira  wrote:

> I don't remember seeing one. There is no reason not to write one (let us
> know if you do, so we can put it on the connector hub!).
>
> Few things:
> 1. Connectors that listen on sockets typically run in stand-alone mode, so
> they can tied to a specific machine (in distributed mode, connectors can
> move around).
> 2. Why do you need a connector? Why not just use Kafka producer to send
> protobuf directly to Kafka?
>
> Gwen
> On Tue, Jul 4, 2017 at 9:02 AM Clay Teahouse 
> wrote:
>
> > Hello All,
> >
> > I'd appreciate your help with the following questions.
> >
> > 1) Is there kafka connect for listening to tcp sockets?
> >
> > 2) If, can the messages be in protobuf, with each messaged prefixed with
> > the length of the message?
> >
> > thanks
> > Clay
> >
>


Re: kafka connect questions

2017-07-04 Thread Gwen Shapira
I don't remember seeing one. There is no reason not to write one (let us
know if you do, so we can put it on the connector hub!).

Few things:
1. Connectors that listen on sockets typically run in stand-alone mode, so
they can tied to a specific machine (in distributed mode, connectors can
move around).
2. Why do you need a connector? Why not just use Kafka producer to send
protobuf directly to Kafka?

Gwen
On Tue, Jul 4, 2017 at 9:02 AM Clay Teahouse  wrote:

> Hello All,
>
> I'd appreciate your help with the following questions.
>
> 1) Is there kafka connect for listening to tcp sockets?
>
> 2) If, can the messages be in protobuf, with each messaged prefixed with
> the length of the message?
>
> thanks
> Clay
>


kafka connect questions

2017-07-04 Thread Clay Teahouse
Hello All,

I'd appreciate your help with the following questions.

1) Is there kafka connect for listening to tcp sockets?

2) If, can the messages be in protobuf, with each messaged prefixed with
the length of the message?

thanks
Clay