RE: Http Kafka producer

Hemanth Abbina Wed, 26 Aug 2015 17:07:00 -0700

Marc,

Thanks for your response.  Let's have more details on the problem.

As I already mentioned in the previous post, here is our expected data flow:  
logs -> HAProxy -> {new layer } -> Kafka Cluster

The 'new layer' should receive logs as HTTP requests from HAproxy and produce 
the same logs to Kafka without loss.

Options that seems to be available, are
1. Flume: It has a HTTP source & Kafka sink, but the documentation says HTTP 
source is not for production use.
2. Kafka Rest Proxy: Though this seems to be fine, adding another dependency of 
Schema Registry servers to validate the schema, which should be again used by 
the consumers.
3. Custom plugin to handle this functionality: Though the functionality seems 
to be simple - scalability, reliability aspects and maintenance would be more.

Thanks
Hemanth

-----Original Message-----
From: Marc Bollinger [mailto:m...@lumoslabs.com] 
Sent: Thursday, August 27, 2015 4:39 AM
To: users@kafka.apache.org
Cc: dev-subscr...@kafka.apache.org
Subject: Re: Http Kafka producer

I'm actually also really interested in this...I had a chat about this on the 
distributed systems slack's <http://dist-sys.slack.com> Kafka channel a few 
days ago, but we're not much further than griping about the problem.
We're basically migrating an existing event system, one which packed messages 
into files, waited for a time-or-space threshold to be crossed, then dealt with 
distribution in terms of files. Basically, we'd like to keep a lot of those 
semantics: we can acknowledge success on the app server as soon as we've 
flushed to disk, and rely on the filesystem for durability, and total order 
across the system doesn't matter, as the HTTP PUTs sending the messages are 
load balanced across many app servers. We also can tolerate [very] long 
downstream event system outages, because...we're ultimately just writing 
sequentially to disk, per process (I should mention that this part is in Rails, 
which means we're dealing largely in terms of processes, not threads).

RocksDB was mentioned in the discussion, but spending exactly 5 minutes 
researching that solution, it seems like the dead simplest solution on an app 
server in terms of moving parts (multiple processes writing, one process 
reading/forwarding to Kafka) wouldn't work well with RocksDB.
Although now that I'm looking at it more, it looks like they're working on a 
MySQL storage engine?

Anyway yeah, I'd love some discussion on this, or war stories of migration to 
Kafka from other event systems (F/OSS or...bespoke).

On Wed, Aug 26, 2015 at 3:45 PM, Hemanth Abbina <heman...@eiqnetworks.com>
wrote:

> Hi,
>
> Our application receives events through a HAProxy server on HTTPs, 
> which should be forwarded and stored to Kafka cluster.
>
> What should be the best option for this ?
> This layer should receive events from HAProxy & produce them to Kafka 
> cluster, in a reliable and efficient way (and should scale horizontally).
>
> Please suggest.
>
> --regards
> Hemanth
>

RE: Http Kafka producer

Reply via email to