syslog-ng producer for kafka

2015-09-08 Thread Szabó , István
Hi, syslog-ng (https://syslog-ng.org/) is one of the most widely used open source log collection tools, capable of filtering, classifying, parsing log data and forwarding it to a wide variety of destinations. In its most recent release (3.7.1

Re: Add the hard drive problem

2015-06-20 Thread István
Kafka is not concerned about your disks and it does not do anything lower than writing files to a folder. Meaning, the best way to add more capacity to your servers is to stop the service add the drives create a new volume and data folder copy over the data from previous location and mount the new

Re: java.io.IOException: Too many open files error

2015-01-15 Thread István
Hi Sa Li, Depending on your system that configuration entry needs to be modified. The first parameter after the insert is the username what you use to run kafka. It might be your own username or something else, in the following example it is called kafkauser. On the top of that I also like to use

Re: How many partition can one single machine handle in Kafka?

2014-10-23 Thread István
This is actually a very vague statement and does not cover every use case. Having a RAID10 array of 6x250G SSDs is very different from having 4x1T spinning drives. In my experience rebuilding a raid10 array that has several smaller SSD disks is hardly noticeable from the service point of view,

Re: How many partition can one single machine handle in Kafka?

2014-10-23 Thread István
RAID has nothing to do with the overall availability of your system, it is just increasing the per node reliability. Regards, Istvan On Wed, Oct 22, 2014 at 11:01 AM, Gwen Shapira gshap...@cloudera.com wrote: RAID-10? Interesting choice for a system where the data is already replicated

Re: Sizing Cluster

2014-10-21 Thread István
One thing that you have to keep in mind is that moving 10T between nodes takes long time. If you have a node failure and you need to rebuild (resync) the data your system is going to be vulnerable against the second node failure. You could mitigate this with using raid. I think generally speaking

Re: Sizing Cluster

2014-10-21 Thread István
of the message bus at any given time. I agree with your assessment though that having 3 nodes is a more durable configuration, but was hoping others could explain how they calculate capacity and scaling issues on their storage subsystems. Cheers, -pete On 10/21/14 11:28, István wrote: One thing

Using MirrorMaker to move some data between clusters

2014-10-07 Thread István
Hi all, I have just a quick question. I was wondering if MirrorMaker is the right solution for this task, basically I need to move a subset of the data from a production cluster to a test cluster. My original idea was to set the log.retention.minutes to a low value like 12 hours so I could keep

Dynamic partitioning

2014-09-12 Thread István
Hi all, My understanding is that with 0.8.1.x you can manually change the number of partitions on the broker, and this change is going to be picked up by the producers and consumers (high level). kafka-topics.sh --alter --zookeeper zk.net:2181/stream --topic test --partitions 3 Is that the

Re: Question on Kafka .8 / Logstash client

2014-06-13 Thread István
I am not sure what you are trying to achieve but here is this: https://github.com/joekiller/logstash-kafka On Thu, Jun 12, 2014 at 11:06 AM, Sutanu Das sd2...@att.com wrote: Hi Kafka Users, 1. Is there a plugin/client for Logstash for Kakfa 0.8 ? 2. Is there any example of

[ANN] New Clojure library for Kafka

2014-06-12 Thread István
Hi all, I was working with Kafka the other day and I wanted to use my favorite language but the official driver was rather cryptic and there too much black magic going on, so I decided to rewrite it (violating the No. 1 rule of software development from Joel Spolsky). Anyways, it is short and