Re: Kafka Subscriptions

2015-09-20 Thread Daniel Compton
See http://kafka.apache.org/contact.html to do this, you'll need to do it yourself. On Mon, Sep 21, 2015 at 10:43 AM Murari Goswami wrote: > Hi, > > Can you please add to the subscriptions list. > > -- > Thanks, > Murari Goswami > -- Daniel

Re: Why Apache is scalable?

2015-08-09 Thread Daniel Compton
Hi Lucky Your question at the moment is a bit vague, and not particularly well posed. These kinds of questions can normally be answered best by a quick google search and reading the docs, rather than asking the list. Kafka provides excellent docs at

Re: Cache Memory Kafka Process

2015-07-27 Thread Daniel Compton
http://www.linuxatemyram.com may be a helpful resource to explain this better. On Tue, 28 Jul 2015 at 5:32 AM Ewen Cheslack-Postava e...@confluent.io wrote: Having the OS cache the data in Kafka's log files is useful since it means that data doesn't need to be read back from disk when consumed.

Re: [Announcement] Hermes - pub / sub broker built on top of Kafka

2015-05-25 Thread Daniel Compton
is to push event to Kafka, as only this way we can assure our customers that the event will be delivered (and is stored reliably). Thus, we do not plan on making any distributed storage here - Kafka is our storage. Adam 2015-05-20 11:49 GMT+02:00 Daniel Compton daniel.compton.li

Re: Architecture for multiple consumers of a given message

2015-05-24 Thread Daniel Compton
Hi Warren If you're using the high level consumer, then you can just have multiple consumer groups (one for each purpose), and run 1 consumer thread per consumer group. On Mon, 25 May 2015 at 8:43 am Warren Henning warren.henn...@gmail.com wrote: I'm working on a simple web application where I

Re: Optimal number of partitions for topic

2015-05-21 Thread Daniel Compton
a few simple formulas. More... | | | | View on blog.confluent.io | Preview by Yahoo | | | | | Naidu Saladi From: Daniel Compton daniel.compton.li...@gmail.com To: users@kafka.apache.org Sent: Wednesday, May 20, 2015 8:21 PM Subject: Re: Optimal number of partitions

Re: [Announcement] Hermes - pub / sub broker built on top of Kafka

2015-05-20 Thread Daniel Compton
Hi Adam Firstly, thanks for open sourcing this, it looks like a great tool and I can imagine a lot of people will find it very useful. I had a few thoughts reading the docs. I may have misunderstood things but it seems that your goal of meeting a strict SLA conflicts with your goal of

Re: Optimal number of partitions for topic

2015-05-20 Thread Daniel Compton
One of the beautiful things about Kafka is that it uses the disk and OS disk caching really efficiently. Because Kafka writes messages to a contiguous log, it needs very little seek time to move the write head to the next point. Similarly for reading, if the consumers are mostly up to date with

Re: I want to prescribe

2015-05-02 Thread Daniel Compton
Here is the place. On Sun, 3 May 2015 at 11:30 am Samuel Measho meash...@gmail.com wrote: Hi I have some questions regarding kafka, where do we post it? Thanks, Samuel

Re: Data replication and zero data loss

2015-04-30 Thread Daniel Compton
When we evaluated MirrorMaker last year we didn't find any risk of data loss, only duplicate messages in the case of a network partition. Did you discover data loss in your tests, or were you just looking at the docs? On Fri, 1 May 2015 at 4:31 pm Jiangjie Qin j...@linkedin.com.invalid wrote:

Re: New and old producers partition messages differently

2015-04-26 Thread Daniel Compton
I would support a configuration flag to be added in the short term, say until 0.9. In the long term, hashcode may change out from underneath people anyway, so delaying moving to Murmur for too long is likely to still end up in pain. Leaving that configuration around long term increases code and

Re: Fw: How to measure performance of Mirror Maker

2015-04-21 Thread Daniel Compton
From memory, MirrorMaker is just using Kafka Producers and Consumers to send the data from one DC to the other. So the meaningful performance metric I would be looking at is how far behind your mirror queues are from your source queues. Your other performance metrics are going to be very dependent

Re: Kafka Monitoring using JMX

2015-04-20 Thread Daniel Compton
Hi Naidu You'll need to escape the with a \ in the Mbean names. I've run across this too and it was al pain. It can get a bit tricky if you're doing it in code because you need to account for double escapes and so forth. This is a bug in the version of Metrics that Kafka is using. There is a

Re: One or multiple instances of MM to aggregate kafka data to one hadoop

2015-01-28 Thread Daniel Compton
Hi Mingjie I would recommend the first option of running one mirrormaker instance pulling from multiple DC's. A single MM instance will be able to make more efficient use of the machine resources in two ways: 1. You will only have to run one process which will be able to be allocated the full

Re: TTL changes - Are they retroactive?

2014-11-20 Thread Daniel Compton
Hi Parag Just to expand on Jun’s comment, log retention and deletion is at the segment level, not the message level. Because it’s at the segment level I would avoid using the term TTL, as that would normally be applied to individual items. Every log.retention.check.interval.ms (default 5

Re: Elastsic Scaling

2014-11-20 Thread Daniel Compton
While it’s good to plan ahead for growth, Kafka will still let you add more partitions to a topic https://kafka.apache.org/081/ops.html#basic_ops_modify_topic. This will rebalance the hashing if you are partitioning by your key, and consumers will probably end up with different partitions, but

Re: Aeron, a high throughput, low latency messaging system

2014-11-19 Thread Daniel Compton
latency for throughput to great effect but may be something that Aeron protocol would never support because of it's focus on latency first. Cheers, Roger On Mon, Nov 17, 2014 at 11:12 AM, Daniel Compton d...@danielcompton.net wrote: Recently there has been some interesting discussion

Aeron, a high throughput, low latency messaging system

2014-11-17 Thread Daniel Compton
Recently there has been some interesting discussion on the Mechanical Sympathy mailing list about Martin Thompson’s new extremely high performance, extremely low latency, open source messaging system called Aeron. For those of you that don’t know, Martin Thompson is an expert on wringing as

Re: Programmatic Kafka version detection/extraction?

2014-11-15 Thread Daniel Compton
This has been covered in passing in the preceding threads but I'd like to make it explicit, can we have a command line option/script for getting the Kafka version (probably with Scala version too)? I ran into this recently, where we wanted to verify that the right version of Kafka had been

Re: MBeans, dashes, underscores, and KAFKA-1481

2014-10-19 Thread Daniel Compton
I'm pretty sure that the quotes are a side effect of using Metrics 2.x. When part of an Mbean name has certain characters, then that part will be wrapped in quotes. This is fixed in Metrics 3. -- Daniel On 18/10/2014, at 10:03 am, Rajasekar Elango rela...@salesforce.com wrote: +1 on

Re: kafka java api (written in 100% clojure)

2014-10-13 Thread Daniel Compton
Hi Gerrit Thanks for your contribution, I'm sure everyone here appreciates it, especially Clojure developers like myself. I do have one question: what are the guarantees you offer to users of your library under failures, particularly when Redis fails? -- Daniel On 13/10/2014, at 10:22 am,

Re: kafka docker

2014-09-30 Thread Daniel Compton
Hi Joe What's the story for persisting data with Docker? Do you use a data volume or do you just start fresh every time you start the Docker instance? Daniel. On 1/10/2014, at 7:13 am, Buntu Dev buntu...@gmail.com wrote: Thanks Joe.. seems quite handy. Is there a 'Kafka-HDFS with Camus'

Re: multi-node and multi-broker kafka cluster setup

2014-09-30 Thread Daniel Compton
Hi Sa While it's possible to run multiple brokers on a single machine, I would be interested to hear why you would want to. Kafka is very efficient and can use all of the system resources under load. Running multiple brokers would increase zookeeper load, force resource sharing between the

Re: Disactivating Yammer Metrics Monitoring

2014-09-17 Thread Daniel Compton
Hi Francois I didn't quite understand how you've set up your metrics reporting. Are you using the https://github.com/criteo/kafka-ganglia metrics reporter? If so then you should be able to adjust the config to exclude the metrics you don't want, with kafka.ganglia.metrics.exclude.regex. On 18

Re: Using kafka in non million users environment

2014-08-19 Thread Daniel Compton
Hi Justin It sounds like Kafka could be a good fit for your environment. Are you able to tell us more about the kinds of applications you will be running? Daniel. On 19/08/2014, at 10:53 am, Justin Maltat justin.mal...@gmail.com wrote: Hello, I'm managing a study to explore

Re: kafka consumer fail over

2014-08-03 Thread Daniel Compton
Hi Weide The consumer rebalancing algorithm is deterministic. In your failure scenario, when A comes back up again, the consumer threads will rebalance. This will give you the initial consumer configuration at the start of the test. I'm unsure whether the partitions are balanced round robin,

Re: request.required.acks=-1 under high data volume

2014-07-21 Thread Daniel Compton
In the docs for 0.8.1.1, there are only three options for request.required.acks https://kafka.apache.org/documentation.html#producerconfigs, {-1, 0, 1}. How is request.required.acks=3 a valid configuration property? Am I reading it incorrectly or are the docs out of date? On 18 July 2014 06:25,

Durably storing messages in Kafka

2014-07-15 Thread Daniel Compton
I think I know the answer to this already but I wanted to check my assumptions before proceeding. We are using Kafka as a queueing mechanism for receiving messages from stateless producers. We are operating in a legal framework where we can never lose a committed message, but we can reject a

Re: Reg Kafka Replication

2014-06-30 Thread Daniel Compton
Hi Balasubramanian Why the (topics/partition) combination which has broker with id 0 in their replication list does not find a new broker and replicate the messages? Is this the intended behavior of Kafka ? Do you mean, why does Broker 0 stay in the replication set for partitions when it

Kafka producer performance test sending 0x0 byte messages

2014-06-30 Thread Daniel Compton
Hi folks I was doing some performance testing using the built in Kafka performance tester and it seems like it sends messages of size n bytes but with all bytes having the value 0x0. Is that correct? Reading the source seemed to indicate that too but I'm not a Scala developer so I could be

Stopping thread on consumer timeout

2014-06-30 Thread Daniel Compton
I'm doing some testing to reconcile the results of mirror maker replication between two Kafka clusters across an unreliable (Internet) link using Clojure. In this case, we run our production tests, wait for MM replication to finish, then drain the topics on both sides of the network and compare

Re: Kafka producer performance test sending 0x0 byte messages

2014-06-30 Thread Daniel Compton
On Mon, Jun 30, 2014 at 2:24 AM, Daniel Compton d...@danielcompton.net wrote: Hi folks I was doing some performance testing using the built in Kafka performance tester and it seems like it sends messages of size n bytes but with all bytes having the value 0x0. Is that correct? Reading

Re: Intercept broker operation in Kafka

2014-06-24 Thread Daniel Compton
Hi Ravi You’ve probably seen this already but I thought I’d point it out just in case: https://kafka.apache.org/documentation.html#monitoring. In our case we are using https://github.com/pingles/kafka-riemann-reporter to send metrics to Riemann but you could get the metrics through JMX to send

How does number of partitions affect sequential disk IO

2014-06-24 Thread Daniel Compton
I’ve been reading the Kafka docs and one thing that I’m having trouble understanding is how partitions affect sequential disk IO. One of the reasons Kafka is so fast is that you can do lots of sequential IO with read-ahead cache and all of that goodness. However, if your broker is responsible

Re: How does number of partitions affect sequential disk IO

2014-06-24 Thread Daniel Compton
that you are reading mostly sequentially and your consumers are keeping up. On 6/24/14 3:58 AM, Daniel Compton d...@danielcompton.net wrote: I¹ve been reading the Kafka docs and one thing that I¹m having trouble understanding is how partitions affect sequential disk IO. One of the reasons

Re: How does number of partitions affect sequential disk IO

2014-06-24 Thread Daniel Compton
. Depending on your volume, that might not be enough. On 6/24/14 6:44 AM, Daniel Compton d...@danielcompton.net wrote: Good point. We've only got two disks per node and two topics so I was planning to have one disk/partition. Our workload is very write heavy so I'm mostly concerned about

Re: Design for Kafka

2014-06-23 Thread Daniel Compton
Hi there One architecture pattern could be to have a number of app servers to receive the messages from the devices and send them to the Kafka cluster. This would let you handle authentication and authorisation from these devices, I don’t know your exact scenario but I don’t think you’d want

Re: MirrorMaker documentation suggestions

2014-06-18 Thread Daniel Compton
:39 pm, Guozhang Wang wrote: Thanks Daniel for the findings, please feel free to update the wiki. Guozhang On Tue, Jun 17, 2014 at 9:56 PM, Daniel Compton d...@danielcompton.net (mailto:d...@danielcompton.net) wrote: Hi I was following the instructions for Kafka mirroring

Re: MirrorMaker documentation suggestions

2014-06-18 Thread Daniel Compton
Hi Jun Thanks, I was able to update the wiki. Daniel. On Thursday, 19 June 2014 at 2:45 am, Jun Rao wrote: I just granted you the wiki permission. Could you give it a try? Thanks, Jun On Tue, Jun 17, 2014 at 11:30 PM, Daniel Compton d...@danielcompton.net (mailto:d

MirrorMaker documentation suggestions

2014-06-17 Thread Daniel Compton
Hi I was following the instructions for Kafka mirroring and had two suggestions for improving the documentation at https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=27846330: 1. Move Note that the --zkconnect argument should point to the source cluster's ZooKeeper...” above the