understanding partition key

2015-02-11 Thread Gary Ogden
I'm trying to understand how the partition key works and whether I need to specify a partition key for my topics or not. What happens if I don't specify a PK and I have more than one consumer that wants all messages in a topic for a certain period of time? Will those consumers get all the message

Re: understanding partition key

2015-02-11 Thread Zijing Guo
Partition key is on producer level, that if you have multiple partitions for a single topic, then you can pass in a key for the KeyedMessage object, and base on different partition.class, it will return a partition number for the producer, and producer will find the leader for that partition.I d

Re: understanding partition key

2015-02-12 Thread Gary Ogden
So it's not possible to have 1 topic with 1 partition and many consumers of that topic? My intention is to have a topic with many consumers, but each consumer needs to be able to have access to all the messages in that topic. On 11 February 2015 at 20:42, Zijing Guo wrote: > Partition key is on

Re: understanding partition key

2015-02-12 Thread David McNelis
Gary, That is certainly a valid use case. What Zijing was saying is that you can only have 1 consumer per consumer application per partition. I think that what it boils down to is how you want your information grouped inside your timeframes. For example, if you want to have everything for a spe

Re: understanding partition key

2015-02-12 Thread Gary Ogden
Thanks David. Whether Kafka is the right choice is exactly what I'm trying to determine. Everything I want to do with these events is time based. Store them in the topic for 24 hours. Read from the topics and get data for a time period (last hour , last 8 hours etc). This reading from the topics c

Re: understanding partition key

2015-02-12 Thread David McNelis
In our setup we deal with a similar situation (lots of time-series data that we have to aggregate in a number of different ways). Our approach is to push all of our data to a central producer stack, that stack then submits data to different topics, depending on a set of predetermined rules. For a

Re: understanding partition key

2015-02-12 Thread Todd Snyder
,#,, Z. Z Sent from the wilds on my BlackBerry smartphone. Original Message From: Gary Ogden Sent: Thursday, February 12, 2015 8:23 AM To: users@kafka.apache.org Reply To: users@kafka.apache.org Subject: Re: understanding partition key Thanks David. Whether Kafka is the right

Re: understanding partition key

2015-02-12 Thread Gary Ogden
Thanks again David. So what kind of latencies are you experiencing with this? If I wanted to act upon certain events in this and send out alarms (email, sms etc), what kind of delays are you seeing by the time you're able to process them? It seems if you were to create an alarm topic, and dump ale

Re: understanding partition key

2015-02-12 Thread David McNelis
I'm going to go a bit in reverse for your questions. We built a restful API to push data to so that we could submit things from multiple sources that aren't necessarily things that our team would maintain, as well as validate that data before we send it off to a topic. As for consumers... we expec

Re: understanding partition key

2015-02-12 Thread Gary Ogden
Thanks. I was under the impression that you can't process the data as each record comes into the topic? That Kafka doesn't support pushing messages to a consumer like a traditional message queue? On 12 February 2015 at 11:06, David McNelis wrote: > I'm going to go a bit in reverse for your ques

Re: understanding partition key

2015-02-12 Thread David McNelis
Sorry... I probably didn't word that well. When I say each message, that's assuming your going out to poll for new messages on the topic. I.e. if you have a latency tolerance of 1 second, then you'd never want to go more than 1 second without polling for new messages. On Thu, Feb 12, 2015 at 12: