Gary,

That is certainly a valid use case.  What Zijing was saying is that you can
only have 1 consumer per consumer application per partition.

I think that what it boils down to is how you want your information grouped
inside your timeframes.  For example, if you want to have everything for a
specific user, then you could use that as your partition key, ensuring that
any data for that user is processed by the same consumer (in however many
consumer applications you opt to run).

The second point that I think Zijing was getting at was whether or not your
proposed use case makes sense for Kafka.  If your goal is to do
time-interval batch processing (versus N-record batches), then why use
Kafka for it?  Why not use something more adept at batch processing?  For
example, if you're using HBase you can use Pig jobs that would read only
the records created between specific timestamps.

David

On Thu, Feb 12, 2015 at 7:44 AM, Gary Ogden <gog...@gmail.com> wrote:

> So it's not possible to have 1 topic with 1 partition and many consumers of
> that topic?
>
> My intention is to have a topic with many consumers, but each consumer
> needs to be able to have access to all the messages in that topic.
>
> On 11 February 2015 at 20:42, Zijing Guo <alter...@yahoo.com.invalid>
> wrote:
>
> > Partition key is on producer level, that if you have multiple partitions
> > for a single topic, then you can pass in a key for the KeyedMessage
> object,
> > and base on different partition.class, it will return a partition number
> > for the producer, and producer will find the leader for that partition.I
> > don't know how kafka could handle time series case, but depends on how
> many
> > partitions for that topic. If you only have 1 partition, then you don't
> > need to worry about order at all, since each consumer group can only
> allow
> > 1 consumer instance to consume that data.  if you have multiple
> partitions
> > (say 3 for example), then you can fire up 3 consumer instances under the
> > same consumer group, and each will only consume 1 partition's data. if
> > order in each partition matters, then you need to do some work on the
> > producer side.Hope this helpsEdwin
> >
> >      On Wednesday, February 11, 2015 3:14 PM, Gary Ogden <
> gog...@gmail.com>
> > wrote:
> >
> >
> >  I'm trying to understand how the partition key works and whether I need
> to
> > specify a partition key for my topics or not.  What happens if I don't
> > specify a PK and I have more than one consumer that wants all messages
> in a
> > topic for a certain period of time? Will those consumers get all the
> > messages, but they just may not be ordered correctly?
> >
> > The current scenario is that we will have events going into a topic based
> > on customer and the data will remain in the topic for 24 hours. We will
> > then have multiple consumers reading messages from that topic. They will
> > want to be able to get them out over a time range (could be last hour,
> last
> > 8 hours etc).
> >
> > So if I specify the same PK for each subscriber, then each consumer will
> > get all messages in the correct order?  If I don't specify the PK or use
> a
> > random one, will each consumer still get all the messages but they just
> > won't be ordered correctly?
> >
> >
> >
> >
>

Reply via email to