Understood. Thanks.
On Wed, May 4, 2016 at 3:47 PM, Wesley Chow wrote:
>
> We don’t do this on the Kafka side, but for a different system that has
> similar distribution problems we manually maintain a map of “hot” keys. On
> the Kafka side, we distribute keys with an even distribution in our la
We don’t do this on the Kafka side, but for a different system that has similar
distribution problems we manually maintain a map of “hot” keys. On the Kafka
side, we distribute keys with an even distribution in our largest volume topic,
and then squash the data and repartition based on a skewed
Yeah, fixed slicing may help. I'll put more thought into this.
You had mentioned that you didn't put custom partitioner into production.
Would you mind sharing how you worked around this currently?
Srikanth
On Tue, May 3, 2016 at 5:43 PM, Wesley Chow wrote:
> >
> > Upload to S3 is partitioned b
ter I LinkedIn I Facebook I YouTube
>
>
> -Original Message-
> From: Srikanth [mailto:srikanth...@gmail.com]
> Sent: Tuesday, May 03, 2016 1:57 PM
> To: users@kafka.apache.org
> Subject: Re: Hash partition of key with skew
>
> So, there are a few consumers. One is
>
> Upload to S3 is partitioned by the "key" field. I.e, one folder per key. It
> does offset management to make sure offset commit is in sync with S3 upload.
We do this in several spots and I wish we had built our system in such a way
that we could just open source it. I’m sure many people have
: Re: Hash partition of key with skew
So, there are a few consumers. One is a spark streaming job where we can go a
partitionBy(key) and take a slight hit.
There are two consumers which are just java apps. Multiple instance running in
Marathon.
One consumer reads records, does basic checks
essages as you pull them off of
> Kafka?
>
> -Dave
>
>
> -Original Message-
> From: Srikanth [mailto:srikanth...@gmail.com]
> Sent: Tuesday, May 03, 2016 12:12 PM
> To: users@kafka.apache.org
> Subject: Re: Hash partition of key with skew
>
> Jens,
> T
gt; of messages. What will you do with the messages as you pull them off of
> Kafka?
>
> -Dave
>
>
> -Original Message-
> From: Srikanth [mailto:srikanth...@gmail.com]
> Sent: Tuesday, May 03, 2016 12:12 PM
> To: users@kafka.apache.org
> Subject: Re: Hash parti
them off of Kafka?
-Dave
-Original Message-
From: Srikanth [mailto:srikanth...@gmail.com]
Sent: Tuesday, May 03, 2016 12:12 PM
To: users@kafka.apache.org
Subject: Re: Hash partition of key with skew
Jens,
Thanks for the link. That is something to consider. Of course it has downsides
too
cripts.com <http://www.surescripts.com/> |
> dave.tauz...@surescripts.com <mailto:dave.tauz...@surescripts.com>
> > Connect with us: Twitter I LinkedIn I Facebook I YouTube
> >
> >
> > -Original Message-
> > From: Wesley Chow [mailto:w...@
.3042 | www.surescripts.com | dave.tauz...@surescripts.com
Connect with us: Twitter I LinkedIn I Facebook I YouTube
-Original Message-
From: Wesley Chow [mailto:w...@chartbeat.com]
Sent: Tuesday, May 03, 2016 10:51 AM
To: users@kafka.apache.org
Subject: Re: Hash partition of key with skew
I’m not t
Connect with us: Twitter I LinkedIn I Facebook I YouTube
>
>
> -Original Message-
> From: Wesley Chow [mailto:w...@chartbeat.com <mailto:w...@chartbeat.com>]
> Sent: Tuesday, May 03, 2016 9:51 AM
> To: users@kafka.apache.org <mailto:users@kafka.apache.org>
> Subject: Re: Ha
kedIn I Facebook I YouTube
-Original Message-
From: Wesley Chow [mailto:w...@chartbeat.com]
Sent: Tuesday, May 03, 2016 9:51 AM
To: users@kafka.apache.org
Subject: Re: Hash partition of key with skew
I’ve come up with a couple solutions since we too have a power law
distribution. Howeve
I’ve come up with a couple solutions since we too have a power law
distribution. However, we have not put anything into practice.
Fixed Slicing
One simple thing to do is to take each key and slice it into some fixed number
of partitions. So your function might be:
(hash(key) % num) + (hash(key
Hi,
Not sure if this helps, but the way Loggly seem to do it is to have a
separate topic for "noisy neighbors". See [1].
[1]
https://www.loggly.com/blog/loggly-loves-apache-kafka-use-unbreakable-messaging-better-log-management/
Cheers,
Jens
On Wed, Apr 27, 2016 at 9:11 PM Srikanth wrote:
> He
Hello,
Is there a recommendation for handling producer side partitioning based on
a key with skew?
We want to partition on something like clientId. Problem is, this key has
an uniform distribution.
Its equally likely to see a key with 3k occurrence/day vs 100k/day vs
65million/day.
Cardinality of
16 matches
Mail list logo