Re: Recommended max number of topics (and data separation)

2018-02-01 Thread Ted Yu
After brief search, I found KAFKA-6469
FYI
 Original message From: Andrey Falko  
Date: 2/1/18  5:28 PM  (GMT-08:00) To: users@kafka.apache.org Subject: Re: 
Recommended max number of topics (and data separation) 
Indeed David, I confirmed that I can't push my clusters to more than
72k topics with default zookeeper settings. Once I get to that
quantity, leader election never happens for new topics. Additionally
if I kill one of the brokers, all the topics don't get leaders
re-elected and it is impossible to trigger it due to the ClientCnxn
"Packet len4194494 is out of range" exception or "Failed to start
preferred replica election" error. It seems like there should be a
couple of JIRA item for starters:
1) Don't let # of topics exceed what jute.maxbuffer will bear; i.e.
send a good error message back to users saying the peak has been
reached and no more topics can be created.
2) Kafka should support more than 72k topics (probably without
requiring messing with jute.maxbuffer).

Is anyone aware of JIRA tickets that might already cover the above?

On Wed, Jan 31, 2018 at 8:35 AM, David Espinosa  wrote:
> I used:
> -Djute.maxbuffer=50111000
> and the gain I had is that I could increment number of topics from 70k to
> 100k :P
>
> 2018-01-30 23:25 GMT+01:00 Andrey Falko :
>
>> On Tue, Jan 30, 2018 at 1:38 PM, David Espinosa  wrote:
>> > Hi Andrey,
>> > My topics are replicated with a replicated factor equals to the number of
>> > nodes, 3 in this test.
>> > Didn't know about the kip-227.
>> > The problems I see at 70k topics coming from ZK are related to any
>> > operation where ZK has to retrieve topics metadata. Just listing topics
>> at
>> > 50K or 60k you will experience a big delay in the response. I have no
>> more
>> > details about these problems, but is easy to reproduce the latency in the
>> > topics list request.
>>
>> AFAIK kafka doesn't do a full list as part of normal operations from
>> ZK. If you have requirements in your consumer/producer code on doing
>> --describe, then that would be a problem. I think that can be worked
>> around. Based on my profiling data so far, while things are working in
>> non-failure mode, none of the ZK functions pop up as "hot methods".
>>
>> > Thanks me for pointing me to this parameter,  vm.max_map_count, it wasn't
>> > on my radar. Could you tell me what value you use?
>>
>> I set it to the max allowable on Amzn Linux: vm.max_map_count=1215752192
>>
>> > The other way around about topic naming, I think the longer the topic
>> names
>> > are the sooner jute.maxbuffer overflows.
>>
>> I see; what value(s) have you tried with and how much gain did you you see?
>>
>> > David
>> >
>> >
>> > 2018-01-30 4:40 GMT+01:00 Andrey Falko :
>> >
>> >> On Sun, Jan 28, 2018 at 8:45 AM, David Espinosa 
>> wrote:
>> >> > Hi Monty,
>> >> >
>> >> > I'm also planning to use a big amount of topics in Kafka, so recently
>> I
>> >> > made a test within a 3 nodes kafka cluster where I created 100k topics
>> >> with
>> >> > one partition. Sent 1M messages in total.
>> >>
>> >> Are your topic partitions replicated?
>> >>
>> >> > These are my conclusions:
>> >> >
>> >> >    - There is not any limitation on kafka regarding the number of
>> topics
>> >> >    but on Zookeeper and in the system where Kafka nodes is allocated.
>> >>
>> >> There are also the problems being addressed in KIP-227:
>> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> >> 227%3A+Introduce+Incremental+FetchRequests+to+Increase+
>> >> Partition+Scalability
>> >>
>> >> >    - Zookeeper will start having problems from 70k topics, which can
>> be
>> >> >    solved modifying a buffer parameter on the JVM (-Djute.maxbuffer).
>> >> >    Performance is reduced.
>> >>
>> >> What kind of problems do you see at 70k topics? If performance is
>> >> reduced w/ modifying jute.maxbuffer, won't that effect the performance
>> >> of kafka interms of how long it takes to recover from broker failure,
>> >> creating/deleting topics, producing and consuming?
>> >>
>> >> >    - Open file descriptors of the system are equivalent to [number of
>> >> >    topics]X[number of partitions per topic]. Set to 128k in my test to
>> >&g

Re: Recommended max number of topics (and data separation)

2018-02-01 Thread Andrey Falko
Indeed David, I confirmed that I can't push my clusters to more than
72k topics with default zookeeper settings. Once I get to that
quantity, leader election never happens for new topics. Additionally
if I kill one of the brokers, all the topics don't get leaders
re-elected and it is impossible to trigger it due to the ClientCnxn
"Packet len4194494 is out of range" exception or "Failed to start
preferred replica election" error. It seems like there should be a
couple of JIRA item for starters:
1) Don't let # of topics exceed what jute.maxbuffer will bear; i.e.
send a good error message back to users saying the peak has been
reached and no more topics can be created.
2) Kafka should support more than 72k topics (probably without
requiring messing with jute.maxbuffer).

Is anyone aware of JIRA tickets that might already cover the above?

On Wed, Jan 31, 2018 at 8:35 AM, David Espinosa  wrote:
> I used:
> -Djute.maxbuffer=50111000
> and the gain I had is that I could increment number of topics from 70k to
> 100k :P
>
> 2018-01-30 23:25 GMT+01:00 Andrey Falko :
>
>> On Tue, Jan 30, 2018 at 1:38 PM, David Espinosa  wrote:
>> > Hi Andrey,
>> > My topics are replicated with a replicated factor equals to the number of
>> > nodes, 3 in this test.
>> > Didn't know about the kip-227.
>> > The problems I see at 70k topics coming from ZK are related to any
>> > operation where ZK has to retrieve topics metadata. Just listing topics
>> at
>> > 50K or 60k you will experience a big delay in the response. I have no
>> more
>> > details about these problems, but is easy to reproduce the latency in the
>> > topics list request.
>>
>> AFAIK kafka doesn't do a full list as part of normal operations from
>> ZK. If you have requirements in your consumer/producer code on doing
>> --describe, then that would be a problem. I think that can be worked
>> around. Based on my profiling data so far, while things are working in
>> non-failure mode, none of the ZK functions pop up as "hot methods".
>>
>> > Thanks me for pointing me to this parameter,  vm.max_map_count, it wasn't
>> > on my radar. Could you tell me what value you use?
>>
>> I set it to the max allowable on Amzn Linux: vm.max_map_count=1215752192
>>
>> > The other way around about topic naming, I think the longer the topic
>> names
>> > are the sooner jute.maxbuffer overflows.
>>
>> I see; what value(s) have you tried with and how much gain did you you see?
>>
>> > David
>> >
>> >
>> > 2018-01-30 4:40 GMT+01:00 Andrey Falko :
>> >
>> >> On Sun, Jan 28, 2018 at 8:45 AM, David Espinosa 
>> wrote:
>> >> > Hi Monty,
>> >> >
>> >> > I'm also planning to use a big amount of topics in Kafka, so recently
>> I
>> >> > made a test within a 3 nodes kafka cluster where I created 100k topics
>> >> with
>> >> > one partition. Sent 1M messages in total.
>> >>
>> >> Are your topic partitions replicated?
>> >>
>> >> > These are my conclusions:
>> >> >
>> >> >- There is not any limitation on kafka regarding the number of
>> topics
>> >> >but on Zookeeper and in the system where Kafka nodes is allocated.
>> >>
>> >> There are also the problems being addressed in KIP-227:
>> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> >> 227%3A+Introduce+Incremental+FetchRequests+to+Increase+
>> >> Partition+Scalability
>> >>
>> >> >- Zookeeper will start having problems from 70k topics, which can
>> be
>> >> >solved modifying a buffer parameter on the JVM (-Djute.maxbuffer).
>> >> >Performance is reduced.
>> >>
>> >> What kind of problems do you see at 70k topics? If performance is
>> >> reduced w/ modifying jute.maxbuffer, won't that effect the performance
>> >> of kafka interms of how long it takes to recover from broker failure,
>> >> creating/deleting topics, producing and consuming?
>> >>
>> >> >- Open file descriptors of the system are equivalent to [number of
>> >> >topics]X[number of partitions per topic]. Set to 128k in my test to
>> >> avoid
>> >> >problems.
>> >> >- System needs a big amount of memory for page caching.
>> >>
>> >> I also had to tune vm.max_map_count much higher.
>> >>
>> >> >
>> >> > So, after creating 100k with the required setup (system+JVM) but
>> seeing
>> >> > problems at 70k, I feel safe by not creating more than 50k, and always
>> >> will
>> >> > have Zookeeper as my first suspect if a problem comes. I think with
>> >> proper
>> >> > resources (memory) and system setup (open file descriptors), you don't
>> >> have
>> >> > any real limitation regarding partitions.
>> >>
>> >> I can confirm the 50k number. After about 40k-45k topics, I start
>> >> seeing slow down in consume offset commit latencies that eclipse 50ms.
>> >> Hopefully KIP-227 will alleviate that problem and leave ZK as the last
>> >> remaining hurdle. I'm testing with 3x replication per partition and 10
>> >> brokers.
>> >>
>> >> > By the way, I used long topic names (about 30 characters), which can
>> be
>> >> > important for ZK.
>> >>
>> >> I'd like to learn

Re: Recommended max number of topics (and data separation)

2018-01-31 Thread David Espinosa
I used:
-Djute.maxbuffer=50111000
and the gain I had is that I could increment number of topics from 70k to
100k :P

2018-01-30 23:25 GMT+01:00 Andrey Falko :

> On Tue, Jan 30, 2018 at 1:38 PM, David Espinosa  wrote:
> > Hi Andrey,
> > My topics are replicated with a replicated factor equals to the number of
> > nodes, 3 in this test.
> > Didn't know about the kip-227.
> > The problems I see at 70k topics coming from ZK are related to any
> > operation where ZK has to retrieve topics metadata. Just listing topics
> at
> > 50K or 60k you will experience a big delay in the response. I have no
> more
> > details about these problems, but is easy to reproduce the latency in the
> > topics list request.
>
> AFAIK kafka doesn't do a full list as part of normal operations from
> ZK. If you have requirements in your consumer/producer code on doing
> --describe, then that would be a problem. I think that can be worked
> around. Based on my profiling data so far, while things are working in
> non-failure mode, none of the ZK functions pop up as "hot methods".
>
> > Thanks me for pointing me to this parameter,  vm.max_map_count, it wasn't
> > on my radar. Could you tell me what value you use?
>
> I set it to the max allowable on Amzn Linux: vm.max_map_count=1215752192
>
> > The other way around about topic naming, I think the longer the topic
> names
> > are the sooner jute.maxbuffer overflows.
>
> I see; what value(s) have you tried with and how much gain did you you see?
>
> > David
> >
> >
> > 2018-01-30 4:40 GMT+01:00 Andrey Falko :
> >
> >> On Sun, Jan 28, 2018 at 8:45 AM, David Espinosa 
> wrote:
> >> > Hi Monty,
> >> >
> >> > I'm also planning to use a big amount of topics in Kafka, so recently
> I
> >> > made a test within a 3 nodes kafka cluster where I created 100k topics
> >> with
> >> > one partition. Sent 1M messages in total.
> >>
> >> Are your topic partitions replicated?
> >>
> >> > These are my conclusions:
> >> >
> >> >- There is not any limitation on kafka regarding the number of
> topics
> >> >but on Zookeeper and in the system where Kafka nodes is allocated.
> >>
> >> There are also the problems being addressed in KIP-227:
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >> 227%3A+Introduce+Incremental+FetchRequests+to+Increase+
> >> Partition+Scalability
> >>
> >> >- Zookeeper will start having problems from 70k topics, which can
> be
> >> >solved modifying a buffer parameter on the JVM (-Djute.maxbuffer).
> >> >Performance is reduced.
> >>
> >> What kind of problems do you see at 70k topics? If performance is
> >> reduced w/ modifying jute.maxbuffer, won't that effect the performance
> >> of kafka interms of how long it takes to recover from broker failure,
> >> creating/deleting topics, producing and consuming?
> >>
> >> >- Open file descriptors of the system are equivalent to [number of
> >> >topics]X[number of partitions per topic]. Set to 128k in my test to
> >> avoid
> >> >problems.
> >> >- System needs a big amount of memory for page caching.
> >>
> >> I also had to tune vm.max_map_count much higher.
> >>
> >> >
> >> > So, after creating 100k with the required setup (system+JVM) but
> seeing
> >> > problems at 70k, I feel safe by not creating more than 50k, and always
> >> will
> >> > have Zookeeper as my first suspect if a problem comes. I think with
> >> proper
> >> > resources (memory) and system setup (open file descriptors), you don't
> >> have
> >> > any real limitation regarding partitions.
> >>
> >> I can confirm the 50k number. After about 40k-45k topics, I start
> >> seeing slow down in consume offset commit latencies that eclipse 50ms.
> >> Hopefully KIP-227 will alleviate that problem and leave ZK as the last
> >> remaining hurdle. I'm testing with 3x replication per partition and 10
> >> brokers.
> >>
> >> > By the way, I used long topic names (about 30 characters), which can
> be
> >> > important for ZK.
> >>
> >> I'd like to learn more about this, are you saying that long topic
> >> names would improve ZK performance because that relates to bumping up
> >> jute.maxbuffer?
> >>
> >> > Hope this information is of your help.
> >> >
> >> > David
> >> >
> >> > 2018-01-28 2:22 GMT+01:00 Monty Hindman :
> >> >
> >> >> I'm designing a system and need some more clarity regarding Kafka's
> >> >> recommended limits on the number of topics and/or partitions. At a
> high
> >> >> level, our system would work like this:
> >> >>
> >> >> - A user creates a job X (X is a UUID).
> >> >> - The user uploads data for X to an input topic: X.in.
> >> >> - Workers process the data, writing results to an output topic:
> X.out.
> >> >> - The user downloads the data from X.out.
> >> >>
> >> >> It's important for the system that data for different jobs be kept
> >> >> separate, and that input and output data be kept separate. By
> >> "separate" I
> >> >> mean that there needs to be a reasonable way for users and the
> system's
> >> >> workers to query f

Re: Recommended max number of topics (and data separation)

2018-01-30 Thread Andrey Falko
On Tue, Jan 30, 2018 at 1:38 PM, David Espinosa  wrote:
> Hi Andrey,
> My topics are replicated with a replicated factor equals to the number of
> nodes, 3 in this test.
> Didn't know about the kip-227.
> The problems I see at 70k topics coming from ZK are related to any
> operation where ZK has to retrieve topics metadata. Just listing topics at
> 50K or 60k you will experience a big delay in the response. I have no more
> details about these problems, but is easy to reproduce the latency in the
> topics list request.

AFAIK kafka doesn't do a full list as part of normal operations from
ZK. If you have requirements in your consumer/producer code on doing
--describe, then that would be a problem. I think that can be worked
around. Based on my profiling data so far, while things are working in
non-failure mode, none of the ZK functions pop up as "hot methods".

> Thanks me for pointing me to this parameter,  vm.max_map_count, it wasn't
> on my radar. Could you tell me what value you use?

I set it to the max allowable on Amzn Linux: vm.max_map_count=1215752192

> The other way around about topic naming, I think the longer the topic names
> are the sooner jute.maxbuffer overflows.

I see; what value(s) have you tried with and how much gain did you you see?

> David
>
>
> 2018-01-30 4:40 GMT+01:00 Andrey Falko :
>
>> On Sun, Jan 28, 2018 at 8:45 AM, David Espinosa  wrote:
>> > Hi Monty,
>> >
>> > I'm also planning to use a big amount of topics in Kafka, so recently I
>> > made a test within a 3 nodes kafka cluster where I created 100k topics
>> with
>> > one partition. Sent 1M messages in total.
>>
>> Are your topic partitions replicated?
>>
>> > These are my conclusions:
>> >
>> >- There is not any limitation on kafka regarding the number of topics
>> >but on Zookeeper and in the system where Kafka nodes is allocated.
>>
>> There are also the problems being addressed in KIP-227:
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 227%3A+Introduce+Incremental+FetchRequests+to+Increase+
>> Partition+Scalability
>>
>> >- Zookeeper will start having problems from 70k topics, which can be
>> >solved modifying a buffer parameter on the JVM (-Djute.maxbuffer).
>> >Performance is reduced.
>>
>> What kind of problems do you see at 70k topics? If performance is
>> reduced w/ modifying jute.maxbuffer, won't that effect the performance
>> of kafka interms of how long it takes to recover from broker failure,
>> creating/deleting topics, producing and consuming?
>>
>> >- Open file descriptors of the system are equivalent to [number of
>> >topics]X[number of partitions per topic]. Set to 128k in my test to
>> avoid
>> >problems.
>> >- System needs a big amount of memory for page caching.
>>
>> I also had to tune vm.max_map_count much higher.
>>
>> >
>> > So, after creating 100k with the required setup (system+JVM) but seeing
>> > problems at 70k, I feel safe by not creating more than 50k, and always
>> will
>> > have Zookeeper as my first suspect if a problem comes. I think with
>> proper
>> > resources (memory) and system setup (open file descriptors), you don't
>> have
>> > any real limitation regarding partitions.
>>
>> I can confirm the 50k number. After about 40k-45k topics, I start
>> seeing slow down in consume offset commit latencies that eclipse 50ms.
>> Hopefully KIP-227 will alleviate that problem and leave ZK as the last
>> remaining hurdle. I'm testing with 3x replication per partition and 10
>> brokers.
>>
>> > By the way, I used long topic names (about 30 characters), which can be
>> > important for ZK.
>>
>> I'd like to learn more about this, are you saying that long topic
>> names would improve ZK performance because that relates to bumping up
>> jute.maxbuffer?
>>
>> > Hope this information is of your help.
>> >
>> > David
>> >
>> > 2018-01-28 2:22 GMT+01:00 Monty Hindman :
>> >
>> >> I'm designing a system and need some more clarity regarding Kafka's
>> >> recommended limits on the number of topics and/or partitions. At a high
>> >> level, our system would work like this:
>> >>
>> >> - A user creates a job X (X is a UUID).
>> >> - The user uploads data for X to an input topic: X.in.
>> >> - Workers process the data, writing results to an output topic: X.out.
>> >> - The user downloads the data from X.out.
>> >>
>> >> It's important for the system that data for different jobs be kept
>> >> separate, and that input and output data be kept separate. By
>> "separate" I
>> >> mean that there needs to be a reasonable way for users and the system's
>> >> workers to query for the data they need (by job-id and by
>> input-vs-output)
>> >> and not get the data they don't need.
>> >>
>> >> Based on expected usage and our data retention policy, we would not
>> expect
>> >> to need more than 12,000 active jobs at any one time -- in other words,
>> >> 24,000 topics. If we were to have 5 partitions per topic (our cluster
>> has 5
>> >> brokers), that would imply 120,000 part

Re: Recommended max number of topics (and data separation)

2018-01-30 Thread David Espinosa
Hi Andrey,
My topics are replicated with a replicated factor equals to the number of
nodes, 3 in this test.
Didn't know about the kip-227.
The problems I see at 70k topics coming from ZK are related to any
operation where ZK has to retrieve topics metadata. Just listing topics at
50K or 60k you will experience a big delay in the response. I have no more
details about these problems, but is easy to reproduce the latency in the
topics list request.
Thanks me for pointing me to this parameter,  vm.max_map_count, it wasn't
on my radar. Could you tell me what value you use?
The other way around about topic naming, I think the longer the topic names
are the sooner jute.maxbuffer overflows.
David


2018-01-30 4:40 GMT+01:00 Andrey Falko :

> On Sun, Jan 28, 2018 at 8:45 AM, David Espinosa  wrote:
> > Hi Monty,
> >
> > I'm also planning to use a big amount of topics in Kafka, so recently I
> > made a test within a 3 nodes kafka cluster where I created 100k topics
> with
> > one partition. Sent 1M messages in total.
>
> Are your topic partitions replicated?
>
> > These are my conclusions:
> >
> >- There is not any limitation on kafka regarding the number of topics
> >but on Zookeeper and in the system where Kafka nodes is allocated.
>
> There are also the problems being addressed in KIP-227:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 227%3A+Introduce+Incremental+FetchRequests+to+Increase+
> Partition+Scalability
>
> >- Zookeeper will start having problems from 70k topics, which can be
> >solved modifying a buffer parameter on the JVM (-Djute.maxbuffer).
> >Performance is reduced.
>
> What kind of problems do you see at 70k topics? If performance is
> reduced w/ modifying jute.maxbuffer, won't that effect the performance
> of kafka interms of how long it takes to recover from broker failure,
> creating/deleting topics, producing and consuming?
>
> >- Open file descriptors of the system are equivalent to [number of
> >topics]X[number of partitions per topic]. Set to 128k in my test to
> avoid
> >problems.
> >- System needs a big amount of memory for page caching.
>
> I also had to tune vm.max_map_count much higher.
>
> >
> > So, after creating 100k with the required setup (system+JVM) but seeing
> > problems at 70k, I feel safe by not creating more than 50k, and always
> will
> > have Zookeeper as my first suspect if a problem comes. I think with
> proper
> > resources (memory) and system setup (open file descriptors), you don't
> have
> > any real limitation regarding partitions.
>
> I can confirm the 50k number. After about 40k-45k topics, I start
> seeing slow down in consume offset commit latencies that eclipse 50ms.
> Hopefully KIP-227 will alleviate that problem and leave ZK as the last
> remaining hurdle. I'm testing with 3x replication per partition and 10
> brokers.
>
> > By the way, I used long topic names (about 30 characters), which can be
> > important for ZK.
>
> I'd like to learn more about this, are you saying that long topic
> names would improve ZK performance because that relates to bumping up
> jute.maxbuffer?
>
> > Hope this information is of your help.
> >
> > David
> >
> > 2018-01-28 2:22 GMT+01:00 Monty Hindman :
> >
> >> I'm designing a system and need some more clarity regarding Kafka's
> >> recommended limits on the number of topics and/or partitions. At a high
> >> level, our system would work like this:
> >>
> >> - A user creates a job X (X is a UUID).
> >> - The user uploads data for X to an input topic: X.in.
> >> - Workers process the data, writing results to an output topic: X.out.
> >> - The user downloads the data from X.out.
> >>
> >> It's important for the system that data for different jobs be kept
> >> separate, and that input and output data be kept separate. By
> "separate" I
> >> mean that there needs to be a reasonable way for users and the system's
> >> workers to query for the data they need (by job-id and by
> input-vs-output)
> >> and not get the data they don't need.
> >>
> >> Based on expected usage and our data retention policy, we would not
> expect
> >> to need more than 12,000 active jobs at any one time -- in other words,
> >> 24,000 topics. If we were to have 5 partitions per topic (our cluster
> has 5
> >> brokers), that would imply 120,000 partitions. [These number refer only
> to
> >> main/primary partitions, not any replicas that might exist.]
> >>
> >> Those numbers seem to be far larger than the suggested limits I see
> online.
> >> For example, the Kafka FAQ on these matters seems to imply that the most
> >> relevant limit is the number of partitions (rather than topics) and
> sort of
> >> implies that 10,000 partitions might be a suggested guideline (
> >> https://goo.gl/fQs2md). Also implied is that systems should use fewer
> >> topics and instead partition the data within topics if further
> separation
> >> is needed (the FAQ entry uses the example of partitioning by user ID,
> which
> >> is roughly an

Re: Recommended max number of topics (and data separation)

2018-01-29 Thread Andrey Falko
On Sun, Jan 28, 2018 at 8:45 AM, David Espinosa  wrote:
> Hi Monty,
>
> I'm also planning to use a big amount of topics in Kafka, so recently I
> made a test within a 3 nodes kafka cluster where I created 100k topics with
> one partition. Sent 1M messages in total.

Are your topic partitions replicated?

> These are my conclusions:
>
>- There is not any limitation on kafka regarding the number of topics
>but on Zookeeper and in the system where Kafka nodes is allocated.

There are also the problems being addressed in KIP-227:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-227%3A+Introduce+Incremental+FetchRequests+to+Increase+Partition+Scalability

>- Zookeeper will start having problems from 70k topics, which can be
>solved modifying a buffer parameter on the JVM (-Djute.maxbuffer).
>Performance is reduced.

What kind of problems do you see at 70k topics? If performance is
reduced w/ modifying jute.maxbuffer, won't that effect the performance
of kafka interms of how long it takes to recover from broker failure,
creating/deleting topics, producing and consuming?

>- Open file descriptors of the system are equivalent to [number of
>topics]X[number of partitions per topic]. Set to 128k in my test to avoid
>problems.
>- System needs a big amount of memory for page caching.

I also had to tune vm.max_map_count much higher.

>
> So, after creating 100k with the required setup (system+JVM) but seeing
> problems at 70k, I feel safe by not creating more than 50k, and always will
> have Zookeeper as my first suspect if a problem comes. I think with proper
> resources (memory) and system setup (open file descriptors), you don't have
> any real limitation regarding partitions.

I can confirm the 50k number. After about 40k-45k topics, I start
seeing slow down in consume offset commit latencies that eclipse 50ms.
Hopefully KIP-227 will alleviate that problem and leave ZK as the last
remaining hurdle. I'm testing with 3x replication per partition and 10
brokers.

> By the way, I used long topic names (about 30 characters), which can be
> important for ZK.

I'd like to learn more about this, are you saying that long topic
names would improve ZK performance because that relates to bumping up
jute.maxbuffer?

> Hope this information is of your help.
>
> David
>
> 2018-01-28 2:22 GMT+01:00 Monty Hindman :
>
>> I'm designing a system and need some more clarity regarding Kafka's
>> recommended limits on the number of topics and/or partitions. At a high
>> level, our system would work like this:
>>
>> - A user creates a job X (X is a UUID).
>> - The user uploads data for X to an input topic: X.in.
>> - Workers process the data, writing results to an output topic: X.out.
>> - The user downloads the data from X.out.
>>
>> It's important for the system that data for different jobs be kept
>> separate, and that input and output data be kept separate. By "separate" I
>> mean that there needs to be a reasonable way for users and the system's
>> workers to query for the data they need (by job-id and by input-vs-output)
>> and not get the data they don't need.
>>
>> Based on expected usage and our data retention policy, we would not expect
>> to need more than 12,000 active jobs at any one time -- in other words,
>> 24,000 topics. If we were to have 5 partitions per topic (our cluster has 5
>> brokers), that would imply 120,000 partitions. [These number refer only to
>> main/primary partitions, not any replicas that might exist.]
>>
>> Those numbers seem to be far larger than the suggested limits I see online.
>> For example, the Kafka FAQ on these matters seems to imply that the most
>> relevant limit is the number of partitions (rather than topics) and sort of
>> implies that 10,000 partitions might be a suggested guideline (
>> https://goo.gl/fQs2md). Also implied is that systems should use fewer
>> topics and instead partition the data within topics if further separation
>> is needed (the FAQ entry uses the example of partitioning by user ID, which
>> is roughly analogous to job ID in my use case).
>>
>> The guidance in the FAQ is unclear to me:
>>
>> - Does the suggested limit of 10,000 refer to the total number of
>> partitions (ie, main partitions plus any replicas) or just the main
>> partitions?
>>
>> - If the most important limitation is number of partitions (rather than
>> number of topics), how does the suggested strategy of using fewer topics
>> and then partitioning by some other attribute (ie job ID) help at all?
>>
>> - Is my use case just a bad fit for Kafka? Or, is there a way for us to use
>> Kafka while still supporting the kinds of query patterns that we need (ie,
>> by job ID and by input-vs-output)?
>>
>> Thanks in advance for any guidance.
>>
>> Monty
>>


Re: Recommended max number of topics (and data separation)

2018-01-28 Thread David Espinosa
Hi Monty,

I'm also planning to use a big amount of topics in Kafka, so recently I
made a test within a 3 nodes kafka cluster where I created 100k topics with
one partition. Sent 1M messages in total.
These are my conclusions:

   - There is not any limitation on kafka regarding the number of topics
   but on Zookeeper and in the system where Kafka nodes is allocated.
   - Zookeeper will start having problems from 70k topics, which can be
   solved modifying a buffer parameter on the JVM (-Djute.maxbuffer).
   Performance is reduced.
   - Open file descriptors of the system are equivalent to [number of
   topics]X[number of partitions per topic]. Set to 128k in my test to avoid
   problems.
   - System needs a big amount of memory for page caching.

So, after creating 100k with the required setup (system+JVM) but seeing
problems at 70k, I feel safe by not creating more than 50k, and always will
have Zookeeper as my first suspect if a problem comes. I think with proper
resources (memory) and system setup (open file descriptors), you don't have
any real limitation regarding partitions.
By the way, I used long topic names (about 30 characters), which can be
important for ZK.
Hope this information is of your help.

David

2018-01-28 2:22 GMT+01:00 Monty Hindman :

> I'm designing a system and need some more clarity regarding Kafka's
> recommended limits on the number of topics and/or partitions. At a high
> level, our system would work like this:
>
> - A user creates a job X (X is a UUID).
> - The user uploads data for X to an input topic: X.in.
> - Workers process the data, writing results to an output topic: X.out.
> - The user downloads the data from X.out.
>
> It's important for the system that data for different jobs be kept
> separate, and that input and output data be kept separate. By "separate" I
> mean that there needs to be a reasonable way for users and the system's
> workers to query for the data they need (by job-id and by input-vs-output)
> and not get the data they don't need.
>
> Based on expected usage and our data retention policy, we would not expect
> to need more than 12,000 active jobs at any one time -- in other words,
> 24,000 topics. If we were to have 5 partitions per topic (our cluster has 5
> brokers), that would imply 120,000 partitions. [These number refer only to
> main/primary partitions, not any replicas that might exist.]
>
> Those numbers seem to be far larger than the suggested limits I see online.
> For example, the Kafka FAQ on these matters seems to imply that the most
> relevant limit is the number of partitions (rather than topics) and sort of
> implies that 10,000 partitions might be a suggested guideline (
> https://goo.gl/fQs2md). Also implied is that systems should use fewer
> topics and instead partition the data within topics if further separation
> is needed (the FAQ entry uses the example of partitioning by user ID, which
> is roughly analogous to job ID in my use case).
>
> The guidance in the FAQ is unclear to me:
>
> - Does the suggested limit of 10,000 refer to the total number of
> partitions (ie, main partitions plus any replicas) or just the main
> partitions?
>
> - If the most important limitation is number of partitions (rather than
> number of topics), how does the suggested strategy of using fewer topics
> and then partitioning by some other attribute (ie job ID) help at all?
>
> - Is my use case just a bad fit for Kafka? Or, is there a way for us to use
> Kafka while still supporting the kinds of query patterns that we need (ie,
> by job ID and by input-vs-output)?
>
> Thanks in advance for any guidance.
>
> Monty
>