Re: How many partition can one single machine handle in Kafka?

2014-10-24 Thread Xiaobin She
Todd,

Thank you very much for your reply. My understanding of RAID 10 is wrong.

I understand that one can not get absolute sequential disk access even on
one single disk, the reason I'm interested with this question is that the
design document of Kafka emphasize that Kafka make advantage of the
sequential disk acceess to improve the disk performance, and I can' t
understand how to achive this with thounds of open files.

I thought that compare to one or fewer files, thounds of open files will
make the disk access much more random, and make the disk performance much
more weak.

You mentioned that to increase overall IO cpapcity, one will have to use
multiple spindles with sufficiently fast disk speed, but will it be more
effective for the disk with fewer files? Or does the num of files is not an
important factor for the entire performance of Kafka?

Thanks again.

xiaobinshe



2014-10-23 22:01 GMT+08:00 Todd Palino tpal...@gmail.com:

 Your understanding of RAID 10 is slightly off. Because it is a combination
 of striping and mirroring, trying to say that there are 4000 open files per
 pair of disks is not accurate. The disk, as far as the system is concerned,
 is the entire RAID. Files are striped across all mirrors, so any open file
 will cross all 7 mirror sets.

 Even if you were to operate on a single disk, you're never going to be able
 to ensure sequential disk access with Kafka. Even if you have a single
 partition on a disk, there will be multiple log files for that partition
 and you will have to seek to read older data. What you have to do is use
 multiple spindles, with sufficiently fast disk speeds, to increase your
 overall IO capacity. You can also tune to get a little more. For example,
 we use a 120 second commit on that mount point to reduce the frequency of
 flushing to disk.

 -Todd


 On Wed, Oct 22, 2014 at 10:09 PM, Xiaobin She xiaobin...@gmail.com
 wrote:

  Todd,
 
  Thank you for the information.
 
  With 28,000+ files and 14 disks, that makes there are averagely about
 4000
  open files on two disk ( which is treated as one single disk) , am I
 right?
 
  How do you manage to make the all the write operation to thest 4000 open
  files be sequential to the disk?
 
  As far as I know, write operation to different files on the same disk
 will
  cause random write, which is not good for performance.
 
  xiaobinshe
 
 
 
 
  2014-10-23 1:00 GMT+08:00 Todd Palino tpal...@gmail.com:
 
   In fact there are many more than 4000 open files. Many of our brokers
 run
   with 28,000+ open files (regular file handles, not network
 connections).
  In
   our case, we're beefing up the disk performance as much as we can by
   running in a RAID-10 configuration with 14 disks.
  
   -Todd
  
   On Tue, Oct 21, 2014 at 7:58 PM, Xiaobin She xiaobin...@gmail.com
  wrote:
  
Todd,
   
Actually I'm wondering how kafka handle so much partition, with one
partition there is at least one file on disk, and with 4000
 partition,
there will be at least 4000 files.
   
When all these partitions have write request, how did Kafka make the
   write
operation on the disk to be sequential (which is emphasized in the
  design
document of Kafka) and make sure the disk access is effective?
   
Thank you for your reply.
   
xiaobinshe
   
   
   
2014-10-22 5:10 GMT+08:00 Todd Palino tpal...@gmail.com:
   
 As far as the number of partitions a single broker can handle,
 we've
   set
 our cap at 4000 partitions (including replicas). Above that we've
  seen
some
 performance and stability issues.

 -Todd

 On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She 
 xiaobin...@gmail.com
 wrote:

  hello, everyone
 
  I'm new to kafka, I'm wondering what's the max num of partition
 can
   one
  siggle machine handle in Kafka?
 
  Is there an sugeest num?
 
  Thanks.
 
  xiaobinshe
 

   
  
 



Re: How many partition can one single machine handle in Kafka?

2014-10-24 Thread Todd Palino
Hmm, I haven't read the design doc lately, but I'm surprised that there's
even a discussion of sequential disk access. I suppose for small subsets of
the writes you can write larger blocks of sequential data, but that's about
the extent of it. Maybe one of the developers can speak more to that aspect.

As far as the number of files goes, it really doesn't matter that much
whether you have a few or a lot. Once you have more than one, the disk
access is random, so the performance is more like a cliff than a gentle
slope. As I said, we've found issues once we go above 4000 partitions, and
that's probably a combination of what the software can handle and the
number of open files.

-Todd


On Thu, Oct 23, 2014 at 11:19 PM, Xiaobin She xiaobin...@gmail.com wrote:

 Todd,

 Thank you very much for your reply. My understanding of RAID 10 is wrong.

 I understand that one can not get absolute sequential disk access even on
 one single disk, the reason I'm interested with this question is that the
 design document of Kafka emphasize that Kafka make advantage of the
 sequential disk acceess to improve the disk performance, and I can' t
 understand how to achive this with thounds of open files.

 I thought that compare to one or fewer files, thounds of open files will
 make the disk access much more random, and make the disk performance much
 more weak.

 You mentioned that to increase overall IO cpapcity, one will have to use
 multiple spindles with sufficiently fast disk speed, but will it be more
 effective for the disk with fewer files? Or does the num of files is not an
 important factor for the entire performance of Kafka?

 Thanks again.

 xiaobinshe



 2014-10-23 22:01 GMT+08:00 Todd Palino tpal...@gmail.com:

  Your understanding of RAID 10 is slightly off. Because it is a
 combination
  of striping and mirroring, trying to say that there are 4000 open files
 per
  pair of disks is not accurate. The disk, as far as the system is
 concerned,
  is the entire RAID. Files are striped across all mirrors, so any open
 file
  will cross all 7 mirror sets.
 
  Even if you were to operate on a single disk, you're never going to be
 able
  to ensure sequential disk access with Kafka. Even if you have a single
  partition on a disk, there will be multiple log files for that partition
  and you will have to seek to read older data. What you have to do is use
  multiple spindles, with sufficiently fast disk speeds, to increase your
  overall IO capacity. You can also tune to get a little more. For example,
  we use a 120 second commit on that mount point to reduce the frequency of
  flushing to disk.
 
  -Todd
 
 
  On Wed, Oct 22, 2014 at 10:09 PM, Xiaobin She xiaobin...@gmail.com
  wrote:
 
   Todd,
  
   Thank you for the information.
  
   With 28,000+ files and 14 disks, that makes there are averagely about
  4000
   open files on two disk ( which is treated as one single disk) , am I
  right?
  
   How do you manage to make the all the write operation to thest 4000
 open
   files be sequential to the disk?
  
   As far as I know, write operation to different files on the same disk
  will
   cause random write, which is not good for performance.
  
   xiaobinshe
  
  
  
  
   2014-10-23 1:00 GMT+08:00 Todd Palino tpal...@gmail.com:
  
In fact there are many more than 4000 open files. Many of our brokers
  run
with 28,000+ open files (regular file handles, not network
  connections).
   In
our case, we're beefing up the disk performance as much as we can by
running in a RAID-10 configuration with 14 disks.
   
-Todd
   
On Tue, Oct 21, 2014 at 7:58 PM, Xiaobin She xiaobin...@gmail.com
   wrote:
   
 Todd,

 Actually I'm wondering how kafka handle so much partition, with one
 partition there is at least one file on disk, and with 4000
  partition,
 there will be at least 4000 files.

 When all these partitions have write request, how did Kafka make
 the
write
 operation on the disk to be sequential (which is emphasized in the
   design
 document of Kafka) and make sure the disk access is effective?

 Thank you for your reply.

 xiaobinshe



 2014-10-22 5:10 GMT+08:00 Todd Palino tpal...@gmail.com:

  As far as the number of partitions a single broker can handle,
  we've
set
  our cap at 4000 partitions (including replicas). Above that we've
   seen
 some
  performance and stability issues.
 
  -Todd
 
  On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She 
  xiaobin...@gmail.com
  wrote:
 
   hello, everyone
  
   I'm new to kafka, I'm wondering what's the max num of partition
  can
one
   siggle machine handle in Kafka?
  
   Is there an sugeest num?
  
   Thanks.
  
   xiaobinshe
  
 

   
  
 



Re: How many partition can one single machine handle in Kafka?

2014-10-24 Thread Gwen Shapira
Todd,

Did you load-test using SSDs?
Got numbers to share?

On Fri, Oct 24, 2014 at 10:40 AM, Todd Palino tpal...@gmail.com wrote:
 Hmm, I haven't read the design doc lately, but I'm surprised that there's
 even a discussion of sequential disk access. I suppose for small subsets of
 the writes you can write larger blocks of sequential data, but that's about
 the extent of it. Maybe one of the developers can speak more to that aspect.

 As far as the number of files goes, it really doesn't matter that much
 whether you have a few or a lot. Once you have more than one, the disk
 access is random, so the performance is more like a cliff than a gentle
 slope. As I said, we've found issues once we go above 4000 partitions, and
 that's probably a combination of what the software can handle and the
 number of open files.

 -Todd


 On Thu, Oct 23, 2014 at 11:19 PM, Xiaobin She xiaobin...@gmail.com wrote:

 Todd,

 Thank you very much for your reply. My understanding of RAID 10 is wrong.

 I understand that one can not get absolute sequential disk access even on
 one single disk, the reason I'm interested with this question is that the
 design document of Kafka emphasize that Kafka make advantage of the
 sequential disk acceess to improve the disk performance, and I can' t
 understand how to achive this with thounds of open files.

 I thought that compare to one or fewer files, thounds of open files will
 make the disk access much more random, and make the disk performance much
 more weak.

 You mentioned that to increase overall IO cpapcity, one will have to use
 multiple spindles with sufficiently fast disk speed, but will it be more
 effective for the disk with fewer files? Or does the num of files is not an
 important factor for the entire performance of Kafka?

 Thanks again.

 xiaobinshe



 2014-10-23 22:01 GMT+08:00 Todd Palino tpal...@gmail.com:

  Your understanding of RAID 10 is slightly off. Because it is a
 combination
  of striping and mirroring, trying to say that there are 4000 open files
 per
  pair of disks is not accurate. The disk, as far as the system is
 concerned,
  is the entire RAID. Files are striped across all mirrors, so any open
 file
  will cross all 7 mirror sets.
 
  Even if you were to operate on a single disk, you're never going to be
 able
  to ensure sequential disk access with Kafka. Even if you have a single
  partition on a disk, there will be multiple log files for that partition
  and you will have to seek to read older data. What you have to do is use
  multiple spindles, with sufficiently fast disk speeds, to increase your
  overall IO capacity. You can also tune to get a little more. For example,
  we use a 120 second commit on that mount point to reduce the frequency of
  flushing to disk.
 
  -Todd
 
 
  On Wed, Oct 22, 2014 at 10:09 PM, Xiaobin She xiaobin...@gmail.com
  wrote:
 
   Todd,
  
   Thank you for the information.
  
   With 28,000+ files and 14 disks, that makes there are averagely about
  4000
   open files on two disk ( which is treated as one single disk) , am I
  right?
  
   How do you manage to make the all the write operation to thest 4000
 open
   files be sequential to the disk?
  
   As far as I know, write operation to different files on the same disk
  will
   cause random write, which is not good for performance.
  
   xiaobinshe
  
  
  
  
   2014-10-23 1:00 GMT+08:00 Todd Palino tpal...@gmail.com:
  
In fact there are many more than 4000 open files. Many of our brokers
  run
with 28,000+ open files (regular file handles, not network
  connections).
   In
our case, we're beefing up the disk performance as much as we can by
running in a RAID-10 configuration with 14 disks.
   
-Todd
   
On Tue, Oct 21, 2014 at 7:58 PM, Xiaobin She xiaobin...@gmail.com
   wrote:
   
 Todd,

 Actually I'm wondering how kafka handle so much partition, with one
 partition there is at least one file on disk, and with 4000
  partition,
 there will be at least 4000 files.

 When all these partitions have write request, how did Kafka make
 the
write
 operation on the disk to be sequential (which is emphasized in the
   design
 document of Kafka) and make sure the disk access is effective?

 Thank you for your reply.

 xiaobinshe



 2014-10-22 5:10 GMT+08:00 Todd Palino tpal...@gmail.com:

  As far as the number of partitions a single broker can handle,
  we've
set
  our cap at 4000 partitions (including replicas). Above that we've
   seen
 some
  performance and stability issues.
 
  -Todd
 
  On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She 
  xiaobin...@gmail.com
  wrote:
 
   hello, everyone
  
   I'm new to kafka, I'm wondering what's the max num of partition
  can
one
   siggle machine handle in Kafka?
  
   Is there an sugeest num?
  
   Thanks.
  
   xiaobinshe
  
 
   

Re: How many partition can one single machine handle in Kafka?

2014-10-24 Thread Todd Palino
We haven't done any testing of Kafka on SSDs, mostly because our storage
density needs are too high. Since our IO load has been fine on the current
model, we haven't pushed in that direction yet. Additionally, I haven't
done any real load testing since I got here, which is part of why we're
going to reevaluate our storage soon.

That said, we are using SSDs for the transaction log volume on our
Zookeeper nodes, with great success. We detailed some of that in the
presentation that Jonathan linked (no latency or outstanding requests). It
helps that we use very high quality SSD drives.

-Todd


On Fri, Oct 24, 2014 at 10:44 AM, Gwen Shapira gshap...@cloudera.com
wrote:

 Todd,

 Did you load-test using SSDs?
 Got numbers to share?

 On Fri, Oct 24, 2014 at 10:40 AM, Todd Palino tpal...@gmail.com wrote:
  Hmm, I haven't read the design doc lately, but I'm surprised that there's
  even a discussion of sequential disk access. I suppose for small subsets
 of
  the writes you can write larger blocks of sequential data, but that's
 about
  the extent of it. Maybe one of the developers can speak more to that
 aspect.
 
  As far as the number of files goes, it really doesn't matter that much
  whether you have a few or a lot. Once you have more than one, the disk
  access is random, so the performance is more like a cliff than a gentle
  slope. As I said, we've found issues once we go above 4000 partitions,
 and
  that's probably a combination of what the software can handle and the
  number of open files.
 
  -Todd
 
 
  On Thu, Oct 23, 2014 at 11:19 PM, Xiaobin She xiaobin...@gmail.com
 wrote:
 
  Todd,
 
  Thank you very much for your reply. My understanding of RAID 10 is
 wrong.
 
  I understand that one can not get absolute sequential disk access even
 on
  one single disk, the reason I'm interested with this question is that
 the
  design document of Kafka emphasize that Kafka make advantage of the
  sequential disk acceess to improve the disk performance, and I can' t
  understand how to achive this with thounds of open files.
 
  I thought that compare to one or fewer files, thounds of open files will
  make the disk access much more random, and make the disk performance
 much
  more weak.
 
  You mentioned that to increase overall IO cpapcity, one will have to use
  multiple spindles with sufficiently fast disk speed, but will it be more
  effective for the disk with fewer files? Or does the num of files is
 not an
  important factor for the entire performance of Kafka?
 
  Thanks again.
 
  xiaobinshe
 
 
 
  2014-10-23 22:01 GMT+08:00 Todd Palino tpal...@gmail.com:
 
   Your understanding of RAID 10 is slightly off. Because it is a
  combination
   of striping and mirroring, trying to say that there are 4000 open
 files
  per
   pair of disks is not accurate. The disk, as far as the system is
  concerned,
   is the entire RAID. Files are striped across all mirrors, so any open
  file
   will cross all 7 mirror sets.
  
   Even if you were to operate on a single disk, you're never going to be
  able
   to ensure sequential disk access with Kafka. Even if you have a single
   partition on a disk, there will be multiple log files for that
 partition
   and you will have to seek to read older data. What you have to do is
 use
   multiple spindles, with sufficiently fast disk speeds, to increase
 your
   overall IO capacity. You can also tune to get a little more. For
 example,
   we use a 120 second commit on that mount point to reduce the
 frequency of
   flushing to disk.
  
   -Todd
  
  
   On Wed, Oct 22, 2014 at 10:09 PM, Xiaobin She xiaobin...@gmail.com
   wrote:
  
Todd,
   
Thank you for the information.
   
With 28,000+ files and 14 disks, that makes there are averagely
 about
   4000
open files on two disk ( which is treated as one single disk) , am I
   right?
   
How do you manage to make the all the write operation to thest 4000
  open
files be sequential to the disk?
   
As far as I know, write operation to different files on the same
 disk
   will
cause random write, which is not good for performance.
   
xiaobinshe
   
   
   
   
2014-10-23 1:00 GMT+08:00 Todd Palino tpal...@gmail.com:
   
 In fact there are many more than 4000 open files. Many of our
 brokers
   run
 with 28,000+ open files (regular file handles, not network
   connections).
In
 our case, we're beefing up the disk performance as much as we can
 by
 running in a RAID-10 configuration with 14 disks.

 -Todd

 On Tue, Oct 21, 2014 at 7:58 PM, Xiaobin She 
 xiaobin...@gmail.com
wrote:

  Todd,
 
  Actually I'm wondering how kafka handle so much partition, with
 one
  partition there is at least one file on disk, and with 4000
   partition,
  there will be at least 4000 files.
 
  When all these partitions have write request, how did Kafka make
  the
 write
  operation on the disk to be sequential (which is 

Re: How many partition can one single machine handle in Kafka?

2014-10-23 Thread Neil Harkins
I've been thinking about this recently.
If kafka provided cmdline hooks to be executed on segment rotation,
similar to postgres' wal 'archive_command', configurations could store
only the current segments and all their random i/o on flash, then once
rotated, copy them sequentially onto larger/slower spinning disks,
or even S3.

-neil

On Wed, Oct 22, 2014 at 10:09 PM, Xiaobin She xiaobin...@gmail.com wrote:
 Todd,

 Thank you for the information.

 With 28,000+ files and 14 disks, that makes there are averagely about 4000
 open files on two disk ( which is treated as one single disk) , am I right?

 How do you manage to make the all the write operation to thest 4000 open
 files be sequential to the disk?

 As far as I know, write operation to different files on the same disk will
 cause random write, which is not good for performance.

 xiaobinshe




 2014-10-23 1:00 GMT+08:00 Todd Palino tpal...@gmail.com:

 In fact there are many more than 4000 open files. Many of our brokers run
 with 28,000+ open files (regular file handles, not network connections). In
 our case, we're beefing up the disk performance as much as we can by
 running in a RAID-10 configuration with 14 disks.

 -Todd

 On Tue, Oct 21, 2014 at 7:58 PM, Xiaobin She xiaobin...@gmail.com wrote:

  Todd,
 
  Actually I'm wondering how kafka handle so much partition, with one
  partition there is at least one file on disk, and with 4000 partition,
  there will be at least 4000 files.
 
  When all these partitions have write request, how did Kafka make the
 write
  operation on the disk to be sequential (which is emphasized in the design
  document of Kafka) and make sure the disk access is effective?
 
  Thank you for your reply.
 
  xiaobinshe
 
 
 
  2014-10-22 5:10 GMT+08:00 Todd Palino tpal...@gmail.com:
 
   As far as the number of partitions a single broker can handle, we've
 set
   our cap at 4000 partitions (including replicas). Above that we've seen
  some
   performance and stability issues.
  
   -Todd
  
   On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She xiaobin...@gmail.com
   wrote:
  
hello, everyone
   
I'm new to kafka, I'm wondering what's the max num of partition can
 one
siggle machine handle in Kafka?
   
Is there an sugeest num?
   
Thanks.
   
xiaobinshe
   
  
 



Re: How many partition can one single machine handle in Kafka?

2014-10-23 Thread Todd Palino
I've mentioned this a couple times in discussions recently as well. We were
discussing the concept of infinite retention for a certain type of service,
and how it might be accomplished. My suggestion was to have a combination
of storage types and the ability for Kafka to look for segments in two
different directory structures. This way you could expand the backend
storage as needed (which could be on an external storage appliance) while
still maintaining performance for recent segments.

I still think this is something worth pursuing at some point, and it should
be relatively easy to implement within the broker.

-Todd


On Wed, Oct 22, 2014 at 11:53 PM, Neil Harkins nhark...@gmail.com wrote:

 I've been thinking about this recently.
 If kafka provided cmdline hooks to be executed on segment rotation,
 similar to postgres' wal 'archive_command', configurations could store
 only the current segments and all their random i/o on flash, then once
 rotated, copy them sequentially onto larger/slower spinning disks,
 or even S3.

 -neil

 On Wed, Oct 22, 2014 at 10:09 PM, Xiaobin She xiaobin...@gmail.com
 wrote:
  Todd,
 
  Thank you for the information.
 
  With 28,000+ files and 14 disks, that makes there are averagely about
 4000
  open files on two disk ( which is treated as one single disk) , am I
 right?
 
  How do you manage to make the all the write operation to thest 4000 open
  files be sequential to the disk?
 
  As far as I know, write operation to different files on the same disk
 will
  cause random write, which is not good for performance.
 
  xiaobinshe
 
 
 
 
  2014-10-23 1:00 GMT+08:00 Todd Palino tpal...@gmail.com:
 
  In fact there are many more than 4000 open files. Many of our brokers
 run
  with 28,000+ open files (regular file handles, not network
 connections). In
  our case, we're beefing up the disk performance as much as we can by
  running in a RAID-10 configuration with 14 disks.
 
  -Todd
 
  On Tue, Oct 21, 2014 at 7:58 PM, Xiaobin She xiaobin...@gmail.com
 wrote:
 
   Todd,
  
   Actually I'm wondering how kafka handle so much partition, with one
   partition there is at least one file on disk, and with 4000 partition,
   there will be at least 4000 files.
  
   When all these partitions have write request, how did Kafka make the
  write
   operation on the disk to be sequential (which is emphasized in the
 design
   document of Kafka) and make sure the disk access is effective?
  
   Thank you for your reply.
  
   xiaobinshe
  
  
  
   2014-10-22 5:10 GMT+08:00 Todd Palino tpal...@gmail.com:
  
As far as the number of partitions a single broker can handle, we've
  set
our cap at 4000 partitions (including replicas). Above that we've
 seen
   some
performance and stability issues.
   
-Todd
   
On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She xiaobin...@gmail.com
 
wrote:
   
 hello, everyone

 I'm new to kafka, I'm wondering what's the max num of partition
 can
  one
 siggle machine handle in Kafka?

 Is there an sugeest num?

 Thanks.

 xiaobinshe

   
  
 



Re: How many partition can one single machine handle in Kafka?

2014-10-23 Thread István
This is actually a very vague statement and does not cover every use case.
Having a RAID10 array of 6x250G SSDs is very different from having 4x1T
spinning drives. In my experience rebuilding a raid10 array that has
several smaller SSD disks is hardly noticeable from the service point of
view, because the IO write load is distributed amongst several disk pairs.
You get lets say 1/2...1/4 (depending how many disk pairs you have) of the
per node IO bandwidth. What configuration have you had the experience with?
Was it fewer spinning disks?

Regards,
Istvan


On Wed, Oct 22, 2014 at 3:44 PM, Neha Narkhede neha.narkh...@gmail.com
wrote:

 In my experience, RAID 10 doesn't really provide value in the presence of
 replication. When a disk fails, the RAID resync process is so I/O intensive
 that it renders the broker useless until it completes. When this happens,
 you actually have to take the broker out of rotation and move the leaders
 off of it to prevent it from serving requests in a degraded state. You
 might as well shutdown the broker, delete the broker's data and let it
 catch up from the leader.

 On Wed, Oct 22, 2014 at 11:20 AM, Gwen Shapira gshap...@cloudera.com
 wrote:

  Makes sense. Thanks :)
 
  On Wed, Oct 22, 2014 at 11:10 AM, Jonathan Weeks
  jonathanbwe...@gmail.com wrote:
   There are various costs when a broker fails, including broker leader
  election for each partition, etc., as well as exposing possible issues
 for
  in-flight messages, and client rebalancing etc.
  
   So even though replication provides partition redundancy, RAID 10 on
  each broker is usually a good tradeoff to prevent the typical most common
  cause of broker server failure (e.g. disk failure) as well, and overall
  smoother operation.
  
   Best Regards,
  
   -Jonathan
  
  
   On Oct 22, 2014, at 11:01 AM, Gwen Shapira gshap...@cloudera.com
  wrote:
  
   RAID-10?
   Interesting choice for a system where the data is already replicated
   between nodes. Is it to avoid the cost of large replication over the
   network? how large are these disks?
  
   On Wed, Oct 22, 2014 at 10:00 AM, Todd Palino tpal...@gmail.com
  wrote:
   In fact there are many more than 4000 open files. Many of our brokers
  run
   with 28,000+ open files (regular file handles, not network
  connections). In
   our case, we're beefing up the disk performance as much as we can by
   running in a RAID-10 configuration with 14 disks.
  
   -Todd
  
   On Tue, Oct 21, 2014 at 7:58 PM, Xiaobin She xiaobin...@gmail.com
  wrote:
  
   Todd,
  
   Actually I'm wondering how kafka handle so much partition, with one
   partition there is at least one file on disk, and with 4000
 partition,
   there will be at least 4000 files.
  
   When all these partitions have write request, how did Kafka make the
  write
   operation on the disk to be sequential (which is emphasized in the
  design
   document of Kafka) and make sure the disk access is effective?
  
   Thank you for your reply.
  
   xiaobinshe
  
  
  
   2014-10-22 5:10 GMT+08:00 Todd Palino tpal...@gmail.com:
  
   As far as the number of partitions a single broker can handle,
 we've
  set
   our cap at 4000 partitions (including replicas). Above that we've
  seen
   some
   performance and stability issues.
  
   -Todd
  
   On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She 
 xiaobin...@gmail.com
   wrote:
  
   hello, everyone
  
   I'm new to kafka, I'm wondering what's the max num of partition
 can
  one
   siggle machine handle in Kafka?
  
   Is there an sugeest num?
  
   Thanks.
  
   xiaobinshe
  
  
  
  
 




-- 
the sun shines for all


Re: How many partition can one single machine handle in Kafka?

2014-10-23 Thread István
RAID has nothing to do with the overall availability of your system, it is
just increasing the per node reliability.

Regards,
Istvan

On Wed, Oct 22, 2014 at 11:01 AM, Gwen Shapira gshap...@cloudera.com
wrote:

 RAID-10?
 Interesting choice for a system where the data is already replicated
 between nodes. Is it to avoid the cost of large replication over the
 network? how large are these disks?

 On Wed, Oct 22, 2014 at 10:00 AM, Todd Palino tpal...@gmail.com wrote:
  In fact there are many more than 4000 open files. Many of our brokers run
  with 28,000+ open files (regular file handles, not network connections).
 In
  our case, we're beefing up the disk performance as much as we can by
  running in a RAID-10 configuration with 14 disks.
 
  -Todd
 
  On Tue, Oct 21, 2014 at 7:58 PM, Xiaobin She xiaobin...@gmail.com
 wrote:
 
  Todd,
 
  Actually I'm wondering how kafka handle so much partition, with one
  partition there is at least one file on disk, and with 4000 partition,
  there will be at least 4000 files.
 
  When all these partitions have write request, how did Kafka make the
 write
  operation on the disk to be sequential (which is emphasized in the
 design
  document of Kafka) and make sure the disk access is effective?
 
  Thank you for your reply.
 
  xiaobinshe
 
 
 
  2014-10-22 5:10 GMT+08:00 Todd Palino tpal...@gmail.com:
 
   As far as the number of partitions a single broker can handle, we've
 set
   our cap at 4000 partitions (including replicas). Above that we've seen
  some
   performance and stability issues.
  
   -Todd
  
   On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She xiaobin...@gmail.com
   wrote:
  
hello, everyone
   
I'm new to kafka, I'm wondering what's the max num of partition can
 one
siggle machine handle in Kafka?
   
Is there an sugeest num?
   
Thanks.
   
xiaobinshe
   
  
 




-- 
the sun shines for all


Re: How many partition can one single machine handle in Kafka?

2014-10-22 Thread Todd Palino
The number of brokers doesn't really matter here, as far as I can tell,
because the question is about what a single broker can handle. The number
of partitions in the cluster is governed by the ability of the controller
to manage the list of partitions for the cluster, and the ability of each
broker to keep that list (to serve metadata requests). The number of
partitions on a single broker is governed by that broker's ability to
handle the messages and files on disk. That's a much more limiting factor
than what the controller can do.

-Todd

On Tue, Oct 21, 2014 at 2:52 PM, Neil Harkins nhark...@gmail.com wrote:

 On Tue, Oct 21, 2014 at 2:10 PM, Todd Palino tpal...@gmail.com wrote:
  As far as the number of partitions a single broker can handle, we've set
  our cap at 4000 partitions (including replicas). Above that we've seen
 some
  performance and stability issues.

 How many brokers? I'm curious: what kinds of problems would affect
 a single broker with a large number of partitions, but not affect the
 entire cluster with even more partitions?



Re: How many partition can one single machine handle in Kafka?

2014-10-22 Thread Todd Palino
In fact there are many more than 4000 open files. Many of our brokers run
with 28,000+ open files (regular file handles, not network connections). In
our case, we're beefing up the disk performance as much as we can by
running in a RAID-10 configuration with 14 disks.

-Todd

On Tue, Oct 21, 2014 at 7:58 PM, Xiaobin She xiaobin...@gmail.com wrote:

 Todd,

 Actually I'm wondering how kafka handle so much partition, with one
 partition there is at least one file on disk, and with 4000 partition,
 there will be at least 4000 files.

 When all these partitions have write request, how did Kafka make the write
 operation on the disk to be sequential (which is emphasized in the design
 document of Kafka) and make sure the disk access is effective?

 Thank you for your reply.

 xiaobinshe



 2014-10-22 5:10 GMT+08:00 Todd Palino tpal...@gmail.com:

  As far as the number of partitions a single broker can handle, we've set
  our cap at 4000 partitions (including replicas). Above that we've seen
 some
  performance and stability issues.
 
  -Todd
 
  On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She xiaobin...@gmail.com
  wrote:
 
   hello, everyone
  
   I'm new to kafka, I'm wondering what's the max num of partition can one
   siggle machine handle in Kafka?
  
   Is there an sugeest num?
  
   Thanks.
  
   xiaobinshe
  
 



Re: How many partition can one single machine handle in Kafka?

2014-10-22 Thread Gwen Shapira
RAID-10?
Interesting choice for a system where the data is already replicated
between nodes. Is it to avoid the cost of large replication over the
network? how large are these disks?

On Wed, Oct 22, 2014 at 10:00 AM, Todd Palino tpal...@gmail.com wrote:
 In fact there are many more than 4000 open files. Many of our brokers run
 with 28,000+ open files (regular file handles, not network connections). In
 our case, we're beefing up the disk performance as much as we can by
 running in a RAID-10 configuration with 14 disks.

 -Todd

 On Tue, Oct 21, 2014 at 7:58 PM, Xiaobin She xiaobin...@gmail.com wrote:

 Todd,

 Actually I'm wondering how kafka handle so much partition, with one
 partition there is at least one file on disk, and with 4000 partition,
 there will be at least 4000 files.

 When all these partitions have write request, how did Kafka make the write
 operation on the disk to be sequential (which is emphasized in the design
 document of Kafka) and make sure the disk access is effective?

 Thank you for your reply.

 xiaobinshe



 2014-10-22 5:10 GMT+08:00 Todd Palino tpal...@gmail.com:

  As far as the number of partitions a single broker can handle, we've set
  our cap at 4000 partitions (including replicas). Above that we've seen
 some
  performance and stability issues.
 
  -Todd
 
  On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She xiaobin...@gmail.com
  wrote:
 
   hello, everyone
  
   I'm new to kafka, I'm wondering what's the max num of partition can one
   siggle machine handle in Kafka?
  
   Is there an sugeest num?
  
   Thanks.
  
   xiaobinshe
  
 



Re: How many partition can one single machine handle in Kafka?

2014-10-22 Thread Jonathan Weeks
There are various costs when a broker fails, including broker leader election 
for each partition, etc., as well as exposing possible issues for in-flight 
messages, and client rebalancing etc.

So even though replication provides partition redundancy, RAID 10 on each 
broker is usually a good tradeoff to prevent the typical most common cause of 
broker server failure (e.g. disk failure) as well, and overall smoother 
operation.

Best Regards,

-Jonathan


On Oct 22, 2014, at 11:01 AM, Gwen Shapira gshap...@cloudera.com wrote:

 RAID-10?
 Interesting choice for a system where the data is already replicated
 between nodes. Is it to avoid the cost of large replication over the
 network? how large are these disks?
 
 On Wed, Oct 22, 2014 at 10:00 AM, Todd Palino tpal...@gmail.com wrote:
 In fact there are many more than 4000 open files. Many of our brokers run
 with 28,000+ open files (regular file handles, not network connections). In
 our case, we're beefing up the disk performance as much as we can by
 running in a RAID-10 configuration with 14 disks.
 
 -Todd
 
 On Tue, Oct 21, 2014 at 7:58 PM, Xiaobin She xiaobin...@gmail.com wrote:
 
 Todd,
 
 Actually I'm wondering how kafka handle so much partition, with one
 partition there is at least one file on disk, and with 4000 partition,
 there will be at least 4000 files.
 
 When all these partitions have write request, how did Kafka make the write
 operation on the disk to be sequential (which is emphasized in the design
 document of Kafka) and make sure the disk access is effective?
 
 Thank you for your reply.
 
 xiaobinshe
 
 
 
 2014-10-22 5:10 GMT+08:00 Todd Palino tpal...@gmail.com:
 
 As far as the number of partitions a single broker can handle, we've set
 our cap at 4000 partitions (including replicas). Above that we've seen
 some
 performance and stability issues.
 
 -Todd
 
 On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She xiaobin...@gmail.com
 wrote:
 
 hello, everyone
 
 I'm new to kafka, I'm wondering what's the max num of partition can one
 siggle machine handle in Kafka?
 
 Is there an sugeest num?
 
 Thanks.
 
 xiaobinshe
 
 
 



Re: How many partition can one single machine handle in Kafka?

2014-10-22 Thread Gwen Shapira
Makes sense. Thanks :)

On Wed, Oct 22, 2014 at 11:10 AM, Jonathan Weeks
jonathanbwe...@gmail.com wrote:
 There are various costs when a broker fails, including broker leader election 
 for each partition, etc., as well as exposing possible issues for in-flight 
 messages, and client rebalancing etc.

 So even though replication provides partition redundancy, RAID 10 on each 
 broker is usually a good tradeoff to prevent the typical most common cause of 
 broker server failure (e.g. disk failure) as well, and overall smoother 
 operation.

 Best Regards,

 -Jonathan


 On Oct 22, 2014, at 11:01 AM, Gwen Shapira gshap...@cloudera.com wrote:

 RAID-10?
 Interesting choice for a system where the data is already replicated
 between nodes. Is it to avoid the cost of large replication over the
 network? how large are these disks?

 On Wed, Oct 22, 2014 at 10:00 AM, Todd Palino tpal...@gmail.com wrote:
 In fact there are many more than 4000 open files. Many of our brokers run
 with 28,000+ open files (regular file handles, not network connections). In
 our case, we're beefing up the disk performance as much as we can by
 running in a RAID-10 configuration with 14 disks.

 -Todd

 On Tue, Oct 21, 2014 at 7:58 PM, Xiaobin She xiaobin...@gmail.com wrote:

 Todd,

 Actually I'm wondering how kafka handle so much partition, with one
 partition there is at least one file on disk, and with 4000 partition,
 there will be at least 4000 files.

 When all these partitions have write request, how did Kafka make the write
 operation on the disk to be sequential (which is emphasized in the design
 document of Kafka) and make sure the disk access is effective?

 Thank you for your reply.

 xiaobinshe



 2014-10-22 5:10 GMT+08:00 Todd Palino tpal...@gmail.com:

 As far as the number of partitions a single broker can handle, we've set
 our cap at 4000 partitions (including replicas). Above that we've seen
 some
 performance and stability issues.

 -Todd

 On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She xiaobin...@gmail.com
 wrote:

 hello, everyone

 I'm new to kafka, I'm wondering what's the max num of partition can one
 siggle machine handle in Kafka?

 Is there an sugeest num?

 Thanks.

 xiaobinshe






Re: How many partition can one single machine handle in Kafka?

2014-10-22 Thread Neha Narkhede
In my experience, RAID 10 doesn't really provide value in the presence of
replication. When a disk fails, the RAID resync process is so I/O intensive
that it renders the broker useless until it completes. When this happens,
you actually have to take the broker out of rotation and move the leaders
off of it to prevent it from serving requests in a degraded state. You
might as well shutdown the broker, delete the broker's data and let it
catch up from the leader.

On Wed, Oct 22, 2014 at 11:20 AM, Gwen Shapira gshap...@cloudera.com
wrote:

 Makes sense. Thanks :)

 On Wed, Oct 22, 2014 at 11:10 AM, Jonathan Weeks
 jonathanbwe...@gmail.com wrote:
  There are various costs when a broker fails, including broker leader
 election for each partition, etc., as well as exposing possible issues for
 in-flight messages, and client rebalancing etc.
 
  So even though replication provides partition redundancy, RAID 10 on
 each broker is usually a good tradeoff to prevent the typical most common
 cause of broker server failure (e.g. disk failure) as well, and overall
 smoother operation.
 
  Best Regards,
 
  -Jonathan
 
 
  On Oct 22, 2014, at 11:01 AM, Gwen Shapira gshap...@cloudera.com
 wrote:
 
  RAID-10?
  Interesting choice for a system where the data is already replicated
  between nodes. Is it to avoid the cost of large replication over the
  network? how large are these disks?
 
  On Wed, Oct 22, 2014 at 10:00 AM, Todd Palino tpal...@gmail.com
 wrote:
  In fact there are many more than 4000 open files. Many of our brokers
 run
  with 28,000+ open files (regular file handles, not network
 connections). In
  our case, we're beefing up the disk performance as much as we can by
  running in a RAID-10 configuration with 14 disks.
 
  -Todd
 
  On Tue, Oct 21, 2014 at 7:58 PM, Xiaobin She xiaobin...@gmail.com
 wrote:
 
  Todd,
 
  Actually I'm wondering how kafka handle so much partition, with one
  partition there is at least one file on disk, and with 4000 partition,
  there will be at least 4000 files.
 
  When all these partitions have write request, how did Kafka make the
 write
  operation on the disk to be sequential (which is emphasized in the
 design
  document of Kafka) and make sure the disk access is effective?
 
  Thank you for your reply.
 
  xiaobinshe
 
 
 
  2014-10-22 5:10 GMT+08:00 Todd Palino tpal...@gmail.com:
 
  As far as the number of partitions a single broker can handle, we've
 set
  our cap at 4000 partitions (including replicas). Above that we've
 seen
  some
  performance and stability issues.
 
  -Todd
 
  On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She xiaobin...@gmail.com
  wrote:
 
  hello, everyone
 
  I'm new to kafka, I'm wondering what's the max num of partition can
 one
  siggle machine handle in Kafka?
 
  Is there an sugeest num?
 
  Thanks.
 
  xiaobinshe
 
 
 
 



Re: How many partition can one single machine handle in Kafka?

2014-10-22 Thread Jonathan Weeks
Neha, 

Do you mean RAID 10 or RAID 5 or 6? With RAID 5 or 6, recovery is definitely 
very painful, but less so with RAID 10.

We have been using the guidance here:

http://www.youtube.com/watch?v=19DvtEC0EbQ#t=190 (LinkedIn Site Reliability 
Engineers state they run RAID 10 on all Kafka clusters @34:40 or so)

Plus: https://cwiki.apache.org/confluence/display/KAFKA/Operations

LinkedIn
Hardware
We are using dual quad-core Intel Xeon machines with 24GB of memory. In general 
this should not matter too much, we only see pretty low CPU usage at peak even 
with GZIP compression enabled and a number of clients that don't batch 
requests. The memory is probably more than is needed for caching the active 
segments of the log.
The disk throughput is important. We have 8x7200 rpm SATA drives in a RAID 10 
array. In general this is the performance bottleneck, and more disks is more 
better. Depending on how you configure flush behavior you may or may not 
benefit from more expensive disks (if you flush often then higher RPM SAS 
drives may be better).
OS Settings
We use Linux. Ext4 is the filesystem and we run using software RAID 10. We 
haven't benchmarked filesystems so other filesystems may be superior.
We have added two tuning changes: (1) we upped the number of file descriptors 
since we have lots of topics and lots of connections, and (2) we upped the max 
socket buffer size to enable high-performance data transfer between data 
centers (described here).


Best Regards,

-Jonathan



On Oct 22, 2014, at 3:44 PM, Neha Narkhede neha.narkh...@gmail.com wrote:

 In my experience, RAID 10 doesn't really provide value in the presence of
 replication. When a disk fails, the RAID resync process is so I/O intensive
 that it renders the broker useless until it completes. When this happens,
 you actually have to take the broker out of rotation and move the leaders
 off of it to prevent it from serving requests in a degraded state. You
 might as well shutdown the broker, delete the broker's data and let it
 catch up from the leader.
 
 On Wed, Oct 22, 2014 at 11:20 AM, Gwen Shapira gshap...@cloudera.com
 wrote:
 
 Makes sense. Thanks :)
 
 On Wed, Oct 22, 2014 at 11:10 AM, Jonathan Weeks
 jonathanbwe...@gmail.com wrote:
 There are various costs when a broker fails, including broker leader
 election for each partition, etc., as well as exposing possible issues for
 in-flight messages, and client rebalancing etc.
 
 So even though replication provides partition redundancy, RAID 10 on
 each broker is usually a good tradeoff to prevent the typical most common
 cause of broker server failure (e.g. disk failure) as well, and overall
 smoother operation.
 
 Best Regards,
 
 -Jonathan
 
 
 On Oct 22, 2014, at 11:01 AM, Gwen Shapira gshap...@cloudera.com
 wrote:
 
 RAID-10?
 Interesting choice for a system where the data is already replicated
 between nodes. Is it to avoid the cost of large replication over the
 network? how large are these disks?
 
 On Wed, Oct 22, 2014 at 10:00 AM, Todd Palino tpal...@gmail.com
 wrote:
 In fact there are many more than 4000 open files. Many of our brokers
 run
 with 28,000+ open files (regular file handles, not network
 connections). In
 our case, we're beefing up the disk performance as much as we can by
 running in a RAID-10 configuration with 14 disks.
 
 -Todd
 
 On Tue, Oct 21, 2014 at 7:58 PM, Xiaobin She xiaobin...@gmail.com
 wrote:
 
 Todd,
 
 Actually I'm wondering how kafka handle so much partition, with one
 partition there is at least one file on disk, and with 4000 partition,
 there will be at least 4000 files.
 
 When all these partitions have write request, how did Kafka make the
 write
 operation on the disk to be sequential (which is emphasized in the
 design
 document of Kafka) and make sure the disk access is effective?
 
 Thank you for your reply.
 
 xiaobinshe
 
 
 
 2014-10-22 5:10 GMT+08:00 Todd Palino tpal...@gmail.com:
 
 As far as the number of partitions a single broker can handle, we've
 set
 our cap at 4000 partitions (including replicas). Above that we've
 seen
 some
 performance and stability issues.
 
 -Todd
 
 On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She xiaobin...@gmail.com
 wrote:
 
 hello, everyone
 
 I'm new to kafka, I'm wondering what's the max num of partition can
 one
 siggle machine handle in Kafka?
 
 Is there an sugeest num?
 
 Thanks.
 
 xiaobinshe
 
 
 
 
 



Re: How many partition can one single machine handle in Kafka?

2014-10-22 Thread Todd Palino
Yeah, Jonathan, I'm the LinkedIn SRE who said that :) And Neha, up until
recently, sat 8 feet from my desk. The data from the wiki page is off a
little bit as well (we're running 14 disks now, and 64 GB systems)

So to hit the first questions, RAID 10 gives higher read performance, and
also allows you to suffer a disk failure without having to drop the entire
cluster. As Neha noted, you're going to take a hit on the rebuild, and
because of ongoing traffic in the cluster it will be for a long time (we
can easily take half a day to rebuild a disk). But you still get some
benefit out of the RAID over just killing the data and letting it rebuild
from the replica, because during that time the cluster is not under
replicated, so you can suffer another failure. The more servers and disks
you have, the more often disks are going to fail, not to mention other
components. Both hardware and software. I like running on the safer side.

That said, I'm not sure RAID 10 is the answer either. We're going to be
doing some experimenting with other disk layouts shortly. We've inherited a
lot of our architecture, and many things have changed in that time. We're
probably going to test out RAID 5 and 6 to start with and see how much we
lose from the parity calculations.

-Todd


On Wed, Oct 22, 2014 at 3:59 PM, Jonathan Weeks jonathanbwe...@gmail.com
wrote:

 Neha,

 Do you mean RAID 10 or RAID 5 or 6? With RAID 5 or 6, recovery is
 definitely very painful, but less so with RAID 10.

 We have been using the guidance here:

 http://www.youtube.com/watch?v=19DvtEC0EbQ#t=190 (LinkedIn Site
 Reliability Engineers state they run RAID 10 on all Kafka clusters @34:40
 or so)

 Plus: https://cwiki.apache.org/confluence/display/KAFKA/Operations

 LinkedIn
 Hardware
 We are using dual quad-core Intel Xeon machines with 24GB of memory. In
 general this should not matter too much, we only see pretty low CPU usage
 at peak even with GZIP compression enabled and a number of clients that
 don't batch requests. The memory is probably more than is needed for
 caching the active segments of the log.
 The disk throughput is important. We have 8x7200 rpm SATA drives in a RAID
 10 array. In general this is the performance bottleneck, and more disks is
 more better. Depending on how you configure flush behavior you may or may
 not benefit from more expensive disks (if you flush often then higher RPM
 SAS drives may be better).
 OS Settings
 We use Linux. Ext4 is the filesystem and we run using software RAID 10. We
 haven't benchmarked filesystems so other filesystems may be superior.
 We have added two tuning changes: (1) we upped the number of file
 descriptors since we have lots of topics and lots of connections, and (2)
 we upped the max socket buffer size to enable high-performance data
 transfer between data centers (described here).


 Best Regards,

 -Jonathan



 On Oct 22, 2014, at 3:44 PM, Neha Narkhede neha.narkh...@gmail.com
 wrote:

  In my experience, RAID 10 doesn't really provide value in the presence of
  replication. When a disk fails, the RAID resync process is so I/O
 intensive
  that it renders the broker useless until it completes. When this happens,
  you actually have to take the broker out of rotation and move the leaders
  off of it to prevent it from serving requests in a degraded state. You
  might as well shutdown the broker, delete the broker's data and let it
  catch up from the leader.
 
  On Wed, Oct 22, 2014 at 11:20 AM, Gwen Shapira gshap...@cloudera.com
  wrote:
 
  Makes sense. Thanks :)
 
  On Wed, Oct 22, 2014 at 11:10 AM, Jonathan Weeks
  jonathanbwe...@gmail.com wrote:
  There are various costs when a broker fails, including broker leader
  election for each partition, etc., as well as exposing possible issues
 for
  in-flight messages, and client rebalancing etc.
 
  So even though replication provides partition redundancy, RAID 10 on
  each broker is usually a good tradeoff to prevent the typical most
 common
  cause of broker server failure (e.g. disk failure) as well, and overall
  smoother operation.
 
  Best Regards,
 
  -Jonathan
 
 
  On Oct 22, 2014, at 11:01 AM, Gwen Shapira gshap...@cloudera.com
  wrote:
 
  RAID-10?
  Interesting choice for a system where the data is already replicated
  between nodes. Is it to avoid the cost of large replication over the
  network? how large are these disks?
 
  On Wed, Oct 22, 2014 at 10:00 AM, Todd Palino tpal...@gmail.com
  wrote:
  In fact there are many more than 4000 open files. Many of our brokers
  run
  with 28,000+ open files (regular file handles, not network
  connections). In
  our case, we're beefing up the disk performance as much as we can by
  running in a RAID-10 configuration with 14 disks.
 
  -Todd
 
  On Tue, Oct 21, 2014 at 7:58 PM, Xiaobin She xiaobin...@gmail.com
  wrote:
 
  Todd,
 
  Actually I'm wondering how kafka handle so much partition, with one
  partition there is at least one file on disk, and with 4000
 partition,
  there will 

Re: How many partition can one single machine handle in Kafka?

2014-10-22 Thread Jonathan Weeks
I suppose it also is going to depend on:

a) How much spare I/O bandwidth the brokers have as well to support a rebuild 
while supporting ongoing requests. Our brokers have spare IO capacity.
b) How many brokers are in the cluster and what the replication factor is — 
e.g. if you have a larger cluster, it is easier to tolerate the loss of a 
single broker. We started with 3 brokers, so the loss of a single broker is 
quite significant — we would prefer possibly degraded performance to having a 
“down” broker.

I do understand that y’all both work at LinkedIn, my point is that all of the 
guidance to date (as recently as this summer) is that in production LinkedIn 
runs on RAID 10, so it is just a bit odd to hear a contrary recommendation, 
although I do understand that best practices are a moving, evolving target.

Best Regards,

-Jonathan


On Oct 22, 2014, at 4:05 PM, Todd Palino tpal...@gmail.com wrote:

 Yeah, Jonathan, I'm the LinkedIn SRE who said that :) And Neha, up until
 recently, sat 8 feet from my desk. The data from the wiki page is off a
 little bit as well (we're running 14 disks now, and 64 GB systems)
 
 So to hit the first questions, RAID 10 gives higher read performance, and
 also allows you to suffer a disk failure without having to drop the entire
 cluster. As Neha noted, you're going to take a hit on the rebuild, and
 because of ongoing traffic in the cluster it will be for a long time (we
 can easily take half a day to rebuild a disk). But you still get some
 benefit out of the RAID over just killing the data and letting it rebuild
 from the replica, because during that time the cluster is not under
 replicated, so you can suffer another failure. The more servers and disks
 you have, the more often disks are going to fail, not to mention other
 components. Both hardware and software. I like running on the safer side.
 
 That said, I'm not sure RAID 10 is the answer either. We're going to be
 doing some experimenting with other disk layouts shortly. We've inherited a
 lot of our architecture, and many things have changed in that time. We're
 probably going to test out RAID 5 and 6 to start with and see how much we
 lose from the parity calculations.
 
 -Todd
 
 
 On Wed, Oct 22, 2014 at 3:59 PM, Jonathan Weeks jonathanbwe...@gmail.com
 wrote:
 
 Neha,
 
 Do you mean RAID 10 or RAID 5 or 6? With RAID 5 or 6, recovery is
 definitely very painful, but less so with RAID 10.
 
 We have been using the guidance here:
 
 http://www.youtube.com/watch?v=19DvtEC0EbQ#t=190 (LinkedIn Site
 Reliability Engineers state they run RAID 10 on all Kafka clusters @34:40
 or so)
 
 Plus: https://cwiki.apache.org/confluence/display/KAFKA/Operations
 
 LinkedIn
 Hardware
 We are using dual quad-core Intel Xeon machines with 24GB of memory. In
 general this should not matter too much, we only see pretty low CPU usage
 at peak even with GZIP compression enabled and a number of clients that
 don't batch requests. The memory is probably more than is needed for
 caching the active segments of the log.
 The disk throughput is important. We have 8x7200 rpm SATA drives in a RAID
 10 array. In general this is the performance bottleneck, and more disks is
 more better. Depending on how you configure flush behavior you may or may
 not benefit from more expensive disks (if you flush often then higher RPM
 SAS drives may be better).
 OS Settings
 We use Linux. Ext4 is the filesystem and we run using software RAID 10. We
 haven't benchmarked filesystems so other filesystems may be superior.
 We have added two tuning changes: (1) we upped the number of file
 descriptors since we have lots of topics and lots of connections, and (2)
 we upped the max socket buffer size to enable high-performance data
 transfer between data centers (described here).
 
 
 Best Regards,
 
 -Jonathan
 
 
 
 On Oct 22, 2014, at 3:44 PM, Neha Narkhede neha.narkh...@gmail.com
 wrote:
 
 In my experience, RAID 10 doesn't really provide value in the presence of
 replication. When a disk fails, the RAID resync process is so I/O
 intensive
 that it renders the broker useless until it completes. When this happens,
 you actually have to take the broker out of rotation and move the leaders
 off of it to prevent it from serving requests in a degraded state. You
 might as well shutdown the broker, delete the broker's data and let it
 catch up from the leader.
 
 On Wed, Oct 22, 2014 at 11:20 AM, Gwen Shapira gshap...@cloudera.com
 wrote:
 
 Makes sense. Thanks :)
 
 On Wed, Oct 22, 2014 at 11:10 AM, Jonathan Weeks
 jonathanbwe...@gmail.com wrote:
 There are various costs when a broker fails, including broker leader
 election for each partition, etc., as well as exposing possible issues
 for
 in-flight messages, and client rebalancing etc.
 
 So even though replication provides partition redundancy, RAID 10 on
 each broker is usually a good tradeoff to prevent the typical most
 common
 cause of broker server failure (e.g. disk failure) as well, and overall
 

Re: How many partition can one single machine handle in Kafka?

2014-10-22 Thread Xiaobin She
Todd,

Thank you for the information.

With 28,000+ files and 14 disks, that makes there are averagely about 4000
open files on two disk ( which is treated as one single disk) , am I right?

How do you manage to make the all the write operation to thest 4000 open
files be sequential to the disk?

As far as I know, write operation to different files on the same disk will
cause random write, which is not good for performance.

xiaobinshe




2014-10-23 1:00 GMT+08:00 Todd Palino tpal...@gmail.com:

 In fact there are many more than 4000 open files. Many of our brokers run
 with 28,000+ open files (regular file handles, not network connections). In
 our case, we're beefing up the disk performance as much as we can by
 running in a RAID-10 configuration with 14 disks.

 -Todd

 On Tue, Oct 21, 2014 at 7:58 PM, Xiaobin She xiaobin...@gmail.com wrote:

  Todd,
 
  Actually I'm wondering how kafka handle so much partition, with one
  partition there is at least one file on disk, and with 4000 partition,
  there will be at least 4000 files.
 
  When all these partitions have write request, how did Kafka make the
 write
  operation on the disk to be sequential (which is emphasized in the design
  document of Kafka) and make sure the disk access is effective?
 
  Thank you for your reply.
 
  xiaobinshe
 
 
 
  2014-10-22 5:10 GMT+08:00 Todd Palino tpal...@gmail.com:
 
   As far as the number of partitions a single broker can handle, we've
 set
   our cap at 4000 partitions (including replicas). Above that we've seen
  some
   performance and stability issues.
  
   -Todd
  
   On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She xiaobin...@gmail.com
   wrote:
  
hello, everyone
   
I'm new to kafka, I'm wondering what's the max num of partition can
 one
siggle machine handle in Kafka?
   
Is there an sugeest num?
   
Thanks.
   
xiaobinshe
   
  
 



How many partition can one single machine handle in Kafka?

2014-10-21 Thread Xiaobin She
hello, everyone

I'm new to kafka, I'm wondering what's the max num of partition can one
siggle machine handle in Kafka?

Is there an sugeest num?

Thanks.

xiaobinshe


Re: How many partition can one single machine handle in Kafka?

2014-10-21 Thread Guozhang Wang
Xiaobin,

This FAQ may give you some hints:

https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-HowdoIchoosethenumberofpartitionsforatopic
?

On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She xiaobin...@gmail.com wrote:

 hello, everyone

 I'm new to kafka, I'm wondering what's the max num of partition can one
 siggle machine handle in Kafka?

 Is there an sugeest num?

 Thanks.

 xiaobinshe




-- 
-- Guozhang


Re: How many partition can one single machine handle in Kafka?

2014-10-21 Thread Todd Palino
As far as the number of partitions a single broker can handle, we've set
our cap at 4000 partitions (including replicas). Above that we've seen some
performance and stability issues.

-Todd

On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She xiaobin...@gmail.com wrote:

 hello, everyone

 I'm new to kafka, I'm wondering what's the max num of partition can one
 siggle machine handle in Kafka?

 Is there an sugeest num?

 Thanks.

 xiaobinshe



Re: How many partition can one single machine handle in Kafka?

2014-10-21 Thread Neil Harkins
On Tue, Oct 21, 2014 at 2:10 PM, Todd Palino tpal...@gmail.com wrote:
 As far as the number of partitions a single broker can handle, we've set
 our cap at 4000 partitions (including replicas). Above that we've seen some
 performance and stability issues.

How many brokers? I'm curious: what kinds of problems would affect
a single broker with a large number of partitions, but not affect the
entire cluster with even more partitions?


Re: How many partition can one single machine handle in Kafka?

2014-10-21 Thread Xiaobin She
Todd,

Actually I'm wondering how kafka handle so much partition, with one
partition there is at least one file on disk, and with 4000 partition,
there will be at least 4000 files.

When all these partitions have write request, how did Kafka make the write
operation on the disk to be sequential (which is emphasized in the design
document of Kafka) and make sure the disk access is effective?

Thank you for your reply.

xiaobinshe



2014-10-22 5:10 GMT+08:00 Todd Palino tpal...@gmail.com:

 As far as the number of partitions a single broker can handle, we've set
 our cap at 4000 partitions (including replicas). Above that we've seen some
 performance and stability issues.

 -Todd

 On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She xiaobin...@gmail.com
 wrote:

  hello, everyone
 
  I'm new to kafka, I'm wondering what's the max num of partition can one
  siggle machine handle in Kafka?
 
  Is there an sugeest num?
 
  Thanks.
 
  xiaobinshe