Re: How many partition can one single machine handle in Kafka?

2014-10-24 Thread Xiaobin She
Todd,

Thank you very much for your reply. My understanding of RAID 10 is wrong.

I understand that one can not get absolute sequential disk access even on
one single disk, the reason I'm interested with this question is that the
design document of Kafka emphasize that Kafka make advantage of the
sequential disk acceess to improve the disk performance, and I can' t
understand how to achive this with thounds of open files.

I thought that compare to one or fewer files, thounds of open files will
make the disk access much more random, and make the disk performance much
more weak.

You mentioned that to increase overall IO cpapcity, one will have to use
multiple spindles with sufficiently fast disk speed, but will it be more
effective for the disk with fewer files? Or does the num of files is not an
important factor for the entire performance of Kafka?

Thanks again.

xiaobinshe



2014-10-23 22:01 GMT+08:00 Todd Palino tpal...@gmail.com:

 Your understanding of RAID 10 is slightly off. Because it is a combination
 of striping and mirroring, trying to say that there are 4000 open files per
 pair of disks is not accurate. The disk, as far as the system is concerned,
 is the entire RAID. Files are striped across all mirrors, so any open file
 will cross all 7 mirror sets.

 Even if you were to operate on a single disk, you're never going to be able
 to ensure sequential disk access with Kafka. Even if you have a single
 partition on a disk, there will be multiple log files for that partition
 and you will have to seek to read older data. What you have to do is use
 multiple spindles, with sufficiently fast disk speeds, to increase your
 overall IO capacity. You can also tune to get a little more. For example,
 we use a 120 second commit on that mount point to reduce the frequency of
 flushing to disk.

 -Todd


 On Wed, Oct 22, 2014 at 10:09 PM, Xiaobin She xiaobin...@gmail.com
 wrote:

  Todd,
 
  Thank you for the information.
 
  With 28,000+ files and 14 disks, that makes there are averagely about
 4000
  open files on two disk ( which is treated as one single disk) , am I
 right?
 
  How do you manage to make the all the write operation to thest 4000 open
  files be sequential to the disk?
 
  As far as I know, write operation to different files on the same disk
 will
  cause random write, which is not good for performance.
 
  xiaobinshe
 
 
 
 
  2014-10-23 1:00 GMT+08:00 Todd Palino tpal...@gmail.com:
 
   In fact there are many more than 4000 open files. Many of our brokers
 run
   with 28,000+ open files (regular file handles, not network
 connections).
  In
   our case, we're beefing up the disk performance as much as we can by
   running in a RAID-10 configuration with 14 disks.
  
   -Todd
  
   On Tue, Oct 21, 2014 at 7:58 PM, Xiaobin She xiaobin...@gmail.com
  wrote:
  
Todd,
   
Actually I'm wondering how kafka handle so much partition, with one
partition there is at least one file on disk, and with 4000
 partition,
there will be at least 4000 files.
   
When all these partitions have write request, how did Kafka make the
   write
operation on the disk to be sequential (which is emphasized in the
  design
document of Kafka) and make sure the disk access is effective?
   
Thank you for your reply.
   
xiaobinshe
   
   
   
2014-10-22 5:10 GMT+08:00 Todd Palino tpal...@gmail.com:
   
 As far as the number of partitions a single broker can handle,
 we've
   set
 our cap at 4000 partitions (including replicas). Above that we've
  seen
some
 performance and stability issues.

 -Todd

 On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She 
 xiaobin...@gmail.com
 wrote:

  hello, everyone
 
  I'm new to kafka, I'm wondering what's the max num of partition
 can
   one
  siggle machine handle in Kafka?
 
  Is there an sugeest num?
 
  Thanks.
 
  xiaobinshe
 

   
  
 



Re: How many partition can one single machine handle in Kafka?

2014-10-22 Thread Xiaobin She
Todd,

Thank you for the information.

With 28,000+ files and 14 disks, that makes there are averagely about 4000
open files on two disk ( which is treated as one single disk) , am I right?

How do you manage to make the all the write operation to thest 4000 open
files be sequential to the disk?

As far as I know, write operation to different files on the same disk will
cause random write, which is not good for performance.

xiaobinshe




2014-10-23 1:00 GMT+08:00 Todd Palino tpal...@gmail.com:

 In fact there are many more than 4000 open files. Many of our brokers run
 with 28,000+ open files (regular file handles, not network connections). In
 our case, we're beefing up the disk performance as much as we can by
 running in a RAID-10 configuration with 14 disks.

 -Todd

 On Tue, Oct 21, 2014 at 7:58 PM, Xiaobin She xiaobin...@gmail.com wrote:

  Todd,
 
  Actually I'm wondering how kafka handle so much partition, with one
  partition there is at least one file on disk, and with 4000 partition,
  there will be at least 4000 files.
 
  When all these partitions have write request, how did Kafka make the
 write
  operation on the disk to be sequential (which is emphasized in the design
  document of Kafka) and make sure the disk access is effective?
 
  Thank you for your reply.
 
  xiaobinshe
 
 
 
  2014-10-22 5:10 GMT+08:00 Todd Palino tpal...@gmail.com:
 
   As far as the number of partitions a single broker can handle, we've
 set
   our cap at 4000 partitions (including replicas). Above that we've seen
  some
   performance and stability issues.
  
   -Todd
  
   On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She xiaobin...@gmail.com
   wrote:
  
hello, everyone
   
I'm new to kafka, I'm wondering what's the max num of partition can
 one
siggle machine handle in Kafka?
   
Is there an sugeest num?
   
Thanks.
   
xiaobinshe
   
  
 



How many partition can one single machine handle in Kafka?

2014-10-21 Thread Xiaobin She
hello, everyone

I'm new to kafka, I'm wondering what's the max num of partition can one
siggle machine handle in Kafka?

Is there an sugeest num?

Thanks.

xiaobinshe


Re: How many partition can one single machine handle in Kafka?

2014-10-21 Thread Xiaobin She
Todd,

Actually I'm wondering how kafka handle so much partition, with one
partition there is at least one file on disk, and with 4000 partition,
there will be at least 4000 files.

When all these partitions have write request, how did Kafka make the write
operation on the disk to be sequential (which is emphasized in the design
document of Kafka) and make sure the disk access is effective?

Thank you for your reply.

xiaobinshe



2014-10-22 5:10 GMT+08:00 Todd Palino tpal...@gmail.com:

 As far as the number of partitions a single broker can handle, we've set
 our cap at 4000 partitions (including replicas). Above that we've seen some
 performance and stability issues.

 -Todd

 On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She xiaobin...@gmail.com
 wrote:

  hello, everyone
 
  I'm new to kafka, I'm wondering what's the max num of partition can one
  siggle machine handle in Kafka?
 
  Is there an sugeest num?
 
  Thanks.
 
  xiaobinshe