This is actually a very vague statement and does not cover every use case.
Having a RAID10 array of 6x250G SSDs is very different from having 4x1T
spinning drives. In my experience rebuilding a raid10 array that has
several smaller SSD disks is hardly noticeable from the service point of
view, because the IO write load is distributed amongst several disk pairs.
You get lets say 1/2...1/4 (depending how many disk pairs you have) of the
per node IO bandwidth. What configuration have you had the experience with?
Was it fewer spinning disks?

Regards,
Istvan


On Wed, Oct 22, 2014 at 3:44 PM, Neha Narkhede <neha.narkh...@gmail.com>
wrote:

> In my experience, RAID 10 doesn't really provide value in the presence of
> replication. When a disk fails, the RAID resync process is so I/O intensive
> that it renders the broker useless until it completes. When this happens,
> you actually have to take the broker out of rotation and move the leaders
> off of it to prevent it from serving requests in a degraded state. You
> might as well shutdown the broker, delete the broker's data and let it
> catch up from the leader.
>
> On Wed, Oct 22, 2014 at 11:20 AM, Gwen Shapira <gshap...@cloudera.com>
> wrote:
>
> > Makes sense. Thanks :)
> >
> > On Wed, Oct 22, 2014 at 11:10 AM, Jonathan Weeks
> > <jonathanbwe...@gmail.com> wrote:
> > > There are various costs when a broker fails, including broker leader
> > election for each partition, etc., as well as exposing possible issues
> for
> > in-flight messages, and client rebalancing etc.
> > >
> > > So even though replication provides partition redundancy, RAID 10 on
> > each broker is usually a good tradeoff to prevent the typical most common
> > cause of broker server failure (e.g. disk failure) as well, and overall
> > smoother operation.
> > >
> > > Best Regards,
> > >
> > > -Jonathan
> > >
> > >
> > > On Oct 22, 2014, at 11:01 AM, Gwen Shapira <gshap...@cloudera.com>
> > wrote:
> > >
> > >> RAID-10?
> > >> Interesting choice for a system where the data is already replicated
> > >> between nodes. Is it to avoid the cost of large replication over the
> > >> network? how large are these disks?
> > >>
> > >> On Wed, Oct 22, 2014 at 10:00 AM, Todd Palino <tpal...@gmail.com>
> > wrote:
> > >>> In fact there are many more than 4000 open files. Many of our brokers
> > run
> > >>> with 28,000+ open files (regular file handles, not network
> > connections). In
> > >>> our case, we're beefing up the disk performance as much as we can by
> > >>> running in a RAID-10 configuration with 14 disks.
> > >>>
> > >>> -Todd
> > >>>
> > >>> On Tue, Oct 21, 2014 at 7:58 PM, Xiaobin She <xiaobin...@gmail.com>
> > wrote:
> > >>>
> > >>>> Todd,
> > >>>>
> > >>>> Actually I'm wondering how kafka handle so much partition, with one
> > >>>> partition there is at least one file on disk, and with 4000
> partition,
> > >>>> there will be at least 4000 files.
> > >>>>
> > >>>> When all these partitions have write request, how did Kafka make the
> > write
> > >>>> operation on the disk to be sequential (which is emphasized in the
> > design
> > >>>> document of Kafka) and make sure the disk access is effective?
> > >>>>
> > >>>> Thank you for your reply.
> > >>>>
> > >>>> xiaobinshe
> > >>>>
> > >>>>
> > >>>>
> > >>>> 2014-10-22 5:10 GMT+08:00 Todd Palino <tpal...@gmail.com>:
> > >>>>
> > >>>>> As far as the number of partitions a single broker can handle,
> we've
> > set
> > >>>>> our cap at 4000 partitions (including replicas). Above that we've
> > seen
> > >>>> some
> > >>>>> performance and stability issues.
> > >>>>>
> > >>>>> -Todd
> > >>>>>
> > >>>>> On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She <
> xiaobin...@gmail.com>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> hello, everyone
> > >>>>>>
> > >>>>>> I'm new to kafka, I'm wondering what's the max num of partition
> can
> > one
> > >>>>>> siggle machine handle in Kafka?
> > >>>>>>
> > >>>>>> Is there an sugeest num?
> > >>>>>>
> > >>>>>> Thanks.
> > >>>>>>
> > >>>>>> xiaobinshe
> > >>>>>>
> > >>>>>
> > >>>>
> > >
> >
>



-- 
the sun shines for all

Reply via email to