In my experience, RAID 10 doesn't really provide value in the presence of
replication. When a disk fails, the RAID resync process is so I/O intensive
that it renders the broker useless until it completes. When this happens,
you actually have to take the broker out of rotation and move the leaders
off of it to prevent it from serving requests in a degraded state. You
might as well shutdown the broker, delete the broker's data and let it
catch up from the leader.

On Wed, Oct 22, 2014 at 11:20 AM, Gwen Shapira <gshap...@cloudera.com>
wrote:

> Makes sense. Thanks :)
>
> On Wed, Oct 22, 2014 at 11:10 AM, Jonathan Weeks
> <jonathanbwe...@gmail.com> wrote:
> > There are various costs when a broker fails, including broker leader
> election for each partition, etc., as well as exposing possible issues for
> in-flight messages, and client rebalancing etc.
> >
> > So even though replication provides partition redundancy, RAID 10 on
> each broker is usually a good tradeoff to prevent the typical most common
> cause of broker server failure (e.g. disk failure) as well, and overall
> smoother operation.
> >
> > Best Regards,
> >
> > -Jonathan
> >
> >
> > On Oct 22, 2014, at 11:01 AM, Gwen Shapira <gshap...@cloudera.com>
> wrote:
> >
> >> RAID-10?
> >> Interesting choice for a system where the data is already replicated
> >> between nodes. Is it to avoid the cost of large replication over the
> >> network? how large are these disks?
> >>
> >> On Wed, Oct 22, 2014 at 10:00 AM, Todd Palino <tpal...@gmail.com>
> wrote:
> >>> In fact there are many more than 4000 open files. Many of our brokers
> run
> >>> with 28,000+ open files (regular file handles, not network
> connections). In
> >>> our case, we're beefing up the disk performance as much as we can by
> >>> running in a RAID-10 configuration with 14 disks.
> >>>
> >>> -Todd
> >>>
> >>> On Tue, Oct 21, 2014 at 7:58 PM, Xiaobin She <xiaobin...@gmail.com>
> wrote:
> >>>
> >>>> Todd,
> >>>>
> >>>> Actually I'm wondering how kafka handle so much partition, with one
> >>>> partition there is at least one file on disk, and with 4000 partition,
> >>>> there will be at least 4000 files.
> >>>>
> >>>> When all these partitions have write request, how did Kafka make the
> write
> >>>> operation on the disk to be sequential (which is emphasized in the
> design
> >>>> document of Kafka) and make sure the disk access is effective?
> >>>>
> >>>> Thank you for your reply.
> >>>>
> >>>> xiaobinshe
> >>>>
> >>>>
> >>>>
> >>>> 2014-10-22 5:10 GMT+08:00 Todd Palino <tpal...@gmail.com>:
> >>>>
> >>>>> As far as the number of partitions a single broker can handle, we've
> set
> >>>>> our cap at 4000 partitions (including replicas). Above that we've
> seen
> >>>> some
> >>>>> performance and stability issues.
> >>>>>
> >>>>> -Todd
> >>>>>
> >>>>> On Tue, Oct 21, 2014 at 12:15 AM, Xiaobin She <xiaobin...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> hello, everyone
> >>>>>>
> >>>>>> I'm new to kafka, I'm wondering what's the max num of partition can
> one
> >>>>>> siggle machine handle in Kafka?
> >>>>>>
> >>>>>> Is there an sugeest num?
> >>>>>>
> >>>>>> Thanks.
> >>>>>>
> >>>>>> xiaobinshe
> >>>>>>
> >>>>>
> >>>>
> >
>

Reply via email to