Hi Varun,

The hard drivers I used are now used on the hadoop/hbase cluster, but they
was clear and formated for the tests I did. The computer where I run those
tests was one of the region servers. It was re-installed to be very clear,
and it's now running a datanode and a RS.

Regarding RAID, I think you are confusing RAID0 and RAID1. It's RAID1 which
need to access the 2 files each time. RAID0 is more like JBOD, but faster.

JM

2012/12/20 Varun Sharma <va...@pinterest.com>

> Hi Jean,
>
> Very interesting benchmark - how are these numbers arrived at ? Is this on
> a real hbase cluster ? To me, it felt kind of counter intuitive that RAID0
> beats JBOD on random seeks because with RAID0 all disks need to seek at the
> same time and the performance should basically be as bad as the slowest
> seeking disk.
>
> Varun
>
> On Wed, Dec 19, 2012 at 5:14 PM, Michael Segel <michael_se...@hotmail.com
> >wrote:
>
> > Yeah,
> > I couldn't argue against LVMs when talking with the system admins.
> > In terms of speed its noise because the CPUs are pretty efficient and
> > unless you have more than 1 drive per physical core, you will end up
> > saturating your disk I/O.
> >
> > In terms of MapR, you want the raw disk. (But we're talking Apache)
> >
> >
> > On Dec 19, 2012, at 4:59 PM, Jean-Marc Spaggiari <
> jean-m...@spaggiari.org>
> > wrote:
> >
> > > Finally, it took me a while to run those tests because it was way
> > > longer than expected, but here are the results:
> > >
> > > http://www.spaggiari.org/bonnie.html
> > >
> > > LVM is not really slower than JBOD and not really taking more CPU. So
> > > I will say, if you have to choose between the 2, take the one you
> > > prefer. Personally, I prefer LVM because it's easy to configure.
> > >
> > > The big winner here is RAID0. It's WAY faster than anything else. But
> > > it's using twice the space... Your choice.
> > >
> > > I did not get a chance to test with the Ubuntu tool because it's not
> > > working with LVM drives.
> > >
> > > JM
> > >
> > > 2012/11/28, Michael Segel <michael_se...@hotmail.com>:
> > >> Ok, just a caveat.
> > >>
> > >> I am discussing MapR as part of a complete response. As Mohit posted
> > MapR
> > >> takes the raw device for their MapR File System.
> > >> They do stripe on their own within what they call a volume.
> > >>
> > >> But going back to Apache...
> > >> You can stripe drives, however I wouldn't recommend it. I don't think
> > the
> > >> performance gains would really matter.
> > >> You're going to end up getting blocked first by disk i/o, then your
> > >> controller card, then your network... assuming 10GBe.
> > >>
> > >> With only 2 disks on an 8 core system, you will hit disk i/o first and
> > then
> > >> you'll watch your CPU Wait I/O climb.
> > >>
> > >> HTH
> > >>
> > >> -Mike
> > >>
> > >> On Nov 28, 2012, at 7:28 PM, Jean-Marc Spaggiari <
> > jean-m...@spaggiari.org>
> > >> wrote:
> > >>
> > >>> Hi Mike,
> > >>>
> > >>> Why not using LVM with MapR? Since LVM is reading from 2 drives
> almost
> > >>> at the same time, it should be better than RAID0 or a single drive,
> > >>> no?
> > >>>
> > >>> 2012/11/28, Michael Segel <michael_se...@hotmail.com>:
> > >>>> Just a couple of things.
> > >>>>
> > >>>> I'm neutral on the use of LVMs. Some would point out that there's
> some
> > >>>> overhead, but on the flip side, it can make managing the machines
> > >>>> easier.
> > >>>> If you're using MapR, you don't want to use LVMs but raw devices.
> > >>>>
> > >>>> In terms of GC, its going to depend on the heap size and not the
> total
> > >>>> memory. With respect to HBase. ... MSLABS is the way to go.
> > >>>>
> > >>>>
> > >>>> On Nov 28, 2012, at 12:05 PM, Jean-Marc Spaggiari
> > >>>> <jean-m...@spaggiari.org>
> > >>>> wrote:
> > >>>>
> > >>>>> Hi Gregory,
> > >>>>>
> > >>>>> I founs this about LVM:
> > >>>>> -> http://blog.andrew.net.au/2006/08/09
> > >>>>> ->
> > >>>>>
> > http://www.phoronix.com/scan.php?page=article&item=fedora_15_lvm&num=2
> > >>>>>
> > >>>>> Seems that performances are still correct with it. I will most
> > >>>>> probably give it a try and bench that too... I have one new hard
> > drive
> > >>>>> which should arrived tomorrow. Perfect timing ;)
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> JM
> > >>>>>
> > >>>>> 2012/11/28, Mohit Anchlia <mohitanch...@gmail.com>:
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> On Nov 28, 2012, at 9:07 AM, Adrien Mogenet <
> > adrien.moge...@gmail.com>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> Does HBase really benefit from 64 GB of RAM since allocating too
> > >>>>>>> large
> > >>>>>>> heap
> > >>>>>>> might increase GC time ?
> > >>>>>>>
> > >>>>>> Benefit you get is from OS cache
> > >>>>>>> Another question : why not RAID 0, in order to aggregate disk
> > >>>>>>> bandwidth
> > >>>>>>> ?
> > >>>>>>> (and thus keep 3x replication factor)
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Wed, Nov 28, 2012 at 5:58 PM, Michael Segel
> > >>>>>>> <michael_se...@hotmail.com>wrote:
> > >>>>>>>
> > >>>>>>>> Sorry,
> > >>>>>>>>
> > >>>>>>>> I need to clarify.
> > >>>>>>>>
> > >>>>>>>> 4GB per physical core is a good starting point.
> > >>>>>>>> So with 2 quad core chips, that is going to be 32GB.
> > >>>>>>>>
> > >>>>>>>> IMHO that's a minimum. If you go with HBase, you will want more.
> > >>>>>>>> (Actually
> > >>>>>>>> you will need more.) The next logical jump would be to 48 or
> 64GB.
> > >>>>>>>>
> > >>>>>>>> If we start to price out memory, depending on vendor, your
> > company's
> > >>>>>>>> procurement,  there really isn't much of a price difference in
> > terms
> > >>>>>>>> of
> > >>>>>>>> 32,48, or 64 GB.
> > >>>>>>>> Note that it also depends on the chips themselves. Also you need
> > to
> > >>>>>>>> see
> > >>>>>>>> how many memory channels exist in the mother board. You may need
> > to
> > >>>>>>>> buy
> > >>>>>>>> in
> > >>>>>>>> pairs or triplets. Your hardware vendor can help you. (Also you
> > need
> > >>>>>>>> to
> > >>>>>>>> keep an eye on your hardware vendor. Sometimes they will give
> you
> > >>>>>>>> higher
> > >>>>>>>> density chips that are going to be more expensive...) ;-)
> > >>>>>>>>
> > >>>>>>>> I tend to like having extra memory from the start.
> > >>>>>>>> It gives you a bit more freedom and also protects you from 'fat'
> > >>>>>>>> code.
> > >>>>>>>>
> > >>>>>>>> Looking at YARN... you will need more memory too.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> With respect to the hard drives...
> > >>>>>>>>
> > >>>>>>>> The best recommendation is to keep the drives as JBOD and then
> use
> > >>>>>>>> 3x
> > >>>>>>>> replication.
> > >>>>>>>> In this case, make sure that the disk controller cards can
> handle
> > >>>>>>>> JBOD.
> > >>>>>>>> (Some don't support JBOD out of the box)
> > >>>>>>>>
> > >>>>>>>> With respect to RAID...
> > >>>>>>>>
> > >>>>>>>> If you are running MapR, no need for RAID.
> > >>>>>>>> If you are running an Apache derivative, you could use RAID 1.
> > Then
> > >>>>>>>> cut
> > >>>>>>>> your replication to 2X. This makes it easier to manage drive
> > >>>>>>>> failures.
> > >>>>>>>> (Its not the norm, but it works...) In some clusters, they are
> > using
> > >>>>>>>> appliances like Net App's e series where the machines see the
> > drives
> > >>>>>>>> as
> > >>>>>>>> local attached storage and I think the appliances themselves are
> > >>>>>>>> using
> > >>>>>>>> RAID.  I haven't played with this configuration, however it
> could
> > >>>>>>>> make
> > >>>>>>>> sense and its a valid design.
> > >>>>>>>>
> > >>>>>>>> HTH
> > >>>>>>>>
> > >>>>>>>> -Mike
> > >>>>>>>>
> > >>>>>>>> On Nov 28, 2012, at 10:33 AM, Jean-Marc Spaggiari
> > >>>>>>>> <jean-m...@spaggiari.org>
> > >>>>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> Hi Mike,
> > >>>>>>>>>
> > >>>>>>>>> Thanks for all those details!
> > >>>>>>>>>
> > >>>>>>>>> So to simplify the equation, for 16 virtual cores we need 48 to
> > >>>>>>>>> 64GB.
> > >>>>>>>>> Which mean 3 to 4GB per core. So with quad cores, 12GB to 16GB
> > are
> > >>>>>>>>> a
> > >>>>>>>>> good start? Or I simplified it to much?
> > >>>>>>>>>
> > >>>>>>>>> Regarding the hard drives. If you add more than one drive, do
> you
> > >>>>>>>>> need
> > >>>>>>>>> to build them on RAID or similar systems? Or can Hadoop/HBase
> be
> > >>>>>>>>> configured to use more than one drive?
> > >>>>>>>>>
> > >>>>>>>>> Thanks,
> > >>>>>>>>>
> > >>>>>>>>> JM
> > >>>>>>>>>
> > >>>>>>>>> 2012/11/27, Michael Segel <michael_se...@hotmail.com>:
> > >>>>>>>>>>
> > >>>>>>>>>> OK... I don't know why Cloudera is so hung up on 32GB. ;-)
> [Its
> > an
> > >>>>>>>> inside
> > >>>>>>>>>> joke ...]
> > >>>>>>>>>>
> > >>>>>>>>>> So here's the problem...
> > >>>>>>>>>>
> > >>>>>>>>>> By default, your child processes in a map/reduce job get a
> > default
> > >>>>>>>> 512MB.
> > >>>>>>>>>> The majority of the time, this gets raised to 1GB.
> > >>>>>>>>>>
> > >>>>>>>>>> 8 cores (dual quad cores) shows up at 16 virtual processors in
> > >>>>>>>>>> Linux.
> > >>>>>>>> (Note:
> > >>>>>>>>>> This is why when people talk about the number of cores, you
> have
> > >>>>>>>>>> to
> > >>>>>>>> specify
> > >>>>>>>>>> physical cores or logical cores....)
> > >>>>>>>>>>
> > >>>>>>>>>> So if you were to over subscribe and have lets say 12  mappers
> > and
> > >>>>>>>>>> 12
> > >>>>>>>>>> reducers, that's 24 slots. Which means that you would need
> 24GB
> > of
> > >>>>>>>> memory
> > >>>>>>>>>> reserved just for the child processes. This would leave 8GB
> for
> > >>>>>>>>>> DN,
> > >>>>>>>>>> TT
> > >>>>>>>> and
> > >>>>>>>>>> the rest of the linux OS processes.
> > >>>>>>>>>>
> > >>>>>>>>>> Can you live with that? Sure.
> > >>>>>>>>>> Now add in R, HBase, Impala, or some other set of tools on top
> > of
> > >>>>>>>>>> the
> > >>>>>>>>>> cluster.
> > >>>>>>>>>>
> > >>>>>>>>>> Ooops! Now you are in trouble because you will swap.
> > >>>>>>>>>> Also adding in R, you may want to bump up those child procs
> from
> > >>>>>>>>>> 1GB
> > >>>>>>>>>> to
> > >>>>>>>> 2
> > >>>>>>>>>> GB. That means the 24 slots would now require 48GB.  Now you
> > have
> > >>>>>>>>>> swap
> > >>>>>>>> and
> > >>>>>>>>>> if that happens you will see HBase in a cascading failure.
> > >>>>>>>>>>
> > >>>>>>>>>> So while you can do a rolling restart with the changed
> > >>>>>>>>>> configuration
> > >>>>>>>>>> (reducing the number of mappers and reducers) you end up with
> > less
> > >>>>>>>>>> slots
> > >>>>>>>>>> which will mean in longer run time for your jobs. (Less slots
> ==
> > >>>>>>>>>> less
> > >>>>>>>>>> parallelism )
> > >>>>>>>>>>
> > >>>>>>>>>> Looking at the price of memory... you can get 48GB or even
> 64GB
> > >>>>>>>>>> for
> > >>>>>>>> around
> > >>>>>>>>>> the same price point. (8GB chips)
> > >>>>>>>>>>
> > >>>>>>>>>> And I didn't even talk about adding SOLR either again a memory
> > >>>>>>>>>> hog...
> > >>>>>>>> ;-)
> > >>>>>>>>>>
> > >>>>>>>>>> Note that I matched the number of mappers w reducers. You
> could
> > go
> > >>>>>>>>>> with
> > >>>>>>>>>> fewer reducers if you want. I tend to recommend a ratio of 2:1
> > >>>>>>>>>> mappers
> > >>>>>>>> to
> > >>>>>>>>>> reducers, depending on the work flow....
> > >>>>>>>>>>
> > >>>>>>>>>> As to the disks... no 7200 SATA III drives are fine. SATA III
> > >>>>>>>>>> interface
> > >>>>>>>> is
> > >>>>>>>>>> pretty much available in the new kit being shipped.
> > >>>>>>>>>> Its just that you don't have enough drives. 8 cores should be
> 8
> > >>>>>>>> spindles if
> > >>>>>>>>>> available.
> > >>>>>>>>>> Otherwise you end up seeing your CPU load climb on wait states
> > as
> > >>>>>>>>>> the
> > >>>>>>>>>> processes wait for the disk i/o to catch up.
> > >>>>>>>>>>
> > >>>>>>>>>> I mean you could build out a cluster w 4 x 3 3.5" 2TB drives
> in
> > a
> > >>>>>>>>>> 1
> > >>>>>>>>>> U
> > >>>>>>>>>> chassis based on price. You're making a trade off and you
> should
> > >>>>>>>>>> be
> > >>>>>>>> aware of
> > >>>>>>>>>> the performance hit you will take.
> > >>>>>>>>>>
> > >>>>>>>>>> HTH
> > >>>>>>>>>>
> > >>>>>>>>>> -Mike
> > >>>>>>>>>>
> > >>>>>>>>>> On Nov 27, 2012, at 1:52 PM, Jean-Marc Spaggiari <
> > >>>>>>>> jean-m...@spaggiari.org>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> Hi Michael,
> > >>>>>>>>>>>
> > >>>>>>>>>>> so are you recommanding 32Gb per node?
> > >>>>>>>>>>>
> > >>>>>>>>>>> What about the disks? SATA drives are to slow?
> > >>>>>>>>>>>
> > >>>>>>>>>>> JM
> > >>>>>>>>>>>
> > >>>>>>>>>>> 2012/11/26, Michael Segel <michael_se...@hotmail.com>:
> > >>>>>>>>>>>> Uhm, those specs are actually now out of date.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> If you're running HBase, or want to also run R on top of
> > Hadoop,
> > >>>>>>>>>>>> you
> > >>>>>>>>>>>> will
> > >>>>>>>>>>>> need to add more memory.
> > >>>>>>>>>>>> Also forget 1GBe got 10GBe,  and w 2 SATA drives, you will
> be
> > >>>>>>>>>>>> disk
> > >>>>>>>>>>>> i/o
> > >>>>>>>>>>>> bound
> > >>>>>>>>>>>> way too quickly.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On Nov 26, 2012, at 8:05 AM, Marcos Ortiz <mlor...@uci.cu>
> > >>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> Are you asking about hardware recommendations?
> > >>>>>>>>>>>>> Eric Sammer on his "Hadoop Operations" book, did a great
> job
> > >>>>>>>>>>>>> about
> > >>>>>>>>>>>>> this:
> > >>>>>>>>>>>>> For middle size clusters (until 300 nodes):
> > >>>>>>>>>>>>> Processor: A dual quad-core 2.6 Ghz
> > >>>>>>>>>>>>> RAM: 24 GB DDR3
> > >>>>>>>>>>>>> Dual 1 Gb Ethernet NICs
> > >>>>>>>>>>>>> a SAS drive controller
> > >>>>>>>>>>>>> at least two SATA II drives in a JBOD configuration
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> The replication factor depends heavily of the primary use
> of
> > >>>>>>>>>>>>> your
> > >>>>>>>>>>>>> cluster.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On 11/26/2012 08:53 AM, David Charle wrote:
> > >>>>>>>>>>>>>> hi
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> what's the recommended nodes for NN, hmaster and zk nodes
> > for
> > >>>>>>>>>>>>>> a
> > >>>>>>>> larger
> > >>>>>>>>>>>>>> cluster, lets say 50-100+
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> also, what would be the ideal replication factor for
> larger
> > >>>>>>>>>>>>>> clusters
> > >>>>>>>>>>>>>> when
> > >>>>>>>>>>>>>> u have 3-4 racks ?
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>> David
> > >>>>>>>>>>>>>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS
> > >>>>>>>>>>>>>> CIENCIAS
> > >>>>>>>>>>>>>> INFORMATICAS...
> > >>>>>>>>>>>>>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> http://www.uci.cu
> > >>>>>>>>>>>>>> http://www.facebook.com/universidad.uci
> > >>>>>>>>>>>>>> http://www.flickr.com/photos/universidad_uci
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> --
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Marcos Luis Ortíz Valmaseda
> > >>>>>>>>>>>>> about.me/marcosortiz <http://about.me/marcosortiz>
> > >>>>>>>>>>>>> @marcosluis2186 <http://twitter.com/marcosluis2186>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS
> > >>>>>>>>>>>>> CIENCIAS
> > >>>>>>>>>>>>> INFORMATICAS...
> > >>>>>>>>>>>>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> http://www.uci.cu
> > >>>>>>>>>>>>> http://www.facebook.com/universidad.uci
> > >>>>>>>>>>>>> http://www.flickr.com/photos/universidad_uci
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> --
> > >>>>>>> Adrien Mogenet
> > >>>>>>> 06.59.16.64.22
> > >>>>>>> http://www.mogenet.me
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>>
> > >>>
> > >>
> > >>
> > >
> >
> >
>

Reply via email to