Hi Varun, The hard drivers I used are now used on the hadoop/hbase cluster, but they was clear and formated for the tests I did. The computer where I run those tests was one of the region servers. It was re-installed to be very clear, and it's now running a datanode and a RS.
Regarding RAID, I think you are confusing RAID0 and RAID1. It's RAID1 which need to access the 2 files each time. RAID0 is more like JBOD, but faster. JM 2012/12/20 Varun Sharma <va...@pinterest.com> > Hi Jean, > > Very interesting benchmark - how are these numbers arrived at ? Is this on > a real hbase cluster ? To me, it felt kind of counter intuitive that RAID0 > beats JBOD on random seeks because with RAID0 all disks need to seek at the > same time and the performance should basically be as bad as the slowest > seeking disk. > > Varun > > On Wed, Dec 19, 2012 at 5:14 PM, Michael Segel <michael_se...@hotmail.com > >wrote: > > > Yeah, > > I couldn't argue against LVMs when talking with the system admins. > > In terms of speed its noise because the CPUs are pretty efficient and > > unless you have more than 1 drive per physical core, you will end up > > saturating your disk I/O. > > > > In terms of MapR, you want the raw disk. (But we're talking Apache) > > > > > > On Dec 19, 2012, at 4:59 PM, Jean-Marc Spaggiari < > jean-m...@spaggiari.org> > > wrote: > > > > > Finally, it took me a while to run those tests because it was way > > > longer than expected, but here are the results: > > > > > > http://www.spaggiari.org/bonnie.html > > > > > > LVM is not really slower than JBOD and not really taking more CPU. So > > > I will say, if you have to choose between the 2, take the one you > > > prefer. Personally, I prefer LVM because it's easy to configure. > > > > > > The big winner here is RAID0. It's WAY faster than anything else. But > > > it's using twice the space... Your choice. > > > > > > I did not get a chance to test with the Ubuntu tool because it's not > > > working with LVM drives. > > > > > > JM > > > > > > 2012/11/28, Michael Segel <michael_se...@hotmail.com>: > > >> Ok, just a caveat. > > >> > > >> I am discussing MapR as part of a complete response. As Mohit posted > > MapR > > >> takes the raw device for their MapR File System. > > >> They do stripe on their own within what they call a volume. > > >> > > >> But going back to Apache... > > >> You can stripe drives, however I wouldn't recommend it. I don't think > > the > > >> performance gains would really matter. > > >> You're going to end up getting blocked first by disk i/o, then your > > >> controller card, then your network... assuming 10GBe. > > >> > > >> With only 2 disks on an 8 core system, you will hit disk i/o first and > > then > > >> you'll watch your CPU Wait I/O climb. > > >> > > >> HTH > > >> > > >> -Mike > > >> > > >> On Nov 28, 2012, at 7:28 PM, Jean-Marc Spaggiari < > > jean-m...@spaggiari.org> > > >> wrote: > > >> > > >>> Hi Mike, > > >>> > > >>> Why not using LVM with MapR? Since LVM is reading from 2 drives > almost > > >>> at the same time, it should be better than RAID0 or a single drive, > > >>> no? > > >>> > > >>> 2012/11/28, Michael Segel <michael_se...@hotmail.com>: > > >>>> Just a couple of things. > > >>>> > > >>>> I'm neutral on the use of LVMs. Some would point out that there's > some > > >>>> overhead, but on the flip side, it can make managing the machines > > >>>> easier. > > >>>> If you're using MapR, you don't want to use LVMs but raw devices. > > >>>> > > >>>> In terms of GC, its going to depend on the heap size and not the > total > > >>>> memory. With respect to HBase. ... MSLABS is the way to go. > > >>>> > > >>>> > > >>>> On Nov 28, 2012, at 12:05 PM, Jean-Marc Spaggiari > > >>>> <jean-m...@spaggiari.org> > > >>>> wrote: > > >>>> > > >>>>> Hi Gregory, > > >>>>> > > >>>>> I founs this about LVM: > > >>>>> -> http://blog.andrew.net.au/2006/08/09 > > >>>>> -> > > >>>>> > > http://www.phoronix.com/scan.php?page=article&item=fedora_15_lvm&num=2 > > >>>>> > > >>>>> Seems that performances are still correct with it. I will most > > >>>>> probably give it a try and bench that too... I have one new hard > > drive > > >>>>> which should arrived tomorrow. Perfect timing ;) > > >>>>> > > >>>>> > > >>>>> > > >>>>> JM > > >>>>> > > >>>>> 2012/11/28, Mohit Anchlia <mohitanch...@gmail.com>: > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> On Nov 28, 2012, at 9:07 AM, Adrien Mogenet < > > adrien.moge...@gmail.com> > > >>>>>> wrote: > > >>>>>> > > >>>>>>> Does HBase really benefit from 64 GB of RAM since allocating too > > >>>>>>> large > > >>>>>>> heap > > >>>>>>> might increase GC time ? > > >>>>>>> > > >>>>>> Benefit you get is from OS cache > > >>>>>>> Another question : why not RAID 0, in order to aggregate disk > > >>>>>>> bandwidth > > >>>>>>> ? > > >>>>>>> (and thus keep 3x replication factor) > > >>>>>>> > > >>>>>>> > > >>>>>>> On Wed, Nov 28, 2012 at 5:58 PM, Michael Segel > > >>>>>>> <michael_se...@hotmail.com>wrote: > > >>>>>>> > > >>>>>>>> Sorry, > > >>>>>>>> > > >>>>>>>> I need to clarify. > > >>>>>>>> > > >>>>>>>> 4GB per physical core is a good starting point. > > >>>>>>>> So with 2 quad core chips, that is going to be 32GB. > > >>>>>>>> > > >>>>>>>> IMHO that's a minimum. If you go with HBase, you will want more. > > >>>>>>>> (Actually > > >>>>>>>> you will need more.) The next logical jump would be to 48 or > 64GB. > > >>>>>>>> > > >>>>>>>> If we start to price out memory, depending on vendor, your > > company's > > >>>>>>>> procurement, there really isn't much of a price difference in > > terms > > >>>>>>>> of > > >>>>>>>> 32,48, or 64 GB. > > >>>>>>>> Note that it also depends on the chips themselves. Also you need > > to > > >>>>>>>> see > > >>>>>>>> how many memory channels exist in the mother board. You may need > > to > > >>>>>>>> buy > > >>>>>>>> in > > >>>>>>>> pairs or triplets. Your hardware vendor can help you. (Also you > > need > > >>>>>>>> to > > >>>>>>>> keep an eye on your hardware vendor. Sometimes they will give > you > > >>>>>>>> higher > > >>>>>>>> density chips that are going to be more expensive...) ;-) > > >>>>>>>> > > >>>>>>>> I tend to like having extra memory from the start. > > >>>>>>>> It gives you a bit more freedom and also protects you from 'fat' > > >>>>>>>> code. > > >>>>>>>> > > >>>>>>>> Looking at YARN... you will need more memory too. > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> With respect to the hard drives... > > >>>>>>>> > > >>>>>>>> The best recommendation is to keep the drives as JBOD and then > use > > >>>>>>>> 3x > > >>>>>>>> replication. > > >>>>>>>> In this case, make sure that the disk controller cards can > handle > > >>>>>>>> JBOD. > > >>>>>>>> (Some don't support JBOD out of the box) > > >>>>>>>> > > >>>>>>>> With respect to RAID... > > >>>>>>>> > > >>>>>>>> If you are running MapR, no need for RAID. > > >>>>>>>> If you are running an Apache derivative, you could use RAID 1. > > Then > > >>>>>>>> cut > > >>>>>>>> your replication to 2X. This makes it easier to manage drive > > >>>>>>>> failures. > > >>>>>>>> (Its not the norm, but it works...) In some clusters, they are > > using > > >>>>>>>> appliances like Net App's e series where the machines see the > > drives > > >>>>>>>> as > > >>>>>>>> local attached storage and I think the appliances themselves are > > >>>>>>>> using > > >>>>>>>> RAID. I haven't played with this configuration, however it > could > > >>>>>>>> make > > >>>>>>>> sense and its a valid design. > > >>>>>>>> > > >>>>>>>> HTH > > >>>>>>>> > > >>>>>>>> -Mike > > >>>>>>>> > > >>>>>>>> On Nov 28, 2012, at 10:33 AM, Jean-Marc Spaggiari > > >>>>>>>> <jean-m...@spaggiari.org> > > >>>>>>>> wrote: > > >>>>>>>> > > >>>>>>>>> Hi Mike, > > >>>>>>>>> > > >>>>>>>>> Thanks for all those details! > > >>>>>>>>> > > >>>>>>>>> So to simplify the equation, for 16 virtual cores we need 48 to > > >>>>>>>>> 64GB. > > >>>>>>>>> Which mean 3 to 4GB per core. So with quad cores, 12GB to 16GB > > are > > >>>>>>>>> a > > >>>>>>>>> good start? Or I simplified it to much? > > >>>>>>>>> > > >>>>>>>>> Regarding the hard drives. If you add more than one drive, do > you > > >>>>>>>>> need > > >>>>>>>>> to build them on RAID or similar systems? Or can Hadoop/HBase > be > > >>>>>>>>> configured to use more than one drive? > > >>>>>>>>> > > >>>>>>>>> Thanks, > > >>>>>>>>> > > >>>>>>>>> JM > > >>>>>>>>> > > >>>>>>>>> 2012/11/27, Michael Segel <michael_se...@hotmail.com>: > > >>>>>>>>>> > > >>>>>>>>>> OK... I don't know why Cloudera is so hung up on 32GB. ;-) > [Its > > an > > >>>>>>>> inside > > >>>>>>>>>> joke ...] > > >>>>>>>>>> > > >>>>>>>>>> So here's the problem... > > >>>>>>>>>> > > >>>>>>>>>> By default, your child processes in a map/reduce job get a > > default > > >>>>>>>> 512MB. > > >>>>>>>>>> The majority of the time, this gets raised to 1GB. > > >>>>>>>>>> > > >>>>>>>>>> 8 cores (dual quad cores) shows up at 16 virtual processors in > > >>>>>>>>>> Linux. > > >>>>>>>> (Note: > > >>>>>>>>>> This is why when people talk about the number of cores, you > have > > >>>>>>>>>> to > > >>>>>>>> specify > > >>>>>>>>>> physical cores or logical cores....) > > >>>>>>>>>> > > >>>>>>>>>> So if you were to over subscribe and have lets say 12 mappers > > and > > >>>>>>>>>> 12 > > >>>>>>>>>> reducers, that's 24 slots. Which means that you would need > 24GB > > of > > >>>>>>>> memory > > >>>>>>>>>> reserved just for the child processes. This would leave 8GB > for > > >>>>>>>>>> DN, > > >>>>>>>>>> TT > > >>>>>>>> and > > >>>>>>>>>> the rest of the linux OS processes. > > >>>>>>>>>> > > >>>>>>>>>> Can you live with that? Sure. > > >>>>>>>>>> Now add in R, HBase, Impala, or some other set of tools on top > > of > > >>>>>>>>>> the > > >>>>>>>>>> cluster. > > >>>>>>>>>> > > >>>>>>>>>> Ooops! Now you are in trouble because you will swap. > > >>>>>>>>>> Also adding in R, you may want to bump up those child procs > from > > >>>>>>>>>> 1GB > > >>>>>>>>>> to > > >>>>>>>> 2 > > >>>>>>>>>> GB. That means the 24 slots would now require 48GB. Now you > > have > > >>>>>>>>>> swap > > >>>>>>>> and > > >>>>>>>>>> if that happens you will see HBase in a cascading failure. > > >>>>>>>>>> > > >>>>>>>>>> So while you can do a rolling restart with the changed > > >>>>>>>>>> configuration > > >>>>>>>>>> (reducing the number of mappers and reducers) you end up with > > less > > >>>>>>>>>> slots > > >>>>>>>>>> which will mean in longer run time for your jobs. (Less slots > == > > >>>>>>>>>> less > > >>>>>>>>>> parallelism ) > > >>>>>>>>>> > > >>>>>>>>>> Looking at the price of memory... you can get 48GB or even > 64GB > > >>>>>>>>>> for > > >>>>>>>> around > > >>>>>>>>>> the same price point. (8GB chips) > > >>>>>>>>>> > > >>>>>>>>>> And I didn't even talk about adding SOLR either again a memory > > >>>>>>>>>> hog... > > >>>>>>>> ;-) > > >>>>>>>>>> > > >>>>>>>>>> Note that I matched the number of mappers w reducers. You > could > > go > > >>>>>>>>>> with > > >>>>>>>>>> fewer reducers if you want. I tend to recommend a ratio of 2:1 > > >>>>>>>>>> mappers > > >>>>>>>> to > > >>>>>>>>>> reducers, depending on the work flow.... > > >>>>>>>>>> > > >>>>>>>>>> As to the disks... no 7200 SATA III drives are fine. SATA III > > >>>>>>>>>> interface > > >>>>>>>> is > > >>>>>>>>>> pretty much available in the new kit being shipped. > > >>>>>>>>>> Its just that you don't have enough drives. 8 cores should be > 8 > > >>>>>>>> spindles if > > >>>>>>>>>> available. > > >>>>>>>>>> Otherwise you end up seeing your CPU load climb on wait states > > as > > >>>>>>>>>> the > > >>>>>>>>>> processes wait for the disk i/o to catch up. > > >>>>>>>>>> > > >>>>>>>>>> I mean you could build out a cluster w 4 x 3 3.5" 2TB drives > in > > a > > >>>>>>>>>> 1 > > >>>>>>>>>> U > > >>>>>>>>>> chassis based on price. You're making a trade off and you > should > > >>>>>>>>>> be > > >>>>>>>> aware of > > >>>>>>>>>> the performance hit you will take. > > >>>>>>>>>> > > >>>>>>>>>> HTH > > >>>>>>>>>> > > >>>>>>>>>> -Mike > > >>>>>>>>>> > > >>>>>>>>>> On Nov 27, 2012, at 1:52 PM, Jean-Marc Spaggiari < > > >>>>>>>> jean-m...@spaggiari.org> > > >>>>>>>>>> wrote: > > >>>>>>>>>> > > >>>>>>>>>>> Hi Michael, > > >>>>>>>>>>> > > >>>>>>>>>>> so are you recommanding 32Gb per node? > > >>>>>>>>>>> > > >>>>>>>>>>> What about the disks? SATA drives are to slow? > > >>>>>>>>>>> > > >>>>>>>>>>> JM > > >>>>>>>>>>> > > >>>>>>>>>>> 2012/11/26, Michael Segel <michael_se...@hotmail.com>: > > >>>>>>>>>>>> Uhm, those specs are actually now out of date. > > >>>>>>>>>>>> > > >>>>>>>>>>>> If you're running HBase, or want to also run R on top of > > Hadoop, > > >>>>>>>>>>>> you > > >>>>>>>>>>>> will > > >>>>>>>>>>>> need to add more memory. > > >>>>>>>>>>>> Also forget 1GBe got 10GBe, and w 2 SATA drives, you will > be > > >>>>>>>>>>>> disk > > >>>>>>>>>>>> i/o > > >>>>>>>>>>>> bound > > >>>>>>>>>>>> way too quickly. > > >>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>>>>>>> On Nov 26, 2012, at 8:05 AM, Marcos Ortiz <mlor...@uci.cu> > > >>>>>>>>>>>> wrote: > > >>>>>>>>>>>> > > >>>>>>>>>>>>> Are you asking about hardware recommendations? > > >>>>>>>>>>>>> Eric Sammer on his "Hadoop Operations" book, did a great > job > > >>>>>>>>>>>>> about > > >>>>>>>>>>>>> this: > > >>>>>>>>>>>>> For middle size clusters (until 300 nodes): > > >>>>>>>>>>>>> Processor: A dual quad-core 2.6 Ghz > > >>>>>>>>>>>>> RAM: 24 GB DDR3 > > >>>>>>>>>>>>> Dual 1 Gb Ethernet NICs > > >>>>>>>>>>>>> a SAS drive controller > > >>>>>>>>>>>>> at least two SATA II drives in a JBOD configuration > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> The replication factor depends heavily of the primary use > of > > >>>>>>>>>>>>> your > > >>>>>>>>>>>>> cluster. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> On 11/26/2012 08:53 AM, David Charle wrote: > > >>>>>>>>>>>>>> hi > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> what's the recommended nodes for NN, hmaster and zk nodes > > for > > >>>>>>>>>>>>>> a > > >>>>>>>> larger > > >>>>>>>>>>>>>> cluster, lets say 50-100+ > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> also, what would be the ideal replication factor for > larger > > >>>>>>>>>>>>>> clusters > > >>>>>>>>>>>>>> when > > >>>>>>>>>>>>>> u have 3-4 racks ? > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> -- > > >>>>>>>>>>>>>> David > > >>>>>>>>>>>>>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS > > >>>>>>>>>>>>>> CIENCIAS > > >>>>>>>>>>>>>> INFORMATICAS... > > >>>>>>>>>>>>>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> http://www.uci.cu > > >>>>>>>>>>>>>> http://www.facebook.com/universidad.uci > > >>>>>>>>>>>>>> http://www.flickr.com/photos/universidad_uci > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> -- > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> Marcos Luis OrtÃz Valmaseda > > >>>>>>>>>>>>> about.me/marcosortiz <http://about.me/marcosortiz> > > >>>>>>>>>>>>> @marcosluis2186 <http://twitter.com/marcosluis2186> > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS > > >>>>>>>>>>>>> CIENCIAS > > >>>>>>>>>>>>> INFORMATICAS... > > >>>>>>>>>>>>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> http://www.uci.cu > > >>>>>>>>>>>>> http://www.facebook.com/universidad.uci > > >>>>>>>>>>>>> http://www.flickr.com/photos/universidad_uci > > >>>>>>> > > >>>>>>> > > >>>>>>> -- > > >>>>>>> Adrien Mogenet > > >>>>>>> 06.59.16.64.22 > > >>>>>>> http://www.mogenet.me > > >>>>>> > > >>>>> > > >>>> > > >>>> > > >>> > > >> > > >> > > > > > > > >