On Thu, May 21, 2015 at 1:04 AM, Serega Sheypak <serega.shey...@gmail.com> wrote:
> > Do you have the system sharing > There are 2 HDD 7200 2TB each. There is 300GB OS partition on each drive > with mirroring enabled. I can't persuade devops that mirroring could cause > IO issues. What arguments can I bring? They use OS partition mirroring when > disck fails, we can use other partition to boot OS and continue to work... > > You are already compromised i/o-wise having two disks only. I have not the experience to say for sure but basic physics would seem to dictate that having your two disks (partially) mirrored compromises your i/o even more. You are in a bit of a hard place. Your operators want the machine to boot even after it loses 50% of its disk. > >Do you have to compact? In other words, do you have read SLAs? > Unfortunately, I have mixed workload from web applications. I need to write > and read and SLA is < 50ms. > > Ok. You get the bit that seeks are about 10ms or each so with two disks you can do 2x100 seeks a second presuming no one else is using disk. > >How are your read times currently? > Cloudera manager says it's 4K reads per second and 500 writes per second > > >Does your working dataset fit in RAM or do > reads have to go to disk? > I have several tables for 500GB each and many small tables 10-20 GB. Small > tables loaded hourly/daily using bulkload (prepare HFiles using MR and move > them to HBase using utility). Big tables are used by webapps, they read and > write them. > > These hfiles are created on same cluster with MR? (i.e. they are using up i/os) > >It looks like you are running at about three storefiles per column family > is it hbase.hstore.compactionThreshold=3? > > >What if you upped the threshold at which minors run? > you mean bump hbase.hstore.compactionThreshold to 8 or 10? > > Yes. Downside is that your reads may require more seeks to find a keyvalue. Can you cache more? Can you make it so files are bigger before you flush? > >Do you have a downtime during which you could schedule compactions? > Unfortunately no. It should work 24/7 and sometimes it doesn't do it. > > So, it is running at full bore 24/7? There is no 'downtime'... a time when the traffic is not so heavy? > >Are you managing the major compactions yourself or are you having hbase do > it for you? > HBase, once a day hbase.hregion.majorcompaction=1day > > Have you studied your compactions? You realize that a major compaction will do full rewrite of your dataset? When they run, how many storefiles are there? Do you have to run once a day? Can you not run once a week? Can you manage the compactions yourself... and run them a region at a time in a rolling manner across the cluster rather than have them just run whenever it suits them once a day? > I can disable WAL. It's ok to loose some data in case of RS failure. I'm > not doing banking transactions. > If I disable WAL, could it help? > > It could but don't. Enable deferring sync'ing first if you can 'lose' some data. Work on your flushing and compactions before you mess w/ WAL. What version of hbase are you on? You say CDH but the newer your hbase, the better it does generally. St.Ack > 2015-05-20 18:04 GMT+03:00 Stack <st...@duboce.net>: > > > On Mon, May 18, 2015 at 4:26 PM, Serega Sheypak < > serega.shey...@gmail.com> > > wrote: > > > > > Hi, we are using extremely cheap HW: > > > 2 HHD 7200 > > > 4*2 core (Hyperthreading) > > > 32GB RAM > > > > > > We met serious IO performance issues. > > > We have more or less even distribution of read/write requests. The same > > for > > > datasize. > > > > > > ServerName Request Per Second Read Request Count Write Request Count > > > node01.domain.com,60020,1430172017193 195 171871826 16761699 > > > node02.domain.com,60020,1426925053570 24 34314930 16006603 > > > node03.domain.com,60020,1430860939797 22 32054801 16913299 > > > node04.domain.com,60020,1431975656065 33 1765121 253405 > > > node05.domain.com,60020,1430484646409 27 42248883 16406280 > > > node07.domain.com,60020,1426776403757 27 36324492 16299432 > > > node08.domain.com,60020,1426775898757 26 38507165 13582109 > > > node09.domain.com,60020,1430440612531 27 34360873 15080194 > > > node11.domain.com,60020,1431989669340 28 44307 13466 > > > node12.domain.com,60020,1431927604238 30 5318096 2020855 > > > node13.domain.com,60020,1431372874221 29 31764957 15843688 > > > node14.domain.com,60020,1429640630771 41 36300097 13049801 > > > > > > ServerName Num. Stores Num. Storefiles Storefile Size Uncompressed > > > Storefile > > > Size Index Size Bloom Size > > > node01.domain.com,60020,1430172017193 82 186 1052080m 76496mb 641849k > > > 310111k > > > node02.domain.com,60020,1426925053570 82 179 1062730m 79713mb 649610k > > > 318854k > > > node03.domain.com,60020,1430860939797 82 179 1036597m 76199mb 627346k > > > 307136k > > > node04.domain.com,60020,1431975656065 82 400 1034624m 76405mb 655954k > > > 289316k > > > node05.domain.com,60020,1430484646409 82 185 1111807m 81474mb 688136k > > > 334127k > > > node07.domain.com,60020,1426776403757 82 164 1023217m 74830mb 631774k > > > 296169k > > > node08.domain.com,60020,1426775898757 81 171 1086446m 79933mb 681486k > > > 312325k > > > node09.domain.com,60020,1430440612531 81 160 1073852m 77874mb 658924k > > > 309734k > > > node11.domain.com,60020,1431989669340 81 166 1006322m 75652mb 664753k > > > 264081k > > > node12.domain.com,60020,1431927604238 82 188 1050229m 75140mb 652970k > > > 304137k > > > node13.domain.com,60020,1431372874221 82 178 937557m 70042mb 601684k > > > 257607k > > > node14.domain.com,60020,1429640630771 82 145 949090m 69749mb 592812k > > > 266677k > > > > > > > > > When compaction starts random node gets I/O 100%, io wait for seconds, > > > even tenth of seconds. > > > > > > What are the approaches to optimize minor and major compactions when > you > > > are I/O bound..? > > > > > > > Yeah, with two disks, you will be crimped. Do you have the system sharing > > with hbase/hdfs or is hdfs running on one disk only? > > > > Do you have to compact? In other words, do you have read SLAs? How are > > your read times currently? Does your working dataset fit in RAM or do > > reads have to go to disk? It looks like you are running at about three > > storefiles per column family. What if you upped the threshold at which > > minors run? Do you have a downtime during which you could schedule > > compactions? Are you managing the major compactions yourself or are you > > having hbase do it for you? > > > > St.Ack > > >