Hi Jack, It's interesting that the META server is getting hit hard for an MR job that only inputs from HBase. Each task should be only accessing one region with a single scan, so each task should only hit META once unless there are RS failures, etc.
Am I misunderstanding the way your MR job works? Thanks -Todd On Thu, Dec 23, 2010 at 4:29 PM, Jack Levin <[email protected]> wrote: > We are doing a map-reduce run to extract all keys from 500 tables with > few million records each. What we noticed is that it seems that the > region server that holds the .meta. is somewhat CPU bound. Question, > is it possible to nail or create affinitity for regions servers that > are beefy to hold .META.? E.g. one nice thing would be to have it in > .xml config <RegionServerMetaAffinity>host,host,host, n, > ..</RegionServerMetaAffinity>, here is what we see: > > top - 16:22:44 up 78 days, 22:42, 1 user, load average: 5.64, 6.73, 4.75 > Tasks: 76 total, 2 running, 74 sleeping, 0 stopped, 0 zombie > Cpu0 : 54.5%us, 25.6%sy, 0.0%ni, 5.3%id, 7.0%wa, 1.7%hi, 6.0%si, > 0.0%st > Cpu1 : 56.6%us, 18.9%sy, 0.0%ni, 6.0%id, 4.3%wa, 2.0%hi, 12.3%si, > 0.0%st > Mem: 8196168k total, 8145884k used, 50284k free, 13792k buffers > Swap: 0k total, 0k used, 0k free, 2493392k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 24456 root 20 0 5593m 5.1g 10m S 117.4 64.7 14995:53 > /usr/java/jdk1.6.0_12/bin/java -Xmx5000m -XX:+UseConcMarkSweepGC > -XX:+CMSIncrementalMode -verbose:gc -XX:+PrintGCDe > 25987 hadoop 20 0 2676m 144m 9608 S 54.2 1.8 5516:36 > /usr/java/jdk1.6.0_12/bin/java -Xmx2048m -server > -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote -Dhad > 242 root 15 -5 0 0 0 S 2.0 0.0 423:52.39 [kswapd0] > > That server is dual-core, and great for natural region serving, but > meta interaction implies a lot of fast transactions, which would be > better served on 8 Core Box. > > -Jack > -- Todd Lipcon Software Engineer, Cloudera
