> I'm using > http://hbase.apache.org/docs/r0.20.6/api/org/apache/hadoop/hbase/mapreduce/IdentityTableReducer.html
Did you set it up with TableMapReduceUtil? > Not explicitly set be me If you use TableMapReduceUtil, then it's set to 2MB by default, but looking at the RS logs the write buffer is probably not the problem. > 1 family Good > LZO Excellent > Indeed: > > memstore size 138.7m is >= than blocking 128.0m size 2010-11-24 > 17:12:49,136 INFO org.apache.hadoop.hbase.regionserver.HRegion: > Blocking updates for 'IPC Server handler 4 on 60020' on region > raw_occurrence_record,,1290613896288.841ac149ecacf4b721ac232960e98761.: > memstore size 138.7m is >= than blocking 128.0m size 2010-11-24 > 17:12:49,155 INFO org.apache.hadoop.hbase.regionserver.HRegion: > Blocking updates for 'IPC Server handler 10 on 60020' on region > raw_occurrence_record,,1290613896288.841ac149ecacf4b721ac232960e98761.: > memstore size 146.3m is >= than blocking 128.0m size 2010-11-24 > 17:12:49,169 INFO org.apache.hadoop.hbase.regionserver.HRegion: > Blocking updates for 'IPC Server handler 5 on 60020' on region > raw_occurrence_record,,1290613896288.841ac149ecacf4b721ac232960e98761.: > memstore size 148.8m is >= than blocking 128.0m size 2010-11-24 > 17:12:49,193 INFO org.apache.hadoop.hbase.regionserver.HRegion: > Blocking updates for 'IPC Server handler 8 on 60020' on region > > I guess this is bad, but could benefit from some guidance... How many regions do you have in your table? If you started with only 1 region (eg a new table), then all the load will go to that single region. It's a good thing to create your tables pre-split if you're planning to do a massive upload into them. See this method and the others in the likes http://hbase.apache.org/docs/r0.89.20100924/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html#createTable(org.apache.hadoop.hbase.HTableDescriptor, byte[][]) To find how many regions you have in "raw_occurrence_record", go on the master web UI and click on the table's name in the tables list. Finally, you might want to do a bulk load instead, see http://hbase.apache.org/docs/r0.89.20100924/bulk-loads.html > > What's the best way to do this please and I will? Open conf/hbase-env.sh and go to: # Uncomment below to enable java garbage collection logging. # export HBASE_OPTS="$HBASE_OPTS -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:$HBASE_HOME/logs/gc-hbase.log" J-D