You have this line: 2010-01-08 21:25:24,709 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 66413ms, ten times longer than scheduled: 3000
That's a garbage collector pause that lasted more than a minute which is higher than the default timeout to consider a region server dead (40 seconds in 0.20 unless you are using 0.20.3RC1). The master replayed the write-ahead-logs and reopened the regions elsewhere. You want to set a higher heap space in conf/hbase-env.sh because the default 1GB is way too low, give it a much as you can without swapping. J-D On Sat, Jan 9, 2010 at 4:06 AM, Dmitriy Lyfar <[email protected]> wrote: > Hello, > > 2010/1/5 Jean-Daniel Cryans <[email protected]> > >> WRT your last 2 emails, HBase ships with defaults that are working >> safely for most of the users and in no way tuned for one time upload. >> Playing with the memstore size like you did makes sense. >> >> Now you said you were inserting with row key being reversed ts... are >> all threads using the same key space when uploading? I ask this >> because if all 60 threads are hitting almost always the same region >> (different one in time), then all 60 threads are just filling up >> really fast the same memstore, then all wait for the snapshot, >> eventually all wait for the same region split and in the mean time >> fills the same WAL which will probably be rolled some times. Is it the >> case? >> >> You could also post a region server log for us to analyze. >> > > Now I'm using random int keys to distribute loading between regionservers. > Now I not use threaded client, but multiprocessed one. And timings still > almost same (sometimes random keys are faster). > I left cluster for night stress testing. I've ran several clients, each of > them inserts 100K of 25Kb records. I noticed that one of my regionservers > were closed. I've analyzed logs and seems there were timeout with zookeeper > service which caused closing of regionserver. > Cluster continued its work, but test's timings were increased. I have few > questions. > Should I shutdown all cluster in such case to return closed regionserver to > work? > What master will do in such cases? Will it reassign regions to another > servers? How it impacts on read/write performance? > Logs of this regionserver is here: http://pastebin.com/m1c25e2ae > > -- > Thank you, Lyfar Dmitriy >
