With only 1GB of memory, you definitely want to reduce the heap size given to the JVM running regionservers and datanodes. You can change this for hbase in conf/hbase-env.sh, export HBASE_HEAPSIZE=SIZE_IN_MB. I'd say you want to configure everything so you won't be out of memory when every java process is maxed out. Also, 1GB is on the low end of the spectrum when you want to run both an RS and DataNode on one machine. At least 2GB is preferred.
Can you confirm if the machine is swapping during this time? Java/HBase will try to make use of all the heap it's been given, so it seems likely that this is the issue. What are the other hardware specs on this machine? Number of cores, for example? Also, I'd be very careful with only leaving 1GB of space free on a machine... Are you still writing to this instance when that is the free space on this machine? JG -----Original Message----- From: Jean-Adrien [mailto:[EMAIL PROTECTED] Sent: Thursday, October 16, 2008 7:33 AM To: [email protected] Subject: Regionserver sleeps too much Hello again. My second question concerns one of my region server (often the same) which shutdowns often because it misses the window to heartbeats to master: Maybe it is overloaded. But it misses it for about 6min. I turned the log file to debug mode, but I havn't found anything more interesting. The last action is a compaction, but it ends normally. Maybe it is followed by a heavy hadoop task ? Or maybe it is linked to the fact that there is only 1Gb HD free ? That is the only difference I notice between this node and the others, Note that its hostname is the first on the regionsserver list. Does this position increase the amount of work ? (e.g. META table always loaded here ?) By the way, on a computer that have (only) 1Gb of RAM should I decrease the jvm max allowed memory to the heaps of hadoop datanode and hbase regionserver (default is 1Gb for each I think) to avoid endless swap ? Nothing in jira seems to match my problem. Other idea ? --- region server log --- 2008-10-16 15:18:45,812 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: table-0.3,PLQ80+70101200 :key/miss;j1DB44040DD81BA02D4E0E9A0D8698DA9 2008-10-16 15:18:45,812 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region table-0.3,PLQ80+70101200 :key/miss;j1DB44040DD81BA02D4E0E9A0D8698DA9,1224096999059 2008-10-16 15:18:45,820 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Skipped compaction of 1 file; compaction size of 1082805005/header: 4 83.5k; Skipped 3 files, size: 488461 2008-10-16 15:18:45,826 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Skipped compaction of 1 file; compaction size of 1082805005/bytes: 13 6.5m; Skipped 2 files, size: 141612841 2008-10-16 15:18:45,833 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Skipped compaction of 1 file; compaction size of 1082805005/info: 1.1 m; Skipped 3 files, size: 1109592 2008-10-16 15:18:45,833 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region table-0.3,PLQ80+70101200 :key/miss;j1DB44040DD81BA02D4E0E9A0D8698DA9,1224096999059 in 0sec 2008-10-16 15:24:32,656 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 265463ms, ten times longer than scheduled: 3000 2008-10-16 15:24:32,656 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: unable to report to master for 265463 milliseconds - aborting server 2008-10-16 15:24:32,656 DEBUG org.apache.hadoop.hbase.RegionHistorian: Offlined 2008-10-16 15:24:32,657 INFO org.apache.hadoop.ipc.Server: Stopping server on 60020 2 Again. Thank you for your advises. -- Jean-Adrien Cluster setup: Ubuntu linux 4 regionsservers / datanodes 1 is master / namenode as well. java-6-sun Total size of hdfs: 81.98 GB (replication factor 3) fsck -> healthy hadoop: 0.18.1 hbase: 0.18.0 (jar of hadoop replaced with 0.18.1) 1Gb ram per node -- View this message in context: http://www.nabble.com/Regionserver-sleeps-too-much-tp20014722p20014722.html Sent from the HBase User mailing list archive at Nabble.com.
