Here is partial config I used: http://pastebin.com/1Dpbb2LA
I verified that there is no hbase-0.90.1.jar in lib dir. Thanks On Sun, Feb 13, 2011 at 8:59 AM, Ted Yu <[email protected]> wrote: > BTW > The timeout (when calling flushCommits) happened midnight, so I didn't > capture jstack. > > In hadoop1 region server log, I see this around time of timeout in 4th run: > > 2011-02-13 08:25:01,015 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > Finished snapshotting, commencing flushing stores > 2011-02-13 08:25:01,016 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server > Responder, call flushRegion(REGION => {NAME => > 'NIGHTLYDEVGRIDSGRIDSQL-THREEGPPSPEECHCALLS-1297583809865,2>&U\xF6\xB582>&U\xF6\xB582>&U\xF6\xB582>&U\xF6\xB582>&T,1297583814638.8cb772d452dee232306dfab0b472ec9a.', > STARTKEY => '2>&U\xF6\xB582>&U\xF6\xB582>&U\xF6\xB582>&U\xF6\xB582>&T', > ENDKEY => > '2\xC1\xA3\xDFhVz2\xC1\xA3\xDFhVz2\xC1\xA3\xDFhVz2\xC1\xA3\xDFhVz2\xC1\xA3\xDD', > ENCODED => 8cb772d452dee232306dfab0b472ec9a, TABLE => {{NAME => > 'NIGHTLYDEVGRIDSGRIDSQL-THREEGPPSPEECHCALLS-1297583809865', FAMILIES => > [{NAME => 'd', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => > '2', COMPRESSION => 'GZ', TTL => '31536000', BLOCKSIZE => '65536', IN_MEMORY > => 'false', BLOCKCACHE => 'false'}, {NAME => 'i', BLOOMFILTER => 'ROW', > REPLICATION_SCOPE => '0', VERSIONS => '2', COMPRESSION => 'GZ', TTL => > '31536000', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => > 'false'}, {NAME => 'v', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', > VERSIONS => '2', COMPRESSION => 'GZ', TTL => '31536000', BLOCKSIZE => > '65536', IN_MEMORY => 'false', BLOCKCACHE => 'false'}]}}) from > 10.202.50.76:62489: output error > 2011-02-13 08:25:01,020 WARN org.apache.hadoop.ipc.HBaseServer: PRI IPC > Server handler 3 on 60020 caught: java.nio.channels.ClosedChannelException > at > sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324) > at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1339) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:727) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:792) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1083) > > On Thu, Feb 10, 2011 at 2:41 PM, Ted Yu <[email protected]> wrote: > >> I replaced hbase jar with hbase-0.90.1.jar >> I also upgraded client side jar to hbase-0.90.1.jar >> >> Our map tasks were running faster than before for about 50 minutes. >> However, map tasks then timed out calling flushCommits(). This happened even >> after fresh restart of hbase. >> >> I don't see any exception in region server logs. >> >> In master log, I found: >> >> 2011-02-10 18:24:15,286 DEBUG >> org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Opened region >> -ROOT-,,0.70236052 on sjc1-hadoop6.X.com,60020,1297362251595 >> 2011-02-10 18:24:15,349 INFO >> org.apache.hadoop.hbase.catalog.CatalogTracker: Failed verification of >> .META.,,1 at address=null; >> org.apache.hadoop.hbase.NotServingRegionException: >> org.apache.hadoop.hbase.NotServingRegionException: Region is not online: >> .META.,,1 >> 2011-02-10 18:24:15,350 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: >> master:60000-0x12e10d0e31e0000 Creating (or updating) unassigned node for >> 1028785192 with OFFLINE state >> >> I am attaching region server (which didn't respond to stop-hbase.sh) >> jstack. >> >> FYI >> >> On Thu, Feb 10, 2011 at 10:10 AM, Stack <[email protected]> wrote: >> >>> Thats probably enough Ted. The 0.90.1 hbase-default.xml has an extra >>> config. to enable the experimental HBASE-3455 feature but you can copy >>> that over if you want to try playing with it (it defaults off so you'd >>> copy over the config. if you wanted to set it to true). >>> >>> St.Ack >>> >> >> >
