>From the log I see no log was produced during 08:25 to 09:15, why did this
happen?

08:25:06.274INFOorg.apache.
hadoop.hbase.regionserver.wal.HLog

moving old hlog file
/hbase/.logs/etp-hdfs-n1-sg.passport.local,60020,1414102905372/etp-hdfs-n1-sg.passport.local%2C60020%2C1414102905372.1420856706020
whose highest sequenceid is 229359 to
/hbase/.oldlogs/etp-hdfs-n1-sg.passport.local%2C60020%2C1414102905372.1420856706020

09:15:52.020INFOorg.apache.hadoop.hbase.regionserver.HRegionServer

Regards,
Shuai

On Mon, Jan 12, 2015 at 3:47 AM, Serega Sheypak <serega.shey...@gmail.com>
wrote:

> Hi, I have PoC HBase cluster running on 3 VM
> deployment schema is:
> NODE01 NN, SN, HMaster (HM), RegionServer (RS), Zookeeper server (ZK), DN
> NODE02 RegionServer, DN
> NODE03 RegionServer, DN
>
> Suddenly ONLY HBase went offline, all services: HM RS
> HDFS was working, no alerts were there
> ZK server was working, no alerts there.
> VMWare didn't publish any alerts.
> Only restart of HBase service helped.
>
> We are using this:
> http://www.cloudera.com/content/cloudera/en/downloads/cdh/cdh-4-7-0.html
> hbase-0.94.15+113
>
> I made a deep dive into logs and found this stuff:
> 08:15:51.968INFOorg.apache.hadoop.hbase.regionserver.HRegionServer
>
> regionserver60020.periodicFlusher requesting flush for region
>
> epd_documents,403ded58-45fa-4526-ae5f-da69683bc620,1418822716508.f2cca08a8628d1660a4143f4383a5457.
> after a delay of 3026
>
> 08:15:55.011INFOorg.apache.hadoop.hbase.regionserver.StoreFile
>
> Bloom filter type for
>
> hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/.tmp/8e68424066dc4c02a60ca57ec98128fb:
> ROW, CompoundBloomFilterWriter
>
> 08:15:55.012INFOorg.apache.hadoop.hbase.regionserver.StoreFile
>
> Delete Family Bloom filter type for
>
> hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/.tmp/8e68424066dc4c02a60ca57ec98128fb:
> CompoundBloomFilterWriter
>
> 08:15:55.035INFOorg.apache.hadoop.hbase.regionserver.StoreFile
>
> General Bloom and NO DeleteFamily was added to HFile
>
> (hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/.tmp/8e68424066dc4c02a60ca57ec98128fb)
>
> 08:15:55.035INFOorg.apache.hadoop.hbase.regionserver.Store
>
> Flushed , sequenceid=229362, memsize=7.7 K, into tmp file
>
> hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/.tmp/8e68424066dc4c02a60ca57ec98128fb
>
> 08:15:55.053INFOorg.apache.hadoop.hbase.regionserver.StoreFile$Reader
>
> Loaded ROW (CompoundBloomFilter) metadata for
> 8e68424066dc4c02a60ca57ec98128fb
>
> 08:15:55.072INFOorg.apache.hadoop.hbase.regionserver.StoreFile$Reader
>
> Loaded ROW (CompoundBloomFilter) metadata for
> 8e68424066dc4c02a60ca57ec98128fb
>
> 08:15:55.073INFOorg.apache.hadoop.hbase.regionserver.Store
>
> Added
> hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/CF/8e68424066dc4c02a60ca57ec98128fb,
> entries=8, sequenceid=229362, filesize=2.7 K
>
> 08:15:55.076INFOorg.apache.hadoop.hbase.regionserver.HRegion
>
> Finished memstore flush of ~7.7 K/7840, currentsize=0/0 for region
>
> epd_documents,403ded58-45fa-4526-ae5f-da69683bc620,1418822716508.f2cca08a8628d1660a4143f4383a5457.
> in 80ms, sequenceid=229362, compaction requested=true
>
> 08:15:55.077INFOorg.apache.hadoop.hbase.regionserver.HRegion
>
> Starting compaction on CF in region
>
> epd_documents,403ded58-45fa-4526-ae5f-da69683bc620,1418822716508.f2cca08a8628d1660a4143f4383a5457.
>
> 08:15:55.077INFOorg.apache.hadoop.hbase.regionserver.Store
>
> Starting compaction of 4 file(s) in CF of
>
> epd_documents,403ded58-45fa-4526-ae5f-da69683bc620,1418822716508.f2cca08a8628d1660a4143f4383a5457.
> into
> tmpdir=hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/.tmp,
> seqid=229362, totalSize=76.6 M
>
> 08:15:55.096INFOorg.apache.hadoop.hbase.regionserver.StoreFile
>
> Bloom filter type for
>
> hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/.tmp/8bf8e92031834676b5d40b352120c5f2:
> ROW, CompoundBloomFilterWriter
>
> 08:15:55.097INFOorg.apache.hadoop.hbase.regionserver.StoreFile
>
> Delete Family Bloom filter type for
>
> hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/.tmp/8bf8e92031834676b5d40b352120c5f2:
> CompoundBloomFilterWriter
>
> 08:15:59.245INFOorg.apache.hadoop.hbase.regionserver.StoreFile
>
> General Bloom and NO DeleteFamily was added to HFile
>
> (hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/.tmp/8bf8e92031834676b5d40b352120c5f2)
>
> 08:15:59.255INFOorg.apache.hadoop.hbase.regionserver.StoreFile$Reader
>
> Loaded ROW (CompoundBloomFilter) metadata for
> 8bf8e92031834676b5d40b352120c5f2
>
> 08:15:59.255INFOorg.apache.hadoop.hbase.regionserver.Store
>
> Renaming compacted file at
>
> hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/.tmp/8bf8e92031834676b5d40b352120c5f2
> to
> hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/f2cca08a8628d1660a4143f4383a5457/CF/8bf8e92031834676b5d40b352120c5f2
>
> 08:15:59.266INFOorg.apache.hadoop.hbase.regionserver.StoreFile$Reader
>
> Loaded ROW (CompoundBloomFilter) metadata for
> 8bf8e92031834676b5d40b352120c5f2
>
> 08:15:59.282INFOorg.apache.hadoop.hbase.regionserver.Store
>
> Completed major compaction of 4 file(s) in CF of
>
> epd_documents,403ded58-45fa-4526-ae5f-da69683bc620,1418822716508.f2cca08a8628d1660a4143f4383a5457.
> into 8bf8e92031834676b5d40b352120c5f2, size=76.6 M; total size for
> store is 76.6 M
>
> 08:15:59.283INFO
> org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest
>
> completed compaction:
>
> regionName=epd_documents,403ded58-45fa-4526-ae5f-da69683bc620,1418822716508.f2cca08a8628d1660a4143f4383a5457.,
> storeName=CF, fileCount=4, fileSize=76.6 M, priority=3,
> time=8079772165508216; duration=4sec
>
> 08:25:06.274INFOorg.apache.hadoop.hbase.regionserver.wal.HLog
>
> Roll
> /hbase/.logs/etp-hdfs-n1-sg.passport.local,60020,1414102905372/etp-hdfs-n1-sg.passport.local%2C60020%2C1414102905372.1420860306123,
> entries=3, filesize=7313.  for
>
> /hbase/.logs/etp-hdfs-n1-sg.passport.local,60020,1414102905372/etp-hdfs-n1-sg.passport.local%2C60020%2C1414102905372.1420863906224
>
> 08:25:06.274INFOorg.apache.hadoop.hbase.regionserver.wal.HLog
>
> moving old hlog file
>
> /hbase/.logs/etp-hdfs-n1-sg.passport.local,60020,1414102905372/etp-hdfs-n1-sg.passport.local%2C60020%2C1414102905372.1420856706020
> whose highest sequenceid is 229359 to
>
> /hbase/.oldlogs/etp-hdfs-n1-sg.passport.local%2C60020%2C1414102905372.1420856706020
>
> 09:15:52.020INFOorg.apache.hadoop.hbase.regionserver.HRegionServer
>
> regionserver60020.periodicFlusher requesting flush for region
> epd_documents,,1418822716508.af285fa424bad011e58c32a88875b947. after a
> delay of 14844
>
> 09:16:02.021INFOorg.apache.hadoop.hbase.regionserver.HRegionServer
>
> regionserver60020.periodicFlusher requesting flush for region
> epd_documents,,1418822716508.af285fa424bad011e58c32a88875b947. after a
> delay of 21848
>
> 09:16:06.880INFOorg.apache.hadoop.hbase.regionserver.StoreFile
>
> Bloom filter type for
>
> hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/af285fa424bad011e58c32a88875b947/.tmp/a148cdce49964965b7b6f4728881427d:
> ROW, CompoundBloomFilterWriter
>
> 09:16:06.880INFOorg.apache.hadoop.hbase.regionserver.StoreFile
>
> Delete Family Bloom filter type for
>
> hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/af285fa424bad011e58c32a88875b947/.tmp/a148cdce49964965b7b6f4728881427d:
> CompoundBloomFilterWriter
>
> 09:16:06.914INFOorg.apache.hadoop.hbase.regionserver.StoreFile
>
> General Bloom and NO DeleteFamily was added to HFile
>
> (hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/af285fa424bad011e58c32a88875b947/.tmp/a148cdce49964965b7b6f4728881427d)
>
> 09:16:06.914INFOorg.apache.hadoop.hbase.regionserver.Store
>
> Flushed , sequenceid=229369, memsize=16.3 K, into tmp file
>
> hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/af285fa424bad011e58c32a88875b947/.tmp/a148cdce49964965b7b6f4728881427d
>
> 09:16:06.932INFOorg.apache.hadoop.hbase.regionserver.StoreFile$Reader
>
> Loaded ROW (CompoundBloomFilter) metadata for
> a148cdce49964965b7b6f4728881427d
>
> 09:16:06.952INFOorg.apache.hadoop.hbase.regionserver.StoreFile$Reader
>
> Loaded ROW (CompoundBloomFilter) metadata for
> a148cdce49964965b7b6f4728881427d
>
> 09:16:06.952INFOorg.apache.hadoop.hbase.regionserver.Store
>
> Added
> hdfs://etp-hdfs-n1-sg.passport.local:8020/hbase/epd_documents/af285fa424bad011e58c32a88875b947/CF/a148cdce49964965b7b6f4728881427d,
> entries=16, sequenceid=229369, filesize=4.5 K
>
> 09:16:06.955INFOorg.apache.hadoop.hbase.regionserver.HRegion
>
> Finished memstore flush of ~16.3 K/16648, currentsize=0/0 for region
> epd_documents,,1418822716508.af285fa424bad011e58c32a88875b947. in
> 90ms, sequenceid=229369, compaction requested=false
>
> 09:16:55.986INFOorg.apache.zookeeper.ClientCnxn
>
> Client session timed out, have not heard from server in 40003ms for
> sessionid 0x1493f19431b0005, closing socket connection and attempting
> reconnect
>
> 09:16:56.552INFOorg.apache.zookeeper.ClientCnxn
>
> Client session timed out, have not heard from server in 40020ms for
> sessionid 0x1493f19431b0000, closing socket connection and attempting
> reconnect
>
> 09:16:57.535INFOorg.apache.zookeeper.ClientCnxn
>
> Opening socket connection to server
> etp-hdfs-n1-sg.passport.local/172.16.160.8:2181. Will not attempt to
> authenticate using SASL (java.lang.SecurityException: Unable to locate
> a login configuration)
>
> 09:16:57.550INFOorg.apache.zookeeper.ClientCnxn
>
> Socket connection established to
> etp-hdfs-n1-sg.passport.local/172.16.160.8:2181, initiating session
>
> 09:16:58.448INFOorg.apache.zookeeper.ClientCnxn
>
> Opening socket connection to server
> etp-hdfs-n1-sg.passport.local/172.16.160.8:2181. Will not attempt to
> authenticate using SASL (java.lang.SecurityException: Unable to locate
> a login configuration)
>
> 09:16:58.449INFOorg.apache.zookeeper.ClientCnxn
>
> Socket connection established to
> etp-hdfs-n1-sg.passport.local/172.16.160.8:2181, initiating session
>
> 09:17:20.581INFOorg.apache.zookeeper.ClientCnxn
>
> Unable to reconnect to ZooKeeper service, session 0x1493f19431b0005
> has expired, closing socket connection
>
> 09:17:20.582INFO
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
>
> This client just lost it's session with ZooKeeper, will automatically
> reconnect when needed.
>
> 09:17:20.582INFO
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
>
> ZK session expired. This disconnect could have been caused by a
> network partition or a long-running GC pause, either way it's
> recommended that you verify your environment.
>
> 09:17:20.582INFOorg.apache.zookeeper.ClientCnxn
>
> EventThread shut down
>
> 09:17:20.584INFOorg.apache.zookeeper.ClientCnxn
>
> Unable to reconnect to ZooKeeper service, session 0x1493f19431b0000
> has expired, closing socket connection
>
> 09:17:20.633FATALorg.apache.hadoop.hbase.regionserver.HRegionServer
>
> ABORTING region server
> etp-hdfs-n1-sg.passport.local,60020,1414102905372:
> regionserver:60020-0x1493f19431b0000
> regionserver:60020-0x1493f19431b0000 received expired from ZooKeeper,
> aborting
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired
>         at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:384)
>         at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303)
>         at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
>         at
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
>
> 09:17:20.733FATALorg.apache.hadoop.hbase.regionserver.HRegionServer
>
> RegionServer abort: loaded coprocessors are: []
>
> 09:17:20.736INFOorg.apache.hadoop.hbase.regionserver.HRegionServer
>
> Dump of metrics: requestsPerSecond=0, numberOfOnlineRegions=3,
> numberOfStores=3, hlogFileCount=2, hlogFileSizeMB=365,
> numberOfStorefiles=4, storefileIndexSizeMB=0, rootIndexSizeKB=2,
> totalStaticIndexSizeKB=650, totalStaticBloomSizeKB=256,
> memstoreSizeMB=0, mbInMemoryWithoutWAL=0, numberOfPutsWithoutWAL=0,
> readRequestsCount=279141, writeRequestsCount=91793,
> compactionQueueSize=0, flushQueueSize=0, usedHeapMB=455,
> maxHeapMB=1015, blockCacheSizeMB=2.66, blockCacheFreeMB=251.26,
> blockCacheCount=7, blockCacheHitCount=596347,
> blockCacheMissCount=3360967, blockCacheEvictedCount=36743,
> blockCacheHitRatio=15%, blockCacheHitCachingRatio=93%,
> hdfsBlocksLocalityIndex=100, slowHLogAppendCount=0,
> fsReadLatencyHistogramMean=330098.85,
> fsReadLatencyHistogramCount=2970773.00,
> fsReadLatencyHistogramMedian=312261.50,
> fsReadLatencyHistogram75th=357708.50,
> fsReadLatencyHistogram95th=3147993.00,
> fsReadLatencyHistogram99th=4431904.00,
> fsReadLatencyHistogram999th=4431904.00,
> fsPreadLatencyHistogramMean=3245622.44,
> fsPreadLatencyHistogramCount=36742.00,
> fsPreadLatencyHistogramMedian=3180486.50,
> fsPreadLatencyHistogram75th=3733068.50,
> fsPreadLatencyHistogram95th=6424537.75,
> fsPreadLatencyHistogram99th=24206578.00,
> fsPreadLatencyHistogram999th=24206578.00,
> fsWriteLatencyHistogramMean=356278.05,
> fsWriteLatencyHistogramCount=2983888.00,
> fsWriteLatencyHistogramMedian=538147.00,
> fsWriteLatencyHistogram75th=570668.00,
> fsWriteLatencyHistogram95th=570668.00,
> fsWriteLatencyHistogram99th=570668.00,
> fsWriteLatencyHistogram999th=570668.00
>
> 09:17:20.749INFOorg.apache.hadoop.hbase.regionserver.HRegionServer
>
> STOPPED: regionserver:60020-0x1493f19431b0000
> regionserver:60020-0x1493f19431b0000 received expired from ZooKeeper,
> aborting
>
> 09:17:20.750INFOorg.apache.zookeeper.ClientCnxn
>
> EventThread shut down
>
> 09:17:20.755INFOorg.apache.hadoop.ipc.HBaseServer
>
> Stopping server on 60020
>
> 09:17:20.756INFOorg.apache.hadoop.ipc.HBaseServer
>
> IPC Server handler 0 on 60020: exiting
>
> 09:17:20.756INFOorg.apache.hadoop.ipc.HBaseServer
>
> PRI IPC Server handler 1 on 60020: exiting
>
> 09:17:20.756INFOorg.apache.hadoop.ipc.HBaseServer
>
> PRI IPC Server handler 0 on 60020: exiting
>
> 09:17:20.756INFOorg.apache.hadoop.ipc.HBaseServer
>
> Stopping IPC Server listener on 60020
>
> 09:17:20.757INFOorg.apache.hadoop.ipc.HBaseServer
>
> PRI IPC Server handler 3 on 60020: exiting
>
> 09:17:20.757INFOorg.apache.hadoop.ipc.HBaseServer
>
> PRI IPC Server handler 8 on 60020: exiting
>
> 09:17:20.757INFOorg.apache.hadoop.ipc.HBaseServer
>
> PRI IPC Server handler 4 on 60020: exiting
>
> 09:17:20.757INFOorg.apache.hadoop.ipc.HBaseServer
>
> IPC Server handler 3 on 60020: exiting
>
> 09:17:20.757INFOorg.apache.hadoop.ipc.
> HBaseServer
>
> PRI IPC Server handler 6 on 60020: exiting
>
> 09:17:20.757INFOorg.apache.hadoop.ipc.HBaseServer
>
> IPC Server handler 2 on 60020: exiting
>
> 09:17:20.757INFOorg.apache.hadoop.ipc.HBaseServer
>
> IPC Server handler 6 on 60020: exiting
>
> 09:17:20.757INFOorg.apache.hadoop.ipc.HBaseServer
>
> PRI IPC Server handler 5 on 60020: exiting
>
> 09:17:20.757INFOorg.apache.hadoop.ipc.HBaseServer
>
> PRI IPC Server handler 7 on 60020: exiting
>
> 09:17:20.757INFOorg.apache.hadoop.ipc.HBaseServer
>
> PRI IPC Server handler 2 on 60020: exiting
>
> 09:17:20.757INFOorg.apache.hadoop.hbase.regionserver.SplitLogWorker
>
> Sending interrupt to stop the worker thread
>
> 09:17:20.758INFOorg.apache.hadoop.hbase.regionserver.HRegionServer
>
> Stopping infoServer
>
> 09:17:20.758INFOorg.apache.hadoop.ipc.HBaseServer
>
> IPC Server handler 4 on 60020: exiting
>
> 09:17:20.758INFOorg.apache.hadoop.ipc.HBaseServer
>
> Stopping IPC Server Responder
>
> 09:17:20.758INFOorg.apache.hadoop.ipc.HBaseServer
>
> IPC Server handler 7 on 60020: exiting
>
> 09:17:20.757INFOorg.apache.hadoop.ipc.HBaseServer
>
> IPC Server handler 8 on 60020: exiting
>
> 09:17:20.757INFOorg.apache.hadoop.ipc.HBaseServer
>
> IPC Server handler 1 on 60020: exiting
>
> 09:17:20.757INFOorg.apache.hadoop.ipc.HBaseServer
>
> PRI IPC Server handler 9 on 60020: exiting
>
> 09:17:20.758INFOorg.apache.hadoop.ipc.HBaseServer
>
> IPC Server handler 5 on 60020: exiting
>
> 09:17:20.758INFOorg.apache.hadoop.ipc.HBaseServer
>
> Stopping IPC Server Responder
>
> 09:17:20.761INFOorg.apache.hadoop.ipc.HBaseServer
>
> IPC Server handler 9 on 60020: exiting
>
> 09:17:20.763INFOorg.apache.hadoop.hbase.regionserver.SplitLogWorker
>
> SplitLogWorker interrupted while waiting for task, exiting:
> java.lang.InterruptedException
>
> 09:17:20.767INFOorg.apache.hadoop.ipc.HBaseServer
>
> REPL IPC Server handler 0 on 60020: exiting
>
> 09:17:20.767INFOorg.apache.hadoop.hbase.regionserver.SplitLogWorker
>
> SplitLogWorker etp-hdfs-n1-sg.passport.local,60020,1414102905372 exiting
>
> 09:17:20.767INFOorg.apache.hadoop.ipc.HBaseServer
>
> REPL IPC Server handler 2 on 60020: exiting
>
> 09:17:20.768INFOorg.mortbay.log
>
> Stopped SelectChannelConnector@0.0.0.0:60030
>
> 09:17:20.768INFOorg.apache.hadoop.ipc.HBaseServer
>
> REPL IPC Server handler 1 on 60020: exiting
>
> 09:17:20.908INFO
> org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager
>
> Stopping RegionServerSnapshotManager abruptly.
>
> 09:17:20.908INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer$CompactionChecker
>
> regionserver60020.compactionChecker exiting
>
> 09:17:20.908INFOorg.apache.hadoop.hbase.regionserver.LogRoller
>
> LogRoller exiting.
>
> 09:17:20.909INFOorg.apache.hadoop.hbase.regionserver.MemStoreFlusher
>
> regionserver60020.cacheFlusher exiting
>
> 09:17:20.920INFOorg.apache.hadoop.hbase.regionserver.HRegionServer
>
> aborting server etp-hdfs-n1-sg.passport.local,60020,1414102905372
>
> 09:17:20.924INFOorg.apache.hadoop.hbase.regionserver.HRegionServer
>
> Waiting on 3 regions to close
>
> 09:17:20.943INFOorg.apache.hadoop.hbase.regionserver.Store
>
> Closed CF
>
> 09:17:20.949INFOorg.apache.hadoop.hbase.regionserver.HRegion
>
> Closed epd_documents,,1418822716508.af285fa424bad011e58c32a88875b947.
>
> 09:17:20.951INFOorg.apache.hadoop.hbase.regionserver.Store
>
> Closed info
>
> 09:17:20.953INFOorg.apache.hadoop.hbase.regionserver.HRegion
>
> Closed .META.,,1.1028785192
>
> 09:17:20.956INFOorg.apache.hadoop.hbase.regionserver.Store
>
> Closed CF
>
> 09:17:20.957ERRORorg.apache.hadoop.hbase.regionserver.HRegion
>
> Memstore size is 13376
>
> 09:17:20.957INFOorg.apache.hadoop.hbase.regionserver.HRegion
>
> Closed
> epd_documents,403ded58-45fa-4526-ae5f-da69683bc620,1418822716508.f2cca08a8628d1660a4143f4383a5457.
>
> 09:17:21.125INFOorg.apache.hadoop.hbase.regionserver.HRegionServer
>
> stopping server etp-hdfs-n1-sg.passport.local,60020,1414102905372; all
> regions closed.
>
> 09:17:21.126INFOorg.apache.hadoop.hbase.regionserver.wal.HLog
>
> regionserver60020.logSyncer exiting
>
> 09:17:21.240INFOorg.apache.hadoop.hbase.regionserver.Leases
>
> regionserver60020 closing leases
>
> 09:17:21.241INFOorg.apache.hadoop.hbase.regionserver.Leases
>
> regionserver60020 closed leases
>
> 09:17:22.021INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer$PeriodicMemstoreFlusher
>
> regionserver60020.periodicFlusher exiting
>
> 09:17:22.031WARNorg.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper
>
> Possibly transient ZooKeeper exception:
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired for
> /hbase/rs/etp-hdfs-n1-sg.passport.local,60020,1414102905372
>
> 09:17:22.031INFOorg.apache.hadoop.hbase.util.RetryCounter
>
> Sleeping 2000ms before retry #1...
>
> 09:17:23.958INFOorg.apache.hadoop.hbase.regionserver.Leases
>
> regionserver60020.leaseChecker closing leases
>
> 09:17:23.959INFOorg.apache.hadoop.hbase.regionserver.Leases
>
> regionserver60020.leaseChecker closed leases
>
> 09:17:24.032WARNorg.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper
>
> Possibly transient ZooKeeper exception:
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired for
> /hbase/rs/etp-hdfs-n1-sg.passport.local,60020,1414102905372
>
> 09:17:24.032INFOorg.apache.hadoop.hbase.util.RetryCounter
>
> Sleeping 4000ms before retry #2...
>
> 09:17:28.032WARNorg.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper
>
> Possibly transient ZooKeeper exception:
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired for
> /hbase/rs/etp-hdfs-n1-sg.passport.local,60020,1414102905372
>
> 09:17:28.033INFOorg.apache.hadoop.hbase.util.RetryCounter
>
> Sleeping 8000ms before retry #3...
>
> 09:17:36.033WARNorg.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper
>
> Possibly transient ZooKeeper exception:
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired for
> /hbase/rs/etp-hdfs-n1-sg.passport.local,60020,1414102905372
>
> 09:17:36.033ERRORorg.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper
>
> ZooKeeper delete failed after 3 retries
>
> 09:17:36.033WARNorg.apache.hadoop.hbase.regionserver.HRegionServer
>
> Failed deleting my ephemeral node
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired for
> /hbase/rs/etp-hdfs-n1-sg.passport.local,60020,1414102905372
>         at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
>         at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>         at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873)
>         at
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.delete(RecoverableZooKeeper.java:133)
>         at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:1196)
>         at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:1185)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.deleteMyEphemeralNode(HRegionServer.java:1132)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:897)
>         at java.lang.Thread.run(Thread.java:662)
>
> 09:17:36.035INFOorg.apache.hadoop.hbase.regionserver.HRegionServer
>
> stopping server etp-hdfs-n1-sg.passport.local,60020,1414102905372;
> zookeeper connection closed.
>
> 09:17:36.035INFOorg.apache.hadoop.hbase.regionserver.HRegionServer
>
> regionserver60020 exiting
>
> 09:17:36.051ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine
>
> Region server exiting
> java.lang.RuntimeException: HRegionServer Aborted
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:65)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:84)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:4106)
>
> 09:17:36.139INFOorg.apache.hadoop.hbase.regionserver.ShutdownHook
>
> Shutdown hook starting; hbase.shutdown.hook=true;
>
> fsShutdownHook=org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@727f3b8a
>
> 09:17:36.140INFOorg.apache.hadoop.hbase.regionserver.HRegionServer
>
> STOPPED: Shutdown hook
>
> 09:17:36.141INFOorg.apache.hadoop.hbase.regionserver.ShutdownHook
>
> Starting fs shutdown hook thread.
>
> 09:17:36.142INFOorg.apache.hadoop.hbase.regionserver.ShutdownHook
>
> Shutdown hook finished.
>
> 10:16:40.584INFOorg.apache.hadoop.hbase.util.VersionInfo
>
> HBase 0.94.15-cdh4.7.0
>
> 10:16:40.608INFOorg.apache.hadoop.hbase.util.VersionInfo
>
> Subversion
> file:///data/1/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hbase-0.94.15-cdh4.7.0
> -r Unknown
>
> 10:16:40.608INFOorg.apache.hadoop.hbase.util.VersionInfo
>
> Compiled by jenkins on Wed May 28 10:48:53 PDT 2014
>
> 10:16:40.912INFOorg.apache.hadoop.hbase.util.ServerCommandLine
>
> vmName=Java HotSpot(TM) 64-Bit Server VM, vmVendor=Sun Microsystems
> Inc., vmVersion=20.6-b01
>
> 10:16:40.913INFOorg.apache.hadoop.hbase.util.ServerCommandLine
>
> vmInputArguments=[-XX:OnOutOfMemoryError=kill, -9, %p, -Xmx1000m,
> -Djava.net.preferIPv4Stack=true, -Xms1073741824, -Xmx1073741824,
> -XX:+UseParNewGC, -XX:+UseConcMarkSweepGC,
> -XX:-CMSConcurrentMTEnabled, -XX:CMSInitiatingOccupancyFraction=70,
> -XX:+CMSParallelRemarkEnabled, -Dhbase.log.dir=/var/log/hbase,
>
> -Dhbase.log.file=hbase-cmf-hbase1-REGIONSERVER-etp-hdfs-n1-sg.passport.local.log.out,
>
> -Dhbase.home.dir=/opt/cloudera/parcels/CDH-4.7.0-1.cdh4.7.0.p0.40/lib/hbase,
> -Dhbase.id.str=, -Dhbase.root.logger=INFO,RFA,
>
> -Djava.library.path=/opt/cloudera/parcels/CDH-4.7.0-1.cdh4.7.0.p0.40/lib/hadoop/lib/native:/opt/cloudera/parcels/CDH-4.7.0-1.cdh4.7.0.p0.40/lib/hbase/lib/native/Linux-amd64-64,
> -Dhbase.security.logger=INFO,DRFAS]
>
> 10:16:41.392INFOorg.apache.hadoop.ipc.HBaseServer
>
> Starting Thread-2
>
> 10:16:41.393INFOorg.apache.hadoop.ipc.sdfHBaseServer
>
> Starting Thread-2
>
>
> this FATAL
>
> 09:17:20.633FATALorg.apache.hadoop.hbase.regionserver.HRegionServer
>
> ABORTING region server
> etp-hdfs-n1-sg.passport.local,60020,1414102905372:
> regionserver:60020-0x1493f19431b0000
> regionserver:60020-0x1493f19431b0000 received expired from ZooKeeper,
> aborting
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired
>         at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:384)
>         at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303)
>         at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
>         at
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
>
> found on each RS
> I didn' t see any problems with ZK or memory usage. The load is extremely
> low, just few put/get in a second, you can see file sizes in logs.
> What it could be how can i catch the root cause?
>

Reply via email to