Bah, mis-paste and send with the same clipboard action, please ignore.
On Fri, Jun 6, 2014 at 6:35 PM, Andrew Purtell <apurt...@apache.org> wrote: > The requested URL */D2N6ZDvk,http://paste2.org/a64LXD0X > <http://paste2.org/a64LXD0X>* was not found on this server. The link you > followed is probably outdated. > > > > On Fri, Jun 6, 2014 at 6:33 PM, sunweiwei <su...@asiainfo-linkage.com> > wrote: > >> Hi >> >> The symptom reproduced again. >> I paste the log in >> http://paste2.org/D2N6ZDvk,http://paste2.org/a64LXD0X >> One is the regionserver jstack log. >> The other is regionserver log, which was grep and only include the >> unflush region. >> >> Thanks >> >> -----邮件原件----- >> 发件人: sunweiwei [mailto:su...@asiainfo-linkage.com] >> 发送时间: 2014年6月5日 14:51 >> 收件人: user@hbase.apache.org >> 主题: 答复: 答复: 答复: forcing flush not works >> >> I'm sorry but the regionserver log have been deleted... >> >> to Stack: >> Yes, Always the same two regions of Table BT_D_BF001_201406 always can't >> be flushed. >> >> >> Previously I have only saved a little log, When Table BT_D_BF001_201405 >> had lots of regions. >> 2014-05-27 22:40:52,025 DEBUG [regionserver60020.logRoller] >> regionserver.LogRoller: HLog roll requested >> 2014-05-27 22:40:52,039 DEBUG [regionserver60020.logRoller] wal.FSHLog: >> cleanupCurrentWriter waiting for transactions to get synced total >> 450500823 synced till here 450500779 >> 2014-05-27 22:40:52,049 INFO [regionserver60020.logRoller] wal.FSHLog: >> Rolled WAL >> /apps/hbase/data/WALs/hadoop03,60020,1401173211108/hadoop03%2C60020%2C1401173211108.1401201646659 >> with entries=94581, filesize=122.2 M; new WAL >> /apps/hbase/data/WALs/hadoop03,60020,1401173211108/hadoop03%2C60020%2C1401173211108.1401201652025 >> 2014-05-27 22:40:52,049 INFO [regionserver60020.logRoller] wal.FSHLog: >> Too many hlogs: logs=156, maxlogs=32; forcing flush of 2 regions(s): >> a5b94272f0fdd477bf320e428059fe87, f1a60d3ea5820cb672832c59531de89d >> 2014-05-27 22:40:52,073 DEBUG [Thread-17] regionserver.MemStoreFlusher: >> Flush thread woke up because memory above low water=6.1 G >> 2014-05-27 22:40:52,074 DEBUG [Thread-17] regionserver.MemStoreFlusher: >> Under global heap pressure: Region >> BT_D_BF001_201405,8618989870918460036571102456550000000012320000002014050815160220140508151602174000000000462000001094000080000000000000000000548090000000,1401009042160.47633a80bd6fede708c05c9fcc9e2b39. >> has too many store files, but is 27.6 M vs best flushable region's 0. >> Choosing the bigger. >> 2014-05-27 22:40:52,075 INFO [Thread-17] regionserver.MemStoreFlusher: >> Flush of region >> BT_D_BF001_201405,8618989870918460036571102456550000000012320000002014050815160220140508151602174000000000462000001094000080000000000000000000548090000000,1401009042160.47633a80bd6fede708c05c9fcc9e2b39. >> due to global heap pressure >> 2014-05-27 22:40:52,075 DEBUG [Thread-17] regionserver.HRegion: Started >> memstore flush for >> BT_D_BF001_201405,8618989870918460036571102456550000000012320000002014050815160220140508151602174000000000462000001094000080000000000000000000548090000000,1401009042160.47633a80bd6fede708c05c9fcc9e2b39., >> current region memstore size 27.6 M >> 2014-05-27 22:40:52,599 INFO [Thread-17] >> regionserver.DefaultStoreFlusher: Flushed, sequenceid=10069900941, >> memsize=27.6 M, hasBloomFilter=true, into tmp file >> hdfs://hdpcluster/apps/hbase/data/data/default/BT_D_BF001_201405/47633a80bd6fede708c05c9fcc9e2b39/.tmp/a89428808e1a4be4a1bf7bd9ec8ece88 >> 2014-05-27 22:40:52,608 DEBUG [Thread-17] regionserver.HRegionFileSystem: >> Committing store file >> hdfs://hdpcluster/apps/hbase/data/data/default/BT_D_BF001_201405/47633a80bd6fede708c05c9fcc9e2b39/.tmp/a89428808e1a4be4a1bf7bd9ec8ece88 >> as >> hdfs://hdpcluster/apps/hbase/data/data/default/BT_D_BF001_201405/47633a80bd6fede708c05c9fcc9e2b39/cf/a89428808e1a4be4a1bf7bd9ec8ece88 >> 2014-05-27 22:40:52,617 INFO [Thread-17] regionserver.HStore: Added >> hdfs://hdpcluster/apps/hbase/data/data/default/BT_D_BF001_201405/47633a80bd6fede708c05c9fcc9e2b39/cf/a89428808e1a4be4a1bf7bd9ec8ece88, >> entries=44962, sequenceid=10069900941, filesize=5.5 M >> 2014-05-27 22:40:52,618 INFO [Thread-17] regionserver.HRegion: Finished >> memstore flush of ~27.6 M/28933240, currentsize=43.6 K/44664 for region >> BT_D_BF001_201405,8618989870918460036571102456550000000012320000002014050815160220140508151602174000000000462000001094000080000000000000000000548090000000,1401009042160.47633a80bd6fede708c05c9fcc9e2b39. >> in 542ms, sequenceid=10069900941, compaction requested=true >> 2014-05-27 22:40:52,618 DEBUG [Thread-17] >> regionserver.CompactSplitThread: Small Compaction requested: system; >> Because: Thread-17; compaction_queue=(4896:19152), split_queue=0, >> merge_queue=0 >> >> >> Thanks >> >> -----邮件原件----- >> 发件人: ramkrishna vasudevan [mailto:ramkrishna.s.vasude...@gmail.com] >> 发送时间: 2014年6月5日 13:43 >> 收件人: user@hbase.apache.org >> 主题: Re: 答复: 答复: forcing flush not works >> >> >>I still (highly)suspect that there is something wrong with the flush >> queue(some entry pushed into it can't be poll out). >> Ya I too have that suspect. May be any new logs may help to uncover the >> issue. >> >> >> On Thu, Jun 5, 2014 at 11:06 AM, Stack <st...@duboce.net> wrote: >> >> > Always the same two regions that get stuck or does it vary? Another >> set of >> > example logs may help uncover the sequence of trouble-causing events. >> > >> > Thanks, >> > St.Ack >> > >> > >> > On Wed, Jun 4, 2014 at 7:31 PM, sunweiwei <su...@asiainfo-linkage.com> >> > wrote: >> > >> > > my log is similar as HBASE-10499. >> > > >> > > Thanks >> > > >> > > -----邮件原件----- >> > > 发件人: saint....@gmail.com [mailto:saint....@gmail.com] 代表 Stack >> > > 发送时间: 2014年6月3日 23:10 >> > > 收件人: Hbase-User >> > > 主题: Re: 答复: forcing flush not works >> > > >> > > Mind posting link to your log? Sounds like HBASE-10499 as Honghua >> says. >> > > St.Ack >> > > >> > > >> > > On Tue, Jun 3, 2014 at 2:34 AM, sunweiwei <su...@asiainfo-linkage.com >> > >> > > wrote: >> > > >> > > > Thanks. Maybe the same as HBase-10499. >> > > > I stop the regionserver then start it. Then hbase back to normal. >> > > > This is jstack log when 2 regions can not flush. >> > > > >> > > > "Thread-17" prio=10 tid=0x00007f6210383800 nid=0x6540 waiting on >> > > condition >> > > > [0x00007f61e0a26000] >> > > > java.lang.Thread.State: TIMED_WAITING (parking) >> > > > at sun.misc.Unsafe.park(Native Method) >> > > > - parking to wait for <0x000000041ae0e6b8> (a >> > > > java.util.concurrent. >> > > > locks.AbstractQueuedSynchronizer$ConditionObject) >> > > > at >> > > > >> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196) >> > > > at >> > > > >> > > > >> > > >> > >> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitN >> > > > anos(AbstractQueuedSynchronizer.java:2025) >> > > > at java.util.concurrent.DelayQueue.poll(DelayQueue.java:201) >> > > > at java.util.concurrent.DelayQueue.poll(DelayQueue.java:39) >> > > > at >> > > > >> > > > >> > > >> > >> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemSto >> > > > reFlusher.java:228) >> > > > at java.lang.Thread.run(Thread.java:662) >> > > > >> > > > -----邮件原件----- >> > > > 发件人: 冯宏华 [mailto:fenghong...@xiaomi.com] >> > > > 发送时间: 2014年6月3日 16:34 >> > > > 收件人: user@hbase.apache.org >> > > > 主题: 答复: forcing flush not works >> > > > >> > > > The same symptom as HBase-10499? >> > > > >> > > > I still (highly)suspect that there is something wrong with the flush >> > > > queue(some entry pushed into it can't be poll out). >> > > > ________________________________________ >> > > > 发件人: sunweiwei [su...@asiainfo-linkage.com] >> > > > 发送时间: 2014年6月3日 15:43 >> > > > 收件人: user@hbase.apache.org >> > > > 主题: forcing flush not works >> > > > >> > > > Hi >> > > > >> > > > >> > > > >> > > > I'm using a heavy-write hbase0.96 . I find this in regionserver log: >> > > > >> > > > 2014-06-03 15:13:19,445 INFO [regionserver60020.logRoller] >> wal.FSHLog: >> > > Too >> > > > many hlogs: logs=33, maxlogs=32; forcing flush of 3 regions(s): >> > > > 1a7dda3c3815c19970ace39fd99abfe8, aff81bc46aa7d3ed51a01f11f23c8320, >> > > > d5666e003f598147b4dda509f173a779 >> > > > >> > > > 2014-06-03 15:13:23,869 INFO [regionserver60020.logRoller] >> wal.FSHLog: >> > > Too >> > > > many hlogs: logs=34, maxlogs=32; forcing flush of 2 regions(s): >> > > > aff81bc46aa7d3ed51a01f11f23c8320, d5666e003f598147b4dda509f173a779 >> > > > >> > > > ┇ >> > > > >> > > > ┇ >> > > > >> > > > 2014-06-03 15:18:14,778 INFO [regionserver60020.logRoller] >> wal.FSHLog: >> > > Too >> > > > many hlogs: logs=93, maxlogs=32; forcing flush of 2 regions(s): >> > > > aff81bc46aa7d3ed51a01f11f23c8320, d5666e003f598147b4dda509f173a779 >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > It seems like 2 regions can’t be flushed and WALs Dir continue to >> > > increase >> > > > and Then I find this in client log: >> > > > >> > > > INFO | AsyncProcess-waitForMaximumCurrentTasks [2014-06-03 >> 15:30:53] - >> > : >> > > > Waiting for the global number of running tasks to be equals or less >> > than >> > > 0, >> > > > tasksSent=15819, tasksDone=15818, currentTasksDone=15818, >> > > > tableName=BT_D_BF001_201406 >> > > > >> > > > >> > > > >> > > > Then write speed will become very slow. >> > > > >> > > > After I flush 2 regions manually , write speed can back to normal >> > only >> > > > a >> > > > short while. >> > > > >> > > > >> > > > >> > > > Any suggestion will be appreciated. Thanks. >> > > > >> > > > >> > > > >> > > >> > > >> > >> >> > > > -- > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)