[ https://issues.apache.org/jira/browse/HBASE-27850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17756348#comment-17756348 ]
Haoze Wu edited comment on HBASE-27850 at 8/19/23 7:51 PM: ----------------------------------------------------------- Hello, may I have the full master and region servers logs was (Author: functioner): Hello, may I have the logs printed out? > TimeoutIOException: Failed to get sync result after 300000 ms for > txid=16920651960, WAL system stuck? > ----------------------------------------------------------------------------------------------------- > > Key: HBASE-27850 > URL: https://issues.apache.org/jira/browse/HBASE-27850 > Project: HBase > Issue Type: Bug > Components: regionserver > Affects Versions: 2.2.6 > Environment: hbase 2.2.6 > hadoop 3.3.1 > Reporter: longping_jie > Priority: Major > Attachments: 49151.log1 > > > A node under a RsGroup (only one table), at a certain moment, the write call > queue is blocked, and the blocking time starts, and the reading and writing > qps of this table are all reduced to 0, and the client cannot read and write > the table, RS call At the point in time when queue blocking starts, the > following errors are continuously reported in the log: > > 2023-05-08 12:42:27,310 ERROR [MemStoreFlusher.2] > regionserver.MemStoreFlusher: Cache flush failed for region > user_feature_v2,eacf_1658057555,1660314723816.2376cc2326b5372131cc530b115d959a. > org.apache.hadoop.hbase.exceptions.TimeoutIOException: Failed to get sync > result after 300000 ms for txid=16920651960, WAL system stuck? > at > org.apache.hadoop.hbase.regionserver.wal.SyncFuture.get(SyncFuture.java:155) > at > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.blockOnSync(AbstractFSWAL.java:743) > at > org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.sync(AsyncFSWAL.java:625) > at > org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.sync(AsyncFSWAL.java:602) > at > org.apache.hadoop.hbase.regionserver.HRegion.doSyncOfUnflushedWALChanges(HRegion.java:2754) > at > org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2691) > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2549) > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2523) > at > org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2409) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:611) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:580) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$1000(MemStoreFlusher.java:68) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:360) > at java.lang.Thread.run(Thread.java:748) > The data in the node memstore cannot be flushed to the WAL file, other > indicators of the node are normal, and HDFS is not under pressure. After > restarting the blocked node, the table returned to normal. > -- This message was sent by Atlassian Jira (v8.20.10#820010)