[ https://issues.apache.org/jira/browse/HBASE-20770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16519624#comment-16519624 ]
stack commented on HBASE-20770: ------------------------------- I will. Was thinking of updating the 60 seconds too... to ten minutes or something. Thanks. > WAL cleaner logs way too much; gets clogged when lots of work to do > ------------------------------------------------------------------- > > Key: HBASE-20770 > URL: https://issues.apache.org/jira/browse/HBASE-20770 > Project: HBase > Issue Type: Bug > Components: logging > Reporter: stack > Priority: Critical > Fix For: 2.0.2 > > > Been here before (HBASE-7214 and HBASE-19652). Testing on large cluster, > Master log is in a continuous spew of logging output fililng disks. It is > stuck making no progress but hard to tell because it is logging minutiae > rather than a call on whats actually wrong. > Log is full of this: > {code} > 2018-06-21 01:19:12,761 DEBUG > org.apache.hadoop.hbase.master.cleaner.HFileCleaner: Removing > hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/e98cdb817bb3af5fa26e2b885a0b2ec6/meta/bd49572de3914b66985fff5ea2ca7403 > 2018-06-21 01:19:12,761 DEBUG > org.apache.hadoop.hbase.master.cleaner.HFileCleaner: Removing > hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/e98cdb817bb3af5fa26e2b885a0b2ec6/meta/fad01294c6ca421db209e89b5b97d364 > 2018-06-21 01:19:12,823 WARN > org.apache.hadoop.hbase.master.cleaner.HFileCleaner: Wait more than 60000 ms > for deleting > hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/d3f759d0495257fc1d33ae780b634455/tiny/b72bac4036444dcf9265c7b5664fd403, > exit... > 2018-06-21 01:19:12,823 DEBUG > org.apache.hadoop.hbase.master.cleaner.CleanerChore: Cleaning under > hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/665bfa38c86a28d641ce08f8fea0a7f9 > 2018-06-21 01:19:12,824 WARN > org.apache.hadoop.hbase.master.cleaner.HFileCleaner: Wait more than 60000 ms > for deleting > hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/2425053ad86823081b368e00bc471e56/tiny/6ea3cb1174434aecbc448e322e2a062c, > exit... > 2018-06-21 01:19:12,824 DEBUG > org.apache.hadoop.hbase.master.cleaner.CleanerChore: Cleaning under > hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/e98cdb817bb3af5fa26e2b885a0b2ec6/big > 2018-06-21 01:19:12,824 DEBUG > org.apache.hadoop.hbase.master.cleaner.CleanerChore: Cleaning under > hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/e98cdb817bb3af5fa26e2b885a0b2ec6/tiny > 2018-06-21 01:19:12,827 DEBUG > org.apache.hadoop.hbase.master.cleaner.CleanerChore: Cleaning under > hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/665bfa38c86a28d641ce08f8fea0a7f9/meta > 2018-06-21 01:19:12,844 DEBUG > org.apache.hadoop.hbase.master.cleaner.CleanerChore: Cleaning under > hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/17f85c98389104b19358f6751da577d0 > 2018-06-21 01:19:12,844 DEBUG > org.apache.hadoop.hbase.master.cleaner.CleanerChore: Cleaning under > hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/c98e276423813aaa74d848983c47d93c > 2018-06-21 01:19:12,844 DEBUG > org.apache.hadoop.hbase.master.cleaner.HFileCleaner: Removing > hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/665bfa38c86a28d641ce08f8fea0a7f9/meta/90f21dec28d140cda48d37eeb44d37e8 > 2018-06-21 01:19:12,844 DEBUG > org.apache.hadoop.hbase.master.cleaner.HFileCleaner: Removing > hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/665bfa38c86a28d641ce08f8fea0a7f9/meta/8a4cf6410d5a4201963bc1415945f877 > 2018-06-21 01:19:12,848 DEBUG > org.apache.hadoop.hbase.master.cleaner.CleanerChore: Cleaning under > hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/c98e276423813aaa74d848983c47d93c/meta > 2018-06-21 01:19:12,849 DEBUG > org.apache.hadoop.hbase.master.cleaner.CleanerChore: Cleaning under > hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/17f85c98389104b19358f6751da577d0/meta > 2018-06-21 01:19:12,927 DEBUG > org.apache.hadoop.hbase.master.cleaner.HFileCleaner: Removing > hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/17f85c98389104b19358f6751da577d0/meta/6043fce5761e4479819b15405183f193 > 2018-06-21 01:19:12,927 DEBUG > org.apache.hadoop.hbase.master.cleaner.HFileCleaner: Removing > hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/c98e276423813aaa74d848983c47d93c/meta/69e6bf4650124859b2bc7ddf134be642 > 2018-06-21 01:19:13,011 DEBUG > org.apache.hadoop.hbase.master.cleaner.HFileCleaner: Removing > hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/17f85c98389104b19358f6751da577d0/meta/1a46700fbc434574a005c0b55879d5ed > 2018-06-21 01:19:13,011 DEBUG > org.apache.hadoop.hbase.master.cleaner.HFileCleaner: Removing > hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/c98e276423813aaa74d848983c47d93c/meta/44beca2adfb544999dd82e9cf8151be1 > 2018-06-21 01:19:13,048 WARN > org.apache.hadoop.hbase.master.cleaner.HFileCleaner: Wait more than 60000 ms > for deleting > hdfs://ns1/hbase/archive/data/default/IntegrationTestBigLinkedList/1f60d5dcf666b1e47de3445c369b6411/tiny/29b934a9c8f74ec1a5806f3fabeee094, > exit... > {code} > It looks like we make no progress because we give up after 60 seconds (see > above). The dir in this case has ~17k WALs in it (Cluster has hundreds of > servers and has been running test for a while). When I do a listing, the > count of WALs is not changing. > There is also this output below that does not help. > {code} > 2018-06-21 00:16:05,852 DEBUG > org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: > HDFSBlocksDistribution not found in cache for region > IntegrationTestBigLinkedList,gW^6\xB5Z}\x00,1529565022477.4c27c2ef56a0aa3adae325a20d0a229b. > 2018-06-21 00:16:05,855 DEBUG > org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: > HDFSBlocksDistribution not found in cache for region > IntegrationTestBigLinkedList,gP\x10\xE2\xD1\xFD\xC2\x00,1529565022477.cee1017900682956e90e94be1e1c10da. > 2018-06-21 00:16:05,858 DEBUG > org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: > HDFSBlocksDistribution not found in cache for region > IntegrationTestBigLinkedList,gH\xC3\x8E\xEE\xA1\x07\x00,1529565022477.d4184dddf837bb3c19eb0cc0d87ccdf5. > 2018-06-21 00:16:05,861 DEBUG > org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: > HDFSBlocksDistribution not found in cache for region > IntegrationTestBigLinkedList,g6\x82=693\x80,1529565022477.87faebdd14107f09126efa169b543dd6. > 2018-06-21 00:16:05,863 DEBUG > org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: > HDFSBlocksDistribution not found in cache for region > IntegrationTestBigLinkedList,gS\xB7\x8C\xC3\xAC\x1F\x80,1529565022477.4d5a3efc33988180ad11c4eac8e64a4a. > 2018-06-21 00:16:05,866 DEBUG > org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: > HDFSBlocksDistribution not found in cache for region > IntegrationTestBigLinkedList,g^\xAB\x8A\x98\xB78\x00,1529565022477.6a6945672c19b937fc2ccde13d2aff5d. > 2018-06-21 00:16:05,869 DEBUG > org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: > HDFSBlocksDistribution not found in cache for region > IntegrationTestBigLinkedList,gmF2_p\xAE\x00,1529565022477.6607c6fef270c71c44caa1b8e69a63cb. > 2018-06-21 00:16:05,872 DEBUG > org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: > HDFSBlocksDistribution not found in cache for region > IntegrationTestBigLinkedList,gi\x9F\x88m\xC2P\x80,1529565022477.5ecbc236145f85d857ffea9ad3c1d89c. > 2018-06-21 00:16:05,875 DEBUG > org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: > HDFSBlocksDistribution not found in cache for region > IntegrationTestBigLinkedList,g[\x04\xE0\xA7\x08\xDA\x80,1529565022477.02a09a3ce49e05ea35a2ed51d306b37d. > 2018-06-21 00:16:05,878 DEBUG > org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: > HDFSBlocksDistribution not found in cache for region > IntegrationTestBigLinkedList,gp\xEC\xDCQ\x1F\x0B\x80,1529565022477.0ce748886ee50a79691c7569a17ea30a. > 2018-06-21 00:16:05,881 DEBUG > org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: > HDFSBlocksDistribution not found in cache for region > IntegrationTestBigLinkedList,g{\xE0\xDA&*$\x00,1529565022477.8d65e26b122c25f411c19555c9103b03. > 2018-06-21 00:16:05,884 DEBUG > org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: > HDFSBlocksDistribution not found in cache for region > IntegrationTestBigLinkedList,g\x86\xD4\xD7\xFB5<\x80,1529565022477.88c1deebe2e914d9f323a557d1517ae8. > 2018-06-21 00:16:05,888 DEBUG > org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: > HDFSBlocksDistribution not found in cache for region > IntegrationTestBigLinkedList,g\x95o\x7F\xC1\xEE\xB2\x80,1529565022477.1c3849e505edf7efd1391ae8da88ec09. > 2018-06-21 00:16:05,891 DEBUG > org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: > HDFSBlocksDistribution not found in cache for region > IntegrationTestBigLinkedList,g\x8E"+\xDE\x91\xF7\x80,1529565022477.bb4eb5e775869e6cc3bc95fbc661c487. > 2018-06-21 00:16:05,894 DEBUG > org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: > HDFSBlocksDistribution not found in cache for region > IntegrationTestBigLinkedList,ge\xF8\xDE|\x13\xF3\x00,1529565022477.76afec6687b4ee782177424af54e8767. > 2018-06-21 00:16:05,897 DEBUG > org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: > HDFSBlocksDistribution not found in cache for region > IntegrationTestBigLinkedList,gt\x93\x86B\xCDi\x00,1529565022477.0dd540d73e6dd115585de4e576b771ca. > 2018-06-21 00:16:05,900 DEBUG > org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: > HDFSBlocksDistribution not found in cache for region > IntegrationTestBigLinkedList,g\x99\x16)\xB3\x9D\x10\x00,1529565022477.0dd49eaefa19cf1935f41661aa4e1dde. > 2018-06-21 00:16:05,903 DEBUG > org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: > HDFSBlocksDistribution not found in cache for region > IntegrationTestBigLinkedList,gx:04{\xC6\x80,1529565022477.525b1aff5c2361101afce5f6c05a68b6. > 2018-06-21 00:16:05,906 DEBUG > org.apache.hadoop.hbase.master.balancer.RegionLocationFinder: > HDFSBlocksDistribution not found in cache for region > IntegrationTestBigLinkedList,g\x9C\xBC\xD3\xA5Km\x80,1529565022477.9fbb338f7e006c02c3a84ca926749674. > ... > {code} > And this.... > {code} > 2018-06-21 00:16:39,617 DEBUG > org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler: Will read > input token of size 104 for processing by initSASLContext > 2018-06-21 00:16:39,617 DEBUG > org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler: Will send > token of size 0 from initSASLContext. > 2018-06-21 00:16:39,618 DEBUG > org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler: Will read > input token of size 104 for processing by initSASLContext > 2018-06-21 00:16:39,618 DEBUG > org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler: Will read > input token of size 50 for processing by initSASLContext > 2018-06-21 00:16:39,618 DEBUG > org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler: Will send > token of size 0 from initSASLContext. > 2018-06-21 00:16:39,618 DEBUG > org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler: Will send > token of size 50 from initSASLContext. > 2018-06-21 00:16:39,618 DEBUG > org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler: Will read > input token of size 104 for processing by initSASLContext > 2018-06-21 00:16:39,618 DEBUG > org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClient: SASL client context > established. Negotiated QoP: auth > 2018-06-21 00:16:39,618 DEBUG > org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler: Will send > token of size 0 from initSASLContext. > 2018-06-21 00:16:39,618 DEBUG > org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler: Will read > input token of size 50 for processing by initSASLContext > 2018-06-21 00:16:39,618 DEBUG > org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler: Will send > token of size 50 from initSASLContext. > 2018-06-21 00:16:39,619 DEBUG > org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClient: SASL client context > established. Negotiated QoP: auth > 2018-06-21 00:16:39,619 DEBUG > org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler: Will read > input token of size 50 for processing by initSASLContext > 2018-06-21 00:16:39,619 DEBUG > org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler: Will send > token of size 50 from initSASLContext. > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)