[ 
https://issues.apache.org/jira/browse/HBASE-24686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17153100#comment-17153100
 ] 

Sean Busbey commented on HBASE-24686:
-------------------------------------

Better logging here especially including who called close would help us chase 
down a master failure case [~mchakka] has seen unsteadily in test lately for 
branch-2.2. Basically the master's primary connection gets closed but the 
master does not abort and then a bunch of stuff fails.

e.g. the master will log will show the normalizer failing

{code}

2020-07-03 09:57:42,005 ERROR 
org.apache.hadoop.hbase.master.normalizer.RegionNormalizerChore: Failed to 
normalize regions.
java.io.IOException: connection is closed
        at 
org.apache.hadoop.hbase.MetaTableAccessor.getMetaHTable(MetaTableAccessor.java:241)
        at 
org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:797)
        at 
org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:768)
        at 
org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:727)
        at 
org.apache.hadoop.hbase.MetaTableAccessor.fullScanTables(MetaTableAccessor.java:215)
        at 
org.apache.hadoop.hbase.master.TableStateManager.getTablesInStates(TableStateManager.java:189)
        at 
org.apache.hadoop.hbase.master.HMaster.normalizeRegions(HMaster.java:1815)
        at 
org.apache.hadoop.hbase.master.normalizer.RegionNormalizerChore.chore(RegionNormalizerChore.java:48)
        at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:188)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at 
org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:111)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
{code}

also if you try to use the shell or the java api to create tables that will 
fail at a similar point once the master needs to call 
{{MetaTableAccessor.getMetaHTable}}.

Right now the only indication of a connection closing is a line like this:

{code}
2020-07-03 09:52:38,751 INFO 
org.apache.hadoop.hbase.client.ConnectionImplementation: Closing master 
protocol: MasterService
{code}

a good starting point would be to update ConnectionImplementation with an INFO 
message that says when we are done closing everything and a DEBUG message at 
the start that includes the call stack.

> [LOG] Log improvement in Connection#close
> -----------------------------------------
>
>                 Key: HBASE-24686
>                 URL: https://issues.apache.org/jira/browse/HBASE-24686
>             Project: HBase
>          Issue Type: Improvement
>          Components: Client, logging
>    Affects Versions: 2.2.3
>            Reporter: mokai
>            Priority: Major
>
> We met some customers used hbase connection improperly, some threads call 
> failed since the shared connection closed by one of the threads.
> It's better to print the details when connection closing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to