[jira] [Work logged] (HIVE-27097) Improve the retry strategy for Metastore client and server

ASF GitHub Bot (Jira) Mon, 20 Mar 2023 09:02:38 -0700


     [ 
https://issues.apache.org/jira/browse/HIVE-27097?focusedWorklogId=851816&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851816
 ]


ASF GitHub Bot logged work on HIVE-27097:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 20/Mar/23 16:01
            Start Date: 20/Mar/23 16:01
    Worklog Time Spent: 10m 
      Work Description: wecharyu commented on code in PR #4076:
URL: https://github.com/apache/hive/pull/4076#discussion_r1142356918


##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java:
##########
@@ -227,6 +237,23 @@ public Result invokeInternal(final Object proxy, final 
Method method, final Obje
     }
   }
 
+  private boolean isRecoverableException(Throwable t) {

Review Comment:
   The RPC flow in HMS looks like:
   **RetryingMetaStoreClient <---> RetryingHMSHandler <---> DBMS**
   
   Such recoverable exceptions always occur between `RetryingHMSHandler` and 
`DBMS`, I think it's RetryingHMSHandler's duty to handle this retry strategy. 
And this can also reduce the reconnection from metastore client to metastore 
server.





Issue Time Tracking
-------------------

    Worklog Id:     (was: 851816)
    Time Spent: 2h 50m  (was: 2h 40m)

> Improve the retry strategy for Metastore client and server
> ----------------------------------------------------------
>
>                 Key: HIVE-27097
>                 URL: https://issues.apache.org/jira/browse/HIVE-27097
>             Project: Hive
>          Issue Type: Improvement
>          Components: Hive
>    Affects Versions: 4.0.0-alpha-2
>            Reporter: Wechar
>            Assignee: Wechar
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> *Background*
> Hive provides *{{RetryingMetaStoreClient}}* and *{{RetryingHMSHandler}}* to 
> do retry when thrift request failed:
>  * RetryingMetaStoreClient will retry for *thrift related exception* and some 
> *MetaException*
>  * RetryingHMSHandler will retry for all {*}JDOException{*} or 
> *NucleusException*.
> *Motivation*
> Current retry mechanism will lead to many unnecessary retries in both client 
> and server. To simplify the process, we introduce following retry mechanism:
>  * Client side only concerns the error of communication, i.e., 
> {*}TTransportException{*}.
>  * Server side can skip some exceptions which always turn to fail even with 
> retry, like {*}SQLIntegrityConstraintViolationException{*}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27097) Improve the retry strategy for Metastore client and server

Reply via email to