Andrew Kyle Purtell created PHOENIX-7820:
--------------------------------------------
Summary: ConnectionQueryServicesImpl.createSnapshot bounded retry
on transient exception
Key: PHOENIX-7820
URL: https://issues.apache.org/jira/browse/PHOENIX-7820
Project: Phoenix
Issue Type: Sub-task
Components: core
Reporter: Andrew Kyle Purtell
Assignee: Andrew Kyle Purtell
Fix For: 5.4.0, 5.3.1
{{ConnectionQueryServicesImpl.createSnapshot()}} invokes {{admin.snapshot()}}
during the Phoenix upgrade path. Transient HMaster issues surface as upgrade
failures because of the lack of retry. The master's per-table lock can be
briefly held by a concurrent admin operation, or RPC-level retries can resubmit
an already-accepted snapshot request, causing the master to reject the
duplicate. The fix is to wrap the snapshot call in a small bounded retry loop
(5 attempts, 1 s backoff).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)