Josh Elser created PHOENIX-4537:
-----------------------------------
Summary: RegionServer initiating compaction can trigger schema
migration and deadlock the system
Key: PHOENIX-4537
URL: https://issues.apache.org/jira/browse/PHOENIX-4537
Project: Phoenix
Issue Type: Bug
Reporter: Romil Choksi
Fix For: 5.0.0, 4.14.0
[~sergey.soldatov] has been doing some great digging around a test failure
we've been seeing at $dayjob. The situation goes like this.
0. Run some arbitrary load
1. Stop HBase
2. Enable schema mapping ({{phoenix.schema.isNamespaceMappingEnabled=true}} and
{{phoenix.schema.mapSystemTablesToNamespace=true}} in hbase-site.xml)
3. Start HBase
4. Circumstantially, have the SYSTEM.CATALOG table need a compaction to run
before a client first connects
When the RegionServer initiates the compaction, it will end up running
{{UngroupedAggregateRegionObserver.clearTsOnDisabledIndexes}} which opens a
Phoenix connection. While the RegionServer won't upgrade system tables, it
*will* try to migrate them into the schema mapped variants (e.g. SYSTEM.CATALOG
to SYSTEM:CATALOG).
However, one of the first steps in the schema migration is to disable the
SYSTEM.CATALOG table. However, the SYSTEM.CATALOG table can't be disabled until
the region is CLOSED, and the region cannot be CLOSED until the compaction is
finished. *deadlock*
The "obvious" fix is to avoid RegionServers from triggering system table
migrations, but Sergey and I both think that this will end badly (RegionServers
falling over because they expect the tables to be migrated and they aren't).
Thoughts? [~ankit.singhal], [~jamestaylor], any others?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)