[ https://issues.apache.org/jira/browse/PHOENIX-4537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16335178#comment-16335178 ]
James Taylor edited comment on PHOENIX-4537 at 1/22/18 11:47 PM: ----------------------------------------------------------------- The upgrade *should not* happen from a server-side connection. We have code in place for this already. See QueryUtil.getConnectionOnServer(): {code} public static Connection getConnectionOnServer(Properties props, Configuration conf) throws ClassNotFoundException, SQLException { UpgradeUtil.doNotUpgradeOnFirstConnection(props); return getConnection(props, conf); } {code} If we're going to go this route, can't we prevent this migration in a similar manner? Sounds like we need a bit more discussion on this one before committing given the comfort level, [~elserj]. What about the idea I outlined above of doing PHOENIX-4530 and removing the clearTsOnDisabledIndexes altogether? was (Author: jamestaylor): The upgrade *should not* happen from a server-side connection. We have code in place for this already. See QueryUtil.getConnectionOnServer(): {code} public static Connection getConnectionOnServer(Properties props, Configuration conf) throws ClassNotFoundException, SQLException { UpgradeUtil.doNotUpgradeOnFirstConnection(props); return getConnection(props, conf); } {code} Sounds like we need a bit more discussion on this one before committing given the comfort level, [~elserj]. What about the idea I outlined above of doing PHOENIX-4530 and removing the clearTsOnDisabledIndexes altogether? > RegionServer initiating compaction can trigger schema migration and deadlock > the system > --------------------------------------------------------------------------------------- > > Key: PHOENIX-4537 > URL: https://issues.apache.org/jira/browse/PHOENIX-4537 > Project: Phoenix > Issue Type: Bug > Reporter: Romil Choksi > Assignee: Josh Elser > Priority: Critical > Fix For: 5.0.0, 4.14.0 > > Attachments: PHOENIX-4537.001.patch > > > [~sergey.soldatov] has been doing some great digging around a test failure > we've been seeing at $dayjob. The situation goes like this. > 0. Run some arbitrary load > 1. Stop HBase > 2. Enable schema mapping ({{phoenix.schema.isNamespaceMappingEnabled=true}} > and {{phoenix.schema.mapSystemTablesToNamespace=true}} in hbase-site.xml) > 3. Start HBase > 4. Circumstantially, have the SYSTEM.CATALOG table need a compaction to run > before a client first connects > When the RegionServer initiates the compaction, it will end up running > {{UngroupedAggregateRegionObserver.clearTsOnDisabledIndexes}} which opens a > Phoenix connection. While the RegionServer won't upgrade system tables, it > *will* try to migrate them into the schema mapped variants (e.g. > SYSTEM.CATALOG to SYSTEM:CATALOG). > However, one of the first steps in the schema migration is to disable the > SYSTEM.CATALOG table. However, the SYSTEM.CATALOG table can't be disabled > until the region is CLOSED, and the region cannot be CLOSED until the > compaction is finished. *deadlock* > The "obvious" fix is to avoid RegionServers from triggering system table > migrations, but Sergey and [~elserj] both think that this will end badly > (RegionServers falling over because they expect the tables to be migrated and > they aren't). > Thoughts? [~ankit.singhal], [~jamestaylor], any others? -- This message was sent by Atlassian JIRA (v7.6.3#76005)