[ https://issues.apache.org/jira/browse/PHOENIX-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15314909#comment-15314909 ]
Hadoop QA commented on PHOENIX-2940: ------------------------------------ {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12808056/PHOENIX-2940.001.patch against master branch at commit 7dcf95a40063a25917a68c56c68fe61a11a4ef8b. ATTACHMENT ID: 12808056 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 10 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 31 warning messages. {color:red}-1 release audit{color}. The applied patch generated 7 release audit warnings (more than the master's current 0 warnings). {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + // Avoid querying the stats table because we're holding the rowLock here. Issuing an RPC to a remote + rowKeyOrderOptimizable, transactional, updateCacheFrequency, PTableStats.EMPTY_STATS, baseColumnCount, + tableStats = useStats() ? context.getConnection().getQueryServices().getTableStats(physicalTableName, currentSCN) : PTableStats.EMPTY_STATS; + private static final Logger statsRefreshLogger = LoggerFactory.getLogger(PhoenixStatsRefreshTask.class); + statsRefreshLogger.debug("Cannot get Phoenix SYSTEM.STATS table. Maybe it doesn't exist yet", e); + PTableStats tableStats = readStatistics(statsHTable, tableName, Long.MAX_VALUE); + statsRefreshLogger.debug("Failed to fetch table stats for {}", table.getName(), e); + return queryServices.connection.getTable(PhoenixDatabaseMetaData.SYSTEM_STATS_NAME_BYTES); + PTableStats readStatistics(HTableInterface statsTable, byte[] tableName, long clientTimeStamp) throws IOException { + static class PhoenixStatsCacheRemovalListener implements RemovalListener<ImmutableBytesPtr, PTableStats> { {color:red}-1 core tests{color}. The patch failed these unit tests: ./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.rpc.PhoenixServerRpcIT ./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.TenantSpecificTablesDMLIT Test results: https://builds.apache.org/job/PreCommit-PHOENIX-Build/382//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-PHOENIX-Build/382//artifact/patchprocess/patchReleaseAuditWarnings.txt Javadoc warnings: https://builds.apache.org/job/PreCommit-PHOENIX-Build/382//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-PHOENIX-Build/382//console This message is automatically generated. > Remove STATS RPCs from rowlock > ------------------------------ > > Key: PHOENIX-2940 > URL: https://issues.apache.org/jira/browse/PHOENIX-2940 > Project: Phoenix > Issue Type: Improvement > Environment: HDP 2.3 + Apache Phoenix 4.6.0 > Reporter: Nick Dimiduk > Assignee: Josh Elser > Fix For: 4.9.0 > > Attachments: PHOENIX-2940.001.patch > > > We have an unfortunate situation wherein we potentially execute many RPCs > while holding a row lock. This is problem is discussed in detail on the user > list thread ["Write path blocked by MetaDataEndpoint acquiring region > lock"|http://search-hadoop.com/m/9UY0h2qRaBt6Tnaz1&subj=Write+path+blocked+by+MetaDataEndpoint+acquiring+region+lock]. > During some situations, the > [MetaDataEndpoint|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L492] > coprocessor will attempt to refresh it's view of the schema definitions and > statistics. This involves [taking a > rowlock|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L2862], > executing a scan against the [local > region|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L542], > and then a scan against a [potentially > remote|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L964] > statistics table. > This issue is apparently exacerbated by the use of user-provided timestamps > (in my case, the use of the ROW_TIMESTAMP feature, or perhaps as in > PHOENIX-2607). When combined with other issues (PHOENIX-2939), we end up with > total gridlock in our handler threads -- everyone queued behind the rowlock, > scanning and rescanning SYSTEM.STATS. Because this happens in the > MetaDataEndpoint, the means by which all clients refresh their knowledge of > schema, gridlock in that RS can effectively stop all forward progress on the > cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)