Rushabh Shah created PHOENIX-6826:
-------------------------------------
Summary: Don't invalidate meta cache if
CQSI#getTableRegionLocation encounters IOException.
Key: PHOENIX-6826
URL: https://issues.apache.org/jira/browse/PHOENIX-6826
Project: Phoenix
Issue Type: Improvement
Components: core
Affects Versions: 4.16.1
Reporter: Rushabh Shah
We have 2 places in CQSI where we invalidate meta data cache (by setting
reload=true) if CQSI#getTableRegionLocation encounters IOException.
See code below
@Override
public HRegionLocation getTableRegionLocation(byte[] tableName, byte[] row)
throws SQLException {
/*
* Use HConnection.getRegionLocation as it uses the cache in
HConnection, to get the region
* to which specified row belongs to.
*/
int retryCount = 0, maxRetryCount = 1;
boolean reload =false;
while (true) {
try {
return
connection.getRegionLocator(TableName.valueOf(tableName)).getRegionLocation(row,
reload);
} catch (org.apache.hadoop.hbase.TableNotFoundException e) {
String fullName = Bytes.toString(tableName);
throw new
TableNotFoundException(SchemaUtil.getSchemaNameFromFullName(fullName),
SchemaUtil.getTableNameFromFullName(fullName));
} catch (IOException e) {
if (retryCount++ < maxRetryCount) { // One retry, in case split
occurs while navigating
reload = true;
continue;
}
throw new
SQLExceptionInfo.Builder(SQLExceptionCode.GET_TABLE_REGIONS_FAIL)
.setRootCause(e).build().buildException();
}
}
}
Once we set reload flag to true, it will invalidate the cache from hbase
client. This means client will make RPC request to hbase:meta region server.
Consider the situation where multiple threads across different jvms make call
to getRegionLocation and if not found in cache, it will make an RPC request.
We have seen requests to hbase:meta region server so high that hbase:meta was
overwhelmed for hours and actual customer queries were failing.
Do we need to invalidate the metadata cache?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)