[
https://issues.apache.org/jira/browse/PHOENIX-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14702197#comment-14702197
]
James Taylor commented on PHOENIX-1812:
---------------------------------------
Thanks for the revisions, [~tdsilva]. I think a few things can be simplified a
bit more now that we're consistently tracking when we resolved the table.
Here's some feedback:
- Combine the logic of the getCurrentScn() and getCurrentTxnTimestamp() to get
the correct client timestamp wrt metadata. If transactional, we should be using
the read pointer scaled down by 1M through our TransactionUtil function. If non
transactional, then the logic will look like the getCurrentScn() function.
{code}
+ private long getCurrentScn() {
Long scn = connection.getSCN();
- long clientTimeStamp = scn == null ? HConstants.LATEST_TIMESTAMP : scn;
- return clientTimeStamp;
+ long currentScn = scn == null ? HConstants.LATEST_TIMESTAMP : scn;
+ return currentScn;
}
+ private long getCurrentTxnTimestamp() {
+ Transaction transaction =
connection.getMutationState().getTransaction();
+ long currentTxnTimestamp = transaction == null ?
HConstants.LATEST_TIMESTAMP : transaction.getReadPointer();
+ return currentTxnTimestamp;
+ }
+
{code}
- Simplify the logic in MetaDataClient.updateCache() by not having a special
case for scn versus txn in this code:
{code}
+ long currentScn = getCurrentScn();
+ long currentTxnTimestamp = getCurrentTxnTimestamp();
+ // Do not make rpc to getTable if
+ // 1. table is a system table
+ // 2. currentScn is set and equals tableTablestamp-1 (in which case
there can't possibly be a newer schema)
+ // 3. currentScn is set and equals tableResolvedTimestamp
+ // 4. table is transactional and the current txn timestamp equals
tableResolvedTimestamp
+ if (table != null && !alwaysHitServer
+ && ( systemTable ||
+ (currentScn!=HConstants.LATEST_TIMESTAMP &&
(tableTimestamp == currentScn-1 || currentScn == tableResolvedTimestamp)) ||
+ (table.isTransactional() &&
currentTxnTimestamp == tableResolvedTimestamp))) {
+ return new
MetaDataMutationResult(MutationCode.TABLE_ALREADY_EXISTS,
QueryConstants.UNSET_TIMESTAMP, table);
}
{code}
- We don't need that special case of currentScn!=HConstants.LATEST_TIMESTAMP
nor do we need the special case of tableTimestamp == currentScn-1. If the table
was resolved already at our scn, then it shouldn't be resolved again. Something
like this instead:
{code}
+ long resolvedTimestamp = getResolvedTimestamp();
+ // Do not make rpc to getTable if
+ // 1. table is a system table
+ // 2. table was already resolved as-of that timestamp
+ if (table != null && !alwaysHitServer && ( systemTable ||
resolvedTimestamp == tableResolvedTimestamp) {
+ return new
MetaDataMutationResult(MutationCode.TABLE_ALREADY_EXISTS,
QueryConstants.UNSET_TIMESTAMP, table);
}
{code}
- For the rest of the MetaDataClient.updateCache(), make sure to pass through
resolvedTimestamp. I think you'll need to pass this through the
addIndexesFromPhysicalTable call too. That way, if we're transactional, we're
consistently resolving everything as of the scaled down read pointer.
{code}
int maxTryCount = tenantId == null ? 1 : 2;
@@ -347,7 +377,8 @@ public class MetaDataClient {
do {
final byte[] schemaBytes = PVarchar.INSTANCE.toBytes(schemaName);
final byte[] tableBytes = PVarchar.INSTANCE.toBytes(tableName);
- result = connection.getQueryServices().getTable(tenantId,
schemaBytes, tableBytes, tableTimestamp, clientTimeStamp);
+ ConnectionQueryServices queryServices =
connection.getQueryServices();
+ result = queryServices.getTable(tenantId, schemaBytes,
tableBytes, tableTimestamp, resolvedTimestamp); // HERE
if (SYSTEM_CATALOG_SCHEMA.equals(schemaName)) {
return result;
@@ -372,14 +403,18 @@ public class MetaDataClient {
// server again.
if (table != null) {
// Ensures that table in result is set to table found in
our cache.
- result.setTable(table);
if (code == MutationCode.TABLE_ALREADY_EXISTS) {
- // Although this table is up-to-date, the parent table
may not be.
- // In this case, we update the parent table which may
in turn pull
- // in indexes to add to this table.
- if (addIndexesFromPhysicalTable(result)) {
- connection.addTable(result.getTable());
- }
+ result.setTable(table);
+ // Although this table is up-to-date, the parent table
may not be.
+ // In this case, we update the parent table which may
in turn pull
+ // in indexes to add to this table.
+ if (addIndexesFromPhysicalTable(result,
resolvedTimestamp)) { // HERE (I think)
+ connection.addTable(result.getTable(),
resolvedTimestamp); // HERE
+ }
+ else {
+ // if we aren't adding the table, we still need
to update the resolved time of the table
+ connection.updateResolvedTimestamp(table,
resolvedTimestamp); // HERE
+ }
return result;
}
{code}
- Add one more overloaded updateCache call that allows us to pass in the
resolveTimestamp when we already know it. Maybe a Long parameter with null
being the default? We'll use this from addIndexesFromPhysicalTable where we
already have the resolvedTimestamp, as otherwise when we call it with the
parent table, we'll end up making another RPC to get the latest parent table
metadata. I think it's better to consistently use the same resolveTimestamp for
these cases.
- Seems like there are a couple of place where you're calling getCurrentSCN(),
but I'm not sure if this takes into account if a table or index is
transactional. Maybe just make a pass through and ensure it's consistent. For
example:
{code}
@@ -2700,13 +2747,13 @@ public class MetaDataClient {
*/
boolean isSharedIndex = table.getViewIndexId() != null;
if (isSharedIndex) {
- return
connection.getQueryServices().getTableStats(table.getPhysicalName().getBytes(),
getClientTimeStamp());
+ return
connection.getQueryServices().getTableStats(table.getPhysicalName().getBytes(),
getCurrentScn());
}
{code}
> Only sync table metadata when necessary
> ---------------------------------------
>
> Key: PHOENIX-1812
> URL: https://issues.apache.org/jira/browse/PHOENIX-1812
> Project: Phoenix
> Issue Type: Sub-task
> Reporter: James Taylor
> Assignee: Thomas D'Silva
> Attachments: PHOENIX-1812-v2.patch, PHOENIX-1812-v3.patch,
> PHOENIX-1812-v4-WIP.patch, PHOENIX-1812-v5.patch, PHOENIX-1812.patch,
> PHOENIX-1812.patch, PHOENIX-1812.patch
>
>
> With transactions, we hold the timestamp at the point when the transaction
> was opened. We can prevent the MetaDataEndpoint getTable RPC in
> MetaDataClient.updateCache() to check that the client has the latest table if
> we've already checked at the current transaction ID timestamp. We can keep
> track of which tables we've already updated in PhoenixConnection.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)