[jira] [Commented] (PHOENIX-1263) Only cache guideposts on physical PTable

James Taylor (JIRA) Sun, 28 Sep 2014 11:48:07 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-1263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14151175#comment-14151175
 ]


James Taylor commented on PHOENIX-1263:
---------------------------------------

Thanks for the patch, [~ramkrishna]. Here's some feedback:
- Most of our methods that pass through tenantId pass it through as the first 
argument, so please make it the first argument for your new call in 
MetaDataClient. Also, keep the check for if system table where it is now (in 
the lowest level updateCache call), and just set tenantId to null if 
(systemTable). Just expose one extra public method: updateCache(PName tenantId, 
String schemaName, String tableName) and use a value of false for 
alwaysHitServer when you call through to the private method (that you tweak to 
pass through tenantId).
- In ParallelIterator, you need to call your new updateCache with a null 
tenantId and the physical name of the table.  You're calling it now with the 
logical name which may be a view. Then, instead of passing tableRef through 
getSplits, change it to pass through a PTable and pass through the PTable you 
get from the result.getTable() returned from the updateCache call. Something 
like this:
{code}
diff --git 
a/phoenix-core/src/main/java/org/apache/phoenix/iterate/ParallelIterators.java 
b/phoenix-core/src/main/java/org/apache/phoenix/iterate/ParallelIterators.java
index a2dabe3..ca3ae04 100644
--- 
a/phoenix-core/src/main/java/org/apache/phoenix/iterate/ParallelIterators.java
+++ 
b/phoenix-core/src/main/java/org/apache/phoenix/iterate/ParallelIterators.java
@@ -55,6 +55,7 @@ import org.apache.phoenix.query.ConnectionQueryServices;
 import org.apache.phoenix.query.KeyRange;
 import org.apache.phoenix.query.QueryConstants;
 import org.apache.phoenix.query.QueryServices;
+import org.apache.phoenix.schema.MetaDataClient;
 import org.apache.phoenix.schema.PColumnFamily;
 import org.apache.phoenix.schema.PTable;
 import org.apache.phoenix.schema.PTable.IndexType;
@@ -107,6 +108,8 @@ public class ParallelIterators extends ExplainTable 
implements ResultIterators {
             RowProjector projector, GroupBy groupBy, Integer limit, 
ParallelIteratorFactory iteratorFactory)
             throws SQLException {
         super(context, tableRef, groupBy);
         PTable physicalTable = tableRef.getTable();
         String physicalName = tableRef.getTable().getPhysicalName();
         if (!physicalName.equals(physicalTable.getName()) { // tableRef is not 
for the physical table
             String physicalSchemaName = 
SchemaUtil.getSchemaNameFromFullName(physicalName);
             String physicalTableName = 
SchemaUtil.getTableNameFromFullName(physicalName);
             MetaDataClient client = new 
MetaDataClient(context.getConnection());
             // TODO: this will be an extra RPC to ensure we have the latest 
guideposts, but is almost always 
             // unnecessary. We should instead track when the last time an 
update cache was done for this
             // for physical table and not do it again until some interval has 
passed (it's ok to use stale stats).
             MetaDataMutationResult result = client.updateCache(
                 null, /* use global tenant id to get physical table */
                 tableRef.getTable().getSchemaName().getString(), 
                 tableRef.getTable().getTableName().getString());
             physicalTable = result.getTable();
         }
         this.splits = getSplits(context, physicalTable, statement.getHint());
{code}
- In MetaDataEndPointImpl, only call the method that queries the stats table if 
tenantId == null:
{code}
    private PTable getTable(RegionScanner scanner, long clientTimeStamp, long 
tableTimeStamp)
         throws IOException, SQLException {
       ...
        PTableStats stats = tenantId == null ? 
updateStatsInternal(physicalTableName.getBytes(), columns) : null;
{code}


> Only cache guideposts on physical PTable
> ----------------------------------------
>
>                 Key: PHOENIX-1263
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1263
>             Project: Phoenix
>          Issue Type: Sub-task
>            Reporter: James Taylor
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: Phoenix-1263_1.patch
>
>
> Rather than caching the guideposts on all tenant-specific tables, we should 
> cache them only on the physical table. On the client side, we should also 
> update the cache with the latest for the base multi-tenant table when we 
> update the cache for a tenant-specific table. Then when we lookup the 
> guideposts, we should ensure that we're getting them from the physical table.
> Otherwise, it'll be difficult to keep the guideposts cached on the PTable in 
> sync across all tenant-specific tables (not to mention using quite a bit of 
> memory).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-1263) Only cache guideposts on physical PTable

Reply via email to