[
https://issues.apache.org/jira/browse/PHOENIX-1263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14149649#comment-14149649
]
James Taylor commented on PHOENIX-1263:
---------------------------------------
The work you're doing on PHOENIX-1296 uncovered this issue, but the fix is
different than what your patch has done. I propose that we purposely *don't*
store guidepost information on tenant-specific tables. The reason is that we
may have 100K plus tenants and it'd be extremely wasteful to store all the
guidepost information on all tenant-specific tables. We should instead not even
attempt to read the stats table when we build a tenant-specific table (i.e.
tenantId is not null). The other issue that makes it problematic to store the
guideposts on tenant-specific tables is how we'd need to invalidate them all
when the base/physical table is analyzed. In this case, if we did store them
there, all the cached tenant-specific tables would need to be invalidated
(otherwise they'd continue to store the old guideposts).
My proposal is to store the guideposts only on the physical table. Then on the
client-side, make sure we're always using the physical table to get the
guideposts. So for the DefaultParallelIteratorRegionSplitter constructor,
instead of storing the TableRef, just store the PTable, and get the PTable
through a new utility method created through a bit of refactoring the
MetaDataClient.updateCache(String schemaName, String tableName) method. The
reason to refactor is that you don't want to use the connection you have to
resolve the table, because it might give you back a tenant-specific table and
you want to make sure you get the physical table. Just add a tenantId argument
and pass in null for you call while the other existing method would call it
with connection.getTenantId(). Then your call can get the PTable from
result.getTable().
> Only cache guideposts on physical PTable
> ----------------------------------------
>
> Key: PHOENIX-1263
> URL: https://issues.apache.org/jira/browse/PHOENIX-1263
> Project: Phoenix
> Issue Type: Sub-task
> Reporter: James Taylor
> Assignee: ramkrishna.s.vasudevan
>
> Rather than caching the guideposts on all tenant-specific tables, we should
> cache them only on the physical table. On the client side, we should also
> update the cache with the latest for the base multi-tenant table when we
> update the cache for a tenant-specific table. Then when we lookup the
> guideposts, we should ensure that we're getting them from the physical table.
> Otherwise, it'll be difficult to keep the guideposts cached on the PTable in
> sync across all tenant-specific tables (not to mention using quite a bit of
> memory).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)