[
https://issues.apache.org/jira/browse/PHOENIX-7484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17905727#comment-17905727
]
Sanjeet Malhotra commented on PHOENIX-7484:
-------------------------------------------
> Is the cache miss due to UCF being ALWAYS?
[~vjasani] this PhoenixConnection#getTable() call which I fixed and is being
used during mutation plan creation is not even checking UCF. I do feel its odd
and in my opinion something which needs to be fixed. I saw there is already a
bug open for this: https://issues.apache.org/jira/browse/PHOENIX-4475.
> Upserts on a multi-tenant tables using tenant connection are taking 5K-6K %
> more time than non-tenant connection
> ----------------------------------------------------------------------------------------------------------------
>
> Key: PHOENIX-7484
> URL: https://issues.apache.org/jira/browse/PHOENIX-7484
> Project: Phoenix
> Issue Type: Bug
> Reporter: Sanjeet Malhotra
> Assignee: Sanjeet Malhotra
> Priority: Major
>
> Upserts using tenant connection on a multi-tenant table are taking 5K-6K %
> more time than upserts using non-tenant connection for 2M rows. Here the time
> being taken means total time spent in `executeUpdate()` and `commit()` call.
> The batch size and schema was same when testing with tenant connection and
> non-tenant connection.
> On further analysis, got to know that when doing upserts (for 2M rows) on a
> multi-tenant table over a tenant connection 13K-14K% more time was being
> spent in executeUpdate call than non-tenant connection. This whole regression
> is coming from mutation plan creation phase of executeUpdate call.
> Further root caused that, with tenant connection we are always hitting SYSCAT
> to get PTable object during mutation plan creation. So, every call to
> executeUpdate() over tenant connection results in PTable lookup from SYSCAT
> during mutation plan creation adding ~1ms to every call of executeUpdate()
> and for 2M rows this cumulate to 29-33 mins.
>
> For multi-tenant tables, the PTableKey in metadata cache has tenant Id as
> null as table was created over a non-tenant connection. When we are using
> multi-tenant connection for doing upserts, the PTableKey used to lookup
> PTableRef in metadata cache on client has tenant Id same as tenant Id of
> connection i.e. non null. Thus, when lookup happens for PTableRef it results
> in cache miss and next we immediately fallback to `getTableNoCache()` which
> ends up hitting SYSCAT. Rather we should first fallback to looking in
> metadata cache again but with tenant Id as null in PTableKey used for lookup
> and if still we don't find PTableRef then we should fallback to
> `getTableNoCache()`.
>
> Code pointer:
> https://github.com/apache/phoenix/blob/7682e3cee82e9cecb952eddaade1c544e6bd502d/phoenix-core-client/src/main/java/org/apache/phoenix/jdbc/PhoenixConnection.java#L766-L768
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)