[
https://issues.apache.org/jira/browse/PHOENIX-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15830800#comment-15830800
]
James Taylor commented on PHOENIX-3583:
---------------------------------------
Thanks for the explanation, [~elserj]. Now I understand what you're getting at.
There's a small bit of code that decides whether to tack on the IndexMaintainer
to the mutations themselves (as an attribute) or make a separate, single RPC
per region server to cache them for usage when the mutations are processed:
{code}
public static boolean useIndexMetadataCache(PhoenixConnection connection,
List<? extends Mutation> mutations, int indexMetaDataByteLength) {
ReadOnlyProps props = connection.getQueryServices().getProps();
int threshold = props.getInt(INDEX_MUTATE_BATCH_SIZE_THRESHOLD_ATTRIB,
QueryServicesOptions.DEFAULT_INDEX_MUTATE_BATCH_SIZE_THRESHOLD);
return (indexMetaDataByteLength > ServerCacheClient.UUID_LENGTH &&
mutations.size() > threshold);
}
{code}
So the value of INDEX_MUTATE_BATCH_SIZE_THRESHOLD_ATTRIB determines the number
of rows above which a separate RPC is made. The default is only 3 rows. Perhaps
we should bump that up substantially if the RPCs are becoming a bottleneck? It
would have the affect of making the payload larger (by numRowsInBatchToRS *
sizeofIndexMaintainer). Unfortunately, there's no mechanism in HBase to add an
attribute only to the RPC to the RS as opposed to having to repeat it on every
mutation (HBASE-9291).
> Prepare IndexMaintainer on server itself
> ----------------------------------------
>
> Key: PHOENIX-3583
> URL: https://issues.apache.org/jira/browse/PHOENIX-3583
> Project: Phoenix
> Issue Type: Bug
> Reporter: Ankit Singhal
> Assignee: Ankit Singhal
> Attachments: PHOENIX-3583.patch
>
>
> -- reuse the cache of PTable and it's lifecycle.
> -- With the new implementation, we will be doing RPC to meta table per mini
> batch which could be an overhead, but the same configuration
> "updateCacheFrequency" can be used to control a frequency of touching
> SYSTEM.CATALOG endpoint for updated Ptable or index maintainers.
> -- It is expected that 99% of the time the table is old and RPC will be
> returned with an empty result(so it may be less costly), as opposed to the
> current implementation where we have to send the index maintainer payload to
> each region server per upsert batch.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)