[ 
https://issues.apache.org/jira/browse/PHOENIX-7727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned PHOENIX-7727:
-------------------------------------

    Assignee: Viraj Jasani  (was: Tanuj Khurana)

> Eliminate IndexMetadataCache RPCs by leveraging server PTable cache
> -------------------------------------------------------------------
>
>                 Key: PHOENIX-7727
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-7727
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Tanuj Khurana
>            Assignee: Viraj Jasani
>            Priority: Major
>
> ServerCachingEndpointImpl coproc implements the server cache RPC protocol. 
> One use case of the cache RPCs is server side index updates. Whenever the 
> client commits a batch of mutations, if the mutation count is greater than 
> _*phoenix.index.mutableBatchSizeThreshold*_ (default value 3) instead of 
> sending the index maintainer metadata as a mutation attribute the client uses 
> the server cache RPCs to populate the server cache on the region servers and 
> just sends the cache key as a mutation attribute. This was done as an 
> optimization to reduce sending duplicate index maintainer information on 
> every mutation of the batch. It is typical to have batches of size 100 - 1000 
> so this optimization is useful but there are several downsides of this rpc 
> approach.
>  # In-order to determine which region servers to send the cache RPCs, we 
> first create the scan ranges object from the primary keys in the mutations. 
> The size of the scan ranges object is the same as your commit size. This can 
> add to GC overhead since we are doing this on every commit batch.
>  # Then the client calls _*getAllTableRegions*_ which can make calls to meta 
> if the table region locations are not cached in the hbase client meta cache. 
> This adds additional latency on the client side. Once it receives the region 
> list, it intersects the region boundaries with the scan range it constructed 
> to determine the locations of the region servers which host the regions that 
> will be receiving the mutations.
>  # Then the actual RPCs are executed in parallel but these caching RPCs are 
> subject to standard hbase client retry policies and can be retried in case of 
> timeouts or RITs thus potentially adding more latency overhead.
>  # Furthermore, it is not guaranteed that when the server processes these 
> mutations in the IndexRegionObserver coproc and tries to fetch the index 
> maintainer metadata from the cache it will definitely find the cache entry. 
> This happens when the region moves/splits after sending the cache RPC but 
> before the data table mutations are sent. Another scenario where this happens 
> is if the server is overloaded and RPCs are getting queued on the server and 
> by the time the server process the batch rpc, the cache entry has expired 
> (default TTL 30s). If the metadata is not found a DoNotRetryIOException is 
> returned to the client which is handled within the Phoenix MutationState 
> class. The phoenix client retries and again repeats. The worst thing is that 
> when the Phoenix client receives this error, it first calls 
> _*clearTableRegionCache*_ before repeating.
>  
> Sample error logs that we have seen in production:
> {code:java}
> 2025-10-20 07:38:21,800 INFO  
> [t.FPRWQ.Fifo.write.handler=120,queue=20,port=60020] util.IndexManagementUtil 
> - Rethrowing 
> org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2008 (INT10): ERROR 2008 
> (INT10): Unable to find cached index metadata.  key=4619765145502425070 
> region=FOO.TEST1,00D1H000000N1TASDER,1708858336233.1ae49454ee9993697a7cc9e34c899b25.host=server.net,60020,1757812136389
>  Index update failed
>     at 
> org.apache.phoenix.util.ClientUtil.createIOException(ClientUtil.java:166)
>     at 
> org.apache.phoenix.util.ClientUtil.throwIOException(ClientUtil.java:182)
>     at 
> org.apache.phoenix.index.PhoenixIndexMetaDataBuilder.getIndexMetaDataCache(PhoenixIndexMetaDataBuilder.java:101)
>     at 
> org.apache.phoenix.index.PhoenixIndexMetaDataBuilder.getIndexMetaData(PhoenixIndexMetaDataBuilder.java:51)
>     at 
> org.apache.phoenix.index.PhoenixIndexBuilder.getIndexMetaData(PhoenixIndexBuilder.java:92)
>     at 
> org.apache.phoenix.index.PhoenixIndexBuilder.getIndexMetaData(PhoenixIndexBuilder.java:69)
>     at 
> org.apache.phoenix.hbase.index.builder.IndexBuildManager.getIndexMetaData(IndexBuildManager.java:85)
>     at 
> org.apache.phoenix.hbase.index.IndexRegionObserver.getPhoenixIndexMetaData(IndexRegionObserver.java:1090)
>     at 
> org.apache.phoenix.hbase.index.IndexRegionObserver.preBatchMutateWithExceptions(IndexRegionObserver.java:1214)
>     at 
> org.apache.phoenix.hbase.index.IndexRegionObserver.preBatchMutate(IndexRegionObserver.java:514)
>     at {code}
> There is a better solution which will address most of the above problems. 
> Previously, the IndexRegionObserver coproc didn't have the logical name of 
> the table when it was processing a batch of mutations so it couldn't tell 
> whether the entity into which data is being upserted is a table or a view. 
> Because of this the server couldn't determine if the entity in question has 
> an index or not so it relied on the client to tell the server by annotating 
> the mutations with index maintainer metadata. But PHOENIX-5521 started 
> annotating each mutation with enough metadata so that the server can 
> deterministically figure out the Phoenix schema object the mutation targets 
> to. With this information the server can simply _*getTable()*_ and rely on 
> the cqsi cache. Depending on the UPDATE_CACHE_FREQUENCY set on the table we 
> can control the schema freshness. There are already other places on the 
> server where we are making getTable calls like in compaction, server metadata 
> caching
> This will greatly simplify the implementation and should also improve batch 
> write times on tables with indexes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to