[
https://issues.apache.org/jira/browse/PHOENIX-6761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kadir Ozdemir reassigned PHOENIX-6761:
--------------------------------------
Assignee: Kadir Ozdemir
> Phoenix Client Side Metadata Caching Improvement
> ------------------------------------------------
>
> Key: PHOENIX-6761
> URL: https://issues.apache.org/jira/browse/PHOENIX-6761
> Project: Phoenix
> Issue Type: Improvement
> Reporter: Kadir Ozdemir
> Assignee: Kadir Ozdemir
> Priority: Major
>
> CQSI maintains a client-side metadata cache, i.e., schemas, tables, and
> functions, that evicts the last recently used table entries when the cache
> size grows beyond the configured size.
> Each time a Phoenix connection is created, the client-side metadata cache
> maintained by the CQSI object creating this connection is cloned for the
> connection. Thus, we have two levels of caches, one at the Phoenix connection
> level and the other at the CQSI level.
> When a Phoenix client needs to update the client side cache, it updates both
> caches (on the connection object and on the CQSI object). The Phoenix client
> attempts to retrieve a table from the connection level cache. If this table
> is not there then the Phoenix client does not check the CQSI level cache,
> instead it retrieves the object from the server and finally updates both the
> connection and CQSI level cache.
> PMetaDataCache provides caching for tables, schemas and functions but it
> maintains separate caches internally, one cache for each type of metadata.
> The cache for the tables is actually a cache of PTableRef objects. PTableRef
> holds a reference to the table object as well as the estimated size of the
> table object, the create time, last access time, and resolved time. The
> create time is set to the last access time value provided when the PTableRef
> object is inserted into the cache. The resolved time is also provided when
> the PTableRef object is inserted into the cache. Both the created time and
> resolved time are final fields (i.e., they are not updated). PTableRef
> provide a setter method to update the last access time. PMetaDataCache
> updates the last access time whenever the table is retrieved from the cache.
> The LRU eviction policy is implemented using the last access time. The
> eviction policy is not implemented for schemas and functions. The
> configuration parameter for the frequency of updating cache is
> phoenix.default.update.cache.frequency. This can be defined at the cluster or
> table level. When it is set to zero, it means cache would not be used.
> Obviously the eviction of the cache is to limit the memory consumed by the
> cache. The expected behavior is that when a table is removed from the cache,
> the table (PTableImpl) object is also garbage collected. However, this does
> not really happen because multiple caches make references to the same object
> and each cache maintains its own table refs and thus access times. This means
> that the access time for the same table may differ from one cache to another;
> and when one cache can evict an object, another cache will hold on the same
> object.
> Although individual caches implements the LRU eviction policy, the overall
> memory eviction policy for the actual table objects is more like age based
> cache. If a table is frequently accessed from the connection level caches,
> the last access time maintained by the corresponding table ref objects for
> this table will be updated. However, these updates on the access times will
> not be visible to the CQSI level cache. The table refs in the CQSI level
> cache have the same create time and access time.
> Since whenever an object is inserted into the local cache of a connection
> object, it is also inserted the cache on the CSQI object, the CQSI level
> cache will grow faster than the caches on the connection objects. When the
> cache reaches its maximum size, the newly inserted tables will result in
> evicting one of the existing tables in the cache. Since the access time of
> these tables are not updated on the CQSI level cache, it is likely that the
> table that has stayed in the cache for the longest period of time will be
> evicted (regardless of whether the same table is frequently accessed via the
> connection level caches). This obviously defeats the purpose of an LRU cache.
> Another problem with the current cache is related to the choice of its
> internal data structures and its eviction implementation. The table refs in
> the cache are maintained in a hash map which maps a table key (which is pair
> of a tenant id and table name) to a table ref. When the size of a cache (the
> total byte size of the table objects referred by the cache) reaches its
> configured limit, how much overage adding a new table would cause is
> computed. Then all the table refs in this cache are cloned into a priority
> queue as well as a new cache. This queue uses the access time to determine
> the order of its elements (i.e., table refs). The table refs that should not
> be evicted are removed from the queue, which leaves the table refs to be
> evicted in the queue. Finally, the table refs left in the queue are removed
> from the new cache. The new cache replaces the old one. It clear that this is
> an expensive operation in terms of memory allocations and CPU time. The bad
> news is that when the cache reaches its limit, every insertion would likely
> cause an eviction and this expensive operation will be repeated for each such
> insertion.
> Since Phoenix connections are supposed to be short lived, maintaining a
> separate cache for each connection object and especially cloning entire cache
> content (and then pruning the entries belonging to other tenants when the
> connection is a tenant specific connection) are not justified. The cost of
> such a clone operation by itself would offset the gain of not accessing the
> CQSI level cache as the number of such accesses per connection should be
> small because of short lived Phoenix connections.
> Also the impact of Phoenix connection leaks, the connections that are not
> closed by applications and simply long lived connections will be exacerbated
> since these connections will have references to the large set of table
> objects.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)