[ https://issues.apache.org/jira/browse/PHOENIX-6761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kadir Ozdemir reassigned PHOENIX-6761: -------------------------------------- Assignee: Kadir Ozdemir > Phoenix Client Side Metadata Caching Improvement > ------------------------------------------------ > > Key: PHOENIX-6761 > URL: https://issues.apache.org/jira/browse/PHOENIX-6761 > Project: Phoenix > Issue Type: Improvement > Reporter: Kadir Ozdemir > Assignee: Kadir Ozdemir > Priority: Major > > CQSI maintains a client-side metadata cache, i.e., schemas, tables, and > functions, that evicts the last recently used table entries when the cache > size grows beyond the configured size. > Each time a Phoenix connection is created, the client-side metadata cache > maintained by the CQSI object creating this connection is cloned for the > connection. Thus, we have two levels of caches, one at the Phoenix connection > level and the other at the CQSI level. > When a Phoenix client needs to update the client side cache, it updates both > caches (on the connection object and on the CQSI object). The Phoenix client > attempts to retrieve a table from the connection level cache. If this table > is not there then the Phoenix client does not check the CQSI level cache, > instead it retrieves the object from the server and finally updates both the > connection and CQSI level cache. > PMetaDataCache provides caching for tables, schemas and functions but it > maintains separate caches internally, one cache for each type of metadata. > The cache for the tables is actually a cache of PTableRef objects. PTableRef > holds a reference to the table object as well as the estimated size of the > table object, the create time, last access time, and resolved time. The > create time is set to the last access time value provided when the PTableRef > object is inserted into the cache. The resolved time is also provided when > the PTableRef object is inserted into the cache. Both the created time and > resolved time are final fields (i.e., they are not updated). PTableRef > provide a setter method to update the last access time. PMetaDataCache > updates the last access time whenever the table is retrieved from the cache. > The LRU eviction policy is implemented using the last access time. The > eviction policy is not implemented for schemas and functions. The > configuration parameter for the frequency of updating cache is > phoenix.default.update.cache.frequency. This can be defined at the cluster or > table level. When it is set to zero, it means cache would not be used. > Obviously the eviction of the cache is to limit the memory consumed by the > cache. The expected behavior is that when a table is removed from the cache, > the table (PTableImpl) object is also garbage collected. However, this does > not really happen because multiple caches make references to the same object > and each cache maintains its own table refs and thus access times. This means > that the access time for the same table may differ from one cache to another; > and when one cache can evict an object, another cache will hold on the same > object. > Although individual caches implements the LRU eviction policy, the overall > memory eviction policy for the actual table objects is more like age based > cache. If a table is frequently accessed from the connection level caches, > the last access time maintained by the corresponding table ref objects for > this table will be updated. However, these updates on the access times will > not be visible to the CQSI level cache. The table refs in the CQSI level > cache have the same create time and access time. > Since whenever an object is inserted into the local cache of a connection > object, it is also inserted the cache on the CSQI object, the CQSI level > cache will grow faster than the caches on the connection objects. When the > cache reaches its maximum size, the newly inserted tables will result in > evicting one of the existing tables in the cache. Since the access time of > these tables are not updated on the CQSI level cache, it is likely that the > table that has stayed in the cache for the longest period of time will be > evicted (regardless of whether the same table is frequently accessed via the > connection level caches). This obviously defeats the purpose of an LRU cache. > Another problem with the current cache is related to the choice of its > internal data structures and its eviction implementation. The table refs in > the cache are maintained in a hash map which maps a table key (which is pair > of a tenant id and table name) to a table ref. When the size of a cache (the > total byte size of the table objects referred by the cache) reaches its > configured limit, how much overage adding a new table would cause is > computed. Then all the table refs in this cache are cloned into a priority > queue as well as a new cache. This queue uses the access time to determine > the order of its elements (i.e., table refs). The table refs that should not > be evicted are removed from the queue, which leaves the table refs to be > evicted in the queue. Finally, the table refs left in the queue are removed > from the new cache. The new cache replaces the old one. It clear that this is > an expensive operation in terms of memory allocations and CPU time. The bad > news is that when the cache reaches its limit, every insertion would likely > cause an eviction and this expensive operation will be repeated for each such > insertion. > Since Phoenix connections are supposed to be short lived, maintaining a > separate cache for each connection object and especially cloning entire cache > content (and then pruning the entries belonging to other tenants when the > connection is a tenant specific connection) are not justified. The cost of > such a clone operation by itself would offset the gain of not accessing the > CQSI level cache as the number of such accesses per connection should be > small because of short lived Phoenix connections. > Also the impact of Phoenix connection leaks, the connections that are not > closed by applications and simply long lived connections will be exacerbated > since these connections will have references to the large set of table > objects. -- This message was sent by Atlassian Jira (v8.20.10#820010)