[ 
https://issues.apache.org/jira/browse/PHOENIX-6761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tanuj Khurana updated PHOENIX-6761:
-----------------------------------
    Fix Version/s: 5.2.0
                   5.1.3

> Phoenix Client Side Metadata Caching Improvement
> ------------------------------------------------
>
>                 Key: PHOENIX-6761
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6761
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Kadir Ozdemir
>            Assignee: Palash Chauhan
>            Priority: Major
>             Fix For: 5.2.0, 5.1.3
>
>         Attachments: PHOENIX-6761.master.initial.patch
>
>
> CQSI maintains a client-side metadata cache, i.e., schemas, tables, and 
> functions, that evicts the last recently used table entries when the cache 
> size grows beyond the configured size.
> Each time a Phoenix connection is created, the client-side metadata cache 
> maintained by the CQSI object creating this connection is cloned for the 
> connection. Thus, we have two levels of caches, one at the Phoenix connection 
> level and the other at the CQSI level. 
> When a Phoenix client needs to update the client side cache, it updates both 
> caches (on the connection object and on the CQSI object). The Phoenix client 
> attempts to retrieve a table from the connection level cache. If this table 
> is not there then the Phoenix client does not check the CQSI level cache, 
> instead it retrieves the object from the server and finally updates both the 
> connection and CQSI level cache.
> PMetaDataCache provides caching for tables, schemas and functions but it 
> maintains separate caches internally, one cache for each type of metadata. 
> The cache for the tables is actually a cache of PTableRef objects. PTableRef 
> holds a reference to the table object as well as the estimated size of the 
> table object, the create time, last access time, and resolved time. The 
> create time is set to the last access time value provided when the PTableRef 
> object is inserted into the cache. The resolved time is also provided when 
> the PTableRef object is inserted into the cache. Both the created time and 
> resolved time are final fields (i.e., they are not updated). PTableRef 
> provide a setter method to update the last access time. PMetaDataCache 
> updates the last access time whenever the table is retrieved from the cache. 
> The LRU eviction policy is implemented using the last access time. The 
> eviction policy is not implemented for schemas and functions. The 
> configuration parameter for the frequency of updating cache is 
> phoenix.default.update.cache.frequency. This can be defined at the cluster or 
> table level. When it is set to zero, it means cache would not be used.
> Obviously the eviction of the cache is to limit the memory consumed by the 
> cache. The expected behavior is that when a table is removed from the cache, 
> the table (PTableImpl) object is also garbage collected. However, this does 
> not really happen because multiple caches make references to the same object 
> and each cache maintains its own table refs and thus access times. This means 
> that the access time for the same table may differ from one cache to another; 
> and when one cache can evict an object, another cache will hold on the same 
> object. 
> Although individual caches implements the LRU eviction policy, the overall 
> memory eviction policy for the actual table objects is more like age based 
> cache. If a table is frequently accessed from the connection level caches, 
> the last access time maintained by the corresponding table ref objects for 
> this table will be updated. However, these updates on the access times will 
> not be visible to the CQSI level cache. The table refs in the CQSI level 
> cache have the same create time and access time. 
> Since whenever an object is inserted into the local cache of a connection 
> object, it is also inserted the cache on the CSQI object, the CQSI level 
> cache will grow faster than the caches on the connection objects. When the 
> cache reaches its maximum size, the newly inserted tables will result in 
> evicting one of the existing tables in the cache. Since the access time of 
> these tables are not updated on the CQSI level cache, it is likely that the 
> table that has stayed in the cache for the longest period of time will be 
> evicted (regardless of whether the same table is frequently accessed via the 
> connection level caches). This obviously defeats the purpose of an LRU cache.
> Another problem with the current cache is related to the choice of its 
> internal data structures and its eviction implementation. The table refs in 
> the cache are maintained in a hash map which maps a table key (which is pair 
> of a tenant id and table name) to a table ref. When the size of a cache (the 
> total byte size of the table objects referred by the cache) reaches its 
> configured limit, how much overage adding a new table would cause is 
> computed. Then all the table refs in this cache are cloned into a priority 
> queue as well as a new cache. This queue uses the access time to determine 
> the order of its elements (i.e., table refs). The table refs that should not 
> be evicted are removed from the queue, which leaves the table refs to be 
> evicted in the queue. Finally, the table refs left in the queue are removed 
> from the new cache. The new cache replaces the old one. It clear that this is 
> an expensive operation in terms of memory allocations and CPU time. The bad 
> news is that when the cache reaches its limit, every insertion would likely 
> cause an eviction and this expensive operation will be repeated for each such 
> insertion.
> Since Phoenix connections are supposed to be short lived, maintaining a 
> separate cache for each connection object and especially cloning entire cache 
> content (and then pruning the entries belonging to other tenants when the 
> connection is a tenant specific connection) are not justified. The cost of 
> such a clone operation by itself would offset the gain of not accessing the 
> CQSI level cache as the number of such accesses per connection should be 
> small because of short lived Phoenix connections. 
> Also the impact of Phoenix connection leaks, the connections that are not 
> closed by applications and simply long lived connections will be exacerbated 
> since these connections will have references to the large set of table 
> objects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to