Viraj Jasani created PHOENIX-7667: ------------------------------------- Summary: Strict vs Relaxed TTL Key: PHOENIX-7667 URL: https://issues.apache.org/jira/browse/PHOENIX-7667 Project: Phoenix Issue Type: Improvement Reporter: Viraj Jasani
Types of TTLs supported by Phoenix: *Literal TTL:* A simple numeric value specifying the TTL in seconds. e.g. {code:java} CREATE TABLE T1 (id VARCHAR PRIMARY KEY, COL1 VARCHAR, COL2 INTEGER) TTL = 86400{code} Literal TTL values: * {{NONE}} / {{{}TTL_EXPRESSION_NOT_DEFINED{}}}: TTL is not specified for the table. * {{FOREVER}} / {{{}TTL_EXPRESSION_FOREVER{}}}: TTL is set to not expire for the rows. * {{{}TTL_EXPRESSION_DEFINED_IN_TABLE_DESCRIPTOR{}}}: Clients older than Phoenix 5.3 sets TTL value to the HBase TableDescriptor. * {{{}User provided TTL{}}}: Literal value of TTL in seconds. *Conditional TTL:* A boolean expression to determine the row expiration based on column values. e.g. {code:java} CREATE TABLE T1 (id VARCHAR PRIMARY KEY, status VARCHAR) TTL = 'status = ''EXPIRED'' OR TO_NUMBER(CURRENT_TIME()) - TO_NUMBER(PHOENIX_ROW_TIMESTAMP()) >= 108000000'{code} As of Phoenix 5.3.0 (client and server), both types of TTLs are stored in SYSTEM.CATALOG table. The default behavior of conditional TTL includes parsing and compilation of the TTL expression, serializing the compiled expression into bytes and sending the serialized bytes as the Scan attribute. The scan attribute for Conditional TTL is deserialized and used by many region coprocessors and scanner implementations including TTLRegionScanner, GlobalIndexRegionScanner and IndexRegionObserver. In order to provide strict TTL view for the users, the region observers perform extra computation on the row to determine whether the row has already expired and therefore should not be processed. For instance, IndexRegionObserver performs read for the given upsert to identify whether the row has already expired. While the extra cost incurred by the region observers help provide strict TTL expiration, some use case might not require strict TTL expiry. Let’s define the types of TTL use cases: * {*}Strict TTL expiration{*}: As soon as the TTL expires for the given row, the row must not be visible to the user queries. * *Relaxed TTL expiration:* After the TTL expiration, it can take several hours to days for the row to not be visible to the user queries. Costs involved to achieve strict Conditional TTL expiry: * TTLRegionScanner achieves masking of the row (Literal and Conditional TTL) * IndexRegionObserver performs read for each update operation (No additional cost when the table has covered index) * GlobalIndexRegionScanner evaluates TTL expression for each data table row during rebuild For users that do not care about the immediate row masking after the TTL expiry, we can provide optional configuration to avoid the extra cost associated with making the strict TTL expiration. The relaxed TTL expiration is expected to rely only on the Major compaction. After the major compaction successfully expires or deletes the given row, the client will no longer be able to read the expired row. -- This message was sent by Atlassian Jira (v8.20.10#820010)