Viraj Jasani created PHOENIX-7667:
-------------------------------------

             Summary: Strict vs Relaxed TTL
                 Key: PHOENIX-7667
                 URL: https://issues.apache.org/jira/browse/PHOENIX-7667
             Project: Phoenix
          Issue Type: Improvement
            Reporter: Viraj Jasani


Types of TTLs supported by Phoenix:

*Literal TTL:* A simple numeric value specifying the TTL in seconds.
e.g.
{code:java}
CREATE TABLE T1 (id VARCHAR PRIMARY KEY, COL1 VARCHAR, COL2 INTEGER) TTL = 
86400{code}
Literal TTL values:
 * {{NONE}} / {{{}TTL_EXPRESSION_NOT_DEFINED{}}}: TTL is not specified for the 
table.
 * {{FOREVER}} / {{{}TTL_EXPRESSION_FOREVER{}}}: TTL is set to not expire for 
the rows.
 * {{{}TTL_EXPRESSION_DEFINED_IN_TABLE_DESCRIPTOR{}}}: Clients older than 
Phoenix 5.3 sets TTL value to the HBase TableDescriptor.
 * {{{}User provided TTL{}}}: Literal value of TTL in seconds.

*Conditional TTL:* A boolean expression to determine the row expiration based 
on column values.
e.g.
{code:java}
CREATE TABLE T1 (id VARCHAR PRIMARY KEY, status VARCHAR) TTL = 'status = 
''EXPIRED'' OR TO_NUMBER(CURRENT_TIME()) - TO_NUMBER(PHOENIX_ROW_TIMESTAMP()) 
>= 108000000'{code}
As of Phoenix 5.3.0 (client and server), both types of TTLs are stored in 
SYSTEM.CATALOG table.

The default behavior of conditional TTL includes parsing and compilation of the 
TTL expression, serializing the compiled expression into bytes and sending the 
serialized bytes as the Scan attribute. The scan attribute for Conditional TTL 
is deserialized and used by many region coprocessors and scanner 
implementations including TTLRegionScanner, GlobalIndexRegionScanner and 
IndexRegionObserver.

In order to provide strict TTL view for the users, the region observers perform 
extra computation on the row to determine whether the row has already expired 
and therefore should not be processed. For instance, IndexRegionObserver 
performs read for the given upsert to identify whether the row has already 
expired. While the extra cost incurred by the region observers help provide 
strict TTL expiration, some use case might not require strict TTL expiry.

Let’s define the types of TTL use cases:
 * {*}Strict TTL expiration{*}: As soon as the TTL expires for the given row, 
the row must not be visible to the user queries.
 * *Relaxed TTL expiration:* After the TTL expiration, it can take several 
hours to days for the row to not be visible to the user queries.

Costs involved to achieve strict Conditional TTL expiry:
 * TTLRegionScanner achieves masking of the row (Literal and Conditional TTL)
 * IndexRegionObserver performs read for each update operation (No additional 
cost when the table has covered index)
 * GlobalIndexRegionScanner evaluates TTL expression for each data table row 
during rebuild

For users that do not care about the immediate row masking after the TTL 
expiry, we can provide optional configuration to avoid the extra cost 
associated with making the strict TTL expiration. The relaxed TTL expiration is 
expected to rely only on the Major compaction. After the major compaction 
successfully expires or deletes the given row, the client will no longer be 
able to read the expired row.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to