Viraj Jasani created PHOENIX-7667:
-------------------------------------
Summary: Strict vs Relaxed TTL
Key: PHOENIX-7667
URL: https://issues.apache.org/jira/browse/PHOENIX-7667
Project: Phoenix
Issue Type: Improvement
Reporter: Viraj Jasani
Types of TTLs supported by Phoenix:
*Literal TTL:* A simple numeric value specifying the TTL in seconds.
e.g.
{code:java}
CREATE TABLE T1 (id VARCHAR PRIMARY KEY, COL1 VARCHAR, COL2 INTEGER) TTL =
86400{code}
Literal TTL values:
* {{NONE}} / {{{}TTL_EXPRESSION_NOT_DEFINED{}}}: TTL is not specified for the
table.
* {{FOREVER}} / {{{}TTL_EXPRESSION_FOREVER{}}}: TTL is set to not expire for
the rows.
* {{{}TTL_EXPRESSION_DEFINED_IN_TABLE_DESCRIPTOR{}}}: Clients older than
Phoenix 5.3 sets TTL value to the HBase TableDescriptor.
* {{{}User provided TTL{}}}: Literal value of TTL in seconds.
*Conditional TTL:* A boolean expression to determine the row expiration based
on column values.
e.g.
{code:java}
CREATE TABLE T1 (id VARCHAR PRIMARY KEY, status VARCHAR) TTL = 'status =
''EXPIRED'' OR TO_NUMBER(CURRENT_TIME()) - TO_NUMBER(PHOENIX_ROW_TIMESTAMP())
>= 108000000'{code}
As of Phoenix 5.3.0 (client and server), both types of TTLs are stored in
SYSTEM.CATALOG table.
The default behavior of conditional TTL includes parsing and compilation of the
TTL expression, serializing the compiled expression into bytes and sending the
serialized bytes as the Scan attribute. The scan attribute for Conditional TTL
is deserialized and used by many region coprocessors and scanner
implementations including TTLRegionScanner, GlobalIndexRegionScanner and
IndexRegionObserver.
In order to provide strict TTL view for the users, the region observers perform
extra computation on the row to determine whether the row has already expired
and therefore should not be processed. For instance, IndexRegionObserver
performs read for the given upsert to identify whether the row has already
expired. While the extra cost incurred by the region observers help provide
strict TTL expiration, some use case might not require strict TTL expiry.
Let’s define the types of TTL use cases:
* {*}Strict TTL expiration{*}: As soon as the TTL expires for the given row,
the row must not be visible to the user queries.
* *Relaxed TTL expiration:* After the TTL expiration, it can take several
hours to days for the row to not be visible to the user queries.
Costs involved to achieve strict Conditional TTL expiry:
* TTLRegionScanner achieves masking of the row (Literal and Conditional TTL)
* IndexRegionObserver performs read for each update operation (No additional
cost when the table has covered index)
* GlobalIndexRegionScanner evaluates TTL expression for each data table row
during rebuild
For users that do not care about the immediate row masking after the TTL
expiry, we can provide optional configuration to avoid the extra cost
associated with making the strict TTL expiration. The relaxed TTL expiration is
expected to rely only on the Major compaction. After the major compaction
successfully expires or deletes the given row, the client will no longer be
able to read the expired row.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)