[ 
https://issues.apache.org/jira/browse/PHOENIX-7667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated PHOENIX-7667:
----------------------------------
    Description: 
Types of TTLs supported by Phoenix:

*Literal TTL:* A simple numeric value specifying the TTL in seconds.
e.g.
{code:java}
CREATE TABLE T1 (id VARCHAR PRIMARY KEY, COL1 VARCHAR, COL2 INTEGER) TTL = 
86400{code}
Literal TTL values:
 * {{NONE}} / {{{}TTL_EXPRESSION_NOT_DEFINED{}}}: TTL is not specified for the 
table.
 * {{FOREVER}} / {{{}TTL_EXPRESSION_FOREVER{}}}: TTL is set to not expire for 
the rows.
 * {{{}TTL_EXPRESSION_DEFINED_IN_TABLE_DESCRIPTOR{}}}: Clients older than 
Phoenix 5.3 sets TTL value to the HBase TableDescriptor.
 * {{{}User provided TTL{}}}: Literal value of TTL in seconds.

*Conditional TTL:* A boolean expression to determine the row expiration based 
on column values.
e.g.
{code:java}
CREATE TABLE T1 (id VARCHAR PRIMARY KEY, status VARCHAR) TTL = 'status = 
''EXPIRED'' OR TO_NUMBER(CURRENT_TIME()) - TO_NUMBER(PHOENIX_ROW_TIMESTAMP()) 
>= 108000000'{code}
As of Phoenix 5.3.0 (client and server), both types of TTLs are stored in 
SYSTEM.CATALOG table.

The default behavior of conditional TTL includes parsing and compilation of the 
TTL expression, serializing the compiled expression into bytes and sending the 
serialized bytes as the Scan attribute. The scan attribute for Conditional TTL 
is deserialized and used by many region coprocessors and scanner 
implementations including TTLRegionScanner, GlobalIndexRegionScanner and 
IndexRegionObserver.

In order to provide strict TTL view for the users, the region observers perform 
extra computation on the row to determine whether the row has already expired 
and therefore should not be processed. For instance, IndexRegionObserver 
performs read for the given upsert to identify whether the row has already 
expired. While the extra cost incurred by the region observers help provide 
strict TTL expiration, some use case might not require strict TTL expiry.

Let’s define the types of TTL use cases:
 * {*}Strict TTL expiration{*}: As soon as the TTL expires for the given row, 
the row must not be visible to the user queries.
 * *Relaxed TTL expiration:* After the TTL expiration, it can take several 
hours to days for the row to not be visible to the user queries.

Costs involved to achieve strict Conditional TTL expiry:
 * TTLRegionScanner achieves masking of the row (Literal and Conditional TTL)
 * IndexRegionObserver performs read for each update operation (No additional 
cost when the table has covered index)
 * GlobalIndexRegionScanner evaluates TTL expression for each data table row 
during rebuild

For users that do not care about the immediate row masking after the TTL 
expiry, we can provide optional configuration to avoid the extra cost 
associated with making the strict TTL expiration. The relaxed TTL expiration is 
expected to rely only on the Major compaction. After the major compaction 
successfully expires or deletes the given row, the client will no longer be 
able to read the expired row.

New PTable attribute to determine whether the TTL is strict or relaxed: 
{{"IS_STRICT_TTL"}}

  was:
Types of TTLs supported by Phoenix:

*Literal TTL:* A simple numeric value specifying the TTL in seconds.
e.g.
{code:java}
CREATE TABLE T1 (id VARCHAR PRIMARY KEY, COL1 VARCHAR, COL2 INTEGER) TTL = 
86400{code}
Literal TTL values:
 * {{NONE}} / {{{}TTL_EXPRESSION_NOT_DEFINED{}}}: TTL is not specified for the 
table.
 * {{FOREVER}} / {{{}TTL_EXPRESSION_FOREVER{}}}: TTL is set to not expire for 
the rows.
 * {{{}TTL_EXPRESSION_DEFINED_IN_TABLE_DESCRIPTOR{}}}: Clients older than 
Phoenix 5.3 sets TTL value to the HBase TableDescriptor.
 * {{{}User provided TTL{}}}: Literal value of TTL in seconds.

*Conditional TTL:* A boolean expression to determine the row expiration based 
on column values.
e.g.
{code:java}
CREATE TABLE T1 (id VARCHAR PRIMARY KEY, status VARCHAR) TTL = 'status = 
''EXPIRED'' OR TO_NUMBER(CURRENT_TIME()) - TO_NUMBER(PHOENIX_ROW_TIMESTAMP()) 
>= 108000000'{code}
As of Phoenix 5.3.0 (client and server), both types of TTLs are stored in 
SYSTEM.CATALOG table.

The default behavior of conditional TTL includes parsing and compilation of the 
TTL expression, serializing the compiled expression into bytes and sending the 
serialized bytes as the Scan attribute. The scan attribute for Conditional TTL 
is deserialized and used by many region coprocessors and scanner 
implementations including TTLRegionScanner, GlobalIndexRegionScanner and 
IndexRegionObserver.

In order to provide strict TTL view for the users, the region observers perform 
extra computation on the row to determine whether the row has already expired 
and therefore should not be processed. For instance, IndexRegionObserver 
performs read for the given upsert to identify whether the row has already 
expired. While the extra cost incurred by the region observers help provide 
strict TTL expiration, some use case might not require strict TTL expiry.

Let’s define the types of TTL use cases:
 * {*}Strict TTL expiration{*}: As soon as the TTL expires for the given row, 
the row must not be visible to the user queries.
 * *Relaxed TTL expiration:* After the TTL expiration, it can take several 
hours to days for the row to not be visible to the user queries.

Costs involved to achieve strict Conditional TTL expiry:
 * TTLRegionScanner achieves masking of the row (Literal and Conditional TTL)
 * IndexRegionObserver performs read for each update operation (No additional 
cost when the table has covered index)
 * GlobalIndexRegionScanner evaluates TTL expression for each data table row 
during rebuild

For users that do not care about the immediate row masking after the TTL 
expiry, we can provide optional configuration to avoid the extra cost 
associated with making the strict TTL expiration. The relaxed TTL expiration is 
expected to rely only on the Major compaction. After the major compaction 
successfully expires or deletes the given row, the client will no longer be 
able to read the expired row.


> Strict vs Relaxed TTL
> ---------------------
>
>                 Key: PHOENIX-7667
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-7667
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Viraj Jasani
>            Assignee: Viraj Jasani
>            Priority: Major
>             Fix For: 5.3.0
>
>
> Types of TTLs supported by Phoenix:
> *Literal TTL:* A simple numeric value specifying the TTL in seconds.
> e.g.
> {code:java}
> CREATE TABLE T1 (id VARCHAR PRIMARY KEY, COL1 VARCHAR, COL2 INTEGER) TTL = 
> 86400{code}
> Literal TTL values:
>  * {{NONE}} / {{{}TTL_EXPRESSION_NOT_DEFINED{}}}: TTL is not specified for 
> the table.
>  * {{FOREVER}} / {{{}TTL_EXPRESSION_FOREVER{}}}: TTL is set to not expire for 
> the rows.
>  * {{{}TTL_EXPRESSION_DEFINED_IN_TABLE_DESCRIPTOR{}}}: Clients older than 
> Phoenix 5.3 sets TTL value to the HBase TableDescriptor.
>  * {{{}User provided TTL{}}}: Literal value of TTL in seconds.
> *Conditional TTL:* A boolean expression to determine the row expiration based 
> on column values.
> e.g.
> {code:java}
> CREATE TABLE T1 (id VARCHAR PRIMARY KEY, status VARCHAR) TTL = 'status = 
> ''EXPIRED'' OR TO_NUMBER(CURRENT_TIME()) - TO_NUMBER(PHOENIX_ROW_TIMESTAMP()) 
> >= 108000000'{code}
> As of Phoenix 5.3.0 (client and server), both types of TTLs are stored in 
> SYSTEM.CATALOG table.
> The default behavior of conditional TTL includes parsing and compilation of 
> the TTL expression, serializing the compiled expression into bytes and 
> sending the serialized bytes as the Scan attribute. The scan attribute for 
> Conditional TTL is deserialized and used by many region coprocessors and 
> scanner implementations including TTLRegionScanner, GlobalIndexRegionScanner 
> and IndexRegionObserver.
> In order to provide strict TTL view for the users, the region observers 
> perform extra computation on the row to determine whether the row has already 
> expired and therefore should not be processed. For instance, 
> IndexRegionObserver performs read for the given upsert to identify whether 
> the row has already expired. While the extra cost incurred by the region 
> observers help provide strict TTL expiration, some use case might not require 
> strict TTL expiry.
> Let’s define the types of TTL use cases:
>  * {*}Strict TTL expiration{*}: As soon as the TTL expires for the given row, 
> the row must not be visible to the user queries.
>  * *Relaxed TTL expiration:* After the TTL expiration, it can take several 
> hours to days for the row to not be visible to the user queries.
> Costs involved to achieve strict Conditional TTL expiry:
>  * TTLRegionScanner achieves masking of the row (Literal and Conditional TTL)
>  * IndexRegionObserver performs read for each update operation (No additional 
> cost when the table has covered index)
>  * GlobalIndexRegionScanner evaluates TTL expression for each data table row 
> during rebuild
> For users that do not care about the immediate row masking after the TTL 
> expiry, we can provide optional configuration to avoid the extra cost 
> associated with making the strict TTL expiration. The relaxed TTL expiration 
> is expected to rely only on the Major compaction. After the major compaction 
> successfully expires or deletes the given row, the client will no longer be 
> able to read the expired row.
> New PTable attribute to determine whether the TTL is strict or relaxed: 
> {{"IS_STRICT_TTL"}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to