[ 
https://issues.apache.org/jira/browse/PHOENIX-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417866#comment-15417866
 ] 

James Taylor commented on PHOENIX-3176:
---------------------------------------

It's important to run the scan as of the time stamp from which the table was 
resolved so that we have a consistent time across all our parallel scans. One 
example where we depend on this is when an UPSERT SELECT is done on the same 
table. We ensure that we don't see the new rows being inserted so that we don't 
get into an infinite loop. See the 
UpsertSelectAutoCommitIT.testUpsertSelectDoesntSeeUpsertedData(). Does the time 
range on the scan already take into account the time at which the table was 
resolved? It'd be interesting to have another similar test on a table using the 
ROW_TIMESTAMP feature. Maybe we need a special case for this feature (which 
would be a shame)? If so, we'd still need to handle the UPSERT SELECT case.

> Rows will be skipped which are having future timestamp in row_timestamp column
> ------------------------------------------------------------------------------
>
>                 Key: PHOENIX-3176
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3176
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.6.0
>            Reporter: Ankit Singhal
>             Fix For: 4.8.1
>
>         Attachments: PHOENIX-3176.patch
>
>
> Rows will be skipped when row_timestamp have future timestamp
> {code}
> : jdbc:phoenix:localhost> CREATE TABLE historian.data (
> . . . . . . . . . . . . .> assetid unsigned_int not null,
> . . . . . . . . . . . . .> metricid unsigned_int not null,
> . . . . . . . . . . . . .> ts timestamp not null,
> . . . . . . . . . . . . .> val double
> . . . . . . . . . . . . .> CONSTRAINT pk PRIMARY KEY (assetid, metricid, ts 
> row_timestamp))
> . . . . . . . . . . . . .> IMMUTABLE_ROWS=true;
> No rows affected (1.283 seconds)
> 0: jdbc:phoenix:localhost> upsert into historian.data 
> values(1,2,'2015-01-01',1.2);
> 1 row affected (0.047 seconds)
> 0: jdbc:phoenix:localhost> upsert into historian.data 
> values(1,2,'2018-01-01',1.2);
> 1 row affected (0.005 seconds)
> 0: jdbc:phoenix:localhost> select * from historian.data;
> +----------+-----------+--------------------------+------+
> | ASSETID  | METRICID  |            TS            | VAL  |
> +----------+-----------+--------------------------+------+
> | 1        | 2         | 2015-01-01 00:00:00.000  | 1.2  |
> +----------+-----------+--------------------------+------+
> 1 row selected (0.04 seconds)
> 0: jdbc:phoenix:localhost> select count(*) from historian.data;
> +-----------+
> | COUNT(1)  |
> +-----------+
> | 1         |
> +-----------+
> 1 row selected (0.013 seconds)
> {code}
> Explain plan, where scan range is capped to compile time.
> {code}
> | CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER HISTORIAN.DATA  |
> |     ROW TIMESTAMP FILTER [0, 1470901929982)                  |
> |     SERVER FILTER BY FIRST KEY ONLY                          |
> |     SERVER AGGREGATE INTO SINGLE ROW                         |
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to