[jira] [Commented] (PHOENIX-914) Native HBase timestamp support to optimize date range queries in Phoenix

James Taylor (JIRA) Mon, 21 Sep 2015 21:16:48 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14901900#comment-14901900
 ]


James Taylor commented on PHOENIX-914:
--------------------------------------

Nice work on the additional testing, [~samarthjain]. Things are looking really 
good.

bq. I need to pass both - whether row time stamp needs to be automatically 
selected from the table timestamp as well as the actual position of the row 
timestamp column.
Just use the absence of the attribute to indicate that the optimization isn't 
being used, and just always call your validateRowTimestampValue() call. You 
would have set the expression on the client side to a constant ts value, right? 
We can live with the extra v < 0 check. In theory,  if 
rowTimestampExpression.isStateless() you could pre-calculate the bytes outside 
of the loop and just use that value instead of calling 
validateRowTimestampValue.
{code}
+                                        if (rowTimestampColPos == i) {
+                                                validateRowTimestampValue(c, 
values[i], i, expression);
+                                        }
{code}

bq. The code in UpsertCompiler requires the value part in the 
LiteralExpressionNode to be of the same type.
Phoenix will auto coerce the value. It won't go from long -> Date 
automatically, so you'll need to have two conditions. Something like (but if 
this introduces issues, not a huge deal - just trying to cut down on explicit 
references to PDataType instances are reduce code):
{code}
    PDataType type = col.getDataType();
    if (type.isCoercibleTo(PTimestamp.INSTANCE)) {
        return new LiteralParseNode(new Timestamp(-1), PTimestamp.INSTANCE);
    } else if (type == PLong.INSTANCE || type == PUnsignedLong.INSTANCE) {
        return new LiteralParseNode(-1L, PLong.INSTANCE);
    }
    throw new IllegalArgumentException();
{code}

Minor nit: looks like comment only changes to ReadIsolationLevelIT.java, so 
perhaps revert?


> Native HBase timestamp support to optimize date range queries in Phoenix 
> -------------------------------------------------------------------------
>
>                 Key: PHOENIX-914
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-914
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 4.0.0
>            Reporter: Vladimir Rodionov
>            Assignee: Samarth Jain
>         Attachments: PHOENIX-914.patch, PHOENIX-914.patch, wip.patch
>
>
> For many applications one of the column of a table can be (and must be) 
> naturally mapped 
> to HBase timestamp. What it gives us is the optimization on StoreScanner 
> where HFiles with timestamps out of range of
> a Scan operator will be omitted. Let us say that we have time-series type of 
> data (EVENTS) and custom compaction, where we create 
> series of HFiles with continuous non-overlapping timestamp ranges.
> CREATE TABLE IF NOT EXISTS ODS.EVENTS (
>     METRICID  VARCHAR NOT NULL,
>     METRICNAME VARCHAR,
>     SERVICENAME VARCHAR NOT NULL,
>     ORIGIN VARCHAR NOT NULL,
>     APPID VARCHAR,
>     IPID VARCHAR,
>     NVALUE DOUBLE,
>     TIME TIMESTAMP NOT NULL  /+ TIMESTAMP +/,
>     DATA VARCHAR,
>     SVALUE VARCHAR
>     CONSTRAINT PK PRIMARY KEY (METRICID, SERVICENAME, ORIGIN, APPID, IPID, 
> TIME)
> ) SALT_BUCKETS=40, IMMUTABLE_ROWS=true,VERSIONS=1,DATA_BLOCK_ENCODING='NONE';
> Make note on   TIME TIMESTAMP NOT NULL  /+ TIMESTAMP +/ - this is the Hint to 
> Phoenix that the column
> TIME must be mapped to HBase timestamp. 
> The Query:
> Select all events of type 'X' for last 7 days
> SELECT * from EVENTS WHERE METRICID = 'X' and TIME < NOW() and TIME > NOW() - 
> 7*24*3600000; (this may be not correct SQL syntax of course)
> These types of queries will be efficiently optimized if:
> 1. Phoenix maps  TIME column to HBase timestamp
> 2. Phoenix smart enough to map WHERE clause on TIME attribute to Scan 
> timerange 
> Although this :
> Properties props = new Properties();
> props.setProperty(PhoenixRuntime.CURRENT_SCN_ATTRIB, Long.toString(ts));
> Connection conn = DriverManager.connect(myUrl, props);
> conn.createStatement().execute("UPSERT INTO myTable VALUES ('a')");
> conn.commit();
> will work in my case- it may not be efficient from performance point of view 
> because for every INSERT/UPSERT 
> new Connection object and new Statement is created, beside this we still need 
> the optimization 2. (see above). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-914) Native HBase timestamp support to optimize date range queries in Phoenix

Reply via email to