[
https://issues.apache.org/jira/browse/PHOENIX-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14901989#comment-14901989
]
Samarth Jain commented on PHOENIX-914:
--------------------------------------
bq. Just use the absence of the attribute to indicate that the optimization
isn't being used, and just always call your validateRowTimestampValue() call
I need to discern what value should I be using for setting the row timestamp
column value. Absence/presence of attribute is only telling me about whether
the row timestamp column is on the table or not. It is not telling me whether I
should use the server time stamp value ts or that the scanner results already
have the value I need. I hope I am not missing something obvious here.
The other alternative was to use projectedTable.getRowTimestampColPos() to
determine the column position. And then do something like this:
{code}
boolean useServerTimestamp = false;
int rowTimestampColPos = projectedTable.getRowTimestampColPos();
byte[] useServerTimestampAttr) =
scan.getAttribute(BaseScannerRegionObserver.ROWTIMESTAMP_USE_SERVER_TIME);
if (useServerTimestampAttr != null) {
useServerTimestamp =
(boolean)PBoolean.INSTANCE.toObject(useServerTimestampAttr);
}
if (rowTimestampColPos == i) {
if (useServerTimestamp) {
values[i] = PLong.INSTANCE.toBytes(ts); // no need to
validate here since server time will always be greater than zero
} else {
validateRowTimestampValue(c, values[i], i, expression);
}
}
{code}
Unfortunately, I can't use projectedTable.getRowTimestampColPos() to determine
the row time stamp column position. This is because older clients won't be
sending over PColumn.isRowtimestampCol(). So the value returned by
ptable.getRowTimestampColPos() will always be -1.
Using the below for the row timestamp column node fails:
{code}
PDataType type = col.getDataType();
if (type.isCoercibleTo(PTimestamp.INSTANCE)) {
return new LiteralParseNode(new Timestamp(-1), PTimestamp.INSTANCE);
} else if (type == PLong.INSTANCE || type == PUnsignedLong.INSTANCE) {
return new LiteralParseNode(-1L, PLong.INSTANCE);
}
throw new IllegalArgumentException();
{code}
The place at which it fails in LiteralExpression is below.
{code}
if (!actualType.isCoercibleTo(type, value) &&
(!actualType.equals(PVarchar.INSTANCE) ||
!(type.equals(PDate.INSTANCE) ||
type.equals(PTimestamp.INSTANCE) || type.equals(PTime.INSTANCE)))) {
throw TypeMismatchException.newException(type, actualType,
value.toString());
}
{code}
For a DATE type rowtimestamp column, the actualType of literal expression is
PTimestamp which is not coercible to column type PDate because of this
condition in PTimestamp.java:
{code}
if (equalsAny(targetType, PDate.INSTANCE, PTime.INSTANCE)) {
return ((java.sql.Timestamp) value).getNanos() == 0;
}
{code}
The nanos part isn't zero.
> Native HBase timestamp support to optimize date range queries in Phoenix
> -------------------------------------------------------------------------
>
> Key: PHOENIX-914
> URL: https://issues.apache.org/jira/browse/PHOENIX-914
> Project: Phoenix
> Issue Type: Improvement
> Affects Versions: 4.0.0
> Reporter: Vladimir Rodionov
> Assignee: Samarth Jain
> Attachments: PHOENIX-914.patch, PHOENIX-914.patch, wip.patch
>
>
> For many applications one of the column of a table can be (and must be)
> naturally mapped
> to HBase timestamp. What it gives us is the optimization on StoreScanner
> where HFiles with timestamps out of range of
> a Scan operator will be omitted. Let us say that we have time-series type of
> data (EVENTS) and custom compaction, where we create
> series of HFiles with continuous non-overlapping timestamp ranges.
> CREATE TABLE IF NOT EXISTS ODS.EVENTS (
> METRICID VARCHAR NOT NULL,
> METRICNAME VARCHAR,
> SERVICENAME VARCHAR NOT NULL,
> ORIGIN VARCHAR NOT NULL,
> APPID VARCHAR,
> IPID VARCHAR,
> NVALUE DOUBLE,
> TIME TIMESTAMP NOT NULL /+ TIMESTAMP +/,
> DATA VARCHAR,
> SVALUE VARCHAR
> CONSTRAINT PK PRIMARY KEY (METRICID, SERVICENAME, ORIGIN, APPID, IPID,
> TIME)
> ) SALT_BUCKETS=40, IMMUTABLE_ROWS=true,VERSIONS=1,DATA_BLOCK_ENCODING='NONE';
> Make note on TIME TIMESTAMP NOT NULL /+ TIMESTAMP +/ - this is the Hint to
> Phoenix that the column
> TIME must be mapped to HBase timestamp.
> The Query:
> Select all events of type 'X' for last 7 days
> SELECT * from EVENTS WHERE METRICID = 'X' and TIME < NOW() and TIME > NOW() -
> 7*24*3600000; (this may be not correct SQL syntax of course)
> These types of queries will be efficiently optimized if:
> 1. Phoenix maps TIME column to HBase timestamp
> 2. Phoenix smart enough to map WHERE clause on TIME attribute to Scan
> timerange
> Although this :
> Properties props = new Properties();
> props.setProperty(PhoenixRuntime.CURRENT_SCN_ATTRIB, Long.toString(ts));
> Connection conn = DriverManager.connect(myUrl, props);
> conn.createStatement().execute("UPSERT INTO myTable VALUES ('a')");
> conn.commit();
> will work in my case- it may not be efficient from performance point of view
> because for every INSERT/UPSERT
> new Connection object and new Statement is created, beside this we still need
> the optimization 2. (see above).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)