[ https://issues.apache.org/jira/browse/PHOENIX-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16235907#comment-16235907 ]
Josh Mahonin commented on PHOENIX-4347: --------------------------------------- Can you post this question to the phoenix-users mailing list? I suspect someone may have run into this and found a way to do it already. However, if you're able to provide a reproducible unit test in PhoenixSparkIT [ 1 ] which necessitates a patch, a contribution would be most welcome. Thanks! [ 1 ] https://github.com/apache/phoenix/blob/master/phoenix-spark/src/it/scala/org/apache/phoenix/spark/PhoenixSparkIT.scala > Spark Dataset loaded using Phoenix Spark Datasource - Timestamp filter issue > ---------------------------------------------------------------------------- > > Key: PHOENIX-4347 > URL: https://issues.apache.org/jira/browse/PHOENIX-4347 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.11.0 > Environment: CentOS 6.5, Fedora 25 > Reporter: Lokesh Kumar > Priority: Major > Labels: phoenix, spark-sql > > Created a Phoenix table with below schema: > {code:java} > CREATE TABLE IF NOT EXISTS sample_table ( > id VARCHAR NOT NULL, > metricid VARCHAR NOT NULL, > timestamp TIMESTAMP NOT NULL, > metricvalue DOUBLE, > CONSTRAINT st_pk PRIMARY KEY(id,metricid,timestamp)) SALT_BUCKETS = 20; > {code} > Inserted some data into this and loaded as Spark Dataset using the Phoenix > spark datasource ('org.apache.phoenix.spark') options. > The Spark Dataset's schema is as given below: > root > |-- ID: string (nullable = true) > |-- METRICID: string (nullable = true) > |-- TIMESTAMP: timestamp (nullable = true) > |-- METRICVALUE: double (nullable = true) > I apply the Dataset's filter operation on Timestamp column as given below: > {code:java} > Dataset<Row> ds = <Derived from Phoenix> > ds = ds.filter("TIMESTAMP >= CAST('2017-10-31 00:00:00.0' AS TIMESTAMP)") > {code} > This operation throws me an exception as: > testPhoenixTimestamp(DatasetTest): > org.apache.phoenix.exception.PhoenixParserException: ERROR 604 (42P00): > Syntax error. Mismatched input. Expecting "RPAREN", got "00" at line 1, > column 145. > The generated query looks like this: > {code:java} > 2017-11-02 15:29:31,722 INFO [main] > org.apache.phoenix.mapreduce.PhoenixInputFormat > Select Statement: SELECT "ID","METRICID","TIMESTAMP","0"."METRICVALUE" FROM > SAMPLE_TABLE WHERE ( "TIMESTAMP" IS NOT NULL AND "TIMESTAMP" >= 2017-10-31 > 00:00:00.0) > {code} > The issue is with Timestamp filter condition, where the timestamp value is > not wrapped in to_timestamp() function. > I have fixed this locally in org.apache.phoenix.spark.PhoenixRelation class > compileValue() function, by checking the value's class. If it is > java.sql.Timestamp then I am wrapping the value with to_timestamp() function. > Please let me know if there is another way of correctly querying Timestamp > values in Phoenix through Spark's Dataset API. -- This message was sent by Atlassian JIRA (v6.4.14#64029)