[jira] [Commented] (SPARK-11788) Using java.sql.Timestamp and java.sql.Date in where clauses on JDBC dataframes causes SQLServerException
[ https://issues.apache.org/jira/browse/SPARK-11788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964438#comment-15964438 ] Björn-Elmar Macek commented on SPARK-11788: --- I have an issue which maybe related and occurs in 2.1.0. I created a table which contains a Timestamp column. An describe results in: user_id int null userCreatedAt timestamp null country_iso string null When i query as follows ... select date_format(userCreatedAt, ""), userCreatedAt, date_format(userCreatedAt, "") = "2017" from result where date_format(userCreatedAt, "") = "2017" order by userCreatedAt ... the third column contains only true values as it should be due to the expression being in the where clause. When i execute the same query on the same table in which i replaced userCreatedAt by a java.sql.Timestamp created from the userCreatedAt column, the result contains lots of false values. Also, date_format(userCreatedAt, "") returns the correct year. Can anybody reproduce this issue and will it be fixed? > Using java.sql.Timestamp and java.sql.Date in where clauses on JDBC > dataframes causes SQLServerException > > > Key: SPARK-11788 > URL: https://issues.apache.org/jira/browse/SPARK-11788 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.5.2 >Reporter: Martin Tapp >Assignee: Huaxin Gao > Fix For: 1.5.3, 1.6.0 > > > I have a MSSQL table that has a timestamp column and am reading it using > DataFrameReader.jdbc. Adding a where clause which compares a timestamp range > causes a SQLServerException. > The problem is in > https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala#L264 > (compileValue) which should surround timestamps/dates with quotes (only does > it for strings). > Sample pseudo-code: > val beg = new java.sql.Timestamp(...) > val end = new java.sql.Timestamp(...) > val filtered = jdbcdf.where($"TIMESTAMP_COLUMN" >= beg && $"TIMESTAMP_COLUMN" > < end) > Generated SQL query: "TIMESTAMP_COLUMN >= 2015-01-01 00:00:00.0" > Query should use quotes around timestamp: "TIMESTAMP_COLUMN >= '2015-01-01 > 00:00:00.0'" > Fallback is to filter client-side which is extremely inefficient as the whole > table needs to be downloaded to each Spark executor. > Thanks -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11788) Using java.sql.Timestamp and java.sql.Date in where clauses on JDBC dataframes causes SQLServerException
[ https://issues.apache.org/jira/browse/SPARK-11788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022775#comment-15022775 ] Huaxin Gao commented on SPARK-11788: I will change to timestampValue and dateValue. In the test case, I intentionally make the date and timestamp 1 year less than the value in the table because I am using > $"B" > date && $"C" > timestamp > Using java.sql.Timestamp and java.sql.Date in where clauses on JDBC > dataframes causes SQLServerException > > > Key: SPARK-11788 > URL: https://issues.apache.org/jira/browse/SPARK-11788 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.5.2 >Reporter: Martin Tapp > > I have a MSSQL table that has a timestamp column and am reading it using > DataFrameReader.jdbc. Adding a where clause which compares a timestamp range > causes a SQLServerException. > The problem is in > https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala#L264 > (compileValue) which should surround timestamps/dates with quotes (only does > it for strings). > Sample pseudo-code: > val beg = new java.sql.Timestamp(...) > val end = new java.sql.Timestamp(...) > val filtered = jdbcdf.where($"TIMESTAMP_COLUMN" >= beg && $"TIMESTAMP_COLUMN" > < end) > Generated SQL query: "TIMESTAMP_COLUMN >= 2015-01-01 00:00:00.0" > Query should use quotes around timestamp: "TIMESTAMP_COLUMN >= '2015-01-01 > 00:00:00.0'" > Fallback is to filter client-side which is extremely inefficient as the whole > table needs to be downloaded to each Spark executor. > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11788) Using java.sql.Timestamp and java.sql.Date in where clauses on JDBC dataframes causes SQLServerException
[ https://issues.apache.org/jira/browse/SPARK-11788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022845#comment-15022845 ] Huaxin Gao commented on SPARK-11788: Sorry, I just realized that actually I checked in the $"B" === date && $"C" === timestamp instead of the $"B" > date && $"C" > timestamp I will change. > Using java.sql.Timestamp and java.sql.Date in where clauses on JDBC > dataframes causes SQLServerException > > > Key: SPARK-11788 > URL: https://issues.apache.org/jira/browse/SPARK-11788 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.5.2 >Reporter: Martin Tapp > > I have a MSSQL table that has a timestamp column and am reading it using > DataFrameReader.jdbc. Adding a where clause which compares a timestamp range > causes a SQLServerException. > The problem is in > https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala#L264 > (compileValue) which should surround timestamps/dates with quotes (only does > it for strings). > Sample pseudo-code: > val beg = new java.sql.Timestamp(...) > val end = new java.sql.Timestamp(...) > val filtered = jdbcdf.where($"TIMESTAMP_COLUMN" >= beg && $"TIMESTAMP_COLUMN" > < end) > Generated SQL query: "TIMESTAMP_COLUMN >= 2015-01-01 00:00:00.0" > Query should use quotes around timestamp: "TIMESTAMP_COLUMN >= '2015-01-01 > 00:00:00.0'" > Fallback is to filter client-side which is extremely inefficient as the whole > table needs to be downloaded to each Spark executor. > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11788) Using java.sql.Timestamp and java.sql.Date in where clauses on JDBC dataframes causes SQLServerException
[ https://issues.apache.org/jira/browse/SPARK-11788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022200#comment-15022200 ] Martin Tapp commented on SPARK-11788: - Code must use timestampValue and dateValue instead of value. Test case uses wrong year for timestamp and date between dataframe data and assert (year is offset by 1) > Using java.sql.Timestamp and java.sql.Date in where clauses on JDBC > dataframes causes SQLServerException > > > Key: SPARK-11788 > URL: https://issues.apache.org/jira/browse/SPARK-11788 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.5.2 >Reporter: Martin Tapp > > I have a MSSQL table that has a timestamp column and am reading it using > DataFrameReader.jdbc. Adding a where clause which compares a timestamp range > causes a SQLServerException. > The problem is in > https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala#L264 > (compileValue) which should surround timestamps/dates with quotes (only does > it for strings). > Sample pseudo-code: > val beg = new java.sql.Timestamp(...) > val end = new java.sql.Timestamp(...) > val filtered = jdbcdf.where($"TIMESTAMP_COLUMN" >= beg && $"TIMESTAMP_COLUMN" > < end) > Generated SQL query: "TIMESTAMP_COLUMN >= 2015-01-01 00:00:00.0" > Query should use quotes around timestamp: "TIMESTAMP_COLUMN >= '2015-01-01 > 00:00:00.0'" > Fallback is to filter client-side which is extremely inefficient as the whole > table needs to be downloaded to each Spark executor. > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11788) Using java.sql.Timestamp and java.sql.Date in where clauses on JDBC dataframes causes SQLServerException
[ https://issues.apache.org/jira/browse/SPARK-11788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15018511#comment-15018511 ] Martin Tapp commented on SPARK-11788: - Any udpate, this seems like an easy fix? > Using java.sql.Timestamp and java.sql.Date in where clauses on JDBC > dataframes causes SQLServerException > > > Key: SPARK-11788 > URL: https://issues.apache.org/jira/browse/SPARK-11788 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.5.2 >Reporter: Martin Tapp > > I have a MSSQL table that has a timestamp column and am reading it using > DataFrameReader.jdbc. Adding a where clause which compares a timestamp range > causes a SQLServerException. > The problem is in > https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala#L264 > (compileValue) which should surround timestamps/dates with quotes (only does > it for strings). > Sample pseudo-code: > val beg = new java.sql.Timestamp(...) > val end = new java.sql.Timestamp(...) > val filtered = jdbcdf.where($"TIMESTAMP_COLUMN" >= beg && $"TIMESTAMP_COLUMN" > < end) > Generated SQL query: "TIMESTAMP_COLUMN >= 2015-01-01 00:00:00.0" > Query should use quotes around timestamp: "TIMESTAMP_COLUMN >= '2015-01-01 > 00:00:00.0'" > Fallback is to filter client-side which is extremely inefficient as the whole > table needs to be downloaded to each Spark executor. > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11788) Using java.sql.Timestamp and java.sql.Date in where clauses on JDBC dataframes causes SQLServerException
[ https://issues.apache.org/jira/browse/SPARK-11788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15018826#comment-15018826 ] Huaxin Gao commented on SPARK-11788: Hi Martin, I will try to have a Pull Request soon. > Using java.sql.Timestamp and java.sql.Date in where clauses on JDBC > dataframes causes SQLServerException > > > Key: SPARK-11788 > URL: https://issues.apache.org/jira/browse/SPARK-11788 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.5.2 >Reporter: Martin Tapp > > I have a MSSQL table that has a timestamp column and am reading it using > DataFrameReader.jdbc. Adding a where clause which compares a timestamp range > causes a SQLServerException. > The problem is in > https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala#L264 > (compileValue) which should surround timestamps/dates with quotes (only does > it for strings). > Sample pseudo-code: > val beg = new java.sql.Timestamp(...) > val end = new java.sql.Timestamp(...) > val filtered = jdbcdf.where($"TIMESTAMP_COLUMN" >= beg && $"TIMESTAMP_COLUMN" > < end) > Generated SQL query: "TIMESTAMP_COLUMN >= 2015-01-01 00:00:00.0" > Query should use quotes around timestamp: "TIMESTAMP_COLUMN >= '2015-01-01 > 00:00:00.0'" > Fallback is to filter client-side which is extremely inefficient as the whole > table needs to be downloaded to each Spark executor. > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11788) Using java.sql.Timestamp and java.sql.Date in where clauses on JDBC dataframes causes SQLServerException
[ https://issues.apache.org/jira/browse/SPARK-11788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15018928#comment-15018928 ] Apache Spark commented on SPARK-11788: -- User 'huaxingao' has created a pull request for this issue: https://github.com/apache/spark/pull/9872 > Using java.sql.Timestamp and java.sql.Date in where clauses on JDBC > dataframes causes SQLServerException > > > Key: SPARK-11788 > URL: https://issues.apache.org/jira/browse/SPARK-11788 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.5.2 >Reporter: Martin Tapp > > I have a MSSQL table that has a timestamp column and am reading it using > DataFrameReader.jdbc. Adding a where clause which compares a timestamp range > causes a SQLServerException. > The problem is in > https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala#L264 > (compileValue) which should surround timestamps/dates with quotes (only does > it for strings). > Sample pseudo-code: > val beg = new java.sql.Timestamp(...) > val end = new java.sql.Timestamp(...) > val filtered = jdbcdf.where($"TIMESTAMP_COLUMN" >= beg && $"TIMESTAMP_COLUMN" > < end) > Generated SQL query: "TIMESTAMP_COLUMN >= 2015-01-01 00:00:00.0" > Query should use quotes around timestamp: "TIMESTAMP_COLUMN >= '2015-01-01 > 00:00:00.0'" > Fallback is to filter client-side which is extremely inefficient as the whole > table needs to be downloaded to each Spark executor. > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11788) Using java.sql.Timestamp and java.sql.Date in where clauses on JDBC dataframes causes SQLServerException
[ https://issues.apache.org/jira/browse/SPARK-11788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15009575#comment-15009575 ] Huaxin Gao commented on SPARK-11788: I would like to work on this problem. > Using java.sql.Timestamp and java.sql.Date in where clauses on JDBC > dataframes causes SQLServerException > > > Key: SPARK-11788 > URL: https://issues.apache.org/jira/browse/SPARK-11788 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.5.2 >Reporter: Martin Tapp > > I have a MSSQL table that has a timestamp column and am reading it using > DataFrameReader.jdbc. Adding a where clause which compares a timestamp range > causes a SQLServerException. > The problem is in > https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala#L264 > (compileValue) which should surround timestamps/dates with quotes (only does > it for strings). > Sample pseudo-code: > val beg = new java.sql.Timestamp(...) > val end = new java.sql.Timestamp(...) > val filtered = jdbcdf.where($"TIMESTAMP_COLUMN" >= beg && $"TIMESTAMP_COLUMN" > < end) > Generated SQL query: "TIMESTAMP_COLUMN >= 2015-01-01 00:00:00.0" > Query should use quotes around timestamp: "TIMESTAMP_COLUMN >= '2015-01-01 > 00:00:00.0'" > Fallback is to filter client-side which is extremely inefficient as the whole > table needs to be downloaded to each Spark executor. > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org