Hi guys, Running with a parquet backed table in hive ‘dim_promo_date_curr_p' which has the following data;
scala> sqlContext.sql("select * from pz.dim_promo_date_curr_p").show(3) 15/06/18 00:53:21 INFO ParseDriver: Parsing command: select * from pz.dim_promo_date_curr_p 15/06/18 00:53:21 INFO ParseDriver: Parse Completed +----------+-------------+-----------+ |clndr_date|pw_start_date|pw_end_date| +----------+-------------+-----------+ |2015-02-18| 2015-02-18| 2015-02-24| |2015-11-13| 2015-11-11| 2015-11-17| |2015-03-31| 2015-03-25| 2015-03-31| |2015-07-21| 2015-07-15| 2015-07-21| +----------+-------------+-----------+ Running a query from Spark 1.4 shell with the sqlContext (hive) with date_add it seems to work except for the value from the table. I’ve only seen it on the 31st of March, no other dates; scala> sqlContext.sql("SELECT DATE_ADD(CLNDR_DATE, 7) as wrong, DATE_ADD('2015-03-30', 7) as right30, DATE_ADD('2015-03-31', 7) as right31, DATE_ADD('2015-04-01', 7) as right01 FROM pz.dim_promo_date_curr_p WHERE CLNDR_DATE='2015-03-31'").show 15/06/18 00:57:32 INFO ParseDriver: Parsing command: SELECT DATE_ADD(CLNDR_DATE, 7) as wrong, DATE_ADD('2015-03-30', 7) as right30, DATE_ADD('2015-03-31', 7) as right31, DATE_ADD('2015-04-01', 7) as right01 FROM pz.dim_promo_date_curr_p WHERE CLNDR_DATE='2015-03-31' 15/06/18 00:57:32 INFO ParseDriver: Parse Completed +----------+----------+----------+----------+ | wrong| right30| right31| right01| +----------+----------+----------+----------+ |2015-04-06|2015-04-06|2015-04-07|2015-04-08| +----------+----------+----------+----------+ It seems to miss a date, even though the where clause has 31st in it. When the date is just a string the select clause seems to work fine. Problem appears in Spark 1.3.1 as well. Not sure if this is coming from Hive, but it seems like a bug. I’ve raised a JIRA https://issues.apache.org/jira/browse/SPARK-8421 Cheers, Nathan