[ https://issues.apache.org/jira/browse/SPARK-24545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eric Blanco updated SPARK-24545: -------------------------------- Description: Hello, I tried to get the hour out of a date and it works except if the hour is 2. It works well in Scala but in PySpark it shows hour 3 instead of hour 2. Example: from pyspark.sql.functions import * columns = ["id","date"] vals = [(4,"2016-03-27 02:00:00")] df = sqlContext.createDataFrame(vals, columns) df.withColumn("hours", hour(col("date"))).show() +------+---------------++----- |id|date|hours| +------+---------------++----- |4|2016-03-27 2:00:00|3| +------+---------------++----- It works as expected for other hours. Also, if you change the year or month apparently it works well. was: Hello, I tried to get the hour out of a date and it works except if the hour is 2. It works well in Scala but in PySpark it shows hour 3 instead of hour 2. Example: from pyspark.sql.functions import * columns = ["id","date"] vals = [(4,"2016-03-27 02:00:00")] df = sqlContext.createDataFrame(vals, columns) df.withColumn("hours", hour(col("date"))).show() +-----+----------------++----- |id|date|hours| +-----+----------------++----- |4|2016-03-27 2:00:00|3| +-----+----------------++----- It works as expected for other hours. Also, if you change the year apparently it works well. > Function hour not working as expected for hour 2 in PySpark > ----------------------------------------------------------- > > Key: SPARK-24545 > URL: https://issues.apache.org/jira/browse/SPARK-24545 > Project: Spark > Issue Type: Bug > Components: Java API > Affects Versions: 2.2.1 > Reporter: Eric Blanco > Priority: Minor > > Hello, > I tried to get the hour out of a date and it works except if the hour is 2. > It works well in Scala but in PySpark it shows hour 3 instead of hour 2. > Example: > from pyspark.sql.functions import * > columns = ["id","date"] > vals = [(4,"2016-03-27 02:00:00")] > df = sqlContext.createDataFrame(vals, columns) > df.withColumn("hours", hour(col("date"))).show() > +------+---------------++----- > |id|date|hours| > +------+---------------++----- > |4|2016-03-27 2:00:00|3| > +------+---------------++----- > > It works as expected for other hours. > Also, if you change the year or month apparently it works well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org