Gerhard Fiedler created SPARK-12683: ---------------------------------------
Summary: SQL timestamp is wrong when accessed as Python datetime Key: SPARK-12683 URL: https://issues.apache.org/jira/browse/SPARK-12683 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 1.6.0, 1.5.2, 1.5.1 Environment: Windows 7 Pro x64 Python 3.4.3 py4j 0.9 Reporter: Gerhard Fiedler When accessing SQL timestamp data through {{.show()}}, it looks correct, but when accessing it (as Python {{datetime}}) through {{.collect()}}, it is wrong. {code} from datetime import datetime from pyspark import SparkContext from pyspark.sql import SQLContext if __name__ == "__main__": spark_context = SparkContext(appName='SparkBugTimestampHour') sql_context = SQLContext(spark_context) sql_text = """select cast('2100-09-09 12:11:10.09' as timestamp) as ts""" data_frame = sql_context.sql(sql_text) data_frame.show(truncate=False) # Result from .show() (as expected, looks correct): # +----------------------+ # |ts | # +----------------------+ # |2100-09-09 12:11:10.09| # +----------------------+ rows = data_frame.collect() row = rows[0] ts = row[0] print('ts={ts}'.format(ts=ts)) # Expected result from this print statement: # ts=2100-09-09 12:11:10.090000 # # Actual, wrong result (note the hours being 18 instead of 12): # ts=2100-09-09 18:11:10.090000 # # This error seems to be dependent on some characteristic of the system. We couldn't reproduce # this on all of our systems, but it is not clear what the differences are. One difference is # the processor: it failed on Intel Xeon E5-2687W v2. assert isinstance(ts, datetime) assert ts.year == 2100 and ts.month == 9 and ts.day == 9 assert ts.minute == 11 and ts.second == 10 and ts.microsecond == 90000 if ts.hour != 12: print('hour is not correct; should be 12, is actually {hour}'.format(hour=ts.hour)) spark_context.stop() {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org