Mind if I ask a reproducer? seems returning timestamps fine: >>> from pyspark.sql.functions import * >>> spark.range(1).select(to_timestamp(current_timestamp())).printSchema() root |-- to_timestamp(current_timestamp()): timestamp (nullable = false)
>>> spark.range(1).select(to_timestamp(current_timestamp())).show() +---------------------------------+ |to_timestamp(current_timestamp())| +---------------------------------+ | 2018-03-19 14:45:...| +---------------------------------+ >>> spark.range(1).select(current_timestamp().cast("timestamp")).printSchema() root |-- CAST(current_timestamp() AS TIMESTAMP): timestamp (nullable = false) >>> spark.range(1).select(current_timestamp().cast("timestamp")).show() +--------------------------------------+ |CAST(current_timestamp() AS TIMESTAMP)| +--------------------------------------+ | 2018-03-19 14:45:...| +--------------------------------------+ 2018-03-16 9:00 GMT+09:00 Alan Featherston Lago <alanf...@gmail.com>: > I'm a pretty new user of spark and I've run into this issue with the > pyspark docs: > > The functions pyspark.sql.functions.to_date && > pyspark.sql.functions.to_timestamp > behave in the same way. As in both functions convert a Column of > pyspark.sql.types.StringType or pyspark.sql.types.TimestampType into > pyspark.sql.types.DateType. > > Shouldn't the function `to_timestmap` return pyspark.sql.types. > TimestampType? > Also the to_timestamp docs say that "By default, it follows casting rules > to pyspark.sql.types.TimestampType if the format is omitted (equivalent > to col.cast("timestamp")). ", which doesn't seem to be right ie: > > to_timestamp(current_timestamp()) <> current_timestamp().cast("timestamp") > > > This is wrong right? or am I missing something? (is this due to the > underlying jvm data types?) > > > Cheers, > alan >