Mind if I ask a reproducer? seems returning timestamps fine:

>>> from pyspark.sql.functions import *
>>> spark.range(1).select(to_timestamp(current_timestamp())).printSchema()
root
 |-- to_timestamp(current_timestamp()): timestamp (nullable = false)

>>> spark.range(1).select(to_timestamp(current_timestamp())).show()
+---------------------------------+
|to_timestamp(current_timestamp())|
+---------------------------------+
|             2018-03-19 14:45:...|
+---------------------------------+

>>> spark.range(1).select(current_timestamp().cast("timestamp")).printSchema()
root
 |-- CAST(current_timestamp() AS TIMESTAMP): timestamp (nullable = false)

>>> spark.range(1).select(current_timestamp().cast("timestamp")).show()
+--------------------------------------+
|CAST(current_timestamp() AS TIMESTAMP)|
+--------------------------------------+
|                  2018-03-19 14:45:...|
+--------------------------------------+

​



2018-03-16 9:00 GMT+09:00 Alan Featherston Lago <alanf...@gmail.com>:

> I'm a pretty new user of spark and I've run into this issue with the
> pyspark docs:
>
> The functions pyspark.sql.functions.to_date && 
> pyspark.sql.functions.to_timestamp
> behave in the same way. As in both functions convert a Column of
> pyspark.sql.types.StringType or pyspark.sql.types.TimestampType into
> pyspark.sql.types.DateType.
>
> Shouldn't the function `to_timestmap` return pyspark.sql.types.
> TimestampType?
> Also the to_timestamp docs say that "By default, it follows casting rules
> to pyspark.sql.types.TimestampType if the format is omitted (equivalent
> to col.cast("timestamp")). ", which doesn't seem to be right ie:
>
> to_timestamp(current_timestamp()) <> current_timestamp().cast("timestamp")
>
>
> This is wrong right? or am I missing something? (is this due to the
> underlying jvm data types?)
>
>
> Cheers,
> alan
>

Reply via email to