Seems like a bug we should fix? I agree some form of truncation makes more sense.
On Thu, Jun 1, 2017 at 1:17 AM, Anton Okolnychyi <anton.okolnyc...@gmail.com > wrote: > Hi all, > > I would like to ask what the community thinks regarding the way how Spark > handles nanoseconds in the Timestamp type. > > As far as I see in the code, Spark assumes microseconds precision. > Therefore, I expect to have a truncated to microseconds timestamp or an > exception if I specify a timestamp with nanoseconds. However, the current > implementation just silently sets nanoseconds as microseconds in [1], which > results in a wrong timestamp. Consider the example below: > > spark.sql("SELECT cast('2015-01-02 00:00:00.000000001' as > TIMESTAMP)").show(false) > +------------------------------------------------+ > |CAST(2015-01-02 00:00:00.000000001 AS TIMESTAMP)| > +------------------------------------------------+ > |2015-01-02 00:00:00.000001 | > +------------------------------------------------+ > > This issue was already raised in SPARK-17914 but I do not see any decision > there. > > [1] - org.apache.spark.sql.catalyst.util.DateTimeUtils, toJavaTimestamp, > line 204 > > Best regards, > Anton >