Hi, This is question regarding timezone conversion with from_utc_timestamp function. The observation is that the function return different values for zoneId and zoneOffset for the same timezone. Ex: "America/Los_Angeles" and "-08:00"
System Timezone is +05:30 Timestamp: 1519430400 withZoneId: 2018-02-23 21:30:00 withShortZoneId: 2018-02-23 21:30:00 withZoneOffset: 2018-02-24 05:30:00 Can someone explain? Thanks, Srinath. ================================================ scala> :paste // Entering paste mode (ctrl-D to finish) import org.apache.spark.sql._ import org.apache.spark.sql.types._ import org.apache.spark.sql.functions._ val df = spark.sparkContext.parallelize(Seq((1519430400, "ts1"))) val df2 = df.toDF() * .withColumn("withZoneId", from_utc_timestamp(col("_1").cast(TimestampType), "America/Los_Angeles")) .withColumn("withZoneOffset", from_utc_timestamp(col("_1").cast(TimestampType), "-08:00")) .withColumn("withShortZoneId", from_utc_timestamp(col("_1").cast(TimestampType), "PST")) *df2.show // Exiting paste mode, now interpreting. +----------+---+-------------------+-------------------+---- ---------------+-------------------+-------- | _1. | _2. | withZoneId | withZoneOffset | withShortZoneId. | +----------+---+-------------------+-------------------+---- ---------------+-------------------+-------- |1519430400. | ts1 | 2018-02-23 21:30:00 | 2018-02-24 05:30:00 | 2018-02-23 21:30:00 | +----------+---+-------------------+-------------------+---- ---------------+-------------------+-------- ======================================================================== ============