Anatoliy Plastinin created SPARK-12744: ------------------------------------------
Summary: Inconsistent behavior parsing JSON with unix timestamp values Key: SPARK-12744 URL: https://issues.apache.org/jira/browse/SPARK-12744 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.6.0 Reporter: Anatoliy Plastinin Priority: Minor Let’s have following json {code} val rdd = sc.parallelize("""{"ts":1452386229}""" :: Nil) {code} Spark sql casts int to timestamp treating int value as a number of seconds. https://issues.apache.org/jira/browse/SPARK-11724 {code} scala> sqlContext.read.json(rdd).select($"ts".cast(TimestampType)).show +--------------------+ | ts| +--------------------+ |2016-01-10 01:37:...| +--------------------+ {code} However parsing json with schema gives different result {code} scala> val schema = (new StructType).add("ts", TimestampType) schema: org.apache.spark.sql.types.StructType = StructType(StructField(ts,TimestampType,true)) scala> sqlContext.read.schema(schema).json(rdd).show +--------------------+ | ts| +--------------------+ |1970-01-17 20:26:...| +--------------------+ {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org