[ https://issues.apache.org/jira/browse/SPARK-36604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17521999#comment-17521999 ]
YuanGuanhu commented on SPARK-36604: ------------------------------------ [~senthh] what's the session time zone? i tested with spark 3.2.1 alse have the issue. The value's '2021-08-15 15:30:01', while the min/max value is 8 hours diff. scala> spark.sql("insert into c select '2021-08-15 15:30:01'") 22/04/14 09:23:36 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException res3: org.apache.spark.sql.DataFrame = [] scala> spark.sql("analyze table c compute statistics for columns a") res4: org.apache.spark.sql.DataFrame = [] scala> spark.sql("desc formatted c a").show(true) +--------------+--------------------+ | info_name| info_value| +--------------+--------------------+ | col_name| a| | data_type| timestamp| | comment| NULL| | min|2021-08-15 07:30:...| | max|2021-08-15 07:30:...| | num_nulls| 0| |distinct_count| 1| | avg_col_len| 8| | max_col_len| 8| | histogram| NULL| +--------------+--------------------+ scala> sql("set spark.sql.session.timeZone").show +--------------------+-------------+ | key| value| +--------------------+-------------+ |spark.sql.session...|Asia/Shanghai| +--------------------+-------------+ > timestamp type column analyze result is wrong > --------------------------------------------- > > Key: SPARK-36604 > URL: https://issues.apache.org/jira/browse/SPARK-36604 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.1.1, 3.1.2 > Environment: Spark 3.1.1 > Reporter: YuanGuanhu > Priority: Major > > when we create table with timestamp column type, the min and max data of the > analyze result for the timestamp column is wrong > eg: > {code} > > select * from a; > {code} > {code} > 2021-08-15 15:30:01 > Time taken: 2.789 seconds, Fetched 1 row(s) > spark-sql> desc formatted a a; > col_name a > data_type timestamp > comment NULL > min 2021-08-15 07:30:01.000000 > max 2021-08-15 07:30:01.000000 > num_nulls 0 > distinct_count 1 > avg_col_len 8 > max_col_len 8 > histogram NULL > Time taken: 0.278 seconds, Fetched 10 row(s) > spark-sql> desc a; > a timestamp NULL > Time taken: 1.432 seconds, Fetched 1 row(s) > {code} > > reproduce step: > {code} > create table a(a timestamp); > insert into a select '2021-08-15 15:30:01'; > analyze table a compute statistics for columns a; > desc formatted a a; > select * from a; > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org