[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/21775 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...
Github user sujith71955 commented on a diff in the pull request: https://github.com/apache/spark/pull/21775#discussion_r204287713 --- Diff: docs/sql-programming-guide.md --- @@ -1843,6 +1843,7 @@ working with timestamps in `pandas_udf`s to get the best performance, see ## Upgrading From Spark SQL 2.3 to 2.4 + - Since Spark 2.4, Spark will display hive table description column `Last Access` value as `UNKNOWN` following the Hive system. --- End diff -- done. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...
Github user sujith71955 commented on a diff in the pull request: https://github.com/apache/spark/pull/21775#discussion_r204286227 --- Diff: docs/sql-programming-guide.md --- @@ -1843,6 +1843,7 @@ working with timestamps in `pandas_udf`s to get the best performance, see ## Upgrading From Spark SQL 2.3 to 2.4 + - Since Spark 2.4, Spark will display hive table description column `Last Access` value as `UNKNOWN` following the Hive system. --- End diff -- Right, its applicable for both type, i will update the message as per your comment. Thanks --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21775#discussion_r204285948 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -114,7 +114,10 @@ case class CatalogTablePartition( map.put("Partition Parameters", s"{${parameters.map(p => p._1 + "=" + p._2).mkString(", ")}}") } map.put("Created Time", new Date(createTime).toString) -map.put("Last Access", new Date(lastAccessTime).toString) +val lastAccess = { + if (-1 == lastAccessTime) "UNKNOWN" else new Date(lastAccessTime).toString +} +map.put("Last Access", lastAccess) --- End diff -- the current way is also fine --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21775#discussion_r204285853 --- Diff: docs/sql-programming-guide.md --- @@ -1843,6 +1843,7 @@ working with timestamps in `pandas_udf`s to get the best performance, see ## Upgrading From Spark SQL 2.3 to 2.4 + - Since Spark 2.4, Spark will display hive table description column `Last Access` value as `UNKNOWN` following the Hive system. --- End diff -- This is applicable to both native and hive tables. How about changing it to > Spark will display table description column `Last Access` value as `UNKNOWN` when the value was `Jan 01 1970`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21775#discussion_r203921304 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -2250,6 +2251,22 @@ class HiveDDLSuite } } + test("desc formatted table for last access verification") { --- End diff -- let's name it `SPARK-24812: desc formatted table for last access verification` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21775#discussion_r203920736 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -2250,6 +2251,22 @@ class HiveDDLSuite } } + test("desc formatted table for last access verification") { +withTable("t1") { + sql(s"create table" + +s" if not exists t1 (c1_int int, c2_string string, c3_float float)") --- End diff -- nit: ```scala sql( "CREATE TABLE IF NOT EXISTS t1 (c1_int INT, c2_string STRING, c3_float FLOAT)") ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21775#discussion_r202701770 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -2248,4 +2249,20 @@ class HiveDDLSuite checkAnswer(spark.table("t4"), Row(0, 0)) } } + + test("desc formatted table for last access verification") { +withTable("t1") { + sql(s"create table" + +s" if not exists t1 (c1_int int, c2_string string, c3_float float)") + val desc = sql("DESC FORMATTED t1").collect().toSeq + val lastAcessField = desc.filter((r: Row) => r.getValuesMap(Seq("col_name")) +.get("col_name").getOrElse("").equals("Last Access")) + // Check whether lastAcessField key is exist + assert(!lastAcessField.isEmpty) --- End diff -- lastAccessField.nonEmpty --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21775#discussion_r202703129 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -114,7 +114,10 @@ case class CatalogTablePartition( map.put("Partition Parameters", s"{${parameters.map(p => p._1 + "=" + p._2).mkString(", ")}}") } map.put("Created Time", new Date(createTime).toString) -map.put("Last Access", new Date(lastAccessTime).toString) +val lastAccess = { + if (-1 == lastAccessTime) "UNKNOWN" else new Date(lastAccessTime).toString +} +map.put("Last Access", lastAccess) --- End diff -- No need for the val lastAccess? ``` map.put("Last Access", if (-1 == lastAccessTime) "UNKNOWN" else new Date(lastAccessTime).toString) ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21775#discussion_r202704259 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -2248,4 +2249,20 @@ class HiveDDLSuite checkAnswer(spark.table("t4"), Row(0, 0)) } } + + test("desc formatted table for last access verification") { +withTable("t1") { + sql(s"create table" + +s" if not exists t1 (c1_int int, c2_string string, c3_float float)") + val desc = sql("DESC FORMATTED t1").collect().toSeq + val lastAcessField = desc.filter((r: Row) => r.getValuesMap(Seq("col_name")) +.get("col_name").getOrElse("").equals("Last Access")) + // Check whether lastAcessField key is exist + assert(!lastAcessField.isEmpty) + val validLastAcessFieldValue = lastAcessField.filterNot((r: Row) => ((r +.getValuesMap(Seq("data_type")) +.get("data_type").contains(new Date(-1).toString + assert(lastAcessField.size!=0) --- End diff -- code style nit: blank before and after '!=' --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21775#discussion_r202703948 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -2248,4 +2249,20 @@ class HiveDDLSuite checkAnswer(spark.table("t4"), Row(0, 0)) } } + + test("desc formatted table for last access verification") { +withTable("t1") { + sql(s"create table" + +s" if not exists t1 (c1_int int, c2_string string, c3_float float)") + val desc = sql("DESC FORMATTED t1").collect().toSeq + val lastAcessField = desc.filter((r: Row) => r.getValuesMap(Seq("col_name")) +.get("col_name").getOrElse("").equals("Last Access")) + // Check whether lastAcessField key is exist + assert(!lastAcessField.isEmpty) + val validLastAcessFieldValue = lastAcessField.filterNot((r: Row) => ((r --- End diff -- where is the val `validLastAcessFieldValue` used? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21775#discussion_r202701870 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -2248,4 +2249,20 @@ class HiveDDLSuite checkAnswer(spark.table("t4"), Row(0, 0)) } } + + test("desc formatted table for last access verification") { +withTable("t1") { + sql(s"create table" + +s" if not exists t1 (c1_int int, c2_string string, c3_float float)") + val desc = sql("DESC FORMATTED t1").collect().toSeq + val lastAcessField = desc.filter((r: Row) => r.getValuesMap(Seq("col_name")) --- End diff -- nit: lastAccessField --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...
GitHub user sujith71955 opened a pull request: https://github.com/apache/spark/pull/21775 [SPARK-24812][SQL] Last Access Time in the table description is not valid ## What changes were proposed in this pull request? Last Access Time will always displayed wrong date Wed Dec 31 15:59:59 PST 1969 when user run DESC FORMATTED table command In hive its displayed as "UNKNOWN" which makes more sense than displaying wrong date. seems to be a limitation as of now even from hive, better we can follow the hive behavior unless the limitation has been resolved from hive. ## How was this patch tested? UT has been added which makes sure that the wrong date "Wed Dec 31 15:59:59 PST 1969 " shall not be added as value for the Last Access property You can merge this pull request into a Git repository by running: $ git pull https://github.com/sujith71955/spark master_hive Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21775.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21775 commit 502bb924fdfd4a9593154ce72a128ca869c92fbe Author: s71955 Date: 2018-07-16T07:40:53Z [SPARK-24812][SQL] Last Access Time in the table description is not valid ## What changes were proposed in this pull request? Last Access Time will always displayed wrong date Wed Dec 31 15:59:59 PST 1969 when user run DESC FORMATTED table command In hive its displayed as "UNKNOWN" which makes more sense than displaying wrong date. seems to be a limitation as of now, better we can follow the hive behavior unless the limitation has been resolved from hive. ## How was this patch tested? UT has been added which makes sure that the wrong date "Wed Dec 31 15:59:59 PST 1969 " shall not be added as value for the Last Access property --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org