[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...

2018-07-24 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21775


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...

2018-07-22 Thread sujith71955
Github user sujith71955 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21775#discussion_r204287713
  
--- Diff: docs/sql-programming-guide.md ---
@@ -1843,6 +1843,7 @@ working with timestamps in `pandas_udf`s to get the 
best performance, see
 
 ## Upgrading From Spark SQL 2.3 to 2.4
 
+  - Since Spark 2.4, Spark will display hive table description column  
`Last Access` value  as `UNKNOWN` following the Hive system.
--- End diff --

done.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...

2018-07-22 Thread sujith71955
Github user sujith71955 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21775#discussion_r204286227
  
--- Diff: docs/sql-programming-guide.md ---
@@ -1843,6 +1843,7 @@ working with timestamps in `pandas_udf`s to get the 
best performance, see
 
 ## Upgrading From Spark SQL 2.3 to 2.4
 
+  - Since Spark 2.4, Spark will display hive table description column  
`Last Access` value  as `UNKNOWN` following the Hive system.
--- End diff --

Right, its applicable for both type, i will update the message as per your 
comment. Thanks


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...

2018-07-22 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21775#discussion_r204285948
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala
 ---
@@ -114,7 +114,10 @@ case class CatalogTablePartition(
   map.put("Partition Parameters", s"{${parameters.map(p => p._1 + "=" 
+ p._2).mkString(", ")}}")
 }
 map.put("Created Time", new Date(createTime).toString)
-map.put("Last Access", new Date(lastAccessTime).toString)
+val lastAccess = {
+  if (-1 == lastAccessTime) "UNKNOWN" else new 
Date(lastAccessTime).toString
+}
+map.put("Last Access", lastAccess)
--- End diff --

the current way is also fine


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...

2018-07-22 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21775#discussion_r204285853
  
--- Diff: docs/sql-programming-guide.md ---
@@ -1843,6 +1843,7 @@ working with timestamps in `pandas_udf`s to get the 
best performance, see
 
 ## Upgrading From Spark SQL 2.3 to 2.4
 
+  - Since Spark 2.4, Spark will display hive table description column  
`Last Access` value  as `UNKNOWN` following the Hive system.
--- End diff --

This is applicable to both native and hive tables. How about changing it to 

> Spark will display table description column  `Last Access` value  as 
`UNKNOWN` when the value was `Jan 01 1970`. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...

2018-07-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21775#discussion_r203921304
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala 
---
@@ -2250,6 +2251,22 @@ class HiveDDLSuite
 }
   }
 
+  test("desc formatted table for last access verification") {
--- End diff --

let's name it `SPARK-24812: desc formatted table for last access 
verification`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...

2018-07-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21775#discussion_r203920736
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala 
---
@@ -2250,6 +2251,22 @@ class HiveDDLSuite
 }
   }
 
+  test("desc formatted table for last access verification") {
+withTable("t1") {
+  sql(s"create table" +
+s" if not exists t1 (c1_int int, c2_string string, c3_float 
float)")
--- End diff --

nit:

```scala
sql(
"CREATE TABLE IF NOT EXISTS t1 (c1_int INT, c2_string STRING, 
c3_float FLOAT)")
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...

2018-07-16 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/21775#discussion_r202701770
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala 
---
@@ -2248,4 +2249,20 @@ class HiveDDLSuite
   checkAnswer(spark.table("t4"), Row(0, 0))
 }
   }
+
+  test("desc formatted table for last access verification") {
+withTable("t1") {
+  sql(s"create table" +
+s" if not exists t1 (c1_int int, c2_string string, c3_float 
float)")
+  val desc = sql("DESC FORMATTED t1").collect().toSeq
+  val lastAcessField = desc.filter((r: Row) => 
r.getValuesMap(Seq("col_name"))
+.get("col_name").getOrElse("").equals("Last Access"))
+  // Check whether lastAcessField key is exist
+  assert(!lastAcessField.isEmpty)
--- End diff --

lastAccessField.nonEmpty


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...

2018-07-16 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/21775#discussion_r202703129
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala
 ---
@@ -114,7 +114,10 @@ case class CatalogTablePartition(
   map.put("Partition Parameters", s"{${parameters.map(p => p._1 + "=" 
+ p._2).mkString(", ")}}")
 }
 map.put("Created Time", new Date(createTime).toString)
-map.put("Last Access", new Date(lastAccessTime).toString)
+val lastAccess = {
+  if (-1 == lastAccessTime) "UNKNOWN" else new 
Date(lastAccessTime).toString
+}
+map.put("Last Access", lastAccess)
--- End diff --

No need for the val lastAccess?
```
map.put("Last Access",
  if (-1 == lastAccessTime) "UNKNOWN" else new 
Date(lastAccessTime).toString)
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...

2018-07-16 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/21775#discussion_r202704259
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala 
---
@@ -2248,4 +2249,20 @@ class HiveDDLSuite
   checkAnswer(spark.table("t4"), Row(0, 0))
 }
   }
+
+  test("desc formatted table for last access verification") {
+withTable("t1") {
+  sql(s"create table" +
+s" if not exists t1 (c1_int int, c2_string string, c3_float 
float)")
+  val desc = sql("DESC FORMATTED t1").collect().toSeq
+  val lastAcessField = desc.filter((r: Row) => 
r.getValuesMap(Seq("col_name"))
+.get("col_name").getOrElse("").equals("Last Access"))
+  // Check whether lastAcessField key is exist
+  assert(!lastAcessField.isEmpty)
+  val validLastAcessFieldValue = lastAcessField.filterNot((r: Row) => 
((r
+.getValuesMap(Seq("data_type"))
+.get("data_type").contains(new Date(-1).toString
+  assert(lastAcessField.size!=0)
--- End diff --

code style nit: blank before and after '!='


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...

2018-07-16 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/21775#discussion_r202703948
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala 
---
@@ -2248,4 +2249,20 @@ class HiveDDLSuite
   checkAnswer(spark.table("t4"), Row(0, 0))
 }
   }
+
+  test("desc formatted table for last access verification") {
+withTable("t1") {
+  sql(s"create table" +
+s" if not exists t1 (c1_int int, c2_string string, c3_float 
float)")
+  val desc = sql("DESC FORMATTED t1").collect().toSeq
+  val lastAcessField = desc.filter((r: Row) => 
r.getValuesMap(Seq("col_name"))
+.get("col_name").getOrElse("").equals("Last Access"))
+  // Check whether lastAcessField key is exist
+  assert(!lastAcessField.isEmpty)
+  val validLastAcessFieldValue = lastAcessField.filterNot((r: Row) => 
((r
--- End diff --

where is the val `validLastAcessFieldValue` used? 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...

2018-07-16 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/21775#discussion_r202701870
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala 
---
@@ -2248,4 +2249,20 @@ class HiveDDLSuite
   checkAnswer(spark.table("t4"), Row(0, 0))
 }
   }
+
+  test("desc formatted table for last access verification") {
+withTable("t1") {
+  sql(s"create table" +
+s" if not exists t1 (c1_int int, c2_string string, c3_float 
float)")
+  val desc = sql("DESC FORMATTED t1").collect().toSeq
+  val lastAcessField = desc.filter((r: Row) => 
r.getValuesMap(Seq("col_name"))
--- End diff --

nit: lastAccessField


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21775: [SPARK-24812][SQL] Last Access Time in the table ...

2018-07-15 Thread sujith71955
GitHub user sujith71955 opened a pull request:

https://github.com/apache/spark/pull/21775

[SPARK-24812][SQL] Last Access Time in the table description is not valid

## What changes were proposed in this pull request?
Last Access Time will always displayed wrong date Wed Dec 31 15:59:59 PST 
1969 when user run  DESC FORMATTED table command
In hive its displayed as "UNKNOWN" which makes more sense than displaying 
wrong date. seems to be a limitation as of now even from hive, better we can 
follow the hive behavior unless the limitation has been resolved from hive.

## How was this patch tested?
UT has been added which makes sure that the wrong date "Wed Dec 31 15:59:59 
PST 1969 "
shall not be added as value for the Last Access  property

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sujith71955/spark master_hive

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21775.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21775


commit 502bb924fdfd4a9593154ce72a128ca869c92fbe
Author: s71955 
Date:   2018-07-16T07:40:53Z

[SPARK-24812][SQL] Last Access Time in the table description is not valid

## What changes were proposed in this pull request?
Last Access Time will always displayed wrong date Wed Dec 31 15:59:59 PST 
1969 when user run  DESC FORMATTED table command
In hive its displayed as "UNKNOWN" which makes more sense than displaying 
wrong date. seems to be a limitation as of now, better we can follow the hive 
behavior
unless the limitation has been resolved from hive.

## How was this patch tested?
UT has been added which makes sure that the wrong date "Wed Dec 31 15:59:59 
PST 1969 "
shall not be added as value for the Last Access  property




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org