[GitHub] spark pull request #18252: [SPARK-17914][SQL] Fix parsing of timestamp strin...

2017-06-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18252


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18252: [SPARK-17914][SQL] Fix parsing of timestamp strin...

2017-06-10 Thread aokolnychyi
Github user aokolnychyi commented on a diff in the pull request:

https://github.com/apache/spark/pull/18252#discussion_r121252397
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
 ---
@@ -399,13 +399,13 @@ object DateTimeUtils {
   digitsMilli += 1
 }
 
-if (!justTime && isInvalidDate(segments(0), segments(1), segments(2))) 
{
-  return None
+while (digitsMilli > 6) {
--- End diff --

@wzhfy done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18252: [SPARK-17914][SQL] Fix parsing of timestamp strin...

2017-06-10 Thread aokolnychyi
Github user aokolnychyi commented on a diff in the pull request:

https://github.com/apache/spark/pull/18252#discussion_r121251811
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
 ---
@@ -32,7 +32,7 @@ import org.apache.spark.unsafe.types.UTF8String
  * Helper functions for converting between internal and external date and 
time representations.
  * Dates are exposed externally as java.sql.Date and are represented 
internally as the number of
  * dates since the Unix epoch (1970-01-01). Timestamps are exposed 
externally as java.sql.Timestamp
- * and are stored internally as longs, which are capable of storing 
timestamps with 100 nanosecond
+ * and are stored internally as longs, which are capable of storing 
timestamps with microsecond
--- End diff --

Sure, but the previous comment, which was introduced in 
[this](https://github.com/apache/spark/commit/6b7f2ceafdcbb014791909747c2210b527305df9)
 commit, was no longer correct. The logic was changed in 
[this](https://github.com/apache/spark/commit/a290814877308c6fa9b0f78b1a81145db7651ca4)
 commit and now it is up to microseconds.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18252: [SPARK-17914][SQL] Fix parsing of timestamp strin...

2017-06-09 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18252#discussion_r121246583
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
 ---
@@ -32,7 +32,7 @@ import org.apache.spark.unsafe.types.UTF8String
  * Helper functions for converting between internal and external date and 
time representations.
  * Dates are exposed externally as java.sql.Date and are represented 
internally as the number of
  * dates since the Unix epoch (1970-01-01). Timestamps are exposed 
externally as java.sql.Timestamp
- * and are stored internally as longs, which are capable of storing 
timestamps with 100 nanosecond
+ * and are stored internally as longs, which are capable of storing 
timestamps with microsecond
--- End diff --

100 ns is different from micro, isn't it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18252: [SPARK-17914][SQL] Fix parsing of timestamp strin...

2017-06-09 Thread wzhfy
Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/18252#discussion_r121240900
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
 ---
@@ -399,13 +399,13 @@ object DateTimeUtils {
   digitsMilli += 1
 }
 
-if (!justTime && isInvalidDate(segments(0), segments(1), segments(2))) 
{
-  return None
+while (digitsMilli > 6) {
--- End diff --

add a comment indicating we are truncating the microsecond part and its 
lossy?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18252: [SPARK-17914][SQL] Fix parsing of timestamp strin...

2017-06-09 Thread aokolnychyi
GitHub user aokolnychyi opened a pull request:

https://github.com/apache/spark/pull/18252

[SPARK-17914][SQL] Fix parsing of timestamp strings with nanoseconds

The PR contains a tiny change to fix the way Spark parses string literals 
into timestamps. Currently, some timestamps that contain nanoseconds are 
corrupted during the conversion from internal UTF8Strings into the internal 
representation of timestamps.

Consider the following example:
```
spark.sql("SELECT cast('2015-01-02 00:00:00.1' as 
TIMESTAMP)").show(false)
++
|CAST(2015-01-02 00:00:00.1 AS TIMESTAMP)|
++
|2015-01-02 00:00:00.01  |
++
```

The fix was tested with existing tests. Also, there is a new test to cover 
cases that did not work previously.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aokolnychyi/spark spark-17914

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18252.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18252


commit 2f232a7bda28fb42759ee35923044f886a1ff19e
Author: aokolnychyi 
Date:   2017-06-08T18:52:14Z

[SPARK-17914][SQL] Fix parsing of timestamp strings with nanoseconds




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org