[jira] [Commented] (PARQUET-323) INT96 should be marked as deprecated

2019-06-25 Thread Wes McKinney (JIRA)


[ 
https://issues.apache.org/jira/browse/PARQUET-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872653#comment-16872653
 ] 

Wes McKinney commented on PARQUET-323:
--

[~liusztc09] support for nanoseconds in the INT64 type has been added 

https://github.com/apache/parquet-format/commit/b879065ac1bee3fe1d770eb3c4b60ab4267044d7

This is the recommended path forward

> INT96 should be marked as deprecated
> 
>
> Key: PARQUET-323
> URL: https://issues.apache.org/jira/browse/PARQUET-323
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-format
>Reporter: Cheng Lian
>Assignee: Lars Volker
>Priority: Major
> Fix For: format-2.5.0
>
>
> As discussed in the mailing list, {{INT96}} is only used to represent nanosec 
> timestamp in Impala for some historical reasons, and should be deprecated. 
> Since nanosec precision is rarely a real requirement, one possible and simple 
> solution would be replacing {{INT96}} with {{INT64 (TIMESTAMP_MILLIS)}} or 
> {{INT64 (TIMESTAMP_MICROS)}}.
> Several projects (Impala, Hive, Spark, ...) support INT96.
> We need a clear spec of the replacement and the path to deprecation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PARQUET-323) INT96 should be marked as deprecated

2019-06-25 Thread Songzhi Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/PARQUET-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872626#comment-16872626
 ] 

Songzhi Liu commented on PARQUET-323:
-

[~lv] Hi Lars, is it the direction of the community to deprecate INT96 support? 
In that case, what will be used to support nanosec precision? In some financial 
services use cases, support for nanosecond precision is critical. Could you 
please advise the potential plan of the deprecation plan?

 

> INT96 should be marked as deprecated
> 
>
> Key: PARQUET-323
> URL: https://issues.apache.org/jira/browse/PARQUET-323
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-format
>Reporter: Cheng Lian
>Assignee: Lars Volker
>Priority: Major
> Fix For: format-2.5.0
>
>
> As discussed in the mailing list, {{INT96}} is only used to represent nanosec 
> timestamp in Impala for some historical reasons, and should be deprecated. 
> Since nanosec precision is rarely a real requirement, one possible and simple 
> solution would be replacing {{INT96}} with {{INT64 (TIMESTAMP_MILLIS)}} or 
> {{INT64 (TIMESTAMP_MICROS)}}.
> Several projects (Impala, Hive, Spark, ...) support INT96.
> We need a clear spec of the replacement and the path to deprecation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PARQUET-323) INT96 should be marked as deprecated

2018-03-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16410496#comment-16410496
 ] 

ASF GitHub Bot commented on PARQUET-323:


lekv closed pull request #86: PARQUET-323: Mark INT96 as deprecated
URL: https://github.com/apache/parquet-format/pull/86
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/Encodings.md b/Encodings.md
index 28429be7..b8905bf4 100644
--- a/Encodings.md
+++ b/Encodings.md
@@ -34,7 +34,7 @@ stores the data in the following format:
  - BOOLEAN: [Bit Packed](#RLE), LSB first
  - INT32: 4 bytes little endian
  - INT64: 8 bytes little endian
- - INT96: 12 bytes little endian
+ - INT96: 12 bytes little endian (deprecated)
  - FLOAT: 4 bytes IEEE little endian
  - DOUBLE: 8 bytes IEEE little endian
  - BYTE_ARRAY: length in 4 bytes little endian followed by the bytes contained 
in the array
diff --git a/src/main/thrift/parquet.thrift b/src/main/thrift/parquet.thrift
index 195ff908..4d2e7001 100644
--- a/src/main/thrift/parquet.thrift
+++ b/src/main/thrift/parquet.thrift
@@ -33,7 +33,7 @@ enum Type {
   BOOLEAN = 0;
   INT32 = 1;
   INT64 = 2;
-  INT96 = 3;
+  INT96 = 3;  // deprecated, only used by legacy implementations.
   FLOAT = 4;
   DOUBLE = 5;
   BYTE_ARRAY = 6;


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> INT96 should be marked as deprecated
> 
>
> Key: PARQUET-323
> URL: https://issues.apache.org/jira/browse/PARQUET-323
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-format
>Reporter: Cheng Lian
>Assignee: Lars Volker
>Priority: Major
>
> As discussed in the mailing list, {{INT96}} is only used to represent nanosec 
> timestamp in Impala for some historical reasons, and should be deprecated. 
> Since nanosec precision is rarely a real requirement, one possible and simple 
> solution would be replacing {{INT96}} with {{INT64 (TIMESTAMP_MILLIS)}} or 
> {{INT64 (TIMESTAMP_MICROS)}}.
> Several projects (Impala, Hive, Spark, ...) support INT96.
> We need a clear spec of the replacement and the path to deprecation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PARQUET-323) INT96 should be marked as deprecated

2018-03-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16397196#comment-16397196
 ] 

ASF GitHub Bot commented on PARQUET-323:


rdblue commented on issue #86: PARQUET-323: Mark INT96 as deprecated
URL: https://github.com/apache/parquet-format/pull/86#issuecomment-372726185
 
 
   +1


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> INT96 should be marked as deprecated
> 
>
> Key: PARQUET-323
> URL: https://issues.apache.org/jira/browse/PARQUET-323
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-format
>Reporter: Cheng Lian
>Assignee: Lars Volker
>Priority: Major
>
> As discussed in the mailing list, {{INT96}} is only used to represent nanosec 
> timestamp in Impala for some historical reasons, and should be deprecated. 
> Since nanosec precision is rarely a real requirement, one possible and simple 
> solution would be replacing {{INT96}} with {{INT64 (TIMESTAMP_MILLIS)}} or 
> {{INT64 (TIMESTAMP_MICROS)}}.
> Several projects (Impala, Hive, Spark, ...) support INT96.
> We need a clear spec of the replacement and the path to deprecation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PARQUET-323) INT96 should be marked as deprecated

2017-03-14 Thread Jeff Knupp (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15924999#comment-15924999
 ] 

Jeff Knupp commented on PARQUET-323:


There is a PR against Spark that adds support for TIMESTAMP_MILLIS 
(https://github.com/apache/spark/pull/15332) but after a good deal of 
commentary and a working patch, it has sat untouched since October. Perhaps 
simply asking in a Spark JIRA ticket to re-review that PR would be the quickest 
way for Spark to support reading int64 TIMESTAMP_MILLIS?

> INT96 should be marked as deprecated
> 
>
> Key: PARQUET-323
> URL: https://issues.apache.org/jira/browse/PARQUET-323
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-format
>Reporter: Cheng Lian
>
> As discussed in the mailing list, {{INT96}} is only used to represent nanosec 
> timestamp in Impala for some historical reasons, and should be deprecated. 
> Since nanosec precision is rarely a real requirement, one possible and simple 
> solution would be replacing {{INT96}} with {{INT64 (TIMESTAMP_MILLIS)}} or 
> {{INT64 (TIMESTAMP_MICROS)}}.
> Several projects (Impala, Hive, Spark, ...) support INT96.
> We need a clear spec of the replacement and the path to deprecation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (PARQUET-323) INT96 should be marked as deprecated

2017-03-09 Thread Jeff Knupp (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904007#comment-15904007
 ] 

Jeff Knupp commented on PARQUET-323:


I'm reasonably sure it's the only timestamp format that Spark currently 
supports (see https://github.com/apache/spark/pull/3820).

> INT96 should be marked as deprecated
> 
>
> Key: PARQUET-323
> URL: https://issues.apache.org/jira/browse/PARQUET-323
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-format
>Reporter: Cheng Lian
>
> As discussed in the mailing list, {{INT96}} is only used to represent nanosec 
> timestamp in Impala for some historical reasons, and should be deprecated. 
> Since nanosec precision is rarely a real requirement, one possible and simple 
> solution would be replacing {{INT96}} with {{INT64 (TIMESTAMP_MILLIS)}} or 
> {{INT64 (TIMESTAMP_MICROS)}}.
> Several projects (Impala, Hive, Spark, ...) support INT96.
> We need a clear spec of the replacement and the path to deprecation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (PARQUET-323) INT96 should be marked as deprecated

2017-03-08 Thread Lars Volker (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902218#comment-15902218
 ] 

Lars Volker commented on PARQUET-323:
-

We discussed this issue in today's Parquet sync and agreed to deprecate INT96. 
As a replacement to store timestamps (the most common use for INT96) we will 
encourage all projects who currently use INT96 to switch to INT64 and either 
use the TIMESTAMP_MILLIS or TIMESTAMP_MICROS logical types.

We will not fix the ordering issues around INT96 that resulted in parquet-mr 
writing wrong min/max statistics.

> INT96 should be marked as deprecated
> 
>
> Key: PARQUET-323
> URL: https://issues.apache.org/jira/browse/PARQUET-323
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-format
>Reporter: Cheng Lian
>
> As discussed in the mailing list, {{INT96}} is only used to represent nanosec 
> timestamp in Impala for some historical reasons, and should be deprecated. 
> Since nanosec precision is rarely a real requirement, one possible and simple 
> solution would be replacing {{INT96}} with {{INT64 (TIMESTAMP_MILLIS)}} or 
> {{INT64 (TIMESTAMP_MICROS)}}.
> Several projects (Impala, Hive, Spark, ...) support INT96.
> We need a clear spec of the replacement and the path to deprecation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (PARQUET-323) INT96 should be marked as deprecated

2016-08-03 Thread Ryan Blue (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406334#comment-15406334
 ] 

Ryan Blue commented on PARQUET-323:
---

+1

> INT96 should be marked as deprecated
> 
>
> Key: PARQUET-323
> URL: https://issues.apache.org/jira/browse/PARQUET-323
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-format
>Reporter: Cheng Lian
>
> As discussed in the mailing list, {{INT96}} is only used to represent nanosec 
> timestamp in Impala for some historical reasons, and should be deprecated. 
> Since nanosec precision is rarely a real requirement, one possible and simple 
> solution would be replacing {{INT96}} with {{INT64 (TIMESTAMP_MILLIS)}} or 
> {{INT64 (TIMESTAMP_MICROS)}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PARQUET-323) INT96 should be marked as deprecated

2016-08-03 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406330#comment-15406330
 ] 

Julien Le Dem commented on PARQUET-323:
---

I think we should deprecate it and discourage its use. For backward 
compatibility, it has to stay.
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md doesn't 
even refer to it.


> INT96 should be marked as deprecated
> 
>
> Key: PARQUET-323
> URL: https://issues.apache.org/jira/browse/PARQUET-323
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-format
>Reporter: Cheng Lian
>
> As discussed in the mailing list, {{INT96}} is only used to represent nanosec 
> timestamp in Impala for some historical reasons, and should be deprecated. 
> Since nanosec precision is rarely a real requirement, one possible and simple 
> solution would be replacing {{INT96}} with {{INT64 (TIMESTAMP_MILLIS)}} or 
> {{INT64 (TIMESTAMP_MICROS)}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)