[jira] [Updated] (HIVE-9482) Hive parquet timestamp compatibility
[ https://issues.apache.org/jira/browse/HIVE-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-9482: Labels: (was: TODOC1.2) Done, added new section for [Parquet|https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Parquet] and mention this property. > Hive parquet timestamp compatibility > > > Key: HIVE-9482 > URL: https://issues.apache.org/jira/browse/HIVE-9482 > Project: Hive > Issue Type: Bug > Components: File Formats >Affects Versions: 0.15.0 >Reporter: Szehon Ho >Assignee: Szehon Ho > Fix For: 1.2.0 > > Attachments: HIVE-9482.2.patch, HIVE-9482.patch, HIVE-9482.patch, > parquet_external_time.parq > > > In current Hive implementation, timestamps are stored in UTC (converted from > current timezone), based on original parquet timestamp spec. > However, we find this is not compatibility with other tools, and after some > investigation it is not the way of the other file formats, or even some > databases (Hive Timestamp is more equivalent of 'timestamp without timezone' > datatype). > This is the first part of the fix, which will restore compatibility with > parquet-timestamp files generated by external tools by skipping conversion on > reading. > Later fix will change the write path to not convert, and stop the > read-conversion even for files written by Hive itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9482) Hive parquet timestamp compatibility
[ https://issues.apache.org/jira/browse/HIVE-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-9482: Labels: TODOC1.2 (was: ) Adds property "hive.parquet.timestamp.skip.conversion", which needs to be documented. > Hive parquet timestamp compatibility > > > Key: HIVE-9482 > URL: https://issues.apache.org/jira/browse/HIVE-9482 > Project: Hive > Issue Type: Bug > Components: File Formats >Affects Versions: 0.15.0 >Reporter: Szehon Ho >Assignee: Szehon Ho > Labels: TODOC1.2 > Fix For: 1.2.0 > > Attachments: HIVE-9482.2.patch, HIVE-9482.patch, HIVE-9482.patch, > parquet_external_time.parq > > > In current Hive implementation, timestamps are stored in UTC (converted from > current timezone), based on original parquet timestamp spec. > However, we find this is not compatibility with other tools, and after some > investigation it is not the way of the other file formats, or even some > databases (Hive Timestamp is more equivalent of 'timestamp without timezone' > datatype). > This is the first part of the fix, which will restore compatibility with > parquet-timestamp files generated by external tools by skipping conversion on > reading. > Later fix will change the write path to not convert, and stop the > read-conversion even for files written by Hive itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9482) Hive parquet timestamp compatibility
[ https://issues.apache.org/jira/browse/HIVE-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-9482: Resolution: Fixed Fix Version/s: (was: 0.15.0) 1.2.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks Brock for review. > Hive parquet timestamp compatibility > > > Key: HIVE-9482 > URL: https://issues.apache.org/jira/browse/HIVE-9482 > Project: Hive > Issue Type: Bug > Components: File Formats >Affects Versions: 0.15.0 >Reporter: Szehon Ho >Assignee: Szehon Ho > Labels: TODOC1.2 > Fix For: 1.2.0 > > Attachments: HIVE-9482.2.patch, HIVE-9482.patch, HIVE-9482.patch, > parquet_external_time.parq > > > In current Hive implementation, timestamps are stored in UTC (converted from > current timezone), based on original parquet timestamp spec. > However, we find this is not compatibility with other tools, and after some > investigation it is not the way of the other file formats, or even some > databases (Hive Timestamp is more equivalent of 'timestamp without timezone' > datatype). > This is the first part of the fix, which will restore compatibility with > parquet-timestamp files generated by external tools by skipping conversion on > reading. > Later fix will change the write path to not convert, and stop the > read-conversion even for files written by Hive itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9482) Hive parquet timestamp compatibility
[ https://issues.apache.org/jira/browse/HIVE-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-9482: Attachment: HIVE-9482.2.patch Address review comments. > Hive parquet timestamp compatibility > > > Key: HIVE-9482 > URL: https://issues.apache.org/jira/browse/HIVE-9482 > Project: Hive > Issue Type: Bug > Components: File Formats >Affects Versions: 0.15.0 >Reporter: Szehon Ho >Assignee: Szehon Ho > Fix For: 0.15.0 > > Attachments: HIVE-9482.2.patch, HIVE-9482.patch, HIVE-9482.patch, > parquet_external_time.parq > > > In current Hive implementation, timestamps are stored in UTC (converted from > current timezone), based on original parquet timestamp spec. > However, we find this is not compatibility with other tools, and after some > investigation it is not the way of the other file formats, or even some > databases (Hive Timestamp is more equivalent of 'timestamp without timezone' > datatype). > This is the first part of the fix, which will restore compatibility with > parquet-timestamp files generated by external tools by skipping conversion on > reading. > Later fix will change the write path to not convert, and stop the > read-conversion even for files written by Hive itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9482) Hive parquet timestamp compatibility
[ https://issues.apache.org/jira/browse/HIVE-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-9482: Attachment: HIVE-9482.patch Attaching again to trigger test > Hive parquet timestamp compatibility > > > Key: HIVE-9482 > URL: https://issues.apache.org/jira/browse/HIVE-9482 > Project: Hive > Issue Type: Bug > Components: File Formats >Affects Versions: 0.15.0 >Reporter: Szehon Ho >Assignee: Szehon Ho > Fix For: 0.15.0 > > Attachments: HIVE-9482.patch, HIVE-9482.patch, > parquet_external_time.parq > > > In current Hive implementation, timestamps are stored in UTC (converted from > current timezone), based on original parquet timestamp spec. > However, we find this is not compatibility with other tools, and after some > investigation it is not the way of the other file formats, or even some > databases (Hive Timestamp is more equivalent of 'timestamp without timezone' > datatype). > This is the first part of the fix, which will restore compatibility with > parquet-timestamp files generated by external tools by skipping conversion on > reading. > Later fix will change the write path to not convert, and stop the > read-conversion even for files written by Hive itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9482) Hive parquet timestamp compatibility
[ https://issues.apache.org/jira/browse/HIVE-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-9482: Attachment: parquet_external_time.parq Attaching the new data file which is binary and cannot be displayed in the patch. This should go in /data/files > Hive parquet timestamp compatibility > > > Key: HIVE-9482 > URL: https://issues.apache.org/jira/browse/HIVE-9482 > Project: Hive > Issue Type: Bug > Components: File Formats >Affects Versions: 0.15.0 >Reporter: Szehon Ho >Assignee: Szehon Ho > Fix For: 0.15.0 > > Attachments: HIVE-9482.patch, parquet_external_time.parq > > > In current Hive implementation, timestamps are stored in UTC (converted from > current timezone), based on original parquet timestamp spec. > However, we find this is not compatibility with other tools, and after some > investigation it is not the way of the other file formats, or even some > databases (Hive Timestamp is more equivalent of 'timestamp without timezone' > datatype). > This is the first part of the fix, which will restore compatibility with > parquet-timestamp files generated by external tools by skipping conversion on > reading. > Later fix will change the write path to not convert, and stop the > read-conversion even for files written by Hive itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9482) Hive parquet timestamp compatibility
[ https://issues.apache.org/jira/browse/HIVE-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-9482: Status: Patch Available (was: Open) > Hive parquet timestamp compatibility > > > Key: HIVE-9482 > URL: https://issues.apache.org/jira/browse/HIVE-9482 > Project: Hive > Issue Type: Bug > Components: File Formats >Affects Versions: 0.15.0 >Reporter: Szehon Ho >Assignee: Szehon Ho > Fix For: 0.15.0 > > Attachments: HIVE-9482.patch, parquet_external_time.parq > > > In current Hive implementation, timestamps are stored in UTC (converted from > current timezone), based on original parquet timestamp spec. > However, we find this is not compatibility with other tools, and after some > investigation it is not the way of the other file formats, or even some > databases (Hive Timestamp is more equivalent of 'timestamp without timezone' > datatype). > This is the first part of the fix, which will restore compatibility with > parquet-timestamp files generated by external tools by skipping conversion on > reading. > Later fix will change the write path to not convert, and stop the > read-conversion even for files written by Hive itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9482) Hive parquet timestamp compatibility
[ https://issues.apache.org/jira/browse/HIVE-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-9482: Attachment: HIVE-9482.patch > Hive parquet timestamp compatibility > > > Key: HIVE-9482 > URL: https://issues.apache.org/jira/browse/HIVE-9482 > Project: Hive > Issue Type: Bug > Components: File Formats >Affects Versions: 0.15.0 >Reporter: Szehon Ho >Assignee: Szehon Ho > Fix For: 0.15.0 > > Attachments: HIVE-9482.patch > > > In current Hive implementation, timestamps are stored in UTC (converted from > current timezone), based on original parquet timestamp spec. > However, we find this is not compatibility with other tools, and after some > investigation it is not the way of the other file formats, or even some > databases (Hive Timestamp is more equivalent of 'timestamp without timezone' > datatype). > This is the first part of the fix, which will restore compatibility with > parquet-timestamp files generated by external tools by skipping conversion on > reading. > Later fix will change the write path to not convert, and stop the > read-conversion even for files written by Hive itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)