[ 
https://issues.apache.org/jira/browse/HIVE-26612?focusedWorklogId=818842&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-818842
 ]

ASF GitHub Bot logged work on HIVE-26612:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 20/Oct/22 15:43
            Start Date: 20/Oct/22 15:43
    Worklog Time Spent: 10m 
      Work Description: sonarcloud[bot] commented on PR #3651:
URL: https://github.com/apache/hive/pull/3651#issuecomment-1285779523

   Kudos, SonarCloud Quality Gate passed!    [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3651)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3651&resolved=false&types=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3651&resolved=false&types=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3651&resolved=false&types=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3651&resolved=false&types=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3651&resolved=false&types=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3651&resolved=false&types=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3651&resolved=false&types=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3651&resolved=false&types=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3651&resolved=false&types=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3651&resolved=false&types=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3651&resolved=false&types=CODE_SMELL)
 [0 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3651&resolved=false&types=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3651&metric=coverage&view=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3651&metric=duplicated_lines_density&view=list)
 No Duplication information
   
   




Issue Time Tracking
-------------------

    Worklog Id:     (was: 818842)
    Time Spent: 50m  (was: 40m)

> Hive cannot read parquet files with int64 (TIMESTAMP_MILLIS)
> ------------------------------------------------------------
>
>                 Key: HIVE-26612
>                 URL: https://issues.apache.org/jira/browse/HIVE-26612
>             Project: Hive
>          Issue Type: Bug
>          Components: Database/Schema
>            Reporter: Steve Carlin
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> If a parquet file has a Type of "int64 eventtime (TIMESTAMP(MILLIS,true))", 
> the following error is produced:
> {noformat}
> java.lang.RuntimeException: java.io.IOException: 
> org.apache.parquet.io.ParquetDecodingException: Can not read value at 1 in 
> block 0 in file 
> file:/xxxx/hive/itests/qtest/target/tmp/parquet_format_ts_as_bigint/part-00000/timestamp_as_bigint.parquet
>       at 
> org.apache.hadoop.hive.ql.exec.FetchTask.executeInner(FetchTask.java:213)
>       at org.apache.hadoop.hive.ql.exec.FetchTask.execute(FetchTask.java:98)
>       at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:212)
>       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:154)
>       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:149)
> Caused by: java.io.IOException: 
> org.apache.parquet.io.ParquetDecodingException: Can not read value at 1 in 
> block 0 in file 
> file:/xxxx/hive/itests/qtest/target/tmp/parquet_format_ts_as_bigint/part-00000/timestamp_as_bigint.parquet
>       at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:624)
>       at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:531)
>       at 
> org.apache.hadoop.hive.ql.exec.FetchTask.executeInner(FetchTask.java:197)
>       ... 55 more
> Caused by: org.apache.parquet.io.ParquetDecodingException: Can not read value 
> at 1 in block 0 in file 
> file:/home/stamatis/Projects/Apache/hive/itests/qtest/target/tmp/parquet_format_ts_as_bigint/part-00000/timestamp_as_bigint.parquet
>       at 
> org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:255)
>       at 
> org.apache.parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:207)
>       at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:87)
>       at 
> org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:89)
>       at 
> org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:771)
>       at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:335)
>       at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:562)
>       ... 57 more
> Caused by: java.lang.UnsupportedOperationException: 
> org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter$10$1
>       at 
> org.apache.parquet.io.api.PrimitiveConverter.addLong(PrimitiveConverter.java:105)
>       at 
> org.apache.parquet.column.impl.ColumnReaderBase$2$4.writeValue(ColumnReaderBase.java:301)
>       at 
> org.apache.parquet.column.impl.ColumnReaderBase.writeCurrentValueToConverter(ColumnReaderBase.java:410)
>       at 
> org.apache.parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:30)
>       at 
> org.apache.parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:406)
>       at 
> org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:230)
>       ... 63 more
> {noformat}
> The parquet file can be created with the following steps (through spark):
> spark.conf.set("spark.sql.parquet.outputTimestampType", "TIMESTAMP_MILLIS")
> spark.conf.set("spark.sql.legacy.parquet.int96RebaseModeInWrite", "LEGACY")
> spark.conf.set("spark.sql.legacy.parquet.datetimeRebaseModeInWrite", "LEGACY")
> spark.conf.set("spark.sql.legacy.parquet.int96RebaseModeInRead", "LEGACY")
> spark.conf.set("spark.sql.legacy.parquet.datetimeRebaseModeInRead", "LEGACY")
> [1]
> val df = Seq(
> (1, Timestamp.valueOf("2014-01-01 23:00:01")),
> (1, Timestamp.valueOf("2014-11-30 12:40:32")),
> (2, Timestamp.valueOf("2016-12-29 09:54:00")),
> (2, Timestamp.valueOf("2016-05-09 10:12:43"))
> ).toDF("typeid","eventtime")
> [2]
> [root@c4839-node3 test_parquet2]# parquet-tools schema 
> part-00001-6c90b794-90b9-4cc0-afc5-2e49a4e96bad-c000.snappy.parquet
> message spark_schema {
> required int32 typeid;
> optional int64 eventtime (TIMESTAMP(MILLIS,true));
> }
> [3]
> [root@c4839-node3 test_parquet1]# parquet-tools schema 
> part-00001-cb1aeebb-ec87-4273-82ec-911c4fb605b6-c000.snappy.parquet
> message spark_schema {
> required int32 typeid;
> optional int96 eventtime;
> }



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to