[ https://issues.apache.org/jira/browse/HIVE-26612?focusedWorklogId=818842&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-818842 ]
ASF GitHub Bot logged work on HIVE-26612: ----------------------------------------- Author: ASF GitHub Bot Created on: 20/Oct/22 15:43 Start Date: 20/Oct/22 15:43 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3651: URL: https://github.com/apache/hive/pull/3651#issuecomment-1285779523 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3651) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3651&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3651&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3651&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3651&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3651&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3651&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3651&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3651&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3651&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3651&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3651&resolved=false&types=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3651&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3651&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3651&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking ------------------- Worklog Id: (was: 818842) Time Spent: 50m (was: 40m) > Hive cannot read parquet files with int64 (TIMESTAMP_MILLIS) > ------------------------------------------------------------ > > Key: HIVE-26612 > URL: https://issues.apache.org/jira/browse/HIVE-26612 > Project: Hive > Issue Type: Bug > Components: Database/Schema > Reporter: Steve Carlin > Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > If a parquet file has a Type of "int64 eventtime (TIMESTAMP(MILLIS,true))", > the following error is produced: > {noformat} > java.lang.RuntimeException: java.io.IOException: > org.apache.parquet.io.ParquetDecodingException: Can not read value at 1 in > block 0 in file > file:/xxxx/hive/itests/qtest/target/tmp/parquet_format_ts_as_bigint/part-00000/timestamp_as_bigint.parquet > at > org.apache.hadoop.hive.ql.exec.FetchTask.executeInner(FetchTask.java:213) > at org.apache.hadoop.hive.ql.exec.FetchTask.execute(FetchTask.java:98) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:212) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:154) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:149) > Caused by: java.io.IOException: > org.apache.parquet.io.ParquetDecodingException: Can not read value at 1 in > block 0 in file > file:/xxxx/hive/itests/qtest/target/tmp/parquet_format_ts_as_bigint/part-00000/timestamp_as_bigint.parquet > at > org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:624) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:531) > at > org.apache.hadoop.hive.ql.exec.FetchTask.executeInner(FetchTask.java:197) > ... 55 more > Caused by: org.apache.parquet.io.ParquetDecodingException: Can not read value > at 1 in block 0 in file > file:/home/stamatis/Projects/Apache/hive/itests/qtest/target/tmp/parquet_format_ts_as_bigint/part-00000/timestamp_as_bigint.parquet > at > org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:255) > at > org.apache.parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:207) > at > org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:87) > at > org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:89) > at > org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:771) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:335) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:562) > ... 57 more > Caused by: java.lang.UnsupportedOperationException: > org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter$10$1 > at > org.apache.parquet.io.api.PrimitiveConverter.addLong(PrimitiveConverter.java:105) > at > org.apache.parquet.column.impl.ColumnReaderBase$2$4.writeValue(ColumnReaderBase.java:301) > at > org.apache.parquet.column.impl.ColumnReaderBase.writeCurrentValueToConverter(ColumnReaderBase.java:410) > at > org.apache.parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:30) > at > org.apache.parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:406) > at > org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:230) > ... 63 more > {noformat} > The parquet file can be created with the following steps (through spark): > spark.conf.set("spark.sql.parquet.outputTimestampType", "TIMESTAMP_MILLIS") > spark.conf.set("spark.sql.legacy.parquet.int96RebaseModeInWrite", "LEGACY") > spark.conf.set("spark.sql.legacy.parquet.datetimeRebaseModeInWrite", "LEGACY") > spark.conf.set("spark.sql.legacy.parquet.int96RebaseModeInRead", "LEGACY") > spark.conf.set("spark.sql.legacy.parquet.datetimeRebaseModeInRead", "LEGACY") > [1] > val df = Seq( > (1, Timestamp.valueOf("2014-01-01 23:00:01")), > (1, Timestamp.valueOf("2014-11-30 12:40:32")), > (2, Timestamp.valueOf("2016-12-29 09:54:00")), > (2, Timestamp.valueOf("2016-05-09 10:12:43")) > ).toDF("typeid","eventtime") > [2] > [root@c4839-node3 test_parquet2]# parquet-tools schema > part-00001-6c90b794-90b9-4cc0-afc5-2e49a4e96bad-c000.snappy.parquet > message spark_schema { > required int32 typeid; > optional int64 eventtime (TIMESTAMP(MILLIS,true)); > } > [3] > [root@c4839-node3 test_parquet1]# parquet-tools schema > part-00001-cb1aeebb-ec87-4273-82ec-911c4fb605b6-c000.snappy.parquet > message spark_schema { > required int32 typeid; > optional int96 eventtime; > } -- This message was sent by Atlassian Jira (v8.20.10#820010)