Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/19769#discussion_r151557718 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedColumnReader.java --- @@ -298,7 +304,10 @@ private void decodeDictionaryIds( // TODO: Convert dictionary of Binaries to dictionary of Longs if (!column.isNullAt(i)) { Binary v = dictionary.decodeToBinary(dictionaryIds.getDictId(i)); - column.putLong(i, ParquetRowConverter.binaryToSQLTimestamp(v)); + long rawTime = ParquetRowConverter.binaryToSQLTimestamp(v); + long adjTime = + convertTz == null ? rawTime : DateTimeUtils.convertTz(rawTime, convertTz, UTC); + column.putLong(i, adjTime); --- End diff -- oh good point. I suppose to get test coverage for this, I'd have to try to generate a parquet file without dictionary encoding from impala ...
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org