Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/19769#discussion_r152049670 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedColumnReader.java --- @@ -298,7 +304,10 @@ private void decodeDictionaryIds( // TODO: Convert dictionary of Binaries to dictionary of Longs if (!column.isNullAt(i)) { Binary v = dictionary.decodeToBinary(dictionaryIds.getDictId(i)); - column.putLong(i, ParquetRowConverter.binaryToSQLTimestamp(v)); + long rawTime = ParquetRowConverter.binaryToSQLTimestamp(v); + long adjTime = --- End diff -- oh excellent point. we'd just need to store an additional `int -> long` map, but given that we've already got the dictionary this seems reasonable
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org