[GitHub] [iceberg] gustavoatt commented on issue #1138: int96 support in parquet

GitBox Mon, 17 Aug 2020 15:57:24 -0700


gustavoatt commented on issue #1138:
URL: https://github.com/apache/iceberg/issues/1138#issuecomment-675155272



   > Currently, there is no adjustment to int96 timestamp values. And I believe 
that any adjustment Spark makes is based on the current session time zone. If 
there needs to be an adjustment to imported int96 timestamps for Impala or any 
other writer that was incorrect, then I think it makes sense to add a static 
offset for all int96 values, assuming that all of them were written the same 
way.
   > 
   > I'm happy to not add this if no one needs it, but I think it is the 
remaining piece to solve any problems that might come up.
   
   Yes, Spark makes the adjustment based on the current session timezone:
   
   ```scala
   // PARQUET_INT96_TIMESTAMP_CONVERSION says to apply timezone conversions to 
int96 timestamps'
   // *only* if the file was created by something other than "parquet-mr", so 
check the actual
   // writer here for this file.  We have to do this per-file, as each file in 
the table may
   // have different writers.
   // Define isCreatedByParquetMr as function to avoid unnecessary parquet 
footer reads.
   def isCreatedByParquetMr: Boolean =
     footerFileMetaData.getCreatedBy().startsWith("parquet-mr")
   
   val convertTz =
     if (timestampConversion && !isCreatedByParquetMr) {
       
Some(DateTimeUtils.getTimeZone(sharedConf.get(SQLConf.SESSION_LOCAL_TIMEZONE.key)))
     } else {
       None
     }
   ```
   
   On our end we don't really need to add this offset since we only have to 
deal with int96 written by either Spark or Hive
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] gustavoatt commented on issue #1138: int96 support in parquet

Reply via email to