[GitHub] [hudi] s-sanjay commented on issue #1895: HUDI Dataset backed by Hive Metastore fails on Presto with Unknown converted type TIMESTAMP_MICROS

2020-09-05 Thread GitBox


s-sanjay commented on issue #1895:
URL: https://github.com/apache/hudi/issues/1895#issuecomment-687637620


   @FelixKJose did you check my comment ? Hudi is just using spark's 
[SchemaConvertors](https://github.com/apache/spark/blob/master/external/avro/src/main/scala/org/apache/spark/sql/avro/SchemaConverters.scala#L150)
 and that does not support user setting to use millis or micros. We will have 
to fix it in spark and then upgrade hudi's spark dependency to use that 
library. It is also not easy to copy that function into hudi because the code 
is being called from multiple code path. Fixing presto makes the most sense 
since even without hudi if someone where to use micros then the query will 
fail. 
   However, I do agree we need to document this because even with presto fix it 
may not be possible for everyone to upgrade presto or cherry pick the fix.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] s-sanjay commented on issue #1895: HUDI Dataset backed by Hive Metastore fails on Presto with Unknown converted type TIMESTAMP_MICROS

2020-08-24 Thread GitBox


s-sanjay commented on issue #1895:
URL: https://github.com/apache/hudi/issues/1895#issuecomment-679406205


   @FelixKJose I have raised a 
[PR](https://github.com/prestodb/presto/pull/15074)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] s-sanjay commented on issue #1895: HUDI Dataset backed by Hive Metastore fails on Presto with Unknown converted type TIMESTAMP_MICROS

2020-08-06 Thread GitBox


s-sanjay commented on issue #1895:
URL: https://github.com/apache/hudi/issues/1895#issuecomment-669845428


   Right now presto does not support reading TIMESTAMP_MICROS type. This needs 
to be fixed from the presto side for which I am working on a fix. ( presto only 
supports timestamp upto millisecond granularity so the fix will simply convert 
the microsecond to millisecond ) I think 
`spark.sql.parquet.outputTimestampType` is not working because hudi is using 
spark's 
[SchemaConvertors](https://github.com/apache/spark/blob/master/external/avro/src/main/scala/org/apache/spark/sql/avro/SchemaConverters.scala#L150)
 which is not even looking at this option. This might be because that property 
was to control the parquet type but hudi uses avro format to store the schema 
of the file within parquet.
   It would be very difficult to change this from the hudi or spark side. Right 
now the easiest option is to choose the double type as mentioned above till the 
fix merges to presto. I will share the PR link here in couple days ( I need to 
refactor it since the presto version is custom internal version )



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org