leobiscassi commented on issue #6142:
URL: https://github.com/apache/hudi/issues/6142#issuecomment-1199808432

   Hey @qianchutao, I was able to fix this on my side and maybe the solution 
help you too.
   So, basically this error happens because a mismatch of order between the 
schema declared inside the parquet files and the table schema ddl on Athena / 
Presto. This normally works on Athena because the default method to map the 
columns on Athena is using the names [1], for Presto the default way is by 
column indexes [2], so when you have schema evolution or for some reason the 
order of columns doesn't match between the parquet files and the table schema, 
this starts to happen, nothing related to hudi itself.
   
   To fix this add the config `hive.parquet.use-column-names=true` under the 
EMR config tab or at start up time, this is going to update the config files 
and restart the presto cluster.  If you want to do this on a running cluster 
you'll need to do on master and worker nodes and restart presto, without doing 
that the config won't work.
   
   Let me know if this helps 😁 
   
   [1] 
https://docs.aws.amazon.com/athena/latest/ug/handling-schema-updates-chapter.html
   [2] 
https://stackoverflow.com/questions/60183579/presto-fails-with-type-mismatch-errors


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to