[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16892286#comment-16892286
 ] 

Tim Armstrong commented on IMPALA-8778:
---------------------------------------

I am not all that familiar with this code myself, but I know enough to get you 
started.

Impala doesn't use the same InputFormat classes as hive. Rather, it recognises 
the Java class names and handles it on its own. E.g. Impala is aware of the 
"MapredParquetInputFormat" class and refers to it internally as the PARQUET 
file format - see 
https://github.com/apache/impala/blob/94652d74521e95e8606ea2d22aabcaddde6fc471/fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java#L62
 

I think the first steps would be to add Hudi to the list of known file formats 
in HdfsFileFormat.java. Then you could teach the Impala to load the table by 
making isHdfsInputFormatClass() return true here
 
https://github.com/apache/impala/blob/fc974f944a9266e68e6f1694eecdc2160fd52582/fe/src/main/java/org/apache/impala/catalog/Table.java#L327

Then you would need to teach Impala how to load the files and partitions for 
HdfsTable. If the partitioning is compatible, then maybe we just need to get 
the file metadata loading working. The file metadata is loaded here: 
https://github.com/apache/impala/blob/fc974f944a9266e68e6f1694eecdc2160fd52582/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L554

> Support read/write Apache Hudi tables
> -------------------------------------
>
>                 Key: IMPALA-8778
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8778
>             Project: IMPALA
>          Issue Type: New Feature
>            Reporter: Yuanbin Cheng
>            Assignee: Yuanbin Cheng
>            Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to