Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/14711 )
Change subject: IMPALA-8778: Support Apache Hudi Read Optimized Table ...................................................................... Patch Set 16: (1 comment) http://gerrit.cloudera.org:8080/#/c/14711/16/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: http://gerrit.cloudera.org:8080/#/c/14711/16/be/src/exec/hdfs-scan-node-base.cc@379 PS16, Line 379: HUDI_PARQUET > I definitely agree with you that not changing anything on the backend would Unfortunately in Impala there are some misnomers. By "frontend" we mean the parser and planner that are written in java. By "backend" we mean the code responsible for the actual query execution (scanning, joining, aggregating, etc.), these parts are written in C++. After the Hudi filtering is done we can tell the backend that it just need to scan Parquet files. You already did this here https://gerrit.cloudera.org/#/c/14711/7/fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java in line 180. I think you only need to restore that one line and the "backend" will work just fine thinking it's scanning Parquet. -- To view, visit http://gerrit.cloudera.org:8080/14711 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I65e146b347714df32fe968409ef2dde1f6a25cdf Gerrit-Change-Number: 14711 Gerrit-PatchSet: 16 Gerrit-Owner: Yanjia Gary Li <yanjia.gary...@gmail.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Norbert Luksa <norbert.lu...@cloudera.com> Gerrit-Reviewer: Sahil Takiar <stak...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Yanjia Gary Li <yanjia.gary...@gmail.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Mon, 27 Jan 2020 16:31:19 +0000 Gerrit-HasComments: Yes