Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14711 )
Change subject: IMPALA-8778: Support Apache Hudi Read Optimized Table ...................................................................... Patch Set 16: (2 comments) Zoltan asked me to look at the data loading piece, and I have some opinions here. Hopefully it is not too difficult to move the data loading into our data loading framework. http://gerrit.cloudera.org:8080/#/c/14711/16/testdata/bin/create-load-data.sh File testdata/bin/create-load-data.sh: http://gerrit.cloudera.org:8080/#/c/14711/16/testdata/bin/create-load-data.sh@530 PS16, Line 530: hadoop fs -rm -r /test-warehouse/hudicow Loading data here is a bit of an anti-pattern because it prevents us from loading in parallel and also doesn't allow developers to load individual tables. E.g. I can load functional_parquet.customer_multiblock like this: ./bin/load-data.py -w functional-query -f --table_formats=parquet/none --table_names=customer_multiblock It would be preferable to do this for this table. I think it's similar to customer_multiblock: https://github.com/apache/impala/blob/master/testdata/datasets/functional/functional_schema_template.sql#L2455 https://github.com/apache/impala/blob/master/testdata/datasets/functional/schema_constraints.csv#L58 Except you might need to specify a custom create statement like: https://github.com/apache/impala/blob/master/testdata/datasets/functional/functional_schema_template.sql#L2555 http://gerrit.cloudera.org:8080/#/c/14711/16/testdata/data/hudicow/.hoodie/hoodie.properties File testdata/data/hudicow/.hoodie/hoodie.properties: http://gerrit.cloudera.org:8080/#/c/14711/16/testdata/data/hudicow/.hoodie/hoodie.properties@1 PS16, Line 1: Properties > it would be good to have some more documentation on the format of this file +1 -- To view, visit http://gerrit.cloudera.org:8080/14711 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I65e146b347714df32fe968409ef2dde1f6a25cdf Gerrit-Change-Number: 14711 Gerrit-PatchSet: 16 Gerrit-Owner: Yanjia Gary Li <yanjia.gary...@gmail.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Norbert Luksa <norbert.lu...@cloudera.com> Gerrit-Reviewer: Sahil Takiar <stak...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Yanjia Gary Li <yanjia.gary...@gmail.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Tue, 21 Jan 2020 22:37:48 +0000 Gerrit-HasComments: Yes