Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/15395 )
Change subject: IMPALA-9484: Full ACID Milestone 1: properly scan files that has full ACID schema ...................................................................... Patch Set 8: (7 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/15395/8//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15395/8//COMMIT_MSG@7 PS8, Line 7: IMPALA-9042 > Change this to IMPALA-9484? Done http://gerrit.cloudera.org:8080/#/c/15395/8/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15395/8/be/src/exec/hdfs-orc-scanner.cc@194 PS8, Line 194: "hive.acid.version" > nit: make this a constant? Done http://gerrit.cloudera.org:8080/#/c/15395/8/be/src/exec/hdfs-orc-scanner.cc@197 PS8, Line 197: but file " : "is not > nit: explain that it doesn't have metadata "hive.acid.version" = "2"? Done http://gerrit.cloudera.org:8080/#/c/15395/8/be/src/exec/orc-metadata-utils.cc File be/src/exec/orc-metadata-utils.cc: http://gerrit.cloudera.org:8080/#/c/15395/8/be/src/exec/orc-metadata-utils.cc@87 PS8, Line 87: DCHECK(ValidateFullAcidFileSchema().ok()); // Should have already been validated. > I like the idea, but I run into name clashes because we define KUDU_HEADERS Thanks for reviewing it so quickly. I added the DCHECK_OK macro to common/logging.h. http://gerrit.cloudera.org:8080/#/c/15395/8/tests/query_test/test_nested_types.py File tests/query_test/test_nested_types.py: http://gerrit.cloudera.org:8080/#/c/15395/8/tests/query_test/test_nested_types.py@213 PS8, Line 213: base_table = "functional_orc_def.complextypestbl_non_transactional" > Do we have test coverage on reading nested types from full-ACID partitioned Thanks for catching this. I created the ACID version of this test. I think I'll keep the current modifications as well to have coverage for non-ACID ORC tables. http://gerrit.cloudera.org:8080/#/c/15395/8/tests/query_test/test_scanners.py File tests/query_test/test_scanners.py: http://gerrit.cloudera.org:8080/#/c/15395/8/tests/query_test/test_scanners.py@207 PS8, Line 207: functional_orc_def > I think we should get the db name by Done http://gerrit.cloudera.org:8080/#/c/15395/8/tests/query_test/test_scanners_fuzz.py File tests/query_test/test_scanners_fuzz.py: http://gerrit.cloudera.org:8080/#/c/15395/8/tests/query_test/test_scanners_fuzz.py@181 PS8, Line 181: self.run_stmt_in_hive("insert into %s.%s select * from %s.%s" % (fuzz_db, : fuzz_table, src_db, src_table)) > I'm not sure if this copies the data of complextypestbl correctly since I p I checked and it seems fine to me. Note that in this CR I changed the data loading of 'complextypestbl'. We don't use the nullable/nonnullable.orc files for it (we only use them for complextypestbl_non_transactional). -- To view, visit http://gerrit.cloudera.org:8080/15395 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic2e2afec00c9a5cf87f1d61b5fe52b0085844bcb Gerrit-Change-Number: 15395 Gerrit-PatchSet: 8 Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Norbert Luksa <norbert.lu...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Tue, 31 Mar 2020 16:05:22 +0000 Gerrit-HasComments: Yes