Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15395 )

Change subject: IMPALA-9042: Milestone 1: properly scan files that has full 
ACID schema
......................................................................


Patch Set 8:

(6 comments)

Thanks for adding so many tests! Left some comments for the tests. The changes 
for path resolution looks good to me. Haven't gone through the changes related 
to data load. I'll look into them tomorrow if still needs.

http://gerrit.cloudera.org:8080/#/c/15395/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15395/8//COMMIT_MSG@7
PS8, Line 7: IMPALA-9042
Change this to IMPALA-9484?


http://gerrit.cloudera.org:8080/#/c/15395/8/be/src/exec/hdfs-orc-scanner.cc
File be/src/exec/hdfs-orc-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/15395/8/be/src/exec/hdfs-orc-scanner.cc@194
PS8, Line 194: "hive.acid.version"
nit: make this a constant?


http://gerrit.cloudera.org:8080/#/c/15395/8/be/src/exec/hdfs-orc-scanner.cc@197
PS8, Line 197: but file "
             :         "is not
nit: explain that it doesn't have metadata "hive.acid.version" = "2"?


http://gerrit.cloudera.org:8080/#/c/15395/8/tests/query_test/test_nested_types.py
File tests/query_test/test_nested_types.py:

http://gerrit.cloudera.org:8080/#/c/15395/8/tests/query_test/test_nested_types.py@213
PS8, Line 213:       base_table = 
"functional_orc_def.complextypestbl_non_transactional"
Do we have test coverage on reading nested types from full-ACID partitioned 
table?


http://gerrit.cloudera.org:8080/#/c/15395/8/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

http://gerrit.cloudera.org:8080/#/c/15395/8/tests/query_test/test_scanners.py@207
PS8, Line 207: functional_orc_def
I think we should get the db name by

 "functional" + vector.get_value('table_format').db_suffix()

in case that the compression is not deflate, but snappy or others in exhaustive 
test.


http://gerrit.cloudera.org:8080/#/c/15395/8/tests/query_test/test_scanners_fuzz.py
File tests/query_test/test_scanners_fuzz.py:

http://gerrit.cloudera.org:8080/#/c/15395/8/tests/query_test/test_scanners_fuzz.py@181
PS8, Line 181:       self.run_stmt_in_hive("insert into %s.%s select * from 
%s.%s" % (fuzz_db,
             :           fuzz_table, src_db, src_table))
I'm not sure if this copies the data of complextypestbl correctly since I 
previously encountered a name resolution issue for struct fields: 
https://issues.apache.org/jira/browse/IMPALA-9410?focusedCommentId=17060613&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17060613

Could you verify it? If the issue still exists, we can copy data from the 
parquet table.



--
To view, visit http://gerrit.cloudera.org:8080/15395
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2e2afec00c9a5cf87f1d61b5fe52b0085844bcb
Gerrit-Change-Number: 15395
Gerrit-PatchSet: 8
Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Norbert Luksa <norbert.lu...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Tue, 31 Mar 2020 10:10:37 +0000
Gerrit-HasComments: Yes

Reply via email to