Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16383 )

Change subject: IMPALA-10115: Impala should check file schema as well to check 
full ACIDv2 files
......................................................................

IMPALA-10115: Impala should check file schema as well to check full ACIDv2 files

Currently Impala checks file metadata 'hive.acid.version' to decide the
full ACID schema. There are cases when Hive forgets to set this value
for full ACID files, e.g. query-based compactions.

So it's more robust to check the schema elements instead of the metadata
field. Also, sometimes Hive write the schema with different character
cases, e.g. originalTransaction vs originaltransaction, so we should
rather compare the column names in a case insensitive way.

Testing:
* added test for full ACID compaction
* added test_full_acid_schema_without_file_metadata_tag to test full
  ACID file without metadata 'hive.acid.version'

Change-Id: I52642c1755599efd28fa2c90f13396cfe0f5fa14
Reviewed-on: http://gerrit.cloudera.org:8080/16383
Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
---
M be/src/exec/hdfs-orc-scanner.cc
M be/src/exec/orc-metadata-utils.cc
M be/src/exec/orc-metadata-utils.h
M testdata/data/README
A testdata/data/full_acid_schema_but_no_acid_version.orc
M testdata/workloads/functional-query/queries/QueryTest/acid-compaction.test
M tests/query_test/test_acid.py
7 files changed, 88 insertions(+), 27 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I52642c1755599efd28fa2c90f13396cfe0f5fa14
Gerrit-Change-Number: 16383
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>

Reply via email to