[Impala-ASF-CR] IMPALA-9673: Add external warehouse dir variable in E2E test
Xiaomeng Zhang has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/15990 ) Change subject: IMPALA-9673: Add external warehouse dir variable in E2E test .. IMPALA-9673: Add external warehouse dir variable in E2E test Updated CDP build to 7.2.1.0-57 to include new Hive features such as HIVE-22995. In minicluster, we have default values of hive.create.as.acid and hive.create.as.insert.only which are false. So by default hive creates external type table located in external warehouse directory. Due to HIVE-22995, desc db returns external warehouse directory. With above reasons, we need use external warehouse dir in some tests. Also add a new test for "CREATE DATABASE ... LOCATION". Tested: Re-run failed test in minicluster. Run exhaustive tests. Change-Id: I57926babf4caebfd365e6be65a399f12ea68687f --- M bin/impala-config.sh M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/create-database.test M testdata/workloads/functional-query/queries/QueryTest/describe-db.test M testdata/workloads/functional-query/queries/QueryTest/describe-hive-db.test M tests/common/environ.py M tests/common/impala_test_suite.py M tests/query_test/test_compressed_formats.py 8 files changed, 78 insertions(+), 34 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/15990/10 -- To view, visit http://gerrit.cloudera.org:8080/15990 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I57926babf4caebfd365e6be65a399f12ea68687f Gerrit-Change-Number: 15990 Gerrit-PatchSet: 10 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-9673: Add external warehouse dir variable in E2E test
Xiaomeng Zhang has uploaded a new patch set (#11). ( http://gerrit.cloudera.org:8080/15990 ) Change subject: IMPALA-9673: Add external warehouse dir variable in E2E test .. IMPALA-9673: Add external warehouse dir variable in E2E test Updated CDP build to 7.2.1.0-57 to include new Hive features such as HIVE-22995. In minicluster, we have default values of hive.create.as.acid and hive.create.as.insert.only which are false. So by default hive creates external type table located in external warehouse directory. Due to HIVE-22995, desc db returns external warehouse directory. With above reasons, we need use external warehouse dir in some tests. Also add a new test for "CREATE DATABASE ... LOCATION". Tested: Re-run failed test in minicluster. Run exhaustive tests. Change-Id: I57926babf4caebfd365e6be65a399f12ea68687f --- M bin/impala-config.sh M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/create-database.test M testdata/workloads/functional-query/queries/QueryTest/describe-db.test M testdata/workloads/functional-query/queries/QueryTest/describe-hive-db.test M tests/common/environ.py M tests/common/impala_test_suite.py M tests/query_test/test_compressed_formats.py 8 files changed, 78 insertions(+), 34 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/15990/11 -- To view, visit http://gerrit.cloudera.org:8080/15990 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I57926babf4caebfd365e6be65a399f12ea68687f Gerrit-Change-Number: 15990 Gerrit-PatchSet: 11 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-9673: Add external warehouse dir variable in E2E test
Xiaomeng Zhang has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/15990 ) Change subject: IMPALA-9673: Add external warehouse dir variable in E2E test .. IMPALA-9673: Add external warehouse dir variable in E2E test Updated CDP build to 7.2.1.0-57 to include new Hive features such as HIVE-22995. In minicluster, we have default values of hive.create.as.acid and hive.create.as.insert.only which are false. So by default hive creates external type table located in external warehouse directory. Due to HIVE-22995, desc db returns external warehouse directory. With above reasons, we need use external warehouse dir in some tests. Also add a new test for "CREATE DATABASE ... LOCATION". Tested: Re-run failed test in minicluster. Run exhaustive tests. Change-Id: I57926babf4caebfd365e6be65a399f12ea68687f --- M bin/impala-config.sh M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/create-database.test M testdata/workloads/functional-query/queries/QueryTest/describe-db.test M testdata/workloads/functional-query/queries/QueryTest/describe-hive-db.test M tests/common/environ.py M tests/common/impala_test_suite.py M tests/query_test/test_compressed_formats.py 8 files changed, 78 insertions(+), 34 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/15990/8 -- To view, visit http://gerrit.cloudera.org:8080/15990 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I57926babf4caebfd365e6be65a399f12ea68687f Gerrit-Change-Number: 15990 Gerrit-PatchSet: 8 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-9673: Add external warehouse dir variable in E2E test
Xiaomeng Zhang has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/15990 ) Change subject: IMPALA-9673: Add external warehouse dir variable in E2E test .. IMPALA-9673: Add external warehouse dir variable in E2E test Updated CDP build to 7.2.1.0-57 to include new Hive features such as HIVE-22995. In minicluster, we have default values of hive.create.as.acid and hive.create.as.insert.only which are false. So by default hive creates external type table located in external warehouse directory. Due to HIVE-22995, desc db returns external warehouse directory. With above reasons, we need use external warehouse dir in some tests. Also add a new test for "CREATE DATABASE ... LOCATION". Tested: Re-run failed test in minicluster. Run exhaustive tests. Change-Id: I57926babf4caebfd365e6be65a399f12ea68687f --- M bin/impala-config.sh M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/create-database.test M testdata/workloads/functional-query/queries/QueryTest/describe-db.test M testdata/workloads/functional-query/queries/QueryTest/describe-hive-db.test M tests/common/environ.py M tests/common/impala_test_suite.py M tests/query_test/test_compressed_formats.py 8 files changed, 76 insertions(+), 32 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/15990/7 -- To view, visit http://gerrit.cloudera.org:8080/15990 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I57926babf4caebfd365e6be65a399f12ea68687f Gerrit-Change-Number: 15990 Gerrit-PatchSet: 7 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-9673: Add external warehouse dir variable in E2E test
Xiaomeng Zhang has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/15990 ) Change subject: IMPALA-9673: Add external warehouse dir variable in E2E test .. IMPALA-9673: Add external warehouse dir variable in E2E test Updated CDP build to 7.2.1.0-57 to include new Hive features such as HIVE-22995. In minicluster, we have default values of hive.create.as.acid and hive.create.as.insert.only which are false. So by default hive creates external type table located in external warehouse directory. Due to HIVE-22995, desc db returns external warehouse directory. With above reasons, we need use external warehouse dir in some tests. Also add a new test for "CREATE DATABASE ... LOCATION". Tested: Re-run failed test in minicluster. Run exhaustive tests. Change-Id: I57926babf4caebfd365e6be65a399f12ea68687f --- M bin/impala-config.sh M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/create-database.test M testdata/workloads/functional-query/queries/QueryTest/describe-db.test M testdata/workloads/functional-query/queries/QueryTest/describe-hive-db.test M tests/common/environ.py M tests/common/impala_test_suite.py M tests/query_test/test_compressed_formats.py 8 files changed, 73 insertions(+), 32 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/15990/6 -- To view, visit http://gerrit.cloudera.org:8080/15990 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I57926babf4caebfd365e6be65a399f12ea68687f Gerrit-Change-Number: 15990 Gerrit-PatchSet: 6 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-9673: Add external warehouse dir variable in E2E test
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15990 ) Change subject: IMPALA-9673: Add external warehouse dir variable in E2E test .. Patch Set 5: (4 comments) http://gerrit.cloudera.org:8080/#/c/15990/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15990/4//COMMIT_MSG@12 PS4, Line 12: HIVE-2299 > this isn't an external JIRA, so we can't reference it here. you probably wa Done http://gerrit.cloudera.org:8080/#/c/15990/4//COMMIT_MSG@18 PS4, Line 18: exhaustive tests. > this isn't a public Jenkins job. You can just say "Ran exhaustive tests" Done http://gerrit.cloudera.org:8080/#/c/15990/4/testdata/datasets/functional/functional_schema_template.sql File testdata/datasets/functional/functional_schema_template.sql: http://gerrit.cloudera.org:8080/#/c/15990/4/testdata/datasets/functional/functional_schema_template.sql@2805 PS4, Line 2805: CREATE MATERIALIZED VIEW IF NOT EXISTS {db_name}{db_suffix}.{table_name} > we should probably mention why this is necessary - I'm not sure if there is The command exists previously, I just moved to end of file, which has done by Joe in downstream. It has to be run after insert_only_transactional_table created. But it looks the sql command doesn't run in order, it didn't work if I put this sql in middle of file. So I have to put it in the end. http://gerrit.cloudera.org:8080/#/c/15990/4/tests/query_test/test_compressed_formats.py File tests/query_test/test_compressed_formats.py: http://gerrit.cloudera.org:8080/#/c/15990/4/tests/query_test/test_compressed_formats.py@104 PS4, Line 104: dest_base_dir = '/{0}'.format(EXTERNAL_WAREHOUSE_DIR) > I think we should fix this a different way. After discussing with Naveen, i Make sense. I'll update the patch. -- To view, visit http://gerrit.cloudera.org:8080/15990 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I57926babf4caebfd365e6be65a399f12ea68687f Gerrit-Change-Number: 15990 Gerrit-PatchSet: 5 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Sun, 31 May 2020 16:52:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9673: Add external warehouse dir variable in E2E test
Xiaomeng Zhang has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/15990 ) Change subject: IMPALA-9673: Add external warehouse dir variable in E2E test .. IMPALA-9673: Add external warehouse dir variable in E2E test In minicluster, we have default values of hive.create.as.acid and hive.create.as.insert.only which are false. So by default hive creates external type table located in external warehouse directory. Due to HIVE-22995, desc db returns external warehouse directory. With above reasons, we need use external warehouse dir in some tests. Tested: Re-run failed test in minicluster. Run exhaustive tests. Change-Id: I57926babf4caebfd365e6be65a399f12ea68687f --- M bin/impala-config.sh M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/create-database.test M testdata/workloads/functional-query/queries/QueryTest/describe-db.test M testdata/workloads/functional-query/queries/QueryTest/describe-hive-db.test M tests/common/environ.py M tests/common/impala_test_suite.py M tests/query_test/test_compressed_formats.py 8 files changed, 40 insertions(+), 32 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/15990/5 -- To view, visit http://gerrit.cloudera.org:8080/15990 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I57926babf4caebfd365e6be65a399f12ea68687f Gerrit-Change-Number: 15990 Gerrit-PatchSet: 5 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-9673: Add external warehouse dir variable in E2E test
Xiaomeng Zhang has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/15990 ) Change subject: IMPALA-9673: Add external warehouse dir variable in E2E test .. IMPALA-9673: Add external warehouse dir variable in E2E test In minicluster, we have default values of hive.create.as.acid and hive.create.as.insert.only which are false. So by default hive creates external type table located in external warehouse directory. Due to CDPD-8248, desc db returns external warehouse directory. With above reasons, we need use external warehouse dir in some tests. Tested: Re-run failed test in minicluster. Run impala-private-parameterized job exhaustively. Change-Id: I57926babf4caebfd365e6be65a399f12ea68687f --- M bin/impala-config.sh M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/create-database.test M testdata/workloads/functional-query/queries/QueryTest/describe-db.test M testdata/workloads/functional-query/queries/QueryTest/describe-hive-db.test M tests/common/environ.py M tests/common/impala_test_suite.py M tests/query_test/test_compressed_formats.py 8 files changed, 38 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/15990/4 -- To view, visit http://gerrit.cloudera.org:8080/15990 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I57926babf4caebfd365e6be65a399f12ea68687f Gerrit-Change-Number: 15990 Gerrit-PatchSet: 4 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-9673: Add external warehouse dir variable in E2E test
Xiaomeng Zhang has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/15990 ) Change subject: IMPALA-9673: Add external warehouse dir variable in E2E test .. IMPALA-9673: Add external warehouse dir variable in E2E test In minicluster, we have default values of hive.create.as.acid and hive.create.as.insert.only which are false. So by default hive creates external type table located in external warehouse directory. Due to CDPD-8248, desc db returns external warehouse directory. With above reasons, we need use external warehouse dir in some tests. Tested: Re-run failed test in minicluster. Run impala-private-parameterized job exhaustively. Change-Id: I57926babf4caebfd365e6be65a399f12ea68687f --- M bin/impala-config.sh M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/create-database.test M testdata/workloads/functional-query/queries/QueryTest/describe-db.test M testdata/workloads/functional-query/queries/QueryTest/describe-hive-db.test M tests/common/environ.py M tests/common/impala_test_suite.py M tests/query_test/test_compressed_formats.py 8 files changed, 38 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/15990/3 -- To view, visit http://gerrit.cloudera.org:8080/15990 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I57926babf4caebfd365e6be65a399f12ea68687f Gerrit-Change-Number: 15990 Gerrit-PatchSet: 3 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-9673: Add external warehouse dir variable in E2E test
Xiaomeng Zhang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15990 Change subject: IMPALA-9673: Add external warehouse dir variable in E2E test .. IMPALA-9673: Add external warehouse dir variable in E2E test In minicluster, we have default values of hive.create.as.acid and hive.create.as.insert.only which are false. So by default hive creates external type table located in external warehouse directory. Due to CDPD-8248, desc db returns external warehouse directory. With above reasons, we need use external warehouse dir in some tests. Tested: Re-run failed test in minicluster. Run impala-private-parameterized job exhaustively. Change-Id: I57926babf4caebfd365e6be65a399f12ea68687f --- M bin/impala-config.sh M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/create-database.test M testdata/workloads/functional-query/queries/QueryTest/describe-db.test M testdata/workloads/functional-query/queries/QueryTest/describe-hive-db.test M tests/common/environ.py M tests/common/impala_test_suite.py M tests/query_test/test_compressed_formats.py 8 files changed, 35 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/15990/2 -- To view, visit http://gerrit.cloudera.org:8080/15990 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I57926babf4caebfd365e6be65a399f12ea68687f Gerrit-Change-Number: 15990 Gerrit-PatchSet: 2 Gerrit-Owner: Xiaomeng Zhang
[Impala-ASF-CR] [WIP] Enable event polling by default in dockerized tests
Xiaomeng Zhang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15823 Change subject: [WIP] Enable event polling by default in dockerized tests .. [WIP] Enable event polling by default in dockerized tests Change-Id: Ie6b5742d504e8bce622bab8669f895a94c76fc00 --- M docker/catalogd/Dockerfile 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/15823/2 -- To view, visit http://gerrit.cloudera.org:8080/15823 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ie6b5742d504e8bce622bab8669f895a94c76fc00 Gerrit-Change-Number: 15823 Gerrit-PatchSet: 2 Gerrit-Owner: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-9663: Fix for NPE in fire listener events
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15745 ) Change subject: IMPALA-9663: Fix for NPE in fire listener events .. Patch Set 2: Code-Review+1 Thanks Vihang, LGTM -- To view, visit http://gerrit.cloudera.org:8080/15745 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibfcc5acd598fb0354a5a8288df7c495359f9e53d Gerrit-Change-Number: 15745 Gerrit-PatchSet: 2 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Fri, 17 Apr 2020 01:35:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8632: Add support for self-event detection for insert events
Xiaomeng Zhang has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/15648 ) Change subject: IMPALA-8632: Add support for self-event detection for insert events .. IMPALA-8632: Add support for self-event detection for insert events In case of INSERT_EVENTS if Impala inserts into a table it causes a refresh to the underlying table/partition. This could be unnecessary when there is only one Impala cluster in the system. We can detect a self-event in such cases when the HMS API to fire a listener event returns the event id. This is used by EventProcessor to ignore the event when it is fetched later in the next polling cycle. Testing: Add testInsertFromImpala() in MetastoreEventsProcessorTest.java to test insert event self-event detection when insert into table and partition. Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Db.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M tests/custom_cluster/test_event_processing.py 17 files changed, 504 insertions(+), 171 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/15648/9 -- To view, visit http://gerrit.cloudera.org:8080/15648 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf Gerrit-Change-Number: 15648 Gerrit-PatchSet: 9 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-8632: Add support for self-event detection for insert events
Xiaomeng Zhang has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/15648 ) Change subject: IMPALA-8632: Add support for self-event detection for insert events .. IMPALA-8632: Add support for self-event detection for insert events In case of INSERT_EVENTS if Impala inserts into a table it causes a refresh to the underlying table/partition. This could be unnecessary when there is only one Impala cluster in the system. We can detect a self-event in such cases when the HMS API to fire a listener event returns the event id. This is used by EventProcessor to ignore the event when it is fetched later in the next polling cycle. Testing: Add testInsertFromImpala() in MetastoreEventsProcessorTest.java to test insert event self-event detection when insert into table and partition. Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M bin/impala-config.sh M common/thrift/BackendGflags.thrift M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Db.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M tests/custom_cluster/test_event_processing.py 18 files changed, 487 insertions(+), 171 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/15648/5 -- To view, visit http://gerrit.cloudera.org:8080/15648 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf Gerrit-Change-Number: 15648 Gerrit-PatchSet: 5 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-8632: Add support for self-event detection for insert events
Xiaomeng Zhang has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/15648 ) Change subject: IMPALA-8632: Add support for self-event detection for insert events .. IMPALA-8632: Add support for self-event detection for insert events In case of INSERT_EVENTS if Impala inserts into a table it causes a refresh to the underlying table/partition. This could be unnecessary when there is only one Impala cluster in the system. We can detect a self-event in such cases when the HMS API to fire a listener event returns the event id. This is used by EventProcessor to ignore the event when it is fetched later in the next polling cycle. Testing: Add testInsertFromImpala() in MetastoreEventsProcessorTest.java to test insert event self-event detection when insert into table and partition. Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M bin/impala-config.sh M common/thrift/BackendGflags.thrift M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Db.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 17 files changed, 506 insertions(+), 205 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/15648/4 -- To view, visit http://gerrit.cloudera.org:8080/15648 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf Gerrit-Change-Number: 15648 Gerrit-PatchSet: 4 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-8632: Add support for self-event detection for insert events
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15648 ) Change subject: IMPALA-8632: Add support for self-event detection for insert events .. Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/15648/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/15648/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@286 PS3, Line 286:* > guess this will be removed once we bump up the CDP_BUILD right? Yes. It will be removed after I have hive jar updated. http://gerrit.cloudera.org:8080/#/c/15648/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@849 PS3, Line 849: versionNumber == -1 > Why would this be a case? As we check -1 for version number. I am checking -1 for eventId as well. Is it guaranteed that eventId will not be less than 0? http://gerrit.cloudera.org:8080/#/c/15648/3/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/15648/3/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@897 PS3, Line 897: Preconditions.checkState(inFlightEvents_.size(false) == 0); > do we need a similar check for inFlightEvents_.size(false)? Done http://gerrit.cloudera.org:8080/#/c/15648/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/15648/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4411 PS3, Line 4411: ram isInsertOverwrite indicates if the operation was an inse > I think this check should be ((!catalog_.isEventProcessingActive() && isIns Done http://gerrit.cloudera.org:8080/#/c/15648/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4466 PS3, Line 4466: sMapBeforeInsert.entrySet().iterator().next(); > I think we need to do this via MetastoreShim since otherwise the response w Done -- To view, visit http://gerrit.cloudera.org:8080/15648 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf Gerrit-Change-Number: 15648 Gerrit-PatchSet: 4 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Tue, 07 Apr 2020 21:41:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8632: Add support for self-event detection for insert events
Xiaomeng Zhang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15648 Change subject: IMPALA-8632: Add support for self-event detection for insert events .. IMPALA-8632: Add support for self-event detection for insert events In case of INSERT_EVENTS if Impala inserts into a table it causes a refresh to the underlying table/partition. This could be unnecessary when there is only one Impala cluster in the system. We can detect a self-event in such cases when the HMS API to fire a listener event returns the event id. This is used by EventProcessor to ignore the event when it is fetched later in the next polling cycle. Testing: Add testInsertEvents() in MetastoreEventsProcessorTest.java to test insert event self-event detection when insert into table and partition. Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Db.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/Table.java A fe/src/main/java/org/apache/impala/catalog/events/EventId.java M fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 10 files changed, 297 insertions(+), 83 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/15648/1 -- To view, visit http://gerrit.cloudera.org:8080/15648 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I7873fbb2c159343690f93b9d120f6b425b983dcf Gerrit-Change-Number: 15648 Gerrit-PatchSet: 1 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-9483 Add logs for debugging builtin functions throw unknown exception randomly
Xiaomeng Zhang has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/15607 ) Change subject: IMPALA-9483 Add logs for debugging builtin functions throw unknown exception randomly .. IMPALA-9483 Add logs for debugging builtin functions throw unknown exception randomly In secure env with high concurrency, queries that call builtin function randomly fail when trying to find the function. For example, "AnalysisException: trim() unknown". Adding more info in exception message to help debugging when it happens again. Change-Id: I30d6eb697695da8d2521acb76d8310ec8f1bbda9 --- M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java 1 file changed, 2 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/15607/3 -- To view, visit http://gerrit.cloudera.org:8080/15607 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I30d6eb697695da8d2521acb76d8310ec8f1bbda9 Gerrit-Change-Number: 15607 Gerrit-PatchSet: 3 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9483 Add logs for debugging builtin functions throw unknown exception randomly
Xiaomeng Zhang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15607 Change subject: IMPALA-9483 Add logs for debugging builtin functions throw unknown exception randomly .. IMPALA-9483 Add logs for debugging builtin functions throw unknown exception randomly In secure env with high concurrency, queries that call builtin function randomly fails with function unknown error. For example, "AnalysisException: trim() unknown". Adding more info in exception message to help debugging when it happens again. Change-Id: I30d6eb697695da8d2521acb76d8310ec8f1bbda9 --- M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java 1 file changed, 2 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/15607/1 -- To view, visit http://gerrit.cloudera.org:8080/15607 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I30d6eb697695da8d2521acb76d8310ec8f1bbda9 Gerrit-Change-Number: 15607 Gerrit-PatchSet: 1 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman
[Impala-ASF-CR] IMPALA-9451: Fix test hive text codec interop.py failure in CDP build
Xiaomeng Zhang has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/15520 ) Change subject: IMPALA-9451: Fix test_hive_text_codec_interop.py failure in CDP build .. IMPALA-9451: Fix test_hive_text_codec_interop.py failure in CDP build In CDP build we use Hive3 which has a bug HIVE-22371 (CTAS puts files in the wrong place). It causes failure of newly added test as CTAS creates empty table. Workaround by explicitly creating an external table when hive version >= 3. Tested: Run this test in newest CDP build using job impala-private-basic-parameterized. Change-Id: Ief8e583aae82f548754f41e07efac5d7bca4b930 --- M tests/custom_cluster/test_hive_text_codec_interop.py 1 file changed, 17 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/15520/5 -- To view, visit http://gerrit.cloudera.org:8080/15520 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ief8e583aae82f548754f41e07efac5d7bca4b930 Gerrit-Change-Number: 15520 Gerrit-PatchSet: 5 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-9451: Fix test hive text codec interop.py failure in CDP build
Xiaomeng Zhang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15520 Change subject: IMPALA-9451: Fix test_hive_text_codec_interop.py failure in CDP build .. IMPALA-9451: Fix test_hive_text_codec_interop.py failure in CDP build In CDP build we use Hive3 which has a bug HIVE-22371 (CTAS puts files in the wrong place). It causes failure of newly added test as CTAS creates empty table. Workaround by explicitly creating an external table when hive version >= 3. Tested: Run this test in newest CDP build using job impala-private-basic-parameterized. Change-Id: Ief8e583aae82f548754f41e07efac5d7bca4b930 --- M tests/custom_cluster/test_hive_text_codec_interop.py 1 file changed, 12 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/15520/4 -- To view, visit http://gerrit.cloudera.org:8080/15520 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ief8e583aae82f548754f41e07efac5d7bca4b930 Gerrit-Change-Number: 15520 Gerrit-PatchSet: 4 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Joe McDonnell
[Impala-ASF-CR] IMPALA-9451: Fix test hive text codec interop.py failure in CDP build
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15520 ) Change subject: IMPALA-9451: Fix test_hive_text_codec_interop.py failure in CDP build .. Patch Set 4: The jenkins job with test pass https://master-02.jenkins.cloudera.com/view/Impala/view/Private/job/impala-private-basic-parameterized/189/ -- To view, visit http://gerrit.cloudera.org:8080/15520 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ief8e583aae82f548754f41e07efac5d7bca4b930 Gerrit-Change-Number: 15520 Gerrit-PatchSet: 4 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Sun, 22 Mar 2020 01:49:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9446: Fix bug that impala failed to read zstd file on s3
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15391 ) Change subject: IMPALA-9446: Fix bug that impala failed to read zstd file on s3 .. Patch Set 2: The s3 job link https://master-02.jenkins.cloudera.com/view/Impala/view/Private/job/impala-private-s3-parameterized/38/ Failed test also appears on impala build dashboard, so I think it's unrelated to my commit. -- To view, visit http://gerrit.cloudera.org:8080/15391 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I13f8837fda7454ddb4bb47d20a675d6315a3462d Gerrit-Change-Number: 15391 Gerrit-PatchSet: 2 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Thu, 12 Mar 2020 23:08:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9446: Fix bug that impala failed to read zstd file on s3
Xiaomeng Zhang has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/15391 ) Change subject: IMPALA-9446: Fix bug that impala failed to read zstd file on s3 .. IMPALA-9446: Fix bug that impala failed to read zstd file on s3 S3 filesystem doesn't have a block concept, so scheduler splits each file into a smaller scan ranges. Adding ZSTD as a case in the switch statement to add support for zstd. Testing done: In mini-cluster, read external table located on s3 zstd file. Run end to end test successfully on s3. Change-Id: I13f8837fda7454ddb4bb47d20a675d6315a3462d --- M be/src/util/flat_buffer.cc 1 file changed, 3 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/15391/2 -- To view, visit http://gerrit.cloudera.org:8080/15391 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I13f8837fda7454ddb4bb47d20a675d6315a3462d Gerrit-Change-Number: 15391 Gerrit-PatchSet: 2 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-9446 Fix bug that impala failed to read zstd file on s3
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15391 ) Change subject: IMPALA-9446 Fix bug that impala failed to read zstd file on s3 .. Patch Set 1: I got a s3 jenkins job from Joe, will add result after the job finish. -- To view, visit http://gerrit.cloudera.org:8080/15391 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I13f8837fda7454ddb4bb47d20a675d6315a3462d Gerrit-Change-Number: 15391 Gerrit-PatchSet: 1 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Wed, 11 Mar 2020 23:06:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9357: Fix race condition in alter database event
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15260 ) Change subject: IMPALA-9357: Fix race condition in alter_database event .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/15260/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/15260/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@850 PS8, Line 850: tryLockDb(db); Do we need to check if lock successfully here? -- To view, visit http://gerrit.cloudera.org:8080/15260 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I472fd8a55740769ee5cdb84e48422a4ab39a8d1e Gerrit-Change-Number: 15260 Gerrit-PatchSet: 8 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Wed, 11 Mar 2020 00:45:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9446 Fix bug that impala failed to read zstd file on s3
Xiaomeng Zhang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15391 Change subject: IMPALA-9446 Fix bug that impala failed to read zstd file on s3 .. IMPALA-9446 Fix bug that impala failed to read zstd file on s3 S3 filesystem doesn't have a block concept, so scheduler split each file into a smaller scan ranges. Adding ZSTD as a case in the switch statement fixes this bug. Testing done: In mini-cluster, read external table locates on s3 zstd file. Change-Id: I13f8837fda7454ddb4bb47d20a675d6315a3462d --- M be/src/util/flat_buffer.cc 1 file changed, 3 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/15391/1 -- To view, visit http://gerrit.cloudera.org:8080/15391 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I13f8837fda7454ddb4bb47d20a675d6315a3462d Gerrit-Change-Number: 15391 Gerrit-PatchSet: 1 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9389: [DOCS] Support reading zstd text files
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15304 ) Change subject: IMPALA-9389: [DOCS] Support reading zstd text files .. Patch Set 5: (1 comment) I am good if the file extension problem get fixed. http://gerrit.cloudera.org:8080/#/c/15304/5/docs/topics/impala_txtfile.xml File docs/topics/impala_txtfile.xml: http://gerrit.cloudera.org:8080/#/c/15304/5/docs/topics/impala_txtfile.xml@713 PS5, Line 713: 85 hdfs://127.0.0.1:8020/user/hive/warehouse/file_formats.db/csv_compressed/csv_compressed_zstd.csv.gz The file extension should be csv_compressed_zstd.csv.zst I think. -- To view, visit http://gerrit.cloudera.org:8080/15304 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce Gerrit-Change-Number: 15304 Gerrit-PatchSet: 5 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Fri, 28 Feb 2020 22:17:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9389: [DOCS] Support reading zstd text files
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15304 ) Change subject: IMPALA-9389: [DOCS] Support reading zstd text files .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/15304/3/docs/topics/impala_txtfile.xml File docs/topics/impala_txtfile.xml: http://gerrit.cloudera.org:8080/#/c/15304/3/docs/topics/impala_txtfile.xml@650 PS3, Line 650: capability. Impala can read compressed text files written by Hive or compressed by the > I did some experimentation and while Impala can read text files compressed Yes, I agree, we don't have to include "streaming" or "block". From user point, I think it only matters what source of compressed file impala can read, so for text, impala can read Hive compressed zstd file and standard library compressed zstd file. For standard library compressed file testing, we only have zstd for text. For hive compressed file testing, we have coverage for all codec. So we can say all hive written codec text, impala can read. But for standard library, we only verified zstd. -- To view, visit http://gerrit.cloudera.org:8080/15304 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce Gerrit-Change-Number: 15304 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Fri, 28 Feb 2020 18:51:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9389: [DOCS] Support reading zstd text files
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15304 ) Change subject: IMPALA-9389: [DOCS] Support reading zstd text files .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/15304/4/docs/topics/impala_txtfile.xml File docs/topics/impala_txtfile.xml: http://gerrit.cloudera.org:8080/#/c/15304/4/docs/topics/impala_txtfile.xml@633 PS4, Line 633: Using bzip2, gzip, Snappy-Compressed, or zstd Text Files I saw the other code review https://gerrit.cloudera.org/c/15310/, do we need to add deflate here as well? http://gerrit.cloudera.org:8080/#/c/15304/4/docs/topics/impala_txtfile.xml@653 PS4, Line 653: or zstd-compressed text file is processed, the node doing the : work reads the entire file into memory and then decompresses it. Therefore, the node must : have enough memory to hold both the compressed and uncompressed data from the text file For text zstd decompression, we're using streaming, which doesn't load all at once. It decompress as it read. To be notice is that this is not true for parquet, we're still using block decompression for parquet file. -- To view, visit http://gerrit.cloudera.org:8080/15304 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic83137bd2c3a49398fb60cf1901f8b74ed111fce Gerrit-Change-Number: 15304 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Fri, 28 Feb 2020 00:00:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9075: Add support for reading zstd text files
Xiaomeng Zhang has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/15023 ) Change subject: IMPALA-9075: Add support for reading zstd text files .. IMPALA-9075: Add support for reading zstd text files In this patch, we add support for reading zstd encoded text files. This includes: 1. support reading zstd file written by Hive which uses streaming. 2. support reading zstd file compressed by standard zstd library which uses block. To support decompressing both formats, a function ProcessBlockStreaming is added in zstd decompressor. Testing done: Added two backend tests: 1. streaming decompress test. 2. large data test for both block and streaming decompress. Added two end to end tests: 1. hive and impala integration. For four compression codecs, write in hive and read from impala. 2. zstd library and impala integration. Copy a zstd lib compressed file to HDFS, and read from impala. Change-Id: I2adce9fe00190558525fa5cd3d50cf5e0f0b0aa4 --- M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h M be/src/util/compress.h M be/src/util/decompress-test.cc M be/src/util/decompress.cc M be/src/util/decompress.h M bin/rat_exclude_files.txt A testdata/data/text_large_zstd.txt A testdata/data/text_large_zstd.zst A tests/custom_cluster/test_hive_text_codec_interop.py M tests/query_test/test_compressed_formats.py 11 files changed, 10,000,275 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/15023/8 -- To view, visit http://gerrit.cloudera.org:8080/15023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2adce9fe00190558525fa5cd3d50cf5e0f0b0aa4 Gerrit-Change-Number: 15023 Gerrit-PatchSet: 8 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-9075: Add support for reading zstd text files
Xiaomeng Zhang has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/15023 ) Change subject: IMPALA-9075: Add support for reading zstd text files .. IMPALA-9075: Add support for reading zstd text files In this patch, we add support for reading zstd encoded text files. This includes: 1. support reading zstd file written by Hive which uses streaming. 2. support reading zstd file compressed by standard zstd library which uses block. To support decompressing both formats, a function ProcessBlockStreaming is added in zstd decompressor. Testing done: Added two backend tests: 1. streaming decompress test. 2. large data test for both block and streaming decompress. Added two end to end tests: 1. hive and impala integration. For four compression codecs, write in hive and read from impala. 2. zstd library and impala integration. Copy a zstd lib compressed file to HDFS, and read from impala. Change-Id: I2adce9fe00190558525fa5cd3d50cf5e0f0b0aa4 --- M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h M be/src/util/compress.h M be/src/util/decompress-test.cc M be/src/util/decompress.cc M be/src/util/decompress.h M bin/rat_exclude_files.txt A testdata/data/text_large_zstd.txt A testdata/data/text_large_zstd.zst A tests/custom_cluster/test_hive_text_codec_interop.py M tests/query_test/test_compressed_formats.py 11 files changed, 10,000,278 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/15023/7 -- To view, visit http://gerrit.cloudera.org:8080/15023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2adce9fe00190558525fa5cd3d50cf5e0f0b0aa4 Gerrit-Change-Number: 15023 Gerrit-PatchSet: 7 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-9075: Add support for reading zstd text files
Xiaomeng Zhang has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/15023 ) Change subject: IMPALA-9075: Add support for reading zstd text files .. IMPALA-9075: Add support for reading zstd text files In this patch, we add support for reading zstd encoded text files. This includes: 1. support reading zstd file written by Hive which uses streaming. 2. support reading zstd file compressed by standard zstd library which uses block. To support decompressing both formats, a function ProcessBlockStreaming is added in zstd decompressor. Testing done: Added two backend tests: 1. streaming decompress test. 2. large data test for both block and streaming decompress. Added two end to end tests: 1. hive and impala integration. For four compression codecs, write in hive and read from impala. 2. zstd library and impala integration. Copy a zstd lib compressed file to HDFS, and read from impala. Change-Id: I2adce9fe00190558525fa5cd3d50cf5e0f0b0aa4 --- M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h M be/src/util/compress.h M be/src/util/decompress-test.cc M be/src/util/decompress.cc M be/src/util/decompress.h M bin/rat_exclude_files.txt A testdata/data/text_large_zstd.txt A testdata/data/text_large_zstd.zst A tests/custom_cluster/test_hive_text_codec_interop.py M tests/query_test/test_compressed_formats.py 11 files changed, 10,000,275 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/15023/6 -- To view, visit http://gerrit.cloudera.org:8080/15023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2adce9fe00190558525fa5cd3d50cf5e0f0b0aa4 Gerrit-Change-Number: 15023 Gerrit-PatchSet: 6 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-9075: Add support for reading zstd text files
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15023 ) Change subject: IMPALA-9075: Add support for reading zstd text files .. Patch Set 5: (4 comments) http://gerrit.cloudera.org:8080/#/c/15023/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15023/5//COMMIT_MSG@9 PS5, Line 9: In this patch, we add support for reading zstd encoded text files. > We should open a docs JIRA to update the docs for this new feature. https://issues.apache.org/jira/browse/IMPALA-9389 http://gerrit.cloudera.org:8080/#/c/15023/5/be/src/util/decompress.cc File be/src/util/decompress.cc: http://gerrit.cloudera.org:8080/#/c/15023/5/be/src/util/decompress.cc@616 PS5, Line 616: Status ZstandardDecompressor::Init() { > Discussed offline with Xiaomeng. For now we decided to move the call to `ZS Done http://gerrit.cloudera.org:8080/#/c/15023/5/be/src/util/decompress.cc@668 PS5, Line 668: return Status(TErrorCode::ZSTD_ERROR, "ZSTD_decompress", > Use ZSTD_decompressStream in the error message. Done http://gerrit.cloudera.org:8080/#/c/15023/5/tests/custom_cluster/test_hive_text_codec_interop.py File tests/custom_cluster/test_hive_text_codec_interop.py: http://gerrit.cloudera.org:8080/#/c/15023/5/tests/custom_cluster/test_hive_text_codec_interop.py@28 PS5, Line 28: TEXT_CODECS = ['snappy', 'gzip', 'zstd', 'lzo'] > I think we should also add Default, Bzip2 and Deflate to the list: Done -- To view, visit http://gerrit.cloudera.org:8080/15023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2adce9fe00190558525fa5cd3d50cf5e0f0b0aa4 Gerrit-Change-Number: 15023 Gerrit-PatchSet: 5 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Sat, 15 Feb 2020 01:16:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9075: Add support for reading zstd text files
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15023 ) Change subject: IMPALA-9075: Add support for reading zstd text files .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/15023/5/be/src/util/decompress.cc File be/src/util/decompress.cc: http://gerrit.cloudera.org:8080/#/c/15023/5/be/src/util/decompress.cc@616 PS5, Line 616: Status ZstandardDecompressor::Init() { > Init() gets called for both block and streaming code paths. Since it's only Do you have suggestions where should it be init? I can't put it inside ProcessBlockStreaming as it might be called multiple times for one stream, and we should use one dctx for one stream. -- To view, visit http://gerrit.cloudera.org:8080/15023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2adce9fe00190558525fa5cd3d50cf5e0f0b0aa4 Gerrit-Change-Number: 15023 Gerrit-PatchSet: 5 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Thu, 13 Feb 2020 19:45:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9075: Add support for reading zstd text files
Xiaomeng Zhang has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/15023 ) Change subject: IMPALA-9075: Add support for reading zstd text files .. IMPALA-9075: Add support for reading zstd text files In this patch, we add support for reading zstd encoded text files. This includes: 1. support reading zstd file written by Hive which uses streaming. 2. support reading zstd file compressed by standard zstd library which uses block. To support decompressing both formats, a function ProcessBlockStreaming is added in zstd decompressor. Testing done: Added two backend tests: 1. streaming decompress test. 2. large data test for both block and streaming decompress. Added two end to end tests: 1. hive and impala integration. For four compression codecs, write in hive and read from impala. 2. zstd library and impala integration. Copy a zstd lib compressed file to HDFS, and read from impala. Change-Id: I2adce9fe00190558525fa5cd3d50cf5e0f0b0aa4 --- M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h M be/src/util/compress.h M be/src/util/decompress-test.cc M be/src/util/decompress.cc M be/src/util/decompress.h M bin/rat_exclude_files.txt A testdata/data/text_large_zstd.zst A tests/custom_cluster/test_hive_text_codec_interop.py M tests/query_test/test_compressed_formats.py 10 files changed, 247 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/15023/5 -- To view, visit http://gerrit.cloudera.org:8080/15023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2adce9fe00190558525fa5cd3d50cf5e0f0b0aa4 Gerrit-Change-Number: 15023 Gerrit-PatchSet: 5 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-9075: Add support for reading zstd text files
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15023 ) Change subject: IMPALA-9075: Add support for reading zstd text files .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/15023/4/be/src/util/decompress.cc File be/src/util/decompress.cc: http://gerrit.cloudera.org:8080/#/c/15023/4/be/src/util/decompress.cc@665 PS4, Line 665: *stream_end = false; > It seems to me that this line is redundant. The reason I add it here is that other stream decompressing functions write in this way https://github.infra.cloudera.com/CDH/Impala/blob/cdpd-master/be/src/util/decompress.cc#L103 and https://github.infra.cloudera.com/CDH/Impala/blob/cdpd-master/be/src/util/decompress.cc#L375 -- To view, visit http://gerrit.cloudera.org:8080/15023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2adce9fe00190558525fa5cd3d50cf5e0f0b0aa4 Gerrit-Change-Number: 15023 Gerrit-PatchSet: 4 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Wed, 29 Jan 2020 23:50:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9075: Add support for reading zstd text files
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15023 ) Change subject: IMPALA-9075: Add support for reading zstd text files .. Patch Set 4: (3 comments) http://gerrit.cloudera.org:8080/#/c/15023/3/tests/custom_cluster/test_hive_text_codec_interop.py File tests/custom_cluster/test_hive_text_codec_interop.py: http://gerrit.cloudera.org:8080/#/c/15023/3/tests/custom_cluster/test_hive_text_codec_interop.py@26 PS3, Line 26: > flake8: F401 'tests.util.filesystem_utils.get_fs_path' imported but unused Done http://gerrit.cloudera.org:8080/#/c/15023/3/tests/custom_cluster/test_hive_text_codec_interop.py@63 PS3, Line 63: > flake8: E501 line too long (106 > 90 characters) Done http://gerrit.cloudera.org:8080/#/c/15023/3/tests/query_test/test_compressed_formats.py File tests/query_test/test_compressed_formats.py: http://gerrit.cloudera.org:8080/#/c/15023/3/tests/query_test/test_compressed_formats.py@264 PS3, Line 264: > flake8: E302 expected 2 blank lines, found 1 Done -- To view, visit http://gerrit.cloudera.org:8080/15023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2adce9fe00190558525fa5cd3d50cf5e0f0b0aa4 Gerrit-Change-Number: 15023 Gerrit-PatchSet: 4 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Tue, 28 Jan 2020 19:13:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9075: Add support for reading zstd text files
Xiaomeng Zhang has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/15023 ) Change subject: IMPALA-9075: Add support for reading zstd text files .. IMPALA-9075: Add support for reading zstd text files In this patch, we add support for reading zstd encoded text files. This includes: 1. support reading zstd file written by Hive which uses streaming. 2. support reading zstd file compressed by standard zstd library which uses block. To support decompressing both formats, a function ProcessBlockStreaming is added in zstd decompressor. Testing done: Added two backend tests: 1. streaming decompress test. 2. large data test for both block and streaming decompress. Added two end to end tests: 1. hive and impala integration. For four compression codecs, write in hive and read from impala. 2. zstd library and impala integration. Copy a zstd lib compressed file to HDFS, and read from impala. Change-Id: I2adce9fe00190558525fa5cd3d50cf5e0f0b0aa4 --- M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h M be/src/util/compress.h M be/src/util/decompress-test.cc M be/src/util/decompress.cc M be/src/util/decompress.h M bin/rat_exclude_files.txt A testdata/data/text_large_zstd.zst A tests/custom_cluster/test_hive_text_codec_interop.py M tests/query_test/test_compressed_formats.py 10 files changed, 248 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/15023/4 -- To view, visit http://gerrit.cloudera.org:8080/15023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2adce9fe00190558525fa5cd3d50cf5e0f0b0aa4 Gerrit-Change-Number: 15023 Gerrit-PatchSet: 4 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-9075: Add support for reading zstd text files
Xiaomeng Zhang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15023 Change subject: IMPALA-9075: Add support for reading zstd text files .. IMPALA-9075: Add support for reading zstd text files In this patch, we add support for reading zstd encoded text files. This includes: 1. support reading zstd file written by Hive which uses streaming. 2. support reading zstd file compressed by standard zstd library which uses block. To support decompressing both formats, a function ProcessBlockStreaming is added in zstd decompressor. Testing done: Added two backend tests: 1. streaming decompress test. 2. large data test for both block and streaming decompress. Added two end to end tests: 1. hive and impala integration. For four compression codecs, write in hive and read from impala. 2. zstd library and impala integration. Copy a zstd lib compressed file to HDFS, and read from impala. Change-Id: I2adce9fe00190558525fa5cd3d50cf5e0f0b0aa4 --- M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h M be/src/util/compress.h M be/src/util/decompress-test.cc M be/src/util/decompress.cc M be/src/util/decompress.h M bin/rat_exclude_files.txt A testdata/data/text_large_zstd.zst A tests/custom_cluster/test_hive_text_codec_interop.py M tests/query_test/test_compressed_formats.py 10 files changed, 248 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/15023/3 -- To view, visit http://gerrit.cloudera.org:8080/15023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I2adce9fe00190558525fa5cd3d50cf5e0f0b0aa4 Gerrit-Change-Number: 15023 Gerrit-PatchSet: 3 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-8065: Add OS distribution name in OSInfo
Xiaomeng Zhang has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/14531 ) Change subject: IMPALA-8065: Add OS distribution name in OSInfo .. IMPALA-8065: Add OS distribution name in OSInfo Before this change OsInfo::DebugString() would print two lines: - OS version: the long name of the Linux kernel from /proc/version - Clock: the type of clock used After this change OsInfo::DebugString() will print three lines: - OS distribution: the short name of the OS release. If Docker is being used this is the name of the Container OS. - OS version: the long name of the Linux kernel from /proc/version. If Docker is being used this is the description of the Host Kernel. - Clock: the type of clock used. Tested locally, the displayed OS Info in Ubuntu16 dev box is: OS distribution: Ubuntu 16.04.6 LTS OS version: Linux version 4.15.0-65-generic (buildd@lcy01-amd64-017) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.10)) Clock: clocksource: 'tsc', clockid_t: CLOCK_MONOTONIC Also checked with diff OS in docker: centos, redhat, ubuntu, oracle, debian to make sure /etc/os-release exists and PRETTY_NAME in that file. Each OS picked one version to test. Specially for centos6 and redhat6, which have redhat-release instead of os-release, copied redhat-release into Ubuntu16 dev box and verified os version in mini-cluster. Added new backend test os-info-test.cc. Change-Id: I848c9e53ee4e0bf8ae0874bb6da28e8efa7f7c8a --- M be/src/util/CMakeLists.txt A be/src/util/os-info-test.cc M be/src/util/os-info.cc M be/src/util/os-info.h 4 files changed, 82 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/14531/9 -- To view, visit http://gerrit.cloudera.org:8080/14531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I848c9e53ee4e0bf8ae0874bb6da28e8efa7f7c8a Gerrit-Change-Number: 14531 Gerrit-PatchSet: 9 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-9090: Add name of table being scanned in scan node profile
Xiaomeng Zhang has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/14660 ) Change subject: IMPALA-9090: Add name of table being scanned in scan node profile .. IMPALA-9090: Add name of table being scanned in scan node profile Before this change, the only way to figure out the table being scanned by a scan node in the profile is to pull the string out of the explain plan or execsummary. This is awkward, both for manual and automated analysis of the profiles. We should include the table name as a string in the SCAN_NODE implementation. After this change, a new line "Table Name: database.table" will be added in first line of scan node profile. Also fix a bug that frontend pass hbase and kudu table name incorrectly to thrift. Before this change, native name of hbase and kudu table are passed in and there is no way to get hms table name from TableDescriptor in backend. After this change, for HBaseTableDescriptor and KuduTableDescriptor, function name() would return hms table name, function table_name() would return hbase or kudu native table name. Manually tested on mini-cluster with: 1. hdfs and s3 table with file format text and parquet. 2. hbase table. 3. kudu table. Change-Id: If5da1112bcf38ae55b89eccfd7c7fad860819a99 --- M be/src/exec/hbase-scan-node.cc M be/src/exec/hbase-table-writer.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/kudu-scan-node-base.cc M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java 6 files changed, 8 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/14660/4 -- To view, visit http://gerrit.cloudera.org:8080/14660 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If5da1112bcf38ae55b89eccfd7c7fad860819a99 Gerrit-Change-Number: 14660 Gerrit-PatchSet: 4 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-9090 Add name of table being scanned in scan node profile
Xiaomeng Zhang has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/14660 ) Change subject: IMPALA-9090 Add name of table being scanned in scan node profile .. IMPALA-9090 Add name of table being scanned in scan node profile Before this change, the only way to figure out the table being scanned by a scan node in the profile is to pull the string out of the explain plan or execsummary. This is awkward, both for manual and automated analysis of the profiles. We should include the table name as a string in the SCAN_NODE implementation. After this change, a new line "Table Name: database.table" will be added in first line of scan node profile. Manually tested on mini-cluster with: 1. hdfs and s3 table with file format text and parquet "test.hdfs_table", it would show as "test.hdfs_table". 2. hbase table "test.hbase_table", if create with TBLPROPERTIES ("hbase.table.name" = "xyz", "hbase.mapred.output.outputtable" = "xyz") it would show as "test.xyz"; if not, it would show as "test.test.hbase_table". 3. kudu table "test.kudu_table", it would show as "test.impala::test.kudu_table". Change-Id: If5da1112bcf38ae55b89eccfd7c7fad860819a99 --- M be/src/exec/hbase-scan-node.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/kudu-scan-node-base.cc 3 files changed, 4 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/14660/2 -- To view, visit http://gerrit.cloudera.org:8080/14660 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If5da1112bcf38ae55b89eccfd7c7fad860819a99 Gerrit-Change-Number: 14660 Gerrit-PatchSet: 2 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-9090 Add name of table being scanned in HDFS scan node profile
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14660 ) Change subject: IMPALA-9090 Add name of table being scanned in HDFS scan node profile .. Patch Set 1: Thanks Tim, I will update patch with Hbase and Kudu change. -- To view, visit http://gerrit.cloudera.org:8080/14660 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If5da1112bcf38ae55b89eccfd7c7fad860819a99 Gerrit-Change-Number: 14660 Gerrit-PatchSet: 1 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Wed, 13 Nov 2019 18:40:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9090 Add name of table being scanned in HDFS scan node profile
Xiaomeng Zhang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14660 Change subject: IMPALA-9090 Add name of table being scanned in HDFS scan node profile .. IMPALA-9090 Add name of table being scanned in HDFS scan node profile Before this change, the only way to figure out the table being scanned by a scan node in the profile is to pull the string out of the explain plan or execsummary. This is awkward, both for manual and automated analysis of the profiles. We should include the table name as a string in the SCAN_NODE implementation. After this change, a new line "Table Name: database.table" will be added between line HDFS_SCAN_NODE (id=0) and Hdfs split stats. Manually tested on mini-cluster with hdfs and s3 with file format text and parquet. All have Table Name in HDFS scan node profile. Change-Id: If5da1112bcf38ae55b89eccfd7c7fad860819a99 --- M be/src/exec/hdfs-scan-node-base.cc 1 file changed, 2 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/14660/1 -- To view, visit http://gerrit.cloudera.org:8080/14660 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: If5da1112bcf38ae55b89eccfd7c7fad860819a99 Gerrit-Change-Number: 14660 Gerrit-PatchSet: 1 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8065 Change the format OS version and Kernel version dispalyed in OSInfo
Xiaomeng Zhang has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/14531 ) Change subject: IMPALA-8065 Change the format OS version and Kernel version dispalyed in OSInfo .. IMPALA-8065 Change the format OS version and Kernel version dispalyed in OSInfo Before this change OsInfo::DebugString() would print two lines: - OS version: the long name of the Linux kernel from /proc/version - Clock: the type of clock used After this change OsInfo::DebugString() will print three lines: - OS version: the short name of the OS release. If Docker is being used this is the name of the Container OS - Kernel version: the long name of the Linux kernel from /proc/version. If Docker is being used this is the description of the Host Kernel. - Clock: the type of clock used. Tested locally, the displayed OS Info in Ubuntu16 dev box is: OS version: Ubuntu 16.04.6 LTS Kernel version: Linux version 4.15.0-65-generic (buildd@lcy01-amd64-017) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.10)) Clock: clocksource: 'tsc', clockid_t: CLOCK_MONOTONIC Also checked with diff OS in docker: centos, redhat, ubuntu, oracle, debian to make sure /etc/os-release exists and PRETTY_NAME in that file. Each OS picked one version to test. Specially for centos6, which has centos-release instead of os-release, copied centos-release into Ubuntu16 dev box and verified os version in mini-cluster. Added new backend test os-info-test.cc. Change-Id: I848c9e53ee4e0bf8ae0874bb6da28e8efa7f7c8a --- M be/src/util/CMakeLists.txt A be/src/util/os-info-test.cc M be/src/util/os-info.cc M be/src/util/os-info.h 4 files changed, 80 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/14531/8 -- To view, visit http://gerrit.cloudera.org:8080/14531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I848c9e53ee4e0bf8ae0874bb6da28e8efa7f7c8a Gerrit-Change-Number: 14531 Gerrit-PatchSet: 8 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14600 ) Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/14600/6/be/src/catalog/catalog-server.h File be/src/catalog/catalog-server.h: http://gerrit.cloudera.org:8080/#/c/14600/6/be/src/catalog/catalog-server.h@215 PS6, Line 215: ///"long_75_loading_time": 12361844, Does it mean 75 percentage loading time? Maybe better to add word "percent" in string? Why adding "long" as prefix? -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 6 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Tue, 05 Nov 2019 01:21:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8065 Change the format OS version and Kernel version dispalyed in OSInfo
Xiaomeng Zhang has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/14531 ) Change subject: IMPALA-8065 Change the format OS version and Kernel version dispalyed in OSInfo .. IMPALA-8065 Change the format OS version and Kernel version dispalyed in OSInfo Before this change OsInfo::DebugString() would print two lines: - OS version: the long name of the Linux kernel from /proc/version - Clock: the type of clock used After this change OsInfo::DebugString() will print three lines: - OS version: the short name of the OS release. If Docker is being used this is the name of the Container OS - Kernel version: the long name of the Linux kernel from /proc/version. If Docker is being used this is the description of the Host Kernel. - Clock: the type of clock used. Tested locally, the displayed OS Info in Ubuntu16 dev box is: OS version: Ubuntu 16.04.6 LTS Kernel version: Linux version 4.15.0-65-generic (buildd@lcy01-amd64-017) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.10)) Clock: clocksource: 'tsc', clockid_t: CLOCK_MONOTONIC Also tested with diff OS in docker: centos, redhat, ubuntu, oracle, debian. Each OS picked one version to test. Added new backend test os-info-test.cc. Change-Id: I848c9e53ee4e0bf8ae0874bb6da28e8efa7f7c8a --- M be/src/util/CMakeLists.txt A be/src/util/os-info-test.cc M be/src/util/os-info.cc M be/src/util/os-info.h 4 files changed, 75 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/14531/7 -- To view, visit http://gerrit.cloudera.org:8080/14531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I848c9e53ee4e0bf8ae0874bb6da28e8efa7f7c8a Gerrit-Change-Number: 14531 Gerrit-PatchSet: 7 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-8065 Change the format OS version and Kernel version dispalyed in OSInfo
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14531 ) Change subject: IMPALA-8065 Change the format OS version and Kernel version dispalyed in OSInfo .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/14531/6/be/src/util/os-info.cc File be/src/util/os-info.cc: http://gerrit.cloudera.org:8080/#/c/14531/6/be/src/util/os-info.cc@72 PS6, Line 72: if (fields[0].compare("PRETTY_NAME") == 0) { > Will this work on Centos6? Sorry, no, I'll update with fix. -- To view, visit http://gerrit.cloudera.org:8080/14531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I848c9e53ee4e0bf8ae0874bb6da28e8efa7f7c8a Gerrit-Change-Number: 14531 Gerrit-PatchSet: 6 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Fri, 01 Nov 2019 23:36:04 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8065 Edit OS version and Kernel version in OSInfo
Xiaomeng Zhang has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/14531 ) Change subject: IMPALA-8065 Edit OS version and Kernel version in OSInfo .. IMPALA-8065 Edit OS version and Kernel version in OSInfo Before this change OsInfo::DebugString() would print two lines: - OS version: the long name of the Linux kernel from /proc/version - Clock: the type of clock used After this change OsInfo::DebugString() will print three lines: - OS version: the short name of the OS release. If Docker is being used this is the name of the Container OS - Kernel version: the long name of the Linux kernel from /proc/version. If Docker is being used this is the description of the Host Kernel. - Clock: the type of clock used. Tested locally, the displayed OS Info in Ubuntu16 dev box is: OS version: Ubuntu 16.04.6 LTS Kernel version: Linux version 4.15.0-65-generic (buildd@lcy01-amd64-017) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.10)) Clock: clocksource: 'tsc', clockid_t: CLOCK_MONOTONIC Also tested with diff OS in docker: centos, redhat, ubuntu, oracle, debian. Each OS picked one version to test. Added new backend test os-info-test.cc. Change-Id: I848c9e53ee4e0bf8ae0874bb6da28e8efa7f7c8a --- M be/src/util/CMakeLists.txt A be/src/util/os-info-test.cc M be/src/util/os-info.cc M be/src/util/os-info.h 4 files changed, 77 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/14531/6 -- To view, visit http://gerrit.cloudera.org:8080/14531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I848c9e53ee4e0bf8ae0874bb6da28e8efa7f7c8a Gerrit-Change-Number: 14531 Gerrit-PatchSet: 6 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-8065 Edit OS version and Kernel version in OSInfo
Xiaomeng Zhang has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/14531 ) Change subject: IMPALA-8065 Edit OS version and Kernel version in OSInfo .. IMPALA-8065 Edit OS version and Kernel version in OSInfo Before this change OsInfo::DebugString() would print two lines: - OS version: the long name of the Linux kernel from /proc/version - Clock: the type of clock used After this change OsInfo::DebugString() will print three lines: - OS version: the short name of the OS release. If Docker is being used this is the name of the Container OS - Kernel version: the long name of the Linux kernel from /proc/version. If Docker is being used this is the description of the Host Kernel. - Clock: the type of clock used. Tested locally, the displayed OS Info in Ubuntu16 dev box is: OS version: "Ubuntu 16.04.6 LTS" Kernel version: Linux version 4.15.0-65-generic (buildd@lcy01-amd64-017) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.10)) Clock: clocksource: 'tsc', clockid_t: CLOCK_MONOTONIC Also tested with diff OS in docker: centos, redhat, ubuntu, oracle, debian. Each OS picked one version to test. Added new backend test os-info-test.cc. Change-Id: I848c9e53ee4e0bf8ae0874bb6da28e8efa7f7c8a --- M be/src/util/CMakeLists.txt A be/src/util/os-info-test.cc M be/src/util/os-info.cc M be/src/util/os-info.h 4 files changed, 77 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/14531/5 -- To view, visit http://gerrit.cloudera.org:8080/14531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I848c9e53ee4e0bf8ae0874bb6da28e8efa7f7c8a Gerrit-Change-Number: 14531 Gerrit-PatchSet: 5 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-8065 Edit OS version and Kernel version in OSInfo
Xiaomeng Zhang has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/14531 ) Change subject: IMPALA-8065 Edit OS version and Kernel version in OSInfo .. IMPALA-8065 Edit OS version and Kernel version in OSInfo Before this change OsInfo::DebugString() would print two lines: - OS version: the long name of the Linux kernel from /proc/version - Clock: the type of clock used After this change OsInfo::DebugString() will print three lines: - OS version: the short name of the OS release. If Docker is being used this is the name of the Container OS - Kernel version: the long name of the Linux kernel from /proc/version. If Docker is being used this is the description of the Host Kernel. - Clock: the type of clock used. Tested locally, the displayed OS Info in Ubuntu16 dev box is: OS version: "Ubuntu 16.04.6 LTS" Kernel version: Linux version 4.15.0-65-generic (buildd@lcy01-amd64-017) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.10)) Clock: clocksource: 'tsc', clockid_t: CLOCK_MONOTONIC Also tested with diff OS in docker: centos, redhat, ubuntu, oracle, debian. Each OS picked one version to test. Added new backend test os-info-test.cc. Change-Id: I848c9e53ee4e0bf8ae0874bb6da28e8efa7f7c8a --- M be/src/util/CMakeLists.txt A be/src/util/os-info-test.cc M be/src/util/os-info.cc M be/src/util/os-info.h 4 files changed, 75 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/14531/4 -- To view, visit http://gerrit.cloudera.org:8080/14531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I848c9e53ee4e0bf8ae0874bb6da28e8efa7f7c8a Gerrit-Change-Number: 14531 Gerrit-PatchSet: 4 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-8065 Add OS version and Kernel version in OSInfo
Xiaomeng Zhang has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/14531 ) Change subject: IMPALA-8065 Add OS version and Kernel version in OSInfo .. IMPALA-8065 Add OS version and Kernel version in OSInfo Original we get /proc/version displayed as OS version, while it's actually kernel version. We should correct it as Kernel version, and display OS version from /etc/os-release (for centos6 it's /etc/centos-release). Tested locally, the displayed OS Info in Ubuntu16 dev box is: OS version: "Ubuntu 16.04.6 LTS" Kernel version: Linux version 4.15.0-65-generic (buildd@lcy01-amd64-017) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.10)) Clock: clocksource: 'tsc', clockid_t: CLOCK_MONOTONIC Also tested with diff OS in docker: centos, redhat, ubuntu, oracle, debian. Each OS picked one version to test. Added new backend test os-info-test.cc. Change-Id: I848c9e53ee4e0bf8ae0874bb6da28e8efa7f7c8a --- M be/src/util/CMakeLists.txt A be/src/util/os-info-test.cc M be/src/util/os-info.cc M be/src/util/os-info.h 4 files changed, 71 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/14531/3 -- To view, visit http://gerrit.cloudera.org:8080/14531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I848c9e53ee4e0bf8ae0874bb6da28e8efa7f7c8a Gerrit-Change-Number: 14531 Gerrit-PatchSet: 3 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-8065 OSInfo produces somewhat misleading output when running in container
Xiaomeng Zhang has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/14531 ) Change subject: IMPALA-8065 OSInfo produces somewhat misleading output when running in container .. IMPALA-8065 OSInfo produces somewhat misleading output when running in container Original we get /proc/version dispalyed as OS version, while it's actually kernel version. We should correct it as Kernel version, and display OS version from /etc/os-release (for centos6 it's /etc/centos-release). Tested locally, the displayed OS Info in localhost:25020 is: OS version: "Ubuntu 16.04.6 LTS" Kernel version: Linux version 4.15.0-65-generic (buildd@lcy01-amd64-017) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.10)) Clock: clocksource: 'tsc', clockid_t: CLOCK_MONOTONIC Also tested with diff OS in docker: centos, redhat, ubuntu, oracle, debian. Each OS picked one version to test. Change-Id: I848c9e53ee4e0bf8ae0874bb6da28e8efa7f7c8a --- M be/src/util/os-info.cc M be/src/util/os-info.h 2 files changed, 39 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/14531/2 -- To view, visit http://gerrit.cloudera.org:8080/14531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I848c9e53ee4e0bf8ae0874bb6da28e8efa7f7c8a Gerrit-Change-Number: 14531 Gerrit-PatchSet: 2 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-8065 OSInfo produces somewhat misleading output when running in container
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14531 ) Change subject: IMPALA-8065 OSInfo produces somewhat misleading output when running in container .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/14531/1/be/src/util/os-info.cc File be/src/util/os-info.cc: http://gerrit.cloudera.org:8080/#/c/14531/1/be/src/util/os-info.cc@56 PS1, Line 56: ifstream os_version("/etc/os-release", ios::in); > I think this file is not present on Centos6. Thanks for finding this. I picked one version for each OS testing, which missed the special centos6. Added a catch up for case /etc/os-release doesn't exist. http://gerrit.cloudera.org:8080/#/c/14531/1/be/src/util/os-info.cc@99 PS1, Line 99: << "Kernel version: " << kernel_version_ << endl > If I were reading this I would be confused: what is OS version, what is Ker OS version is distribution version, and kerbel version is linux version. I don't have a better idea of distinguish them. For docker display, I search online, there is no guaranteed way to find it's running in a docker container. There are all kinds situation and we don't want to provide wrong info. -- To view, visit http://gerrit.cloudera.org:8080/14531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I848c9e53ee4e0bf8ae0874bb6da28e8efa7f7c8a Gerrit-Change-Number: 14531 Gerrit-PatchSet: 1 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Tue, 29 Oct 2019 21:44:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7504/KUDU-2979 ParseKerberosPrincipal() should use krb5 parse name() instead
Xiaomeng Zhang has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/14433 ) Change subject: IMPALA-7504/KUDU-2979 ParseKerberosPrincipal() should use krb5_parse_name() instead .. IMPALA-7504/KUDU-2979 ParseKerberosPrincipal() should use krb5_parse_name() instead We want to use krb5_parse_name() to parse the principal instead of using custom code. When kerberos is initialized in Impala's copy of Kudu code, it stores a global context which is used when accessing the Krb5 library. To use this global context the code that parses the principal name is moved into the Impala Kudu code. This new code is then called from the existing ParseKerberosPrincipal method. Test done: Add two tests to authentication-test, one to verify parsing a principal containing a special character. The other to verify exception when parsing bad format principal, new error code is 2 instead of original 112 which is BAD_PRINCIPAL_FORMAT error code. Run impala-private-parameterized tests. Change-Id: Ifddafa7aae25d66ed7d9fa0306f17501a191cdac Reviewed-on: http://gerrit.cloudera.org:8080/14520 Tested-by: Kudu Jenkins Reviewed-by: Alexey Serbin Change-Id: I0e64ebdc10f102dbdc5b87f6fe3f2a0310b1be24 --- M be/src/kudu/security/init.cc M be/src/kudu/security/init.h M be/src/kudu/security/test/mini_kdc-test.cc M be/src/rpc/authentication-test.cc M be/src/util/auth-util.cc 5 files changed, 61 insertions(+), 15 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/14433/7 -- To view, visit http://gerrit.cloudera.org:8080/14433 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0e64ebdc10f102dbdc5b87f6fe3f2a0310b1be24 Gerrit-Change-Number: 14433 Gerrit-PatchSet: 7 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-7504/KUDU-2979 ParseKerberosPrincipal() should use krb5 parse name() instead
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14433 ) Change subject: IMPALA-7504/KUDU-2979 ParseKerberosPrincipal() should use krb5_parse_name() instead .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/14433/6/be/src/rpc/authentication-test.cc File be/src/rpc/authentication-test.cc: http://gerrit.cloudera.org:8080/#/c/14433/6/be/src/rpc/authentication-test.cc@200 PS6, Line 200: EXPECT_ERROR(sa.InitKerberos(" ", "/etc/hosts"), 2); > This says we will get an error, but do we know it is the right error? Expected error code is 2, this line is doing the comparison. -- To view, visit http://gerrit.cloudera.org:8080/14433 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0e64ebdc10f102dbdc5b87f6fe3f2a0310b1be24 Gerrit-Change-Number: 14433 Gerrit-PatchSet: 6 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Mon, 28 Oct 2019 22:05:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7504/KUDU-2979 ParseKerberosPrincipal() should use krb5 parse name() instead
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14433 ) Change subject: IMPALA-7504/KUDU-2979 ParseKerberosPrincipal() should use krb5_parse_name() instead .. Patch Set 6: (7 comments) http://gerrit.cloudera.org:8080/#/c/14433/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14433/5//COMMIT_MSG@10 PS5, Line 10: using custom code. > instead of using custom code. Done http://gerrit.cloudera.org:8080/#/c/14433/5//COMMIT_MSG@12 PS5, Line 12: When kerberos is initialized in Impala's copy of Kudu code, it stores a > The reader of commit messages is developers like you and me. Done http://gerrit.cloudera.org:8080/#/c/14433/5//COMMIT_MSG@18 PS5, Line 18: Test done: > to verify parsing a principal nae containing a special character. Done http://gerrit.cloudera.org:8080/#/c/14433/5//COMMIT_MSG@19 PS5, Line 19: Add two authentication-test, one to verify parsing a principal containing > Is it possible to have an automated test for this? Yes, added a new test BadPrincipalFormat http://gerrit.cloudera.org:8080/#/c/14433/5//COMMIT_MSG@21 PS5, Line 21: format principal, new error code is 2 instead of original 112 > Do you want to mention running end-to-end tests? What kind of end-to-end test? The one included in impala-private-parameterized? http://gerrit.cloudera.org:8080/#/c/14433/5/be/src/kudu/security/init.h File be/src/kudu/security/init.h: http://gerrit.cloudera.org:8080/#/c/14433/5/be/src/kudu/security/init.h@39 PS5, Line 39: // Parses the given Kerberos principal into service name, hostname, and realm. > Maybe: "Parse a kerberos principal name and extract the ervice_name, hostna Done http://gerrit.cloudera.org:8080/#/c/14433/5/be/src/util/auth-util.cc File be/src/util/auth-util.cc: http://gerrit.cloudera.org:8080/#/c/14433/5/be/src/util/auth-util.cc@92 PS5, Line 92: hostname, realm), strings::Substitute("bad principal format $0", principal)); > It is more idiomatic and marginally more efficient to use: Done -- To view, visit http://gerrit.cloudera.org:8080/14433 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0e64ebdc10f102dbdc5b87f6fe3f2a0310b1be24 Gerrit-Change-Number: 14433 Gerrit-PatchSet: 6 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Wed, 23 Oct 2019 01:13:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7504/KUDU-2979 ParseKerberosPrincipal() should use krb5 parse name() instead
Xiaomeng Zhang has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/14433 ) Change subject: IMPALA-7504/KUDU-2979 ParseKerberosPrincipal() should use krb5_parse_name() instead .. IMPALA-7504/KUDU-2979 ParseKerberosPrincipal() should use krb5_parse_name() instead We want to use krb5_parse_name() to parse the principal instead of using custom code. When kerberos is initialized in Impala's copy of Kudu code, it stores a global context which is used when accessing the Krb5 library. To use this global context the code that parses the principal name is moved into the Impala Kudu code. This new code is then called from the existing ParseKerberosPrincipal method. Test done: Add two authentication-test, one to verify parsing a principal containing a special character. The other to verify exception when parsing bad format principal, new error code is 2 instead of original 112 which is BAD_PRINCIPAL_FORMAT error code. Change-Id: I0e64ebdc10f102dbdc5b87f6fe3f2a0310b1be24 --- M be/src/kudu/security/init.cc M be/src/kudu/security/init.h M be/src/rpc/authentication-test.cc M be/src/util/auth-util.cc 4 files changed, 41 insertions(+), 15 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/14433/6 -- To view, visit http://gerrit.cloudera.org:8080/14433 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0e64ebdc10f102dbdc5b87f6fe3f2a0310b1be24 Gerrit-Change-Number: 14433 Gerrit-PatchSet: 6 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-8065 OSInfo produces somewhat misleading output when running in container
Xiaomeng Zhang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14531 Change subject: IMPALA-8065 OSInfo produces somewhat misleading output when running in container .. IMPALA-8065 OSInfo produces somewhat misleading output when running in container Original we get /proc/version dispalyed as OS version, while it's actually kernel version. We should correct it as Kernel version, and display OS version from /etc/os-release. Tested locally, the displayed OS Info in localhost:25020 is: OS version: "Ubuntu 16.04.6 LTS" Kernel version: Linux version 4.15.0-65-generic (buildd@lcy01-amd64-017) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.10)) Clock: clocksource: 'tsc', clockid_t: CLOCK_MONOTONIC Change-Id: I848c9e53ee4e0bf8ae0874bb6da28e8efa7f7c8a --- M be/src/util/os-info.cc M be/src/util/os-info.h 2 files changed, 29 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/14531/1 -- To view, visit http://gerrit.cloudera.org:8080/14531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I848c9e53ee4e0bf8ae0874bb6da28e8efa7f7c8a Gerrit-Change-Number: 14531 Gerrit-PatchSet: 1 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7504/KUDU-2979 ParseKerberosPrincipal() should use krb5 parse name() instead
Xiaomeng Zhang has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/14433 ) Change subject: IMPALA-7504/KUDU-2979 ParseKerberosPrincipal() should use krb5_parse_name() instead .. IMPALA-7504/KUDU-2979 ParseKerberosPrincipal() should use krb5_parse_name() instead We want to use krb5_parse_name() to parse the principal instead of creating our own. As src/kudu/security/init.cc already have g_krb5_ctx initialized, we want to leverage the code in KUDU, and create a wrap up function which can be called from IMPALA Krb5parseName(const string& principal, string* service_name, string* hostname, string* realm) Test done: Add an authentication-test to verify principal with special character. Manually tested with bad format principal, throw out error code 2 instead of original 112 which is BAD_PRINCIPAL_FORMAT error code. Change-Id: I0e64ebdc10f102dbdc5b87f6fe3f2a0310b1be24 --- M be/src/kudu/security/init.cc M be/src/kudu/security/init.h M be/src/rpc/authentication-test.cc M be/src/util/auth-util.cc 4 files changed, 36 insertions(+), 15 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/14433/5 -- To view, visit http://gerrit.cloudera.org:8080/14433 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0e64ebdc10f102dbdc5b87f6fe3f2a0310b1be24 Gerrit-Change-Number: 14433 Gerrit-PatchSet: 5 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-7504/KUDU-2979 ParseKerberosPrincipal() should use krb5 parse name() instead
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14433 ) Change subject: IMPALA-7504/KUDU-2979 ParseKerberosPrincipal() should use krb5_parse_name() instead .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/14433/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14433/4//COMMIT_MSG@7 PS4, Line 7: IMPALA-7504 ParseKerberosPrincipal() should use krb5_parse_name() instead > See comments on commit msg in patch set 2. Sorry, missed this comments. Fixed in new patch. -- To view, visit http://gerrit.cloudera.org:8080/14433 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0e64ebdc10f102dbdc5b87f6fe3f2a0310b1be24 Gerrit-Change-Number: 14433 Gerrit-PatchSet: 4 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Mon, 21 Oct 2019 22:46:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7504 ParseKerberosPrincipal() should use krb5 parse name() instead
Xiaomeng Zhang has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/14433 ) Change subject: IMPALA-7504 ParseKerberosPrincipal() should use krb5_parse_name() instead .. IMPALA-7504 ParseKerberosPrincipal() should use krb5_parse_name() instead Change-Id: I0e64ebdc10f102dbdc5b87f6fe3f2a0310b1be24 --- M be/src/kudu/security/init.cc M be/src/kudu/security/init.h M be/src/rpc/authentication-test.cc M be/src/util/auth-util.cc 4 files changed, 36 insertions(+), 15 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/14433/4 -- To view, visit http://gerrit.cloudera.org:8080/14433 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0e64ebdc10f102dbdc5b87f6fe3f2a0310b1be24 Gerrit-Change-Number: 14433 Gerrit-PatchSet: 4 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Xiaomeng Zhang
[Impala-ASF-CR] IMPALA-7504 ParseKerberosPrincipal() should use krb5 parse name() instead
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14433 ) Change subject: IMPALA-7504 ParseKerberosPrincipal() should use krb5_parse_name() instead .. Patch Set 3: (8 comments) Patch set 3 has jenkins test failure "Bad SAM flags in obtain_sam_padata" on CHECK_EQ(krb5_get_default_realm(krb5_ctx, &unused_realm), 0); http://gerrit.cloudera.org:8080/#/c/14433/2/be/src/util/auth-util.cc File be/src/util/auth-util.cc: http://gerrit.cloudera.org:8080/#/c/14433/2/be/src/util/auth-util.cc@90 PS2, Line 90: krb5_context krb5_ctx; > Maybe 'krb5_ctx' is a clearer name? Yes, it should be. http://gerrit.cloudera.org:8080/#/c/14433/2/be/src/util/auth-util.cc@93 PS2, Line 93: CHECK_EQ(krb5_get_default_realm(krb5_ctx, &unused_realm), 0); > What is the performance impact of these extra allocation/deallocations? It has negative impact on perf. I added this initialization based on comments in init.cc InitKrb5Ctx "Work around the lack of thread safety in krb5_parse_name() by implicitly initializing g_krb5_ctx->default_realm once." http://gerrit.cloudera.org:8080/#/c/14433/2/be/src/util/auth-util.cc@97 PS2, Line 97: krb5_error_code code = krb5_parse_name(krb5_ctx, principal.c_str(), &princ); > You could join with the next line: Done http://gerrit.cloudera.org:8080/#/c/14433/2/be/src/util/auth-util.cc@99 PS2, Line 99: *realm = princ->realm.data; > Is there a test that generates TErrorCode::BAD_PRINCIPAL_FORMAT? No, I will add one. http://gerrit.cloudera.org:8080/#/c/14433/2/be/src/util/auth-util.cc@100 PS2, Line 100: krb5_data* data = princ->data; > Maybe need to call Done http://gerrit.cloudera.org:8080/#/c/14433/2/be/src/util/auth-util.cc@105 PS2, Line 105: krb5_free_principal(krb5_ctx, princ); > I think you can say Good to know! http://gerrit.cloudera.org:8080/#/c/14433/2/be/src/util/auth-util.cc@111 PS2, Line 111: > *hostname = data->data; Done http://gerrit.cloudera.org:8080/#/c/14433/2/be/src/util/auth-util.cc@114 PS2, Line 114: > Maybe need to call Done -- To view, visit http://gerrit.cloudera.org:8080/14433 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0e64ebdc10f102dbdc5b87f6fe3f2a0310b1be24 Gerrit-Change-Number: 14433 Gerrit-PatchSet: 3 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Comment-Date: Mon, 21 Oct 2019 20:14:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7504 ParseKerberosPrincipal() should use krb5 parse name() instead
Xiaomeng Zhang has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/14433 ) Change subject: IMPALA-7504 ParseKerberosPrincipal() should use krb5_parse_name() instead .. IMPALA-7504 ParseKerberosPrincipal() should use krb5_parse_name() instead Change-Id: I0e64ebdc10f102dbdc5b87f6fe3f2a0310b1be24 --- M be/src/util/auth-util.cc 1 file changed, 17 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/14433/3 -- To view, visit http://gerrit.cloudera.org:8080/14433 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0e64ebdc10f102dbdc5b87f6fe3f2a0310b1be24 Gerrit-Change-Number: 14433 Gerrit-PatchSet: 3 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho
[Impala-ASF-CR] IMPALA-7504 ParseKerberosPrincipal() should use krb5 parse name() instead
Hello Michael Ho, Andrew Sherman, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14433 to look at the new patch set (#2). Change subject: IMPALA-7504 ParseKerberosPrincipal() should use krb5_parse_name() instead .. IMPALA-7504 ParseKerberosPrincipal() should use krb5_parse_name() instead Change-Id: I0e64ebdc10f102dbdc5b87f6fe3f2a0310b1be24 --- M be/src/util/auth-util.cc 1 file changed, 24 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/14433/2 -- To view, visit http://gerrit.cloudera.org:8080/14433 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0e64ebdc10f102dbdc5b87f6fe3f2a0310b1be24 Gerrit-Change-Number: 14433 Gerrit-PatchSet: 2 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho