[jira] [Created] (IMPALA-10159) Support ORC file format for Iceberg table
WangSheng created IMPALA-10159: -- Summary: Support ORC file format for Iceberg table Key: IMPALA-10159 URL: https://issues.apache.org/jira/browse/IMPALA-10159 Project: IMPALA Issue Type: Sub-task Reporter: WangSheng Assignee: WangSheng Impala can query PARQUET file format for Iceberg Table now. Since have already do some work in IMPALA-9741, we can continue ORC file format supported work in this jira. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10159) Support ORC file format for Iceberg table
[ https://issues.apache.org/jira/browse/IMPALA-10159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192751#comment-17192751 ] WangSheng commented on IMPALA-10159: Hi [~boroknagyz],[~tarmstrong], supported ORC file format for Iceberg table is quite simple based on IMPALA-9741. The point is to construct test cases, and we meet problems in IMPALA-9967. My previous test file is generated by Spark, and I found that Spark is not supported timestamp without time zone fields. So I think we may generate test files without Timestamp type and explain this in the code. How do you think? > Support ORC file format for Iceberg table > - > > Key: IMPALA-10159 > URL: https://issues.apache.org/jira/browse/IMPALA-10159 > Project: IMPALA > Issue Type: Sub-task >Reporter: WangSheng >Assignee: WangSheng >Priority: Minor > > Impala can query PARQUET file format for Iceberg Table now. Since have > already do some work in IMPALA-9741, we can continue ORC file format > supported work in this jira. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-10159) Support ORC file format for Iceberg table
[ https://issues.apache.org/jira/browse/IMPALA-10159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltán Borók-Nagy updated IMPALA-10159: --- Labels: impala-iceberg (was: ) > Support ORC file format for Iceberg table > - > > Key: IMPALA-10159 > URL: https://issues.apache.org/jira/browse/IMPALA-10159 > Project: IMPALA > Issue Type: Sub-task >Reporter: WangSheng >Assignee: WangSheng >Priority: Minor > Labels: impala-iceberg > > Impala can query PARQUET file format for Iceberg Table now. Since have > already do some work in IMPALA-9741, we can continue ORC file format > supported work in this jira. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10159) Support ORC file format for Iceberg table
[ https://issues.apache.org/jira/browse/IMPALA-10159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192815#comment-17192815 ] Zoltán Borók-Nagy commented on IMPALA-10159: Hey [~skyyws], could you tell me which version of Spark/ORC do you use? An alternative is to create the files with an older ORC library. If that's too much trouble, then maybe we can omit timestamps as you propose, and open a Jira only for the new timestamp types. > Support ORC file format for Iceberg table > - > > Key: IMPALA-10159 > URL: https://issues.apache.org/jira/browse/IMPALA-10159 > Project: IMPALA > Issue Type: Sub-task >Reporter: WangSheng >Assignee: WangSheng >Priority: Minor > Labels: impala-iceberg > > Impala can query PARQUET file format for Iceberg Table now. Since have > already do some work in IMPALA-9741, we can continue ORC file format > supported work in this jira. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-10159) Support ORC file format for Iceberg table
[ https://issues.apache.org/jira/browse/IMPALA-10159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192815#comment-17192815 ] Zoltán Borók-Nagy edited comment on IMPALA-10159 at 9/9/20, 12:19 PM: -- Hey [~skyyws], could you tell me which version of Spark/ORC do you use? An alternative is to create the files with an older ORC library. If that's too much trouble, then maybe we can omit timestamps as you propose, and -open a Jira only for the new timestamp types.- we have IMPALA-9967 to track the timestamp issue. was (Author: boroknagyz): Hey [~skyyws], could you tell me which version of Spark/ORC do you use? An alternative is to create the files with an older ORC library. If that's too much trouble, then maybe we can omit timestamps as you propose, and open a Jira only for the new timestamp types. > Support ORC file format for Iceberg table > - > > Key: IMPALA-10159 > URL: https://issues.apache.org/jira/browse/IMPALA-10159 > Project: IMPALA > Issue Type: Sub-task >Reporter: WangSheng >Assignee: WangSheng >Priority: Minor > Labels: impala-iceberg > > Impala can query PARQUET file format for Iceberg Table now. Since have > already do some work in IMPALA-9741, we can continue ORC file format > supported work in this jira. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10159) Support ORC file format for Iceberg table
[ https://issues.apache.org/jira/browse/IMPALA-10159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192834#comment-17192834 ] WangSheng commented on IMPALA-10159: Hi [~boroknagyz], I use spark-shell to generated test files, my spark client version is 2.4.5, and the orc jars in this client is 1.5.5, even I replace these orc jars to 1.6.3, it doesn't work. Here is the code to generated test files: {code:java} val conf = new Configuration() val tblLoc = "/test-warehouse/iceberg_test/iceberg_partitioned_orc" val catalog = new HadoopTables(conf); val sparkSchema = StructType(List(StructField("id", IntegerType,true), StructField("user", StringType,false),StructField("action", StringType,false), StructField("event_time", SparkSchemaUtil.convert(Types.TimestampType.withoutZone()),false))) val icebergSchema = SparkSchemaUtil.convert(sparkSchema) val spec = PartitionSpec.builderFor(icebergSchema).hour("event_time").identity("action").build val table = catalog.create(icebergSchema, spec, tblLoc) val data_df = spark.createDataFrame(Seq((1,"Alex","view",Timestamp.valueOf("2020-01-01 08:00:00".toDF("id","user","action","ts") var array = data_df.select(data_df("id"),data_df("user"),data_df("action"),to_timestamp(data_df("ts"))).collect() val df = spark.createDataFrame(sc.makeRDD(array), sparkSchema) df.write.format("iceberg").option("write-format", "orc").mode("append").save(tblLoc) spark.read.format("iceberg").load(tblLoc).show {code} This code will throw exception "java.lang.UnsupportedOperationException: Spark does not support timestamp without time zone fields" If we replace "SparkSchemaUtil.convert(Types.TimestampType.withoutZone())" to "TimestampType", we can generated test files normally, but when query in Impala, you can meet the problem in IMPALA-9967. And here is the create statement: {code:java} CREATE EXTERNAL TABLE default.iceberg_partitioned_orc STORED AS ICEBERG LOCATION 'hdfs://localhost:20500/test-warehouse/iceberg_test/iceberg_partitioned_orc' TBLPROPERTIES('iceberg_file_format'='orc'); {code} > Support ORC file format for Iceberg table > - > > Key: IMPALA-10159 > URL: https://issues.apache.org/jira/browse/IMPALA-10159 > Project: IMPALA > Issue Type: Sub-task >Reporter: WangSheng >Assignee: WangSheng >Priority: Minor > Labels: impala-iceberg > > Impala can query PARQUET file format for Iceberg Table now. Since have > already do some work in IMPALA-9741, we can continue ORC file format > supported work in this jira. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10158) test_iceberg_query and test_iceberg_profile fail after IMPALA-9741
[ https://issues.apache.org/jira/browse/IMPALA-10158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192954#comment-17192954 ] ASF subversion and git services commented on IMPALA-10158: -- Commit efc627d050caeb9947af2dfd3fc8a02236c44d0e in impala's branch refs/heads/master from Fang-Yu Rao [ https://gitbox.apache.org/repos/asf?p=impala.git;h=efc627d ] IMPALA-10158: Set timezone to UTC for Iceberg-related E2E tests We found that the tests of test_iceberg_query and test_iceberg_profile fail after the patch for IMPALA-9741 has been merged and that it is due to the default timezone of Impala not being UTC. This patch fixes the issue by adding "SET TIMEZONE=UTC;" before those test queries are run. Testing: - Verified in a local development environment that the tests of test_iceberg_query and test_iceberg_profile could pass after applying this patch. Change-Id: Ie985519e8ded04f90465e141488bd2dda78af6c3 Reviewed-on: http://gerrit.cloudera.org:8080/16425 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > test_iceberg_query and test_iceberg_profile fail after IMPALA-9741 > -- > > Key: IMPALA-10158 > URL: https://issues.apache.org/jira/browse/IMPALA-10158 > Project: IMPALA > Issue Type: Bug >Reporter: Fang-Yu Rao >Assignee: Fang-Yu Rao >Priority: Major > > We found that the tests of {{test_iceberg_query}} and > {{test_iceberg_profile}} fail after the patch for IMPALA-9741 has been > merged. > After some investigation with the help of [~boroknagyz] and [~csringhofer], > we found that it is a timezone-related issue and that we should add {{SET > TIMEZONE=UTC}} in the corresponding test files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7961) Concurrent catalog heavy workloads can cause queries with SYNC_DDL to fail fast
[ https://issues.apache.org/jira/browse/IMPALA-7961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192953#comment-17192953 ] ASF subversion and git services commented on IMPALA-7961: - Commit 0c89a9d562c280507a6e842898bf3e41cadc3ff1 in impala's branch refs/heads/master from wzhou-code [ https://gitbox.apache.org/repos/asf?p=impala.git;h=0c89a9d ] IMPALA-10140: Fix CatalogExeception for creating database with sync_ddl as true IMPALA-7961 handle the cases for query "create table if not exists" with sync_ddl as true. Customers reported similar issue which happened for query "create database if not exists" with sync_ddl as true. This patch adds the similar fixing as the fixing for IMPALA-7961 to function CatalogOpExecutor.createDatabase() to fix the issue. Testing: - Manual tests Since this is a racy bug, I could only reproduce it by forcing frequent topicUpdateLog GCs along with a specific sequence of actions, like: run some DDLs and REFRESHs to trigger a GC in topicUpdateLog, then run query "create database if not exists" with sync_ddl as true. Verified that the issue couldn't be reproduced after applying this patch. - Passed exhaustive test. Change-Id: Id623118f8938f416414c45d93404fb70d036a9df Reviewed-on: http://gerrit.cloudera.org:8080/16421 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Concurrent catalog heavy workloads can cause queries with SYNC_DDL to fail > fast > --- > > Key: IMPALA-7961 > URL: https://issues.apache.org/jira/browse/IMPALA-7961 > Project: IMPALA > Issue Type: Bug > Components: Catalog >Affects Versions: Impala 2.12.0, Impala 3.1.0 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Critical > Fix For: Impala 3.2.0 > > Attachments: 0001-Repro-of-IMPALA-7961.patch > > > When catalog server is under heavy load with concurrent updates to objects, > queries with SYNC_DDL can fail with the following message. > *User facing error message:* > {noformat} > ERROR: CatalogException: Couldn't retrieve the catalog topic version for the > SYNC_DDL operation after 3 attempts.The operation has been successfully > executed but its effects may have not been broadcast to all the coordinators. > {noformat} > *Exception from the catalog server log:* > {noformat} > I1031 00:00:49.168761 1127039 CatalogServiceCatalog.java:1903] Operation > using SYNC_DDL is waiting for catalog topic version: 236535. Time to identify > topic version (msec): 1088 > I1031 00:00:49.168824 1125528 CatalogServiceCatalog.java:1903] Operation > using SYNC_DDL is waiting for catalog topic version: 236535. Time to identify > topic version (msec): 12625 > I1031 00:00:49.168851 1131986 jni-util.cc:230] > org.apache.impala.catalog.CatalogException: Couldn't retrieve the catalog > topic version for the SYNC_DDL operation after 3 attempts.The operation has > been successfully executed but its effects may have not been broadcast to all > the coordinators. > at > org.apache.impala.catalog.CatalogServiceCatalog.waitForSyncDdlVersion(CatalogServiceCatalog.java:1891) > at > org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:336) > at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:146) > > {noformat} > *What this means* > The Catalog operation is actually successful (the change has been committed > to HMS and Catalog server cache) but the Catalog server noticed that it is > taking longer than expected time for it to broadcast the changes (for > whatever reason) and instead of hanging in there, it fails fast. The > coordinators are expected to eventually sync up in the background. > *Problem* > - This violates the contract of the SYNC_DDL query option since the query > returns early. > - This is a behavioral regression from pre IMPALA-5058 state where the > queries would wait forever for SYNC_DDL based changes to propagate. > *Notes* > - Introduced by IMPALA-5058 > - Based on the occurrences of this issue, we narrowed it down to a specific > kind of DDLs (see Jira comments). > - My understanding is that this also applies to the Catalog V2 (or > LocalCatalog mode) since we still rely on the CatalogServer for DDL > orchestration and hence it takes this codepath. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-9741) Support query iceberg table by impala
[ https://issues.apache.org/jira/browse/IMPALA-9741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192955#comment-17192955 ] ASF subversion and git services commented on IMPALA-9741: - Commit efc627d050caeb9947af2dfd3fc8a02236c44d0e in impala's branch refs/heads/master from Fang-Yu Rao [ https://gitbox.apache.org/repos/asf?p=impala.git;h=efc627d ] IMPALA-10158: Set timezone to UTC for Iceberg-related E2E tests We found that the tests of test_iceberg_query and test_iceberg_profile fail after the patch for IMPALA-9741 has been merged and that it is due to the default timezone of Impala not being UTC. This patch fixes the issue by adding "SET TIMEZONE=UTC;" before those test queries are run. Testing: - Verified in a local development environment that the tests of test_iceberg_query and test_iceberg_profile could pass after applying this patch. Change-Id: Ie985519e8ded04f90465e141488bd2dda78af6c3 Reviewed-on: http://gerrit.cloudera.org:8080/16425 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Support query iceberg table by impala > - > > Key: IMPALA-9741 > URL: https://issues.apache.org/jira/browse/IMPALA-9741 > Project: IMPALA > Issue Type: Sub-task >Reporter: WangSheng >Assignee: WangSheng >Priority: Major > Labels: impala-iceberg > Attachments: select-iceberg.jpg > > > Since we have submit an patch of supporting create iceberg table by impala in > IMPALA-9688, we are preparing to implement iceberg table query by impala. But > we need to read the impala and iceberg code deeply to determine how to do > this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10129) Data race in MemTracker::GetTopNQueriesAndUpdatePoolStats
[ https://issues.apache.org/jira/browse/IMPALA-10129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192950#comment-17192950 ] ASF subversion and git services commented on IMPALA-10129: -- Commit 9f51673a40d61cf087dd72c6e50719ed522ac851 in impala's branch refs/heads/master from Qifan Chen [ https://gitbox.apache.org/repos/asf?p=impala.git;h=9f51673 ] IMPALA-10129 Data race in MemTracker::GetTopNQueriesAndUpdatePoolStats This work addresses a data race condition in admission controller by providing the initializing values for two data members ( is_query_mem_tracker_ and query_id_) in a constructor for the MemTracker class. Without doing so, the two data members are set, without lock protection, after the object is constructed, which allows other threads to modify either of them at the same time. Testing: 1. Ran the python admission controller test successfully with a tsan build. Data race was not observed with the enhancement. Data race was observed without the enhancement. 2. Ran the core test. Change-Id: I9c4ffe8064d3e099a525cc48c218ef73112fb67b Reviewed-on: http://gerrit.cloudera.org:8080/16408 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Data race in MemTracker::GetTopNQueriesAndUpdatePoolStats > - > > Key: IMPALA-10129 > URL: https://issues.apache.org/jira/browse/IMPALA-10129 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Sahil Takiar >Assignee: Qifan Chen >Priority: Major > > TSAN is reporting a data race in > {{MemTracker::GetTopNQueriesAndUpdatePoolStats}} > {code} > WARNING: ThreadSanitizer: data race (pid=6436) > Read of size 1 at 0x7b480017aaa8 by thread T320 (mutexes: write > M861448892003377216, write M862574791910219632, write M623321199144890016, > write M1054540811927503496): > #0 > impala::MemTracker::GetTopNQueriesAndUpdatePoolStats(std::priority_queue std::vector std::allocator >, > std::greater >&, int, impala::TPoolStats&) > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/runtime/mem-tracker.cc:453:19 > (impalad+0x20b13b1) > #1 impala::MemTracker::UpdatePoolStatsForQueries(int, > impala::TPoolStats&) > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/runtime/mem-tracker.cc:432:3 > (impalad+0x20b123d) > #2 impala::AdmissionController::PoolStats::UpdateMemTrackerStats() > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:1642:14 > (impalad+0x21c9d10) > #3 > impala::AdmissionController::AddPoolUpdates(std::vector std::allocator >*) > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:1662:18 > (impalad+0x21c7053) > #4 > impala::AdmissionController::UpdatePoolStats(std::map std::char_traits, std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, std::vector std::allocator >*) > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:1355:5 > (impalad+0x21c6d7d) > #5 > impala::AdmissionController::Init()::$_4::operator()(std::map std::char_traits, std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, std::vector std::allocator >*) const > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:643:45 > (impalad+0x21ce0e1) > #6 > boost::detail::function::void_function_obj_invoker2 void, std::map, > std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, std::vector std::allocator > >*>::invoke(boost::detail::function::function_buffer&, > std::map, > std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, std::vector std::allocator >*) > /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:159:11 > (impalad+0x21cdf2c) > #7 boost::function2 std::char_traits, std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, std::vector std::allocator > >*>::operator()(std::map std::char_traits, std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std
[jira] [Commented] (IMPALA-10140) Throw CatalogException for query "create database if not exist" with sync_ddl as true
[ https://issues.apache.org/jira/browse/IMPALA-10140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192951#comment-17192951 ] ASF subversion and git services commented on IMPALA-10140: -- Commit 0c89a9d562c280507a6e842898bf3e41cadc3ff1 in impala's branch refs/heads/master from wzhou-code [ https://gitbox.apache.org/repos/asf?p=impala.git;h=0c89a9d ] IMPALA-10140: Fix CatalogExeception for creating database with sync_ddl as true IMPALA-7961 handle the cases for query "create table if not exists" with sync_ddl as true. Customers reported similar issue which happened for query "create database if not exists" with sync_ddl as true. This patch adds the similar fixing as the fixing for IMPALA-7961 to function CatalogOpExecutor.createDatabase() to fix the issue. Testing: - Manual tests Since this is a racy bug, I could only reproduce it by forcing frequent topicUpdateLog GCs along with a specific sequence of actions, like: run some DDLs and REFRESHs to trigger a GC in topicUpdateLog, then run query "create database if not exists" with sync_ddl as true. Verified that the issue couldn't be reproduced after applying this patch. - Passed exhaustive test. Change-Id: Id623118f8938f416414c45d93404fb70d036a9df Reviewed-on: http://gerrit.cloudera.org:8080/16421 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Throw CatalogException for query "create database if not exist" with sync_ddl > as true > - > > Key: IMPALA-10140 > URL: https://issues.apache.org/jira/browse/IMPALA-10140 > Project: IMPALA > Issue Type: Bug > Components: Catalog, Frontend >Affects Versions: Impala 3.2.0 >Reporter: Wenzhe Zhou >Assignee: Wenzhe Zhou >Priority: Critical > > Customer faced following error message randomly when running following query > on impalad version 3.2.0-cdh6.3.2 RELEASE. > set sync_ddl =true ; create database if not exists $dbname; > I0715 11:52:28.496253 51943 client-request-state.cc:187] > a246b430fe450786:81647bd6] CatalogException: Couldn't retrieve the > catalog topic version for the SYNC_DDL operation after 5 attempts.The > operation has been su > ccessfully executed but its effects may have not been broadcast to all the > coordinators. > > From the Catalog server log, we can check following error message as well. > I0715 11:01:50.143303 220286 jni-util.cc:256] > org.apache.impala.catalog.CatalogException: Couldn't retrieve the catalog > topic version for the SYNC_DDL operation after 5 attempts.The operation has > been successfully executed but its effects may have not been broadcast to all > the coordinators. > at > org.apache.impala.catalog.CatalogServiceCatalog.waitForSyncDdlVersion(CatalogServiceCatalog.java:2474) > at > org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:374) > at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:154) > This looks to be another variation of the conditions described in > IMPALA-7961. But the difference here is that this case is with "CREATE > DATABASE ... IF NOT EXISTS". > The fix in IMPALA-7961 specifically targets the "CREATE TABLE ... IF NOT > EXISTS" use case. > To fix the issue, we should port the change in patch > [https://gerrit.cloudera.org/#/c/12428/] to createDatabase() function. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7961) Concurrent catalog heavy workloads can cause queries with SYNC_DDL to fail fast
[ https://issues.apache.org/jira/browse/IMPALA-7961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192952#comment-17192952 ] ASF subversion and git services commented on IMPALA-7961: - Commit 0c89a9d562c280507a6e842898bf3e41cadc3ff1 in impala's branch refs/heads/master from wzhou-code [ https://gitbox.apache.org/repos/asf?p=impala.git;h=0c89a9d ] IMPALA-10140: Fix CatalogExeception for creating database with sync_ddl as true IMPALA-7961 handle the cases for query "create table if not exists" with sync_ddl as true. Customers reported similar issue which happened for query "create database if not exists" with sync_ddl as true. This patch adds the similar fixing as the fixing for IMPALA-7961 to function CatalogOpExecutor.createDatabase() to fix the issue. Testing: - Manual tests Since this is a racy bug, I could only reproduce it by forcing frequent topicUpdateLog GCs along with a specific sequence of actions, like: run some DDLs and REFRESHs to trigger a GC in topicUpdateLog, then run query "create database if not exists" with sync_ddl as true. Verified that the issue couldn't be reproduced after applying this patch. - Passed exhaustive test. Change-Id: Id623118f8938f416414c45d93404fb70d036a9df Reviewed-on: http://gerrit.cloudera.org:8080/16421 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Concurrent catalog heavy workloads can cause queries with SYNC_DDL to fail > fast > --- > > Key: IMPALA-7961 > URL: https://issues.apache.org/jira/browse/IMPALA-7961 > Project: IMPALA > Issue Type: Bug > Components: Catalog >Affects Versions: Impala 2.12.0, Impala 3.1.0 >Reporter: Bharath Vissapragada >Assignee: Bharath Vissapragada >Priority: Critical > Fix For: Impala 3.2.0 > > Attachments: 0001-Repro-of-IMPALA-7961.patch > > > When catalog server is under heavy load with concurrent updates to objects, > queries with SYNC_DDL can fail with the following message. > *User facing error message:* > {noformat} > ERROR: CatalogException: Couldn't retrieve the catalog topic version for the > SYNC_DDL operation after 3 attempts.The operation has been successfully > executed but its effects may have not been broadcast to all the coordinators. > {noformat} > *Exception from the catalog server log:* > {noformat} > I1031 00:00:49.168761 1127039 CatalogServiceCatalog.java:1903] Operation > using SYNC_DDL is waiting for catalog topic version: 236535. Time to identify > topic version (msec): 1088 > I1031 00:00:49.168824 1125528 CatalogServiceCatalog.java:1903] Operation > using SYNC_DDL is waiting for catalog topic version: 236535. Time to identify > topic version (msec): 12625 > I1031 00:00:49.168851 1131986 jni-util.cc:230] > org.apache.impala.catalog.CatalogException: Couldn't retrieve the catalog > topic version for the SYNC_DDL operation after 3 attempts.The operation has > been successfully executed but its effects may have not been broadcast to all > the coordinators. > at > org.apache.impala.catalog.CatalogServiceCatalog.waitForSyncDdlVersion(CatalogServiceCatalog.java:1891) > at > org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:336) > at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:146) > > {noformat} > *What this means* > The Catalog operation is actually successful (the change has been committed > to HMS and Catalog server cache) but the Catalog server noticed that it is > taking longer than expected time for it to broadcast the changes (for > whatever reason) and instead of hanging in there, it fails fast. The > coordinators are expected to eventually sync up in the background. > *Problem* > - This violates the contract of the SYNC_DDL query option since the query > returns early. > - This is a behavioral regression from pre IMPALA-5058 state where the > queries would wait forever for SYNC_DDL based changes to propagate. > *Notes* > - Introduced by IMPALA-5058 > - Based on the occurrences of this issue, we narrowed it down to a specific > kind of DDLs (see Jira comments). > - My understanding is that this also applies to the Catalog V2 (or > LocalCatalog mode) since we still rely on the CatalogServer for DDL > orchestration and hence it takes this codepath. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10124) admission-controller-test fails with no such file or directory error
[ https://issues.apache.org/jira/browse/IMPALA-10124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen resolved IMPALA-10124. - Fix Version/s: Impala 4.0 Resolution: Fixed > admission-controller-test fails with no such file or directory error > > > Key: IMPALA-10124 > URL: https://issues.apache.org/jira/browse/IMPALA-10124 > Project: IMPALA > Issue Type: Bug >Reporter: Yongzhi Chen >Assignee: Qifan Chen >Priority: Major > Fix For: Impala 4.0 > > > In master-core-ubsan, the admission-controller-test fails : > 03:12:04 > /data/jenkins/workspace/impala-asf-master-core-ubsan/repos/Impala/be/build/debug//scheduling/admission-controller-test: > line 10: 29380 Segmentation fault (core dumped) > ${IMPALA_HOME}/bin/run-jvm-binary.sh > ${IMPALA_HOME}/be/build/latest/service/unifiedbetests > --gtest_filter=${GTEST_FILTER} > --gtest_output=xml:${IMPALA_BE_TEST_LOGS_DIR}/${TEST_EXEC_NAME}.xml > -log_filename="${TEST_EXEC_NAME}" "$@" > 03:12:04 Traceback (most recent call last): > 03:12:04 File > "/data/jenkins/workspace/impala-asf-master-core-ubsan/repos/Impala/bin/junitxml_prune_notrun.py", > line 71, in > 03:12:04 if __name__ == "__main__": main() > 03:12:04 File > "/data/jenkins/workspace/impala-asf-master-core-ubsan/repos/Impala/bin/junitxml_prune_notrun.py", > line 68, in main > 03:12:04 junitxml_prune_notrun(options.filename) > 03:12:04 File > "/data/jenkins/workspace/impala-asf-master-core-ubsan/repos/Impala/bin/junitxml_prune_notrun.py", > line 31, in junitxml_prune_notrun > 03:12:04 root = tree.parse(junitxml_filename) > 03:12:04 File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 647, in > parse > 03:12:04 source = open(source, "rb") > 03:12:04 IOError: [Errno 2] No such file or directory: > '/data/jenkins/workspace/impala-asf-master-core-ubsan/repos/Impala/logs/be_tests/admission-controller-test.xml' > ... > 03:18:30 The following tests FAILED: > 03:18:30 57 - admission-controller-test (Failed) > 03:18:30 Errors while running CTest > 03:18:30 make: *** [test] Error 8 > 03:18:30 ERROR in > /data/jenkins/workspace/impala-asf-master-core-ubsan/repos/Impala/bin/run-backend-tests.sh > at line 43: "${MAKE_CMD:-make}" test ARGS="${BE_TEST_ARGS}" -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10129) Data race in MemTracker::GetTopNQueriesAndUpdatePoolStats
[ https://issues.apache.org/jira/browse/IMPALA-10129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen resolved IMPALA-10129. - Fix Version/s: Impala 4.0 Resolution: Fixed > Data race in MemTracker::GetTopNQueriesAndUpdatePoolStats > - > > Key: IMPALA-10129 > URL: https://issues.apache.org/jira/browse/IMPALA-10129 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Sahil Takiar >Assignee: Qifan Chen >Priority: Major > Fix For: Impala 4.0 > > > TSAN is reporting a data race in > {{MemTracker::GetTopNQueriesAndUpdatePoolStats}} > {code} > WARNING: ThreadSanitizer: data race (pid=6436) > Read of size 1 at 0x7b480017aaa8 by thread T320 (mutexes: write > M861448892003377216, write M862574791910219632, write M623321199144890016, > write M1054540811927503496): > #0 > impala::MemTracker::GetTopNQueriesAndUpdatePoolStats(std::priority_queue std::vector std::allocator >, > std::greater >&, int, impala::TPoolStats&) > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/runtime/mem-tracker.cc:453:19 > (impalad+0x20b13b1) > #1 impala::MemTracker::UpdatePoolStatsForQueries(int, > impala::TPoolStats&) > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/runtime/mem-tracker.cc:432:3 > (impalad+0x20b123d) > #2 impala::AdmissionController::PoolStats::UpdateMemTrackerStats() > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:1642:14 > (impalad+0x21c9d10) > #3 > impala::AdmissionController::AddPoolUpdates(std::vector std::allocator >*) > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:1662:18 > (impalad+0x21c7053) > #4 > impala::AdmissionController::UpdatePoolStats(std::map std::char_traits, std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, std::vector std::allocator >*) > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:1355:5 > (impalad+0x21c6d7d) > #5 > impala::AdmissionController::Init()::$_4::operator()(std::map std::char_traits, std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, std::vector std::allocator >*) const > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:643:45 > (impalad+0x21ce0e1) > #6 > boost::detail::function::void_function_obj_invoker2 void, std::map, > std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, std::vector std::allocator > >*>::invoke(boost::detail::function::function_buffer&, > std::map, > std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, std::vector std::allocator >*) > /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:159:11 > (impalad+0x21cdf2c) > #7 boost::function2 std::char_traits, std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, std::vector std::allocator > >*>::operator()(std::map std::char_traits, std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, std::vector std::allocator >*) const > /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:770:14 > (impalad+0x23fa960) > #8 > impala::StatestoreSubscriber::UpdateState(std::map std::char_traits, std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, impala::TUniqueId const&, std::vector std::allocator >*, bool*) > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/statestore/statestore-subscriber.cc:471:7 > (impalad+0x23f7899) > #9 > impala::StatestoreSubscriberThriftIf::UpdateState(impala::TUpdateStateResponse&, > impala::TUpdateStateRequest const&) > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/statestore/statestore-subscriber.cc:110:18 >
[jira] [Resolved] (IMPALA-10155) Apparent data race in GetTopNQueriesAndUpdatePoolStats
[ https://issues.apache.org/jira/browse/IMPALA-10155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen resolved IMPALA-10155. - Fix Version/s: Impala 4.0 Resolution: Duplicate > Apparent data race in GetTopNQueriesAndUpdatePoolStats > -- > > Key: IMPALA-10155 > URL: https://issues.apache.org/jira/browse/IMPALA-10155 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 4.0 >Reporter: Thomas Tauber-Marshall >Assignee: Qifan Chen >Priority: Blocker > Fix For: Impala 4.0 > > > From a tsan build: > {noformat} > WARNING: ThreadSanitizer: data race (pid=6487) > Read of size 1 at 0x7b48001c2c28 by thread T320 (mutexes: write > M866233966607478128, write M867078391537609888, write M627824798772259232, > write M451058461859238408): > #0 > impala::MemTracker::GetTopNQueriesAndUpdatePoolStats(std::priority_queue std::vector std::allocator >, > std::greater >&, int, impala::TPoolStats&) > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/runtime/mem-tracker.cc:453:19 > (impalad+0x20b2e51) > #1 impala::MemTracker::UpdatePoolStatsForQueries(int, > impala::TPoolStats&) > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/runtime/mem-tracker.cc:432:3 > (impalad+0x20b2cdd) > #2 impala::AdmissionController::PoolStats::UpdateMemTrackerStats() > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:1642:14 > (impalad+0x21cb7b0) > #3 > impala::AdmissionController::AddPoolUpdates(std::vector std::allocator >*) > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:1662:18 > (impalad+0x21c8af3) > #4 > impala::AdmissionController::UpdatePoolStats(std::map std::char_traits, std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, std::vector std::allocator >*) > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:1355:5 > (impalad+0x21c881d) > #5 > impala::AdmissionController::Init()::$_4::operator()(std::map std::char_traits, std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, std::vector std::allocator >*) const > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:643:45 > (impalad+0x21cfb81) > #6 > boost::detail::function::void_function_obj_invoker2 void, std::map, > std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, std::vector std::allocator > >*>::invoke(boost::detail::function::function_buffer&, > std::map, > std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, std::vector std::allocator >*) > /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:159:11 > (impalad+0x21cf9cc) > #7 boost::function2 std::char_traits, std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, std::vector std::allocator > >*>::operator()(std::map std::char_traits, std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, std::vector std::allocator >*) const > /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:770:14 > (impalad+0x23fc400) > #8 > impala::StatestoreSubscriber::UpdateState(std::map std::char_traits, std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, impala::TUniqueId const&, std::vector std::allocator >*, bool*) > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/statestore/statestore-subscriber.cc:471:7 > (impalad+0x23f9339) > #9 > impala::StatestoreSubscriberThriftIf::UpdateState(impala::TUpdateStateResponse&, > impala::TUpdateStateRequest const&) > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/statestore/statestore-subscriber.cc:110:18 > (impalad+0x23fc65f) > #10 impala::StatestoreSu
[jira] [Commented] (IMPALA-10155) Apparent data race in GetTopNQueriesAndUpdatePoolStats
[ https://issues.apache.org/jira/browse/IMPALA-10155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17193007#comment-17193007 ] Qifan Chen commented on IMPALA-10155: - This case is a duplication of IMPALA-10129 which has been resolved. > Apparent data race in GetTopNQueriesAndUpdatePoolStats > -- > > Key: IMPALA-10155 > URL: https://issues.apache.org/jira/browse/IMPALA-10155 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 4.0 >Reporter: Thomas Tauber-Marshall >Assignee: Qifan Chen >Priority: Blocker > > From a tsan build: > {noformat} > WARNING: ThreadSanitizer: data race (pid=6487) > Read of size 1 at 0x7b48001c2c28 by thread T320 (mutexes: write > M866233966607478128, write M867078391537609888, write M627824798772259232, > write M451058461859238408): > #0 > impala::MemTracker::GetTopNQueriesAndUpdatePoolStats(std::priority_queue std::vector std::allocator >, > std::greater >&, int, impala::TPoolStats&) > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/runtime/mem-tracker.cc:453:19 > (impalad+0x20b2e51) > #1 impala::MemTracker::UpdatePoolStatsForQueries(int, > impala::TPoolStats&) > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/runtime/mem-tracker.cc:432:3 > (impalad+0x20b2cdd) > #2 impala::AdmissionController::PoolStats::UpdateMemTrackerStats() > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:1642:14 > (impalad+0x21cb7b0) > #3 > impala::AdmissionController::AddPoolUpdates(std::vector std::allocator >*) > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:1662:18 > (impalad+0x21c8af3) > #4 > impala::AdmissionController::UpdatePoolStats(std::map std::char_traits, std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, std::vector std::allocator >*) > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:1355:5 > (impalad+0x21c881d) > #5 > impala::AdmissionController::Init()::$_4::operator()(std::map std::char_traits, std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, std::vector std::allocator >*) const > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/scheduling/admission-controller.cc:643:45 > (impalad+0x21cfb81) > #6 > boost::detail::function::void_function_obj_invoker2 void, std::map, > std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, std::vector std::allocator > >*>::invoke(boost::detail::function::function_buffer&, > std::map, > std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, std::vector std::allocator >*) > /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:159:11 > (impalad+0x21cf9cc) > #7 boost::function2 std::char_traits, std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, std::vector std::allocator > >*>::operator()(std::map std::char_traits, std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, std::vector std::allocator >*) const > /data/jenkins/workspace/impala-asf-master-core-tsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:770:14 > (impalad+0x23fc400) > #8 > impala::StatestoreSubscriber::UpdateState(std::map std::char_traits, std::allocator >, impala::TTopicDelta, > std::less, > std::allocator > >, > std::allocator std::char_traits, std::allocator > const, impala::TTopicDelta> > > > const&, impala::TUniqueId const&, std::vector std::allocator >*, bool*) > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/statestore/statestore-subscriber.cc:471:7 > (impalad+0x23f9339) > #9 > impala::StatestoreSubscriberThriftIf::UpdateState(impala::TUpdateStateResponse&, > impala::TUpdateStateRequest const&) > /data/jenkins/workspace/impala-asf-master-core-tsan/repos/Impala/be/src/statestore/statestore-subscriber.cc:110:18 > (impalad+0x23fc65f)
[jira] [Created] (IMPALA-10160) kernel_stack_watchdog cannot print user stack
Sahil Takiar created IMPALA-10160: - Summary: kernel_stack_watchdog cannot print user stack Key: IMPALA-10160 URL: https://issues.apache.org/jira/browse/IMPALA-10160 Project: IMPALA Issue Type: Bug Components: Backend Reporter: Sahil Takiar I've seen this a few times now, the kernel_stack_watchdog is used in a few places in the KRPC code and it prints out the kernel + user stack whenever a thread is stuck in some method call for too long. The issue is that the user stack does not get printed: {code} W0908 17:15:00.365721 6605 kernel_stack_watchdog.cc:198] Thread 6612 stuck at outbound_call.cc:273 for 120ms: Kernel stack: [] futex_wait_queue_me+0xc6/0x130 [] futex_wait+0x17b/0x280 [] do_futex+0x106/0x5a0 [] SyS_futex+0x80/0x180 [] system_call_fastpath+0x16/0x1b [] 0x User stack: {code} It says that the signal handler of taking the thread stack is unavailable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10140) Throw CatalogException for query "create database if not exist" with sync_ddl as true
[ https://issues.apache.org/jira/browse/IMPALA-10140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenzhe Zhou resolved IMPALA-10140. -- Fix Version/s: Impala 4.0 Resolution: Fixed > Throw CatalogException for query "create database if not exist" with sync_ddl > as true > - > > Key: IMPALA-10140 > URL: https://issues.apache.org/jira/browse/IMPALA-10140 > Project: IMPALA > Issue Type: Bug > Components: Catalog, Frontend >Affects Versions: Impala 3.2.0 >Reporter: Wenzhe Zhou >Assignee: Wenzhe Zhou >Priority: Critical > Fix For: Impala 4.0 > > > Customer faced following error message randomly when running following query > on impalad version 3.2.0-cdh6.3.2 RELEASE. > set sync_ddl =true ; create database if not exists $dbname; > I0715 11:52:28.496253 51943 client-request-state.cc:187] > a246b430fe450786:81647bd6] CatalogException: Couldn't retrieve the > catalog topic version for the SYNC_DDL operation after 5 attempts.The > operation has been su > ccessfully executed but its effects may have not been broadcast to all the > coordinators. > > From the Catalog server log, we can check following error message as well. > I0715 11:01:50.143303 220286 jni-util.cc:256] > org.apache.impala.catalog.CatalogException: Couldn't retrieve the catalog > topic version for the SYNC_DDL operation after 5 attempts.The operation has > been successfully executed but its effects may have not been broadcast to all > the coordinators. > at > org.apache.impala.catalog.CatalogServiceCatalog.waitForSyncDdlVersion(CatalogServiceCatalog.java:2474) > at > org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:374) > at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:154) > This looks to be another variation of the conditions described in > IMPALA-7961. But the difference here is that this case is with "CREATE > DATABASE ... IF NOT EXISTS". > The fix in IMPALA-7961 specifically targets the "CREATE TABLE ... IF NOT > EXISTS" use case. > To fix the issue, we should port the change in patch > [https://gerrit.cloudera.org/#/c/12428/] to createDatabase() function. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10161) User LDAP search bind support
Tamas Mate created IMPALA-10161: --- Summary: User LDAP search bind support Key: IMPALA-10161 URL: https://issues.apache.org/jira/browse/IMPALA-10161 Project: IMPALA Issue Type: Improvement Components: Backend, Security Affects Versions: Impala 3.4.0 Reporter: Tamas Mate Assignee: Tamas Mate Currently Impala only supports simple direct bind mechanism to authenticate a user. While other components allow the administrators to specify a user search base dn and an administrator bind dn and bind password to search for the user under the user search base directory. This method is especially useful for larger organizations where the directory structure is wide. Given the following two FQDNs: {code:java} uid=alice,ou=Engineering,ou=People,dc=mycompany,dc=com uid=bob,ou=Accounting,ou=People,dc=mycompany,dc=com {code} In case the administrator would like to allow both Engineering and Accounting users to authenticate neither the ldap_baseDN nor the ldap_bind_pattern configuration could give the flexibility to authenticate correctly. * ldap_baseDN takes the configured baseDN and prefixes it with _uid=_ * ldap_bind_pattern gives the option to specify a pattern with a parameter such as _user=#UID,OU=foo,CN=bar_ The convenient solution would be to specify a base dn and execute a search under it instead of prefixing it with uid, because this depends on the LDAP directory structure. LDAP search has already been implemented for groups, this should be implemented for users as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10162) Support additional LDAP filter options
Thomas Tauber-Marshall created IMPALA-10162: --- Summary: Support additional LDAP filter options Key: IMPALA-10162 URL: https://issues.apache.org/jira/browse/IMPALA-10162 Project: IMPALA Issue Type: Task Components: Security Affects Versions: Impala 4.0 Reporter: Thomas Tauber-Marshall Assignee: Thomas Tauber-Marshall IMPALA-2563 added support for user and group filter on LDAP, with options modeled after those in Hive, but they are somewhat restrictive - only allowing specifying particular parts of the LDAP search filter used. There are additional, more general ldap filter options that Impala should also support which allow for specifying arbitrary search filters. This for example would enable an LDAP configuration where the authenticated usernames are not part of the user's DN. We should model these configs after equivalent options in HDFS, see in particular 'hadoop.security.group.mapping.ldap.search.filter.user' and 'hadoop.security.group.mapping.ldap.search.filter.group' -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-10099) Push down DISTINCT aggregation for EXCEPT/INTERSECT
[ https://issues.apache.org/jira/browse/IMPALA-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shant Hovsepian resolved IMPALA-10099. -- Fix Version/s: Impala 4.0 Resolution: Fixed > Push down DISTINCT aggregation for EXCEPT/INTERSECT > --- > > Key: IMPALA-10099 > URL: https://issues.apache.org/jira/browse/IMPALA-10099 > Project: IMPALA > Issue Type: Improvement >Reporter: Shant Hovsepian >Assignee: Shant Hovsepian >Priority: Major > Fix For: Impala 4.0 > > > The implementation of SetOperations for EXCEPT/INTERSECT in IMPALA-9943 > produced query rewrites that would apply DISTINCT aggregation after exchanges > for distributed plans. In case where the query can be directly rewritten to > apply the DISTINCT to the set operation operands would result in better > performance for most large queries. > This should help the performance TPC-DS Q14 which does an INTERSECT of > queries with large result sets that contain many duplicates. > In general it would better to have DISTINCT move around optimization phase > during planning which would handle this case as well as many others. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-9936) Only send invalidations in DDL responses to LocalCatalog coordinators
[ https://issues.apache.org/jira/browse/IMPALA-9936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-9936 started by Quanlong Huang. -- > Only send invalidations in DDL responses to LocalCatalog coordinators > - > > Key: IMPALA-9936 > URL: https://issues.apache.org/jira/browse/IMPALA-9936 > Project: IMPALA > Issue Type: Sub-task >Reporter: Quanlong Huang >Assignee: Quanlong Huang >Priority: Major > > Catalogd RPC requests (TDdlExecRequest, TUpdateCatalogRequest and > TResetMetadataRequest) should contain the information (whether in > LocalCatalog mode) of the clients (coordinators). For LocalCatalog > coordinators, catalogd just need to send back invalidations instead of the > full catalog objects. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org