Pranay Singh created IMPALA-7168: ------------------------------------ Summary: DML query may hang if CatalogUpdateCallback() encounters repeated error Key: IMPALA-7168 URL: https://issues.apache.org/jira/browse/IMPALA-7168 Project: IMPALA Issue Type: Bug Components: Catalog Affects Versions: Impala 2.12.0, Impala 3.0, Impala 2.11.0, Impala 2.10.0, Impala 2.9.0 Reporter: Pranay Singh
DML queries or INSERT will encounter a hang, if exec_env_->frontend()->UpdateCatalogCache() in ImpalaServer::CatalogUpdateCallback encounters repeated error like ENOMEM. This happens with SYNC_DDL set to 1 when the coordinator node is waiting for it's catalog version to become current. The scenario shows up like this, lets say there are two coordinator nodes , Node A, Node B and catalogd and statestored are running on Node C. a) CREATE TABLE is executed on Node A, with SYNC_DDL set to 1, the thread running the query is going to block in impala::ImpalaServer::ProcessCatalogUpdateResult(), waiting for it's catalog version to become current. b) Meanwhile statestored running on Node C would call ImpalaServer::CatalogUpdateCallback on Node B via thrift RPC to do a delta topic update, which would not happen if we encounter repeated errors, say front end is low on memory (low JVM heap situation). c) In such case Node A will wait indefinitely, till Node B is shutdown voluntarily.Note this is case where Node B is reachable (hearbeat is fine, but bad node) -- This message was sent by Atlassian JIRA (v7.6.3#76005)