Mostafa Mokhtar created IMPALA-7172: ---------------------------------------
Summary: Statestore should verify that all subscribers are running the same version of Impala Key: IMPALA-7172 URL: https://issues.apache.org/jira/browse/IMPALA-7172 Project: IMPALA Issue Type: New Feature Components: Distributed Exec Affects Versions: Impala 2.13.0 Reporter: Mostafa Mokhtar While running a metadata test which uses sync_ddl=1, tests appeared to hang indefinitely. Turns out one of the Impala daemons was running an older build which caused statestore topic updates to continuously fail. Ideally the SS should track the version across subscribers and black list the ones that don't match the SS and CS version. Logs from SS {code} I0614 11:11:04.410529 57312 statestore.cc:259] Preparing initial impala-membership topic update for impa...@vb0204.halxg.cloudera.com:22000. Size = 2.06 KB I0614 11:11:04.411222 57312 client-cache.cc:82] ReopenClient(): re-creating client for vb0204.halxg.cloudera.com:23000 I0614 11:11:04.411821 57312 client-cache.h:304] RPC Error: Client for vb0204.halxg.cloudera.com:23000 hit an unexpected exception: No more data to read., type: N6apache6thrift9transport19TTransportExceptionE, rpc: N6impala20TUpdateStateResponseE, send: done I0614 11:11:04.411831 57312 client-cache.cc:174] Broken Connection, destroy client for vb0204.halxg.cloudera.com:23000 I0614 11:11:04.411861 57312 statestore.cc:891] Unable to send priority topic update message to subscriber impa...@vb0204.halxg.cloudera.com:22000, received error: RPC Error: Client for vb0204.halxg.cloudera.com:23000 hit an unexpected exception: No more data to read., type: N6apache6thrift9transport19TTransportExceptionE, rpc: N6impala20TUpdateStateResponseE, send: done {code} Log from Impalad {code} I0614 11:03:19.479164 41915 thrift-util.cc:123] TAcceptQueueServer exception: N6apache6thrift8protocol18TProtocolExceptionE: TProtocolException: Invalid data I0614 11:03:19.680028 41916 thrift-util.cc:123] TAcceptQueueServer exception: N6apache6thrift8protocol18TProtocolExceptionE: TProtocolException: Invalid data I0614 11:03:19.680776 41917 thrift-util.cc:123] TAcceptQueueServer exception: N6apache6thrift8protocol18TProtocolExceptionE: TProtocolException: Invalid data I0614 11:03:19.881295 41918 thrift-util.cc:123] TAcceptQueueServer exception: N6apache6thrift8protocol18TProtocolExceptionE: TProtocolException: Invalid data {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)