Mostafa Mokhtar created IMPALA-7172:
---------------------------------------

             Summary: Statestore should verify that all subscribers are running 
the same version of Impala
                 Key: IMPALA-7172
                 URL: https://issues.apache.org/jira/browse/IMPALA-7172
             Project: IMPALA
          Issue Type: New Feature
          Components: Distributed Exec
    Affects Versions: Impala 2.13.0
            Reporter: Mostafa Mokhtar


While running a metadata test which uses sync_ddl=1, tests appeared to hang 
indefinitely.
Turns out one of the Impala daemons was running an older build which caused 
statestore topic updates to continuously fail.

Ideally the SS should track the version across subscribers and black list the 
ones that don't match the SS and CS version.

Logs from SS
{code}
I0614 11:11:04.410529 57312 statestore.cc:259] Preparing initial 
impala-membership topic update for impa...@vb0204.halxg.cloudera.com:22000. 
Size = 2.06 KB
I0614 11:11:04.411222 57312 client-cache.cc:82] ReopenClient(): re-creating 
client for vb0204.halxg.cloudera.com:23000
I0614 11:11:04.411821 57312 client-cache.h:304] RPC Error: Client for 
vb0204.halxg.cloudera.com:23000 hit an unexpected exception: No more data to 
read., type: N6apache6thrift9transport19TTransportExceptionE, rpc: 
N6impala20TUpdateStateResponseE, send: done
I0614 11:11:04.411831 57312 client-cache.cc:174] Broken Connection, destroy 
client for vb0204.halxg.cloudera.com:23000
I0614 11:11:04.411861 57312 statestore.cc:891] Unable to send priority topic 
update message to subscriber impa...@vb0204.halxg.cloudera.com:22000, received 
error: RPC Error: Client for vb0204.halxg.cloudera.com:23000 hit an unexpected 
exception: No more data to read., type: 
N6apache6thrift9transport19TTransportExceptionE, rpc: 
N6impala20TUpdateStateResponseE, send: done
{code}

Log from Impalad 
{code}
I0614 11:03:19.479164 41915 thrift-util.cc:123] TAcceptQueueServer exception: 
N6apache6thrift8protocol18TProtocolExceptionE: TProtocolException: Invalid data
I0614 11:03:19.680028 41916 thrift-util.cc:123] TAcceptQueueServer exception: 
N6apache6thrift8protocol18TProtocolExceptionE: TProtocolException: Invalid data
I0614 11:03:19.680776 41917 thrift-util.cc:123] TAcceptQueueServer exception: 
N6apache6thrift8protocol18TProtocolExceptionE: TProtocolException: Invalid data
I0614 11:03:19.881295 41918 thrift-util.cc:123] TAcceptQueueServer exception: 
N6apache6thrift8protocol18TProtocolExceptionE: TProtocolException: Invalid data
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to