Hello, i have a following scenario causing a failure while trying to restart a failed cluster node:
1) start a cluster on two nodes N1 and N2 (172.16.133.123/172.16.133.120) 2) start consumer C for queue Q 3) start producer P for queue Q sending one text message / sec 4) confirm with tcpdump that N1 is retrieving all of the traffic 5) shut down node N1 with 'qpidd --quit' 6) confirm with tcpdump that N2 is retrieving all of the traffic (successful failover) 7) restart node N1 with 'qpidd' 8) check the qpidd.log with the error catch-up connection closed prematurely Any ideas what's going on? qpidd.conf: cluster-mechanism=ANONYMOUS cluster-name=MYCLUSTER log-to-file=/home/qpid/qpid.log daemon=yes no-data-dir=yes auth=no qpidd.log (N1) 2011-09-19 13:58:35 notice Initializing CPG 2011-09-19 13:58:35 notice cluster(172.16.133.123:18918 PRE_INIT) configuration change: 172.16.133.123:18918 2011-09-19 13:58:35 notice cluster(172.16.133.123:18918 PRE_INIT) Members joined: 172.16.133.123:18918 2011-09-19 13:58:35 notice SASL disabled: No Authentication Performed 2011-09-19 13:58:35 notice Listening on TCP port 5672 2011-09-19 13:58:35 notice cluster(172.16.133.123:18918 INIT) cluster-uuid = 7ab02e1b-67dd-4fed-b176-79b567ab699f 2011-09-19 13:58:35 notice cluster(172.16.133.123:18918 READY) joined cluster EMS_CLUSTER 2011-09-19 13:58:35 notice Broker running 2011-09-19 13:58:41 notice cluster(172.16.133.123:18918 READY) configuration change: 172.16.133.120:29504 172.16.133.123:18918 2011-09-19 13:58:41 notice cluster(172.16.133.123:18918 READY) Members joined: 172.16.133.120:29504 2011-09-19 13:58:41 notice cluster(172.16.133.123:18918 UPDATER) sending update to 172.16.133.120:29504 at amqp:tcp:172.16.133.120:5672 2011-09-19 13:58:41 warning Broker closed connection: 200, OK 2011-09-19 13:58:41 notice cluster(172.16.133.123:18918 UPDATER) update sent 2011-09-19 14:00:40 notice Shut down 2011-09-19 14:00:43 notice Initializing CPG 2011-09-19 14:00:43 notice cluster(172.16.133.123:19037 PRE_INIT) configuration change: 172.16.133.120:29504 172.16.133.123:19037 2011-09-19 14:00:43 notice cluster(172.16.133.123:19037 PRE_INIT) Members joined: 172.16.133.123:19037 2011-09-19 14:00:43 notice SASL disabled: No Authentication Performed 2011-09-19 14:00:43 notice Listening on TCP port 5672 2011-09-19 14:00:43 notice cluster(172.16.133.123:19037 INIT) cluster-uuid = 7ab02e1b-67dd-4fed-b176-79b567ab699f 2011-09-19 14:00:43 notice cluster(172.16.133.123:19037 JOINER) joining cluster MYCLUSTER 2011-09-19 14:00:43 notice Broker running 2011-09-19 14:00:43 notice cluster(172.16.133.123:19037 UPDATEE) receiving update from 172.16.133.120:29504 2011-09-19 14:00:43 error deliveryRecord no update message (qpid/cluster/Connection.cpp:537) 2011-09-19 14:00:43 critical cluster(172.16.133.123:19037 UPDATEE) catch-up connection closed prematurely 172.16.133.120:5672-172.16.136.143:53170(172.16.133.123:19037-4 local,catchup) 2011-09-19 14:00:43 notice cluster(172.16.133.123:19037 LEFT) leaving cluster MYCLUSTER 2011-09-19 14:00:43 notice Shut down -- View this message in context: http://apache-qpid-users.2158936.n2.nabble.com/Cannot-restart-a-failed-cluster-node-catch-up-connection-closed-prematurely-tp6807962p6807962.html Sent from the Apache Qpid users mailing list archive at Nabble.com. --------------------------------------------------------------------- Apache Qpid - AMQP Messaging Implementation Project: http://qpid.apache.org Use/Interact: mailto:[email protected]
