Hello,

After almost a year of running kafka on a single node, we are in the process of 
migrating to a 3 node cluster.  To test the process we followed the following 
process:


  *   Stop our current kafka instance, copy the entire data directory and 
zookeeper data directories to one of the new nodes
  *   Configured zookeeper as a three node cluster.
     *   Started the node where we placed the copied over data first, and then 
the others.
     *   Use the zookeeper shell to see if we can see the kafka topics listed 
on each of the three nodes.
     *   The primary (assuming the first one with all data is primary as it was 
started first) and the second node had the topics data, but the third one did 
not.
     *   Waited some time, but no data on third node.  The zookeeper shell 
always exited with an Exception if we tried to execute any commad: 
org.apache.zookeeper.KeeperException$ConnectionLossException:
     *   Shut down the second zookeeper instance, and almost instantaneously 
the third node picked up the data.  Restarted the second node, and all three 
nodes seemed to be operating fine.
     *   Used the zookeeper shell to create a test node on the first/primary 
node, and was able to see it on the other two nodes.
  *   Configured kafka as a three node cluster.
     *   Started the node where we placed the copied over data first, and then 
the others.
     *   Created a test kafka topic with replication factor 3, and saw it 
appear on all three zookeeper topics list.
     *   Used kafka-reassign-partitions.sh to modify the replication factor 
from 1 to 3 for one of our topics.
     *   Almost immediately saw the new topic directory being created under the 
logs directory on second node.  Nothing on the third node.
     *   kafka-reassign-partitions.sh with verify option still lists the 
partition reassignment as in progress. Left it like that over-night, and still 
the same.  The topic has very little data, but still nothing on third node.
     *   Shut down kafka on second node, to see if the earlier behaviour with 
zookeeper is replicated, but no such luck.
     *   Shut down both kafka and zookeeper on second node to see if any data 
shows up on third node, again no go.

Any ideas as to what may be going on?  Should we try by copying zookeeper/kafka 
data directory to all three nodes and then starting them up?

Thanks
Rakesh

Reply via email to