Thanks Jay, I looked around but it wasn't immediately obvious to me what setting to change to reduce the zk ephemeral timeout - does kafka configure zk itself - if so then I'm looking for a kafka setting? I didn't see anything appropriateā¦
On Fri, Oct 7, 2011 at 9:21 AM, Jay Kreps <[email protected]> wrote: > It occurs to me that we could do a better job with this error. There are > really three things that might have happened (1) you restarted kafka within > the zk timeout, in which case as far as zk is concerned your old broker > still exists...this is weird but actually correct behavior, (2) you have > two > brokers with the same id, (3) zk has a bug and is not deleting ephemeral > nodes. > > I think if we just improved the error message to explain this we would have > happier users, as is it requires slightly deep knowledge of zk to > understand > why this happens. > > -Jay > > On Fri, Oct 7, 2011 at 7:35 AM, Mathias Herberts < > [email protected] > > wrote: > > > If you abort Kafka (killing the JVM for example) and restart it, > > depending on the zookeeper timeout you've used, it might occur that > > the ephemeral node create by the broker has not yet been removed by > > ZK. > > > > If this is the case, Kafka will detect that there is a znode conflict > > and kill itself. > > > > This is what your logs seem to imply: > > > > [2011-10-03 15:33:22,229] INFO conflict in /brokers/ids/0 data: > > 10.98.20.109-1317681202194:10.98.20.109:9092 stored data: > > 10.98.20.109-1317268078266:10.98.20.109:9092 (kafka.utils.ZkUtils$) > > > > Try to either wait for more than the ZK timeout prior to restarting > > Kafka, or lower the ZK timeout so the ephemeral node is indeed gone > > when you restart Kafka. > > > > Mathias. > > >
