[jira] [Updated] (KAFKA-1557) ISR reported by TopicMetadataResponse most of the time doesn't match the Zookeeper information (and the truth)

2014-07-25 Thread Oleg Lvovitch (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleg Lvovitch updated KAFKA-1557:
-

Attachment: server2.properties
server1.properties

 ISR reported by TopicMetadataResponse most of the time doesn't match the 
 Zookeeper information (and the truth)
 --

 Key: KAFKA-1557
 URL: https://issues.apache.org/jira/browse/KAFKA-1557
 Project: Kafka
  Issue Type: Bug
  Components: consumer, controller, core, replication
Affects Versions: 0.8.0, 0.8.1
 Environment: OSX 10.9.3, Linux Scientific 6.5
 It actually doesn't seem to matter and appears to be OS-agnostic
Reporter: Oleg Lvovitch
Assignee: Neha Narkhede
 Fix For: 0.8.1.1, 0.8.2

 Attachments: server1.properties, server2.properties


 TL;DR - after a topic is created, and at least one broker in the ISR is 
 restarted, the ISR reported by the TopicMetadataResponse is incorrect.
 Specific steps to repro:
 - Download 0.8.1 Kafka
 - Copy server.properties twice into server1.properties and server2.properties 
 (attached) - basically just ports and log paths changed to allow brokers to 
 co-exist
 - Start zookeper using sh bin/zookeeper-server-start.sh 
 config/zookeper.properties
 - Start broker1: 'sh bin/kafka-server-start.sh config/server1.properties
 - Start broker2: 'sh bin/kafka-server-start.sh config/server2.properties
 - Create a new topic: sh bin/kafka-topics.sh --zookeeper localhost:2181 
 --create --topic test --replication-factor 2 --partitions 3
 - Examine topic state: sh bin/kafka-topics.sh --zookeeper localhost:2181 
 --describe --topic test - note that all ISRs are of length 2
 - Run the attached Scala code that uses TopicMetadataRequest to exmaine topic 
 state. Observer that all ISRs are of length 2 and match the information 
 output by the script
 - Shut down broker2 (simply hit Cntrl-C in the terminal), wait 5-10 seconds
 - Restart broker 2 using the original command
 - Check the status of the topic again. Observe that the leader for all topics 
 is 0 (as expected), and all ISRs contain both brokers (as expected)
 - Run the attached Scala snippet again. 
 EXPECTED:
 - The ISR information are of length 2
 ACTUAL:
 - ALL ISRs contain just broker 0
 NOTE: depending on how long broker 2 was down, sometimes some ISRs will 
 contain the full list, but shutting it down for 15+ secs seem to always yeild 
 consistent repro



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (KAFKA-1557) ISR reported by TopicMetadataResponse most of the time doesn't match the Zookeeper information (and the truth)

2014-07-25 Thread Oleg Lvovitch (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleg Lvovitch updated KAFKA-1557:
-

Attachment: BrokenKafkaLink.scala

 ISR reported by TopicMetadataResponse most of the time doesn't match the 
 Zookeeper information (and the truth)
 --

 Key: KAFKA-1557
 URL: https://issues.apache.org/jira/browse/KAFKA-1557
 Project: Kafka
  Issue Type: Bug
  Components: consumer, controller, core, replication
Affects Versions: 0.8.0, 0.8.1
 Environment: OSX 10.9.3, Linux Scientific 6.5
 It actually doesn't seem to matter and appears to be OS-agnostic
Reporter: Oleg Lvovitch
Assignee: Neha Narkhede
 Fix For: 0.8.1.1, 0.8.2

 Attachments: BrokenKafkaLink.scala, server1.properties, 
 server2.properties


 TL;DR - after a topic is created, and at least one broker in the ISR is 
 restarted, the ISR reported by the TopicMetadataResponse is incorrect.
 Specific steps to repro:
 - Download 0.8.1 Kafka
 - Copy server.properties twice into server1.properties and server2.properties 
 (attached) - basically just ports and log paths changed to allow brokers to 
 co-exist
 - Start zookeper using sh bin/zookeeper-server-start.sh 
 config/zookeper.properties
 - Start broker1: 'sh bin/kafka-server-start.sh config/server1.properties
 - Start broker2: 'sh bin/kafka-server-start.sh config/server2.properties
 - Create a new topic: sh bin/kafka-topics.sh --zookeeper localhost:2181 
 --create --topic test --replication-factor 2 --partitions 3
 - Examine topic state: sh bin/kafka-topics.sh --zookeeper localhost:2181 
 --describe --topic test - note that all ISRs are of length 2
 - Run the attached Scala code that uses TopicMetadataRequest to exmaine topic 
 state. Observer that all ISRs are of length 2 and match the information 
 output by the script
 - Shut down broker2 (simply hit Cntrl-C in the terminal), wait 5-10 seconds
 - Restart broker 2 using the original command
 - Check the status of the topic again. Observe that the leader for all topics 
 is 0 (as expected), and all ISRs contain both brokers (as expected)
 - Run the attached Scala snippet again. 
 EXPECTED:
 - The ISR information are of length 2
 ACTUAL:
 - ALL ISRs contain just broker 0
 NOTE: depending on how long broker 2 was down, sometimes some ISRs will 
 contain the full list, but shutting it down for 15+ secs seem to always yeild 
 consistent repro



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (KAFKA-1557) ISR reported by TopicMetadataResponse most of the time doesn't match the Zookeeper information (and the truth)

2014-07-25 Thread Oleg Lvovitch (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleg Lvovitch updated KAFKA-1557:
-

Description: 
TL;DR - after a topic is created, and at least one broker in the ISR is 
restarted, the ISR reported by the TopicMetadataResponse is incorrect.

Specific steps to repro:
- Download 0.8.1 Kafka
- Copy server.properties twice into server1.properties and server2.properties 
(attached) - basically just ports and log paths changed to allow brokers to 
co-exist
- Start zookeper using sh bin/zookeeper-server-start.sh 
config/zookeper.properties
- Start broker1: 'sh bin/kafka-server-start.sh config/server1.properties
- Start broker2: 'sh bin/kafka-server-start.sh config/server2.properties
- Create a new topic: sh bin/kafka-topics.sh --zookeeper localhost:2181 
--create --topic test --replication-factor 2 --partitions 3
- Examine topic state: sh bin/kafka-topics.sh --zookeeper localhost:2181 
--describe --topic test - note that all ISRs are of length 2
- Run the attached Scala code that uses TopicMetadataRequest to exmaine topic 
state. Observer that all ISRs are of length 2 and match the information output 
by the script
- Shut down broker2 (simply hit Cntrl-C in the terminal), wait 5-10 seconds
- Restart broker 2 using the original command
- Check the status of the topic again. Observe that the leader for all topics 
is 0 (as expected), and all ISRs contain both brokers (as expected)
- Run the attached Scala snippet again. 

EXPECTED:
- The ISR information are of length 2

ACTUAL:
- ALL ISRs contain just broker 0

NOTE: depending on how long broker 2 was down, sometimes some ISRs will contain 
the full list, but shutting it down for 15+ secs seem to always yield 
consistent repro


Basically it appears that brokers have incorrect ISR information for the 
metadata cache.
Our production servers exhibit the same problem - after a topic gets created 
everything looks fine, but as brokers get restarted, ISR reported by the 
brokers is wrong, whereas the one in ZK appears to report the truth (it shrinks 
as brokers get shut down and grows back up after they get restarted)

I'm not sure if this has wider impact on the functioning of the cluster - bad 
metadata information is bad - but so far there has been no evidence of that


  was:
TL;DR - after a topic is created, and at least one broker in the ISR is 
restarted, the ISR reported by the TopicMetadataResponse is incorrect.

Specific steps to repro:
- Download 0.8.1 Kafka
- Copy server.properties twice into server1.properties and server2.properties 
(attached) - basically just ports and log paths changed to allow brokers to 
co-exist
- Start zookeper using sh bin/zookeeper-server-start.sh 
config/zookeper.properties
- Start broker1: 'sh bin/kafka-server-start.sh config/server1.properties
- Start broker2: 'sh bin/kafka-server-start.sh config/server2.properties
- Create a new topic: sh bin/kafka-topics.sh --zookeeper localhost:2181 
--create --topic test --replication-factor 2 --partitions 3
- Examine topic state: sh bin/kafka-topics.sh --zookeeper localhost:2181 
--describe --topic test - note that all ISRs are of length 2
- Run the attached Scala code that uses TopicMetadataRequest to exmaine topic 
state. Observer that all ISRs are of length 2 and match the information output 
by the script
- Shut down broker2 (simply hit Cntrl-C in the terminal), wait 5-10 seconds
- Restart broker 2 using the original command
- Check the status of the topic again. Observe that the leader for all topics 
is 0 (as expected), and all ISRs contain both brokers (as expected)
- Run the attached Scala snippet again. 

EXPECTED:
- The ISR information are of length 2

ACTUAL:
- ALL ISRs contain just broker 0

NOTE: depending on how long broker 2 was down, sometimes some ISRs will contain 
the full list, but shutting it down for 15+ secs seem to always yeild 
consistent repro




 ISR reported by TopicMetadataResponse most of the time doesn't match the 
 Zookeeper information (and the truth)
 --

 Key: KAFKA-1557
 URL: https://issues.apache.org/jira/browse/KAFKA-1557
 Project: Kafka
  Issue Type: Bug
  Components: consumer, controller, core, replication
Affects Versions: 0.8.0, 0.8.1
 Environment: OSX 10.9.3, Linux Scientific 6.5
 It actually doesn't seem to matter and appears to be OS-agnostic
Reporter: Oleg Lvovitch
Assignee: Neha Narkhede
 Fix For: 0.8.1.1, 0.8.2

 Attachments: BrokenKafkaLink.scala, server1.properties, 
 server2.properties


 TL;DR - after a topic is created, and at least one broker in the ISR is 
 restarted, the ISR reported by the TopicMetadataResponse is incorrect.
 Specific steps to repro:
 - Download 0.8.1 Kafka
 - Copy 

[jira] [Updated] (KAFKA-1557) ISR reported by TopicMetadataResponse most of the time doesn't match the Zookeeper information (and the truth)

2014-07-25 Thread Joel Koshy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy updated KAFKA-1557:
--

Component/s: (was: consumer)
 Labels: newbie++  (was: )

 ISR reported by TopicMetadataResponse most of the time doesn't match the 
 Zookeeper information (and the truth)
 --

 Key: KAFKA-1557
 URL: https://issues.apache.org/jira/browse/KAFKA-1557
 Project: Kafka
  Issue Type: Bug
  Components: controller, core, replication
Affects Versions: 0.8.0, 0.8.1
 Environment: OSX 10.9.3, Linux Scientific 6.5
 It actually doesn't seem to matter and appears to be OS-agnostic
Reporter: Oleg Lvovitch
Assignee: Neha Narkhede
  Labels: newbie++
 Fix For: 0.8.1.1, 0.8.2

 Attachments: BrokenKafkaLink.scala, server1.properties, 
 server2.properties


 TL;DR - after a topic is created, and at least one broker in the ISR is 
 restarted, the ISR reported by the TopicMetadataResponse is incorrect.
 Specific steps to repro:
 - Download 0.8.1 Kafka
 - Copy server.properties twice into server1.properties and server2.properties 
 (attached) - basically just ports and log paths changed to allow brokers to 
 co-exist
 - Start zookeper using sh bin/zookeeper-server-start.sh 
 config/zookeper.properties
 - Start broker1: 'sh bin/kafka-server-start.sh config/server1.properties
 - Start broker2: 'sh bin/kafka-server-start.sh config/server2.properties
 - Create a new topic: sh bin/kafka-topics.sh --zookeeper localhost:2181 
 --create --topic test --replication-factor 2 --partitions 3
 - Examine topic state: sh bin/kafka-topics.sh --zookeeper localhost:2181 
 --describe --topic test - note that all ISRs are of length 2
 - Run the attached Scala code that uses TopicMetadataRequest to exmaine topic 
 state. Observer that all ISRs are of length 2 and match the information 
 output by the script
 - Shut down broker2 (simply hit Cntrl-C in the terminal), wait 5-10 seconds
 - Restart broker 2 using the original command
 - Check the status of the topic again. Observe that the leader for all topics 
 is 0 (as expected), and all ISRs contain both brokers (as expected)
 - Run the attached Scala snippet again. 
 EXPECTED:
 - The ISR information are of length 2
 ACTUAL:
 - ALL ISRs contain just broker 0
 NOTE: depending on how long broker 2 was down, sometimes some ISRs will 
 contain the full list, but shutting it down for 15+ secs seem to always yield 
 consistent repro
 Basically it appears that brokers have incorrect ISR information for the 
 metadata cache.
 Our production servers exhibit the same problem - after a topic gets created 
 everything looks fine, but as brokers get restarted, ISR reported by the 
 brokers is wrong, whereas the one in ZK appears to report the truth (it 
 shrinks as brokers get shut down and grows back up after they get restarted)
 I'm not sure if this has wider impact on the functioning of the cluster - bad 
 metadata information is bad - but so far there has been no evidence of that



--
This message was sent by Atlassian JIRA
(v6.2#6252)