[
https://issues.apache.org/jira/browse/KAFKA-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588664#comment-13588664
]
Neha Narkhede commented on KAFKA-513:
-------------------------------------
Thanks for patch v3. Overall, its pretty good, a few review comments -
Let's standardize on the format for printing a partition. The reason I
suggested we follow [%s,%d] is because most of the code uses TopicPartition and
that prints partition as [%s,%d]. Saw that you changed that to add a space and
we also had started with that, but after finding that most of us tend to grep
for "[%s,%d]" instead, we decided to remove that space :)
1. ControllerChannelManager
1.1 Let's include correlation id in the state change log messages. This will
make it easier to trace the state change messages through all brokers
1.2 I think we should include another state change log message once the
controller receives a response to the state change request. We can print
correlation id here as well
1.3 In the log message for leader and isr request, I wonder if it's worth
specifying if it is either become-leader or become-follower for that particular
partition. Same can be done on the broker as well.Otherwise, it still remains a
mystery whether the broker was asked to become a leader or a follower. Thoughts
?
2. LeaderAndIsrRequest
It's good to add the controller id, should we also change the toString method
to print it ?
3. Partition
3.1 Can we call the partitionStateChangeLogger just stateChangeLogger for
simplicity ?
3.2 The kind of information you chose to put in the state change log is useful.
Can we standardize it a bit, more like on the lines of what's in
ControllerChannelManager. It will be nice to know which leader and isr request
is being rejected (identified by correlation id). Also which controller with
what epoch had sent it ? For example,
"Broker 1 rejected LeaderAndIsrRequest correlation id 123 from controller 2
epoch 3 for partition [foo,1] as current leader epoch 5 is >= request leader
epoch 4"
3.3 Let's fix the [%s, %d] for printing the partition in makeFollower()
4. KafkaApis
We should probably have a "received request" state change messages on the
brokers. Probably this can go in KafkaApis. Something like -
"Broker 1 received LeaderAndIsrRequest correlation id 123 from controller 2
epoch 3 for partition [foo,1]"
This can be followed by a "handling request" state change message, if there is
one and then "handled request" on the broker. Finally, there will be a
"completed request" on the controller. This will complete the lifecycle of a
state change request.
5. PartitionStateMachine
Can we standardize the log messages here as well ?
"Controller 1 epoch 3 elected leader 4 for partition [foo,1]"
6. ReplicaManager
6.1 Like I mentioned above, let's include "handling request" in the state
change log as well
6.2 Typo -> follwer
6.3 Let's also include the error case when the broker drops the leader and isr
request sent by a stale controller epoch inside becomeLeaderOrFollower()
6.4 Let's standardize on the log message format here as well
7. ReplicaStateMachine
Let's standardize on the log message format here as well
8. PartitionStateChangeLogMerger
8.1 Can we rename this to just StateChangeLogMerger?
8.2 We don't use camel case in command line options for our tools. I don't
think we've quite standardized on one thing, but we use either '-' or '.' to
separate the words in command line options.
8.3 It's unclear from the description of the command line option what the input
to this tool is. Is it a directory that contains state change logs from all
servers or is it a csv that points to a list of state change logs ? I think the
former will be easier to work with since there could be multiple state change
logs per broker (it is usually configured for daily rolling).
8.4 It does seem useful to provide an optional time range to this tool. That
way it doesn't have to unnecessarily merge data from files that don't fit in
the query time range. Sometimes, you know the star
t time, sometimes the end time, sometimes none or both. All of these are useful
while troubleshooting
8.5 It seems like this tool should also take the topic partition as input. Most
times, I see us querying this tool for a particular partition. Here it is
useful to take in topic and partition as separate
options. If you specify only topic but not the partition, the tool should get
data for all partitions for that topic. It's good that the tool can handle all
topic partitions as well.
8.6 I am not sure why the second map in partitionsMergedLog is keyed on Date ?
Also, I have the same suggestion I had earlier to scale this tool. The main
reason is that even if a single state change log has fewer entries, I see us
using this tool across multiday history of state changes on a Kafka cluster.
Typically, when you hit a hard-to-debug issue in production, the last thing you
want is for your awesome tool to run out of memory since that is the best
option you have to troubleshoot the issue at hand. So I suggest we look into
improving the merging process in this tool.
Currently, the merging algorithm requires streaming all the entries from each
broker's state change log in memory. This runs a risk of the tool running out
of memory. One way to mediate this is to basically do a n-way merge from m
input files. The algorithm is something like this -
8.6.1 Read a line (that matches the topic/partition regex and date range, if
there is one) from every input file in a priority queue
8.6.2 Take the line from the file with the earliest date and add it to an
output buffer
8.6.3 Add another line from the file in the earlier step in the priority queue
8.6.4 Flush the output buffer every n entries
Here n can be set to some reasonable number given the memory usage of a typical
line in the state change log (Since we wrote this, we should be able to
estimate this)
9. PartitionStateChangeLogger
What is the reason to defining this in a separate class ? It seems like a
wrapper over the basic getLogger statement.
10. Few other suggestions on the formatted output of the state change log -
Let's remove "[Partition state machine on Controller 0]", "[Replica Manager on
Broker 0]" part of the log statement. Also remove the
(partitionStateChangeLogger) at the end. We also don't need to log if its INFO
or DEBUG.
> Add state change log to Kafka brokers
> -------------------------------------
>
> Key: KAFKA-513
> URL: https://issues.apache.org/jira/browse/KAFKA-513
> Project: Kafka
> Issue Type: Sub-task
> Affects Versions: 0.8
> Reporter: Neha Narkhede
> Assignee: Swapnil Ghike
> Priority: Blocker
> Labels: p1, replication, tools
> Fix For: 0.8
>
> Attachments: kafka-513-v1.patch, kafka-513-v2.patch,
> kafka-513-v3.patch
>
> Original Estimate: 96h
> Remaining Estimate: 96h
>
> Once KAFKA-499 is checked in, every controller to broker communication can be
> modelled as a state change for one or more partitions. Every state change
> request will carry the controller epoch. If there is a problem with the state
> of some partitions, it will be good to have a tool that can create a timeline
> of requested and completed state changes. This will require each broker to
> output a state change log that has entries like
> [2012-09-10 10:06:17,280] broker 1 received request LeaderAndIsr() for
> partition [foo, 0] from controller 2, epoch 1
> [2012-09-10 10:06:17,350] broker 1 completed request LeaderAndIsr() for
> partition [foo, 0] from controller 2, epoch 1
> On controller, this will look like -
> [2012-09-10 10:06:17,198] controller 2, epoch 1, initiated state change
> request LeaderAndIsr() for partition [foo, 0]
> We need a tool that can collect the state change log from all brokers and
> create a per-partition timeline of state changes -
> [foo, 0]
> [2012-09-10 10:06:17,198] controller 2, epoch 1 initiated state change
> request LeaderAndIsr()
> [2012-09-10 10:06:17,280] broker 1 received request LeaderAndIsr() from
> controller 2, epoch 1
> [2012-09-10 10:06:17,350] broker 1 completed request LeaderAndIsr() from
> controller 2, epoch 1
> This JIRA involves adding the state change log to each broker and adding the
> tool to create the timeline
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira