I have two app instance, input topic has 2 partitions, each instance config
one thread and one replicas.
also, instance1's state-store is /tmp/kafka-streams, instance2's
state-store is /tmp/kafka-streams2.
now I do this experiment to study checkpointin kafka streams (0.10.0.0).

1. start instance1, send two msg(msg1 and msg2) to input topic, no
checkpoint file
2. stop instance1, checkpoint file in /tmp/kafka-streams, both partition's
offset equals 1
3. restart instance1(althrough new UUID mean new instance, but here
consider the same instance as before), no checkpoint file again
4. start instance2, send two msg(msg3 and msg4) to input topic, also no
checkpoint file
5. stop instance1, checkpoint in /tmp/kafka-streams, both partition's
offset equals 2
6. send two msg(msg5 and msg6) to input topic, now this two msg all go to
instance2
7. stop instance2, checkpoint in /tmp/kafka-streams2, both partition's
offset equals 3

after two instance stopped, below is the checkpoint file content

$ strings kafka-streams/*/*/.checkpoint   --> instance1
streams-wc-Counts-changelog 0 2
streams-wc-Counts-changelog 1 2
$ strings kafka-streams2/*/*/.checkpoint  --> instance2
streams-wc-Counts-changelog 0 3
streams-wc-Counts-changelog 1 3

I draw a simple table about the partition and offset of each msg, also the
event happend.

      Partition,Offset  | Partition,Offset    | What Happened
msg1  P0,0
msg2                    | P1,0                | restart instance1
msg3  P0,1                                    | after start instance2
msg4                    | P1,1
msg5  P0,2                                    | after stop instance1
msg6                    | P1,2

Next, use kafka-consumer-offset-checker.sh to check input-topic, and all
six msg(each partition has three msg) were consumed

$ bin/kafka-consumer-offset-checker.sh --zookeeper localhost:2181 --topic
streams-wc-input1 --group streams-wc
Group           Topic                          Pid Offset          logSize
        Lag             Owner
streams-wc      streams-wc-input1              0   3               3
        0               none
streams-wc      streams-wc-input1              1   3               3
        0               none

Now If we restart instance1 again, As only one instance exist, so standby
task will not take effect.
That means, instance1 will create two active StreamTask, and both task will
restoreActiveState from changelog topic.

when restore active task, we have to seek to some position, here as
instance1's state store: /tmp/kafka-streams checkpoint file
has both changelog partition, so restoreConsumer will seek to position 2.

And change to another situation, What about restart instance2 not
instance1?
the restoreConsumer will seek to position 3, because instance2's active
task read checkpoint in /tmp/kafka-streams2.

The different between this two situation is which position beginning to
restore StateStore in StreamTask.

Situation One, only restart instance1, beginning position 2 means, P0 seek
after msg3, P1 seek after msg4
Situation Two, only restart instance12 beginning position 3 means, P0 seek
after msg5, P1 seek after msg6

from Partition view, msg1,msg3,msg5 go to partition 0.

msg     P0's offset  | Instance1 restart  | Instance2 restart
msg1,1  0
msg3,1  1
msg5,1  2              | <- seek at pos 2   |
                                                           | seek at pos 3

msg     P1's offset  | Instance1 restart  | Instance2 restart
msg2,1  0
msg4,1  1
msg6,1  2              | <- seek at pos 2   |
                                                           | <-seek at pos
3

The restore process is poll records from changelog-topic at specific
position.
in situation one, restore msg5 and msg6 to StreamTask's state store. msg5
to task1, msg6 to task0
in situation two, restore nothing to StreamTask's state store???

I have some question about the restore process and checkpoint below:
1. Should we seek to beginning, because restore state store must be
complete view of previous?
2. The two situation described above, What will happen?

Hope someone expain to me, or collect me if I'm understand wrong. Tks
before.

Reply via email to