This is one way I did it.
This is applicable to simple consumer spout(storm-kafka) and not the new
spout, meaning your old and upgraded topology use the simple consumer
spout. for e.g old is running 0.9.4 and new is running 1.0.1 of storm-kafka
module
Assuming you do data processing and no stateful processing

1) Create New Storm Cluster with New ZK ensemble.

2) KILL the current topology in prod as they are backed by kafka anyways

3) Migrate the spout offsets for the given topology from current prod ZK to
new prod ZK (less than a minute)

One approach could be

git clone https://github.com/feldoh/guano

mvn clean package

Export from old ZK to your local-dir (eg. current prod is running with zk
x.x.x.160)
======================================

java -jar target/guano-0.1a.jar -s "x.x.x.160" -o
"/<local-dir>/prod1-environment/zkdump_pipeline1" -d "/spout1"

Import this dump to new ZK (new prod is running with zk host x.x.x.132)
(Put the path where the subdirectory will be created in ZK meaning one
level above the spoutroot which is root so the last argument is "/")
===========================================
java -jar target/guano-0.1a.jar -s "x.x.x.132" -i
"/<local-dir>/prod1-environment/zkdump_pipeline1" -r "/"

After this step you should see the spout1 directory created in zkroot /

4) Submit the topology in the new storm cluster. After this it should start
the processing of partitions from the offset in zk for that partition.

5) Watch for the offset messages from the spout in the worker logs (Kafka
Spout on initialization prints which topic/partition/offset it starts to
consume from, verify for some example partitions from your exported copy)






On Fri, Aug 26, 2016 at 8:21 PM, Abhishek Agarwal <abhishc...@gmail.com>
wrote:

> Not the frameworks but applications running on the framework. I am talking
> about the rolling upgrade of a topology (not the entire cluster). Similar
> to blue green deployments of microservices
>
> On Aug 26, 2016 9:35 PM, "Harsha Chintalapani" <st...@harsha.io> wrote:
>
>> Abhishek,
>>         Are you looking rolling upgrade kafka cluster or storm?
>> Harsha
>>
>> On Fri, Aug 26, 2016 at 6:18 AM Abhishek Agarwal <abhishc...@gmail.com>
>> wrote:
>>
>>>
>>> On Aug 26, 2016 2:50 PM, "Abhishek Agarwal" <abhishc...@gmail.com>
>>> wrote:
>>>
>>> >
>>>
>>> > Here is an interesting use case - To upgrade a topology without any
>>> downtime. Let's say, the topology has only Kafka as a source and two
>>> versions of it are running (different topology names of course) in parallel
>>> and sharing the kafka input load.
>>> >
>>> > In old kafka spout, rolling upgrade is not possible, partition
>>> assignment is derived from the number of tasks in the topology.
>>> >
>>> > In new kafka spout, partition assignment is done externally by Kafka
>>> server. If I deploy two different topologies with same* kafka consumer
>>> group id*, is it fair to assume that load will be automatically
>>> distributed across topologies? Are there any corner cases to consider?
>>> >
>>> > --
>>> > Regards,
>>> > Abhishek Agarwal
>>> >
>>>
>>

Reply via email to