What do you mean by a list of on devices in redis? Are you talking about
http://redis.io/commands#list lists? Or is it just a set of keys that you can
scan to see which devices are on?
How exact do you need this list to be? Is it OK if one of the states is wrong
until a new message for that device is sent? How frequently are messages sent?
The issue is that this type of a problem will require the events to be
processed in order and at least once.
Storm does not guarantee the in order aspect of this, except in a few specific
situations. So unless you are very careful it is entirely possible that two
messages sent close to one another in time could be processed out of order by
storm. If they were both for the same device then the state would be switched.
What is more if you are doing at least once processing storm replays out of
order. It will not role back and start from the last success it will just
replay the one message that failed.
So in my opinion if all you care about is keeping the state up to date you
probably don't need/want storm (or really any stream processing for that
matter).
The first thing you need to do is to guarantee that for the event ingestion
that all events associated with a given device will go through the same
partition. If you don't do this kafka also does not guarantee order so you are
dead in the water.
Once you have that write a very simple piece of code that will read messages
from one or more kafka partitions, parses the data, updates redis, then informs
kafka that you are done with the message and repeat.
If you want to make it more efficient you can read a "batch"of messages from
kafka, and then do a batch write to redis and batch ack the messages to kafka.
- Bobby
On Saturday, May 14, 2016 2:58 AM, Daniela Stoiber
<[email protected]> wrote:
Hello
Does no one have any idea regarding this topic? Would a tokenizer bolt be
helpful? But with the tokenizer bolt I can only split the string, right? How
can I assign the splitted string to fields?
Thank you very much in advance.
Regards,
Daniela
Von: Daniela Stoiber [mailto:[email protected]]
Gesendet: Donnerstag, 12. Mai 2016 22:51
An: [email protected]
Betreff: Kafka, Storm and Redis
Hello
I have a question regarding the combination of Kafka, Storm and Redis.
I have created a Kafka producer, which produces messages like this:
1000 100 on
2000 150 off
The first two values are IDs, the third value is the information about the
state of a device "on" or "off".
I have also created a Kafka spout in Storm and I already receive these
messages in my Storm topology.
Now I would like to store the "on" messages in Redis and to delete the "off"
messages from Redis. There should always be an actual list of all "on"
devices in Redis, which I can use afterwards for my analysis.
Unfortunately I have no idea how to realize this. Do I have to split the
messages or can I store/delete them as they are? How could this be realized?
Thank you very much in advance.
Regards,
Daniela