Aw: Re: Re: Re: Re: Pull from Redis

Daniela S Mon, 30 May 2016 14:15:07 -0700

No unfortunately not. Each program has its own "profile" with different values each minute.

Gesendet: Montag, 30. Mai 2016 um 23:04 Uhr
Von: "Yuri Kostine" <kost...@gmail.com>
An: user@storm.apache.org
Betreff: Re: Aw: Re: Re: Re: Pull from Redis

Makes sense. Do all programs have the same value at minute 3?

On May 30, 2016, at 3:55 PM, Daniela S <daniela_4...@gmx.at> wrote:

I try to explain a little bit more in detail:

I receive for example a start event for program X. When program X is finisehd I will receive an end message for program X. As long as I do not receive an end message for a program I assume that it is running and it should be stored in Redis.

Let's assume that program X has been started and I did not receive an end message yet. So I have to pull it from Redis and to calculate how far the program is at the moment (current time - start time). With this value, let's assume it is minute 3, I have to look up which value corresponds to minute 3. And this value is the value I need for my sum.

I have to do this for every started program and I have to repeat the sum building every minute as every program changes its value each minute, as long as it has not ended.

Thank you and regards,

Daniela

Gesendet: Montag, 30. Mai 2016 um 22:34 Uhr
Von: "Yuri Kostine" <kost...@gmail.com>
An: user@storm.apache.org
Betreff: Re: Aw: Re: Re: Pull from Redis

Is the sum the amount of time all current programs have been running? How does storm/redis know when the program is done and needs to be removed? For example, you get a json payload with a start time, no end time. You push that into redis key or list. 1 minute lapses (no other events have been written) you look at that json and calculate time in seconds etc, time now-start time. Let's say it's 120, then you take 120 and do what with it? And if there are 10 events, each returning 120 will that be 1200 > calculation or do you have to calculate each event by itself and then sum results because each event gets its own unique multiplier?

On May 30, 2016, at 2:35 PM, Daniela S <daniela_4...@gmx.at> wrote:

Thank you for your support! I will try to explain what I would like to do:

I am receiving JSON strings from Kafka. These JSON strings contain start and end events of programs. I would like to use Redis as cache to store all the programs, which are started but have not ended yet. As soon as a program has ended it should be deleted from Redis. I would like to build a sum over all programs stored in Redis. But I need another value to build the sum. To get this value I have to calculate the difference between the actual time and the timestamp of each event stored in Redis. With this calculated value I would like to look up the value I need to build the sum. This must be done for each stored entry and it should be repeated everytime a new value has been added or removed from Redis or otherwise every minute.

How should such problems be solved within Storm? I thought about a kind of cache like Redis.

Thank you in advance.

Regards,

Daniela

Gesendet: Montag, 30. Mai 2016 um 21:16 Uhr
Von: "Yuri Kostine" <kost...@gmail.com>
An: user@storm.apache.org
Betreff: Re: Aw: Re: Pull from Redis

It depends on definition of slow and data stored of course, my guess is that few million of keys might take a minute? Pure guess. Redis is a key value store, you give it a key and you can perform an operation on its value. Iterating over all keys is the slowest operation in redis. I think it will also block all other operations while this one is executing. I know this is a storm and not redis group, I am not sure there is a storm solution if redis is your partial data storage. It's not a relational database so it's not great at joins, aggregations, etc. just my 2c. Time series aggregations in redis are done with 1 key per interval, for example. 2016-06-01: 1:30pm event would execute a counter increment in 2016 key, 2016-06, 2016-06-01, etc down to your smallest interval. Then to pull count for a day you would get 1 key only, 2016-06-01. This approach is fast because all operations are key value based, accessing only 1 key at a time. There is no way to pull data you need at the same time before you store that key into redis? You can use redis as your queue and process it once a minute with a topology, then create a new time based queue key and keep going. You would store your data a bit differently though. Instead of many keys, you would have one key with array of values. You keep pushing into it based on a time stamp, then when it lapses you process it with storm and pop those values out one at a time. Lookup the data you need, keep an aggregate and keep going till the queue is empty.

On May 30, 2016, at 1:17 PM, Daniela S <daniela_4...@gmx.at> wrote:

I have to pull the entries and to add a specific value to every entry. This value is stored in another database and therefore I would like to make the join. based on some conditions, in Storm. I need this value to build the sum, as the entries do not contain any information for the sum.

What would be very few keys?

Thank you and regards,

Daniela

Gesendet: Montag, 30. Mai 2016 um 20:11 Uhr
Von: "Yuri Kostine" <kost...@gmail.com>
An: user@storm.apache.org
Betreff: Re: Pull from Redis

Do you pull entries only to sum them up? Why not keep a running total in redis in a time stamped key by minute? Generally speaking redis is not great for pulling all keys unless there are very few keys.

On May 30, 2016, at 12:49 PM, Daniela S <daniela_4...@gmx.at> wrote:

Hi

I have a topology that stores entries in Redis. Now I would like to pull all entries from Redis every minute or as soon as a value has changed. How can I do that? Can I add another bolt to my topology for this task or do I have to use a spout or even a new topology? I would like to build a sum over all entries every minute. Do you have any advice for that?

Thank you in advance.

Regards,

Daniela

Aw: Re: Re: Re: Re: Pull from Redis

Reply via email to