Re: [jira] [Created] (FLINK-5544) Implement Internal Timer Service in RocksDB

2017-01-17 Thread SHI Xiaogang
Hi Florian,

The memory usage depends on the types of keys and namespaces.

We have not experienced that many concurrently open windows.
But given that each open window needs several bytes for its timer,  2M open
windows may cost up to hundreds of MB.

Regards
Xiaogang

2017-01-18 14:45 GMT+08:00 Florian König :

> Hi,
>
> that sounds very useful. We are using quite a lot of timers in custom
> windows. Does anybody have experience with the memory requirements of,
> let’s say, 2 million concurrently open windows and the associated timers?
>
> Thanks
> Florian
>
> > Am 18.01.2017 um 04:40 schrieb Xiaogang Shi (JIRA) :
> >
> > Xiaogang Shi created FLINK-5544:
> > ---
> >
> > Summary: Implement Internal Timer Service in RocksDB
> > Key: FLINK-5544
> > URL: https://issues.apache.org/jira/browse/FLINK-5544
> > Project: Flink
> >  Issue Type: Bug
> >  Components: Streaming
> >Reporter: Xiaogang Shi
> >
> >
> > Now the only implementation of internal timer service is
> HeapInternalTimerService which stores all timers in memory. In the cases
> where the number of keys is very large, the timer service will cost too
> much memory. A implementation which stores timers in RocksDB seems good to
> deal with these cases.
> >
> > It might be a little challenging to implement a RocksDB timer service
> because the timers are accessed in different ways. When timers are
> triggered, we need to access timers in the order of timestamp. But when
> performing checkpoints, we must have a method to obtain all timers of a
> given key group.
> >
> > A good implementation, as suggested by [~StephanEwen], follows the idea
> of merge sorting. We can store timers in RocksDB with the format
> {{KEY_GROUP#TIMER#KEY}}. In this way, the timers under a key group are put
> together and are sorted.
> >
> > Then we can deploy an in-memory heap which keeps the first timer of each
> key group to get the next timer to trigger. When a key group's first timer
> is updated, we can efficiently update the heap.
> >
> >
> >
> > --
> > This message was sent by Atlassian JIRA
> > (v6.3.4#6332)
>
>
>


Re: [jira] [Created] (FLINK-5544) Implement Internal Timer Service in RocksDB

2017-01-17 Thread Florian König
Hi,

that sounds very useful. We are using quite a lot of timers in custom windows. 
Does anybody have experience with the memory requirements of, let’s say, 2 
million concurrently open windows and the associated timers?

Thanks
Florian

> Am 18.01.2017 um 04:40 schrieb Xiaogang Shi (JIRA) :
> 
> Xiaogang Shi created FLINK-5544:
> ---
> 
> Summary: Implement Internal Timer Service in RocksDB
> Key: FLINK-5544
> URL: https://issues.apache.org/jira/browse/FLINK-5544
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Reporter: Xiaogang Shi
> 
> 
> Now the only implementation of internal timer service is 
> HeapInternalTimerService which stores all timers in memory. In the cases 
> where the number of keys is very large, the timer service will cost too much 
> memory. A implementation which stores timers in RocksDB seems good to deal 
> with these cases.
> 
> It might be a little challenging to implement a RocksDB timer service because 
> the timers are accessed in different ways. When timers are triggered, we need 
> to access timers in the order of timestamp. But when performing checkpoints, 
> we must have a method to obtain all timers of a given key group.
> 
> A good implementation, as suggested by [~StephanEwen], follows the idea of 
> merge sorting. We can store timers in RocksDB with the format 
> {{KEY_GROUP#TIMER#KEY}}. In this way, the timers under a key group are put 
> together and are sorted. 
> 
> Then we can deploy an in-memory heap which keeps the first timer of each key 
> group to get the next timer to trigger. When a key group's first timer is 
> updated, we can efficiently update the heap.
> 
> 
> 
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)




[jira] [Created] (FLINK-5544) Implement Internal Timer Service in RocksDB

2017-01-17 Thread Xiaogang Shi (JIRA)
Xiaogang Shi created FLINK-5544:
---

 Summary: Implement Internal Timer Service in RocksDB
 Key: FLINK-5544
 URL: https://issues.apache.org/jira/browse/FLINK-5544
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Reporter: Xiaogang Shi


Now the only implementation of internal timer service is 
HeapInternalTimerService which stores all timers in memory. In the cases where 
the number of keys is very large, the timer service will cost too much memory. 
A implementation which stores timers in RocksDB seems good to deal with these 
cases.

It might be a little challenging to implement a RocksDB timer service because 
the timers are accessed in different ways. When timers are triggered, we need 
to access timers in the order of timestamp. But when performing checkpoints, we 
must have a method to obtain all timers of a given key group.

A good implementation, as suggested by [~StephanEwen], follows the idea of 
merge sorting. We can store timers in RocksDB with the format 
{{KEY_GROUP#TIMER#KEY}}. In this way, the timers under a key group are put 
together and are sorted. 

Then we can deploy an in-memory heap which keeps the first timer of each key 
group to get the next timer to trigger. When a key group's first timer is 
updated, we can efficiently update the heap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)