+1. I personally found it a little confusing when I discovered I had to
configure this after already choosing RocksDB as a backend. Also very
strongly in favour of "safe and scalable" as the default.

Best,

Aaron Levin

On Fri, Jan 17, 2020 at 4:41 AM Piotr Nowojski <pi...@ververica.com> wrote:

> +1 for making it consistent. When using X state backend, timers should be
> stored in X by default.
>
> Also I think any configuration option controlling that needs to be well
> documented in some performance tuning section of the docs.
>
> Piotrek
>
> On 17 Jan 2020, at 09:16, Congxian Qiu <qcx978132...@gmail.com> wrote:
>
> +1 to store timers in RocksDB default.
>
> Store timers in Heap can encounter OOM problems, and make the checkpoint
> much slower, and store times in RocksDB can get ride of both.
>
> Best,
> Congxian
>
>
> Biao Liu <mmyy1...@gmail.com> 于2020年1月17日周五 下午3:10写道:
>
>> +1
>>
>> I think that's how it should be. Timer should align with other regular
>> state.
>>
>> If user wants a better performance without memory concern, memory or FS
>> statebackend might be considered. Or maybe we could optimize the
>> performance by introducing a specific column family for timer. It could
>> have its own tuned options.
>>
>> Thanks,
>> Biao /'bɪ.aʊ/
>>
>>
>>
>> On Fri, 17 Jan 2020 at 10:11, Jingsong Li <jingsongl...@gmail.com> wrote:
>>
>>> Hi Stephan,
>>>
>>> Thanks for starting this discussion.
>>> +1 for stores times in RocksDB by default.
>>> In the past, when Flink didn't save the times with RocksDb, I had a
>>> headache. I always adjusted parameters carefully to ensure that there was
>>> no risk of Out of Memory.
>>>
>>> Just curious, how much impact of heap and RocksDb for times on
>>> performance
>>> - if there is no order of magnitude difference between heap and RocksDb,
>>> there is no problem in using RocksDb.
>>> - if there is, maybe we should improve our documentation to let users
>>> know about this option. (Looks like a lot of users didn't know)
>>>
>>> Best,
>>> Jingsong Lee
>>>
>>> On Fri, Jan 17, 2020 at 3:18 AM Yun Tang <myas...@live.com> wrote:
>>>
>>>> Hi Stephan,
>>>>
>>>> I am +1 for the change which stores timers in RocksDB by default.
>>>>
>>>> Some users hope the checkpoint could be completed as fast as possible,
>>>> which also need the timer stored in RocksDB to not affect the sync part of
>>>> checkpoint.
>>>>
>>>> Best
>>>> Yun Tang
>>>> ------------------------------
>>>> *From:* Andrey Zagrebin <azagre...@apache.org>
>>>> *Sent:* Friday, January 17, 2020 0:07
>>>> *To:* Stephan Ewen <se...@apache.org>
>>>> *Cc:* dev <d...@flink.apache.org>; user <user@flink.apache.org>
>>>> *Subject:* Re: [DISCUSS] Change default for RocksDB timers: Java Heap
>>>> => in RocksDB
>>>>
>>>> Hi Stephan,
>>>>
>>>> Thanks for starting this discussion. I am +1 for this change.
>>>> In general, number of timer state keys can have the same order as
>>>> number of main state keys.
>>>> So if RocksDB is used for main state for scalability, it makes sense to
>>>> have timers there as well
>>>> unless timers are used for only very limited subset of keys which fits
>>>> into memory.
>>>>
>>>> Best,
>>>> Andrey
>>>>
>>>> On Thu, Jan 16, 2020 at 4:27 PM Stephan Ewen <se...@apache.org> wrote:
>>>>
>>>> Hi all!
>>>>
>>>> I would suggest a change of the current default for timers. A bit of
>>>> background:
>>>>
>>>>   - Timers (for windows, process functions, etc.) are state that is
>>>> managed and checkpointed as well.
>>>>   - When using the MemoryStateBackend and the FsStateBackend, timers
>>>> are kept on the JVM heap, like regular state.
>>>>   - When using the RocksDBStateBackend, timers can be kept in RocksDB
>>>> (like other state) or on the JVM heap. The JVM heap is the default though!
>>>>
>>>> I find this a bit un-intuitive and would propose to change this to let
>>>> the RocksDBStateBackend store all state in RocksDB by default.
>>>> The rationale being that if there is a tradeoff (like here), safe and
>>>> scalable should be the default and unsafe performance be an explicit 
>>>> choice.
>>>>
>>>> This sentiment seems to be shared by various users as well, see
>>>> https://twitter.com/StephanEwen/status/1214590846168903680 and
>>>> https://twitter.com/StephanEwen/status/1214594273565388801
>>>> We would of course keep the switch and mention in the performance
>>>> tuning section that this is an option.
>>>>
>>>> # RocksDB State Backend Timers on Heap
>>>>   - Pro: faster
>>>>   - Con: not memory safe, GC overhead, longer synchronous checkpoint
>>>> time, no incremental checkpoints
>>>>
>>>> #  RocksDB State Backend Timers on in RocksDB
>>>>   - Pro: safe and scalable, asynchronously and
>>>> incrementally checkpointed
>>>>   - Con: performance overhead.
>>>>
>>>> Please chime in and let me know what you think.
>>>>
>>>> Best,
>>>> Stephan
>>>>
>>>>
>>>
>>> --
>>> Best, Jingsong Lee
>>>
>>
>

Reply via email to