Re[2]: [DISCUSS] Page replacement improvement

Zhenya Stanilovsky Mon, 23 Nov 2020 00:12:38 -0800

Nikolay, i hope such case ignite users already observed)
I suggest to: put data bigger then available, full scan, get data only for 
available inmem data in loop, check PageReplacement metric, how match 
iterations will bring to zero this metric. 
 
>Hello, Alex.
>
>> Perhaps we can implement a special benchmark for this case with manually 
>> triggered "batch page replacement" using yardstick (but I'm not sure if 
>> ducktape can help here).
>
>I think we should carefully describe the issues with the current approach and 
>then we can choose right tool to benchmark improvements.
>You said:
>
>> we use Random-LRU algorithm … it has many disadvantages and affects 
>> performance very much when replacement is started
>
>Can you list disadvantages of the Random-LRU?
>
>AFAIU the first benchmark should be:
>
>1. Start cluster with persistence and put data bigger then available RAM to it.
>2. Measure performance of the queries that selects one entry from the cache.
>3. Make some queries that will iterate throw all data - `SELECT SUM(x) FROM t` 
>or something similar.
>4. Repeat step 2. in this time performance of random queries should be lower 
>due to the page replacement.
>
>Is this scenario correct?
>
>> 23 нояб. 2020 г., в 09:12, Alex Plehanov < plehanov.a...@gmail.com > 
>> написал(а):
>>
>> Nikolay, Zhenya,
>>
>> Benchmark from IGNITE-13034 is fully synthetic, it makes random puts
>> uniformly. It can be used to compare different page replacement algorithms
>> (random-LRU vs segmented-LRU), but it is totally inapplicable to benchmark
>> batch page replacement.
>> Perhaps we can implement a special benchmark for this case with manually
>> triggered "batch page replacement" using yardstick (but I'm not sure
>> if ducktape can help here).
>>
>>> I understand case you described, but who will pull the switch ? Human,
>> artificial intelligence ?
>> As I described before, we can implement both manual and automatic "batch
>> page replacement" triggering. For automatic triggering, there is no
>> artificial intelligence needed, just several conditions with reasonable
>> thresholds. Automatic triggering also can be disabled by default.
>>
>> пт, 20 нояб. 2020 г. в 13:32, Zhenya Stanilovsky < arzamas...@mail.ru.invalid
>>> :
>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>> Zhenya,
>>>>
>>>>> Alexey, we already have changes that partially fixes this issue [1]
>>>> IGNITE-13086 it's a minor improvement. We still have major problems with
>>>> our page replacement algorithm (slow page selection and non-optimal
>>>> page-fault rate). I think changing from random 5 pages to 7 will make
>>>> things even worse (it's better for page-fault rate, but page selection
>>> will
>>>> be slower).
>>> All this words above need to be proven, i hope. + 1 with Nikolay, we need
>>> correct reproduces or some graphs from 2.9 ver.
>>>
>>>>
>>>>> This approach still not applicable for real life
>>>> Why do you think batch replacement is not applicable for real-life? It can
>>>> be applied for workloads, where some big amount of data periodically used,
>>>> but not very often. For example, when OLAP request over historical data
>>>> raised pages to page-memory, and after such request this data is not
>>> needed
>>>> for a long time. Or when OLTP transactions mostly add new data and process
>>>> recent data but rarely touch historical data. In these cases with the
>>>> current approach, we will enter "page replacement mode" after some period
>>>> of time and never leave it. With batch page replacement there is a chance
>>>> to prevent random-LRU page replacement or postpone it.
>>> I understand case you described, but who will pull the switch ? Human,
>>> artificial intelligence ?
>>> You approach assume some triggering from inner, i don`t like this.
>>>
>>>>
>>>>> But request once more, do you really observe such problems with 2.9 ver
>>> ?
>>>> Any graphs maybe ?
>>>> I don't have production usage feedback after IGNITE-13086, but I doubt
>>>> something changed significantly.
>>>
>>> Lets wait ?:) In any case (Nikolay, Alex) IGNITE-13086 includes yardstik
>>> bench for PR proven, we can use it once more.
>>>
>>> Thanks !
>>>>
>>>>
>>>> чт, 19 нояб. 2020 г. в 09:18, Zhenya Stanilovsky <
>>>  arzamas...@mail.ru.invalid
>>>>> :
>>>>
>>>>>
>>>>> Alexey, we already have changes that partially fixes this issue [1]
>>>>> Easy way:
>>>>> Looks like we already have converge in page replacement.
>>>>> If we change 5 times touch iterator from random lru algo into, for
>>>>> example — 7 we will obtain fast improvement from scratch.
>>>>>
>>>>> » Batch page replacement
>>>>> This approach still not applicable for real life if you wan`t to observe
>>>>> ugly people for threshold (i.e. 12 h) interval. And, of course, you
>>>>> understand that dramatically reduce of such interval gives nothing?
>>>>>
>>>>> » Change the page replacement algorithm.
>>>>> That`s way i vote for ) But request once more, do you really observe
>>> such
>>>>> problems with 2.9 ver ? Any graphs maybe ?
>>>>>
>>>>> thanks !
>>>>>
>>>>> [1]  https://issues.apache.org/jira/browse/IGNITE-13086
>>>>>> Hello, Igniters!
>>>>>>
>>>>>> Currently, for page replacement (page rotation between page-memory and
>>>>>> disk) we use Random-LRU algorithm. It has a low maintenance cost and
>>>>>> relatively simple implementation, but it has many disadvantages and
>>>>> affects
>>>>>> performance very much when replacement is started. We even have
>>> warnings
>>>>> in
>>>>>> the log when page replacement started and a special event for this. I
>>> know
>>>>>> Ignite deployments where administrators force to restart cluster nodes
>>>>>> periodically to avoid page replacement.
>>>>>>
>>>>>> I have a couple of proposals to improve page replacement in Ignite:
>>>>>>
>>>>>> *Batch page replacement.*
>>>>>>
>>>>>> Main idea: in some cases start background task to evict cold pages from
>>>>>> page-memory (for example, pages, last touched more than 12 hours ago).
>>>>>>
>>>>>> The task can be started:
>>>>>> - Automatically, triggered by some events, for example, when we expect
>>> a
>>>>>> start of Random-LRU page replacing soon (allocated more than 90% of
>>>>>> page-memory) + we have enough amount of cold pages (we need some
>>> metric to
>>>>>> calculate the number of cold pages) + some time passed since last batch
>>>>>> page replacement (to avoid too much resource consumption by background
>>>>>> batch replacement).
>>>>>> - Manually (JMX or control.sh), if an administrator wants to control
>>> the
>>>>>> time of batch replacement more precisely (for example, to avoid the
>>> start
>>>>>> of this task during peak time).
>>>>>>
>>>>>> Batch page replacement will be helpful in some workloads (when some
>>> data
>>>>>> much colder than another), it can prevent the starting of Random-LRU
>>> page
>>>>>> replacement, or if Random-LRU already started it can provide
>>> conditions to
>>>>>> stop it.
>>>>>>
>>>>>> *Change the page replacement algorithm.*
>>>>>>
>>>>>> Good page replacement algorithm should satisfy the requirements:
>>>>>> - low page-fault rates for typical workload
>>>>>> - low maintenance cost (low resource consumption to maintain additional
>>>>>> structures required for page replacement)
>>>>>> - fast searching of next page for replacement
>>>>>> - sequential scans resistance (one sequential scan should not evict all
>>>>>> relatively hot pages from page-memory)
>>>>>>
>>>>>> Our Random-LRU has low maintenance cost and sequential scan resistant,
>>> but
>>>>>> to find the next page for replacement in the best case we scan 5
>>> pages, in
>>>>>> the worst case we can scan all data region segment. Also, due to random
>>>>>> nature, it's not very effective in predicting the right page for
>>>>>> replacement to minimize the page-fault rate. And it's much time
>>> required
>>>>> to
>>>>>> totally evict old cold data.
>>>>>>
>>>>>> Usually, database management systems and operating systems use
>>>>>> modifications of LRU algorithms. These algorithms have higher
>>> maintenance
>>>>>> costs (pages list should be modified on each page access), but often
>>> they
>>>>>> are effective from a "page-fault rate" point of view and have O(1)
>>>>>> complexity for a searching page to replace. Simple LRU is not
>>> sequential
>>>>>> scan resistant, but modifications that utilize page access frequency
>>> are
>>>>>> resistant to sequential scan.
>>>>>>
>>>>>> We can try one of the modifications of LRU as well (for example,
>>>>> "segmented
>>>>>> LRU" seems suitable for Ignite).
>>>>>>
>>>>>> Ignite is a memory-centric product, so "low maintenance cost" is very
>>>>>> critical. And there is a risk that page replacement algorithm can
>>> affect
>>>>>> workloads, where page replacement is not used (enough RAM to store all
>>>>>> data). Of course, any page replacement solution should be carefully
>>>>>> benchmarked.
>>>>>>
>>>>>>
>>>>>> Igniters, WDYT? If any of these proposals look reasonable to you, I
>>> will
>>>>>> create IEP and start implementation.
>>>>>>
>>>>>> Also, I have a draft implementation of system view to determine how hot
>>>>> are
>>>>>> pages in page-memory [1]. I think it will be useful for any of these
>>>>>> approaches (and even if we decide to left page replacement as is).
>>>>>>
>>>>>> [1]:  https://issues.apache.org/jira/browse/IGNITE-13726
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>>
>
Re[2]: [DISCUSS] Page replacement improvement

Reply via email to