Re: Page replacement policy improvements (when persistent is enabled)
Hi Igniters, I feel monitoring will provide us with the necessary knowledge about how to change. I have no idea how region separation can help with performance. But it could be an additional challenge for users to correctly configure separate regions. Sincerely, Dmitriy Pavlov чт, 16 авг. 2018 г. в 12:45, Dmitriy Setrakyan : > On Thu, Aug 16, 2018 at 2:27 AM, Vladimir Ozerov > wrote: > > > Dima, > > > > None database I know use separate regions for index pages due to the > reason > > I expressed above. Instead, they split all pages into two groups - hot > and > > cold. With certain rules on how to move pages inside and between these > > groups. None of these algorithms are complex enough. In fact, they are > > pretty straightforward and battle-tested. When implemented properly it > > doesn't matter whether the page is index page or data page. The only > thing > > that matter is how often it is accessed. This is critical piece that we > > lack in the product - our policy is called "random *LRU*", while in > reality > > is not LRU at all. > > > > Aerospike keeps index pages in memory, so there is at least one database > that does that. I am sure if we research around, there will be more. > > > > As far as index pages replacement we do not know whether this is problem > at > > all. We heard some complaints that it might be a problem. But we didn't > see > > any proofs (thanks to lack of monitoring) and even if this is a problem, > we > > do not understand how severe it is. May be it adds 1% overhead and can be > > ignored for years, may be it adds 1000% overhead and must be fixed > > immediately. > > > > I remember one test case with a potential user where we had to change our > eviction algorithm to avoid evicting index pages and because of that > improved performance by about 10x. > > > > This is sensitive piece of a product. Let's use objective data, not > > assumptions. > > > > I agree. The difference is that we need a solution in the mean time. I am > suggesting a very straight forward approach that can be added fairly > quickly and will solve majority performance problems associated with index > page eviction. Once we have that, we can take our time and investigate > further. >
Re: Page replacement policy improvements (when persistent is enabled)
On Thu, Aug 16, 2018 at 2:27 AM, Vladimir Ozerov wrote: > Dima, > > None database I know use separate regions for index pages due to the reason > I expressed above. Instead, they split all pages into two groups - hot and > cold. With certain rules on how to move pages inside and between these > groups. None of these algorithms are complex enough. In fact, they are > pretty straightforward and battle-tested. When implemented properly it > doesn't matter whether the page is index page or data page. The only thing > that matter is how often it is accessed. This is critical piece that we > lack in the product - our policy is called "random *LRU*", while in reality > is not LRU at all. > Aerospike keeps index pages in memory, so there is at least one database that does that. I am sure if we research around, there will be more. > As far as index pages replacement we do not know whether this is problem at > all. We heard some complaints that it might be a problem. But we didn't see > any proofs (thanks to lack of monitoring) and even if this is a problem, we > do not understand how severe it is. May be it adds 1% overhead and can be > ignored for years, may be it adds 1000% overhead and must be fixed > immediately. > I remember one test case with a potential user where we had to change our eviction algorithm to avoid evicting index pages and because of that improved performance by about 10x. > This is sensitive piece of a product. Let's use objective data, not > assumptions. > I agree. The difference is that we need a solution in the mean time. I am suggesting a very straight forward approach that can be added fairly quickly and will solve majority performance problems associated with index page eviction. Once we have that, we can take our time and investigate further.
Re: Page replacement policy improvements (when persistent is enabled)
Dima, None database I know use separate regions for index pages due to the reason I expressed above. Instead, they split all pages into two groups - hot and cold. With certain rules on how to move pages inside and between these groups. None of these algorithms are complex enough. In fact, they are pretty straightforward and battle-tested. When implemented properly it doesn't matter whether the page is index page or data page. The only thing that matter is how often it is accessed. This is critical piece that we lack in the product - our policy is called "random *LRU*", while in reality is not LRU at all. As far as index pages replacement we do not know whether this is problem at all. We heard some complaints that it might be a problem. But we didn't see any proofs (thanks to lack of monitoring) and even if this is a problem, we do not understand how severe it is. May be it adds 1% overhead and can be ignored for years, may be it adds 1000% overhead and must be fixed immediately. This is sensitive piece of a product. Let's use objective data, not assumptions. On Thu, Aug 16, 2018 at 12:08 PM Dmitriy Setrakyan wrote: > On Thu, Aug 16, 2018 at 2:01 AM, Vladimir Ozerov > wrote: > > > Hi Dima, > > > > Putting index pages in separate region is wrong approach, because data > > pages may be equally important on certain workloads, especially in > > heap-organized databases, such as Ignite > > > Never seen a use case where the data page was more important than the index > page. I think we are getting into a hypothetical land. Most Ignite users > are having the reverse problem - index pages get evicted in the same way as > data pages. > > Currently, we are solving it in a most awkward way by trying to give index > pages a different eviction policy. A right solution would be to stick them > into a different region and control the eviction policy for the index > region separately from the data region. > > D. >
Re: Page replacement policy improvements (when persistent is enabled)
On Thu, Aug 16, 2018 at 2:01 AM, Vladimir Ozerov wrote: > Hi Dima, > > Putting index pages in separate region is wrong approach, because data > pages may be equally important on certain workloads, especially in > heap-organized databases, such as Ignite Never seen a use case where the data page was more important than the index page. I think we are getting into a hypothetical land. Most Ignite users are having the reverse problem - index pages get evicted in the same way as data pages. Currently, we are solving it in a most awkward way by trying to give index pages a different eviction policy. A right solution would be to stick them into a different region and control the eviction policy for the index region separately from the data region. D.
Re: Page replacement policy improvements (when persistent is enabled)
Hi Dima, Putting index pages in separate region is wrong approach, because data pages may be equally important on certain workloads, especially in heap-organized databases, such as Ignite. At the moment we'd better focus on monitoring.to better understand usages patterns. This would give us solid ground for further decisions. Vladimir. On Sat, Aug 4, 2018 at 12:06 AM Dmitriy Setrakyan wrote: > Vladimir, > > Are we only counting timestamp of the last access? In that case, it would > create a problem. We should also count number of times a page has been > touched within a certain time frame, e.g. last hour or so. In this case, > index pages would not be evicted as they get touched the most. > > I would also consider putting index pages into a separate memory region. > This way you can apply a different eviction policy to the index pages or > decide not to evict them altogether. This will also be a much simpler and > less error-prone approach than introducing new eviction policies. > > D. > > On Fri, Aug 3, 2018 at 12:19 AM, Vladimir Ozerov > wrote: > > > Igniters, > > > > I heard some complaints about our page replacement algorithm that index > > pages could be evicted from memory too often. I reviewed our current > > implementation and looks like we have choosen very simple approach with > > eviction of random pages, without taking in count their nature (data vs > > index) and typical usage patterns (such as scans). > > > > With our heap-based architecture typical SQL query is executed as > follows: > > 1) Read non-leaf index pages, then in loop: > > 2.1) Read 1 leaf index page > > 2.2) Read several hunderds data pages > > > > This way index pages on average has smaller timestamp than data pages and > > has good probabilty of being evicted. > > > > Another major problem is scan resistance, which doesn't seem to be > covered > > anyhow. > > > > My question is - what was the reason of choosing random-pseudo-LRU > > algorithm instead of commonly used variation of *real* LRU (such as > LRU-K, > > 2Q, etc)? Did we perform any evaluation of it's effectiveness? > > > > I am thinking of creating new IEP to evaluate and possibly improve our > page > > replacement as follows: > > 1) Implement metrics to count page cache hit/miss by page type [1] > > 2) Implement *heat map* which can optionally be enabled to track page > > hits/misses per page or per specific object (cache, index) > > 3) Run heat map on typical workloads (lookups, scans, joins, etc) to get > a > > baseline > > 4) Prototype several LRU-based implementation and see if they gave any > > benefit. It makes sense to start with minor improvements to current > > algorithm (e.g. favor index pages over data pages, play with sample size, > > replace timestamps with read counters, etc). > > > > In any case the first two action items would be good addition to product > > monitoring. > > > > What do you think? > > > > [1] https://issues.apache.org/jira/browse/IGNITE-8580 > > >
Re: Page replacement policy improvements (when persistent is enabled)
Vladimir, Are we only counting timestamp of the last access? In that case, it would create a problem. We should also count number of times a page has been touched within a certain time frame, e.g. last hour or so. In this case, index pages would not be evicted as they get touched the most. I would also consider putting index pages into a separate memory region. This way you can apply a different eviction policy to the index pages or decide not to evict them altogether. This will also be a much simpler and less error-prone approach than introducing new eviction policies. D. On Fri, Aug 3, 2018 at 12:19 AM, Vladimir Ozerov wrote: > Igniters, > > I heard some complaints about our page replacement algorithm that index > pages could be evicted from memory too often. I reviewed our current > implementation and looks like we have choosen very simple approach with > eviction of random pages, without taking in count their nature (data vs > index) and typical usage patterns (such as scans). > > With our heap-based architecture typical SQL query is executed as follows: > 1) Read non-leaf index pages, then in loop: > 2.1) Read 1 leaf index page > 2.2) Read several hunderds data pages > > This way index pages on average has smaller timestamp than data pages and > has good probabilty of being evicted. > > Another major problem is scan resistance, which doesn't seem to be covered > anyhow. > > My question is - what was the reason of choosing random-pseudo-LRU > algorithm instead of commonly used variation of *real* LRU (such as LRU-K, > 2Q, etc)? Did we perform any evaluation of it's effectiveness? > > I am thinking of creating new IEP to evaluate and possibly improve our page > replacement as follows: > 1) Implement metrics to count page cache hit/miss by page type [1] > 2) Implement *heat map* which can optionally be enabled to track page > hits/misses per page or per specific object (cache, index) > 3) Run heat map on typical workloads (lookups, scans, joins, etc) to get a > baseline > 4) Prototype several LRU-based implementation and see if they gave any > benefit. It makes sense to start with minor improvements to current > algorithm (e.g. favor index pages over data pages, play with sample size, > replace timestamps with read counters, etc). > > In any case the first two action items would be good addition to product > monitoring. > > What do you think? > > [1] https://issues.apache.org/jira/browse/IGNITE-8580 >
Re: Page replacement policy improvements (when persistent is enabled)
Hi Vladimir, I really feel that page replacement approach can be improved. Currently I don't think that page nature will give us much, because usage frequency can be independent to page type. I also noticed a couple of tickets were done by Ilya Kasnacheev and Eugeniy Stanilovskly, which were more or less related to page replacement improvements. I hope guys will step in. Could we consider somehow involve index page level in B+ Tree? This could be helpful. Tree root should be never replaced. I totally agree that some metric to monitor and understand how page replacement works in wild, would benefit us a lot. Sincerely, Dmitriy Pavlov пт, 3 авг. 2018 г. в 10:19, Vladimir Ozerov : > Igniters, > > I heard some complaints about our page replacement algorithm that index > pages could be evicted from memory too often. I reviewed our current > implementation and looks like we have choosen very simple approach with > eviction of random pages, without taking in count their nature (data vs > index) and typical usage patterns (such as scans). > > With our heap-based architecture typical SQL query is executed as follows: > 1) Read non-leaf index pages, then in loop: > 2.1) Read 1 leaf index page > 2.2) Read several hunderds data pages > > This way index pages on average has smaller timestamp than data pages and > has good probabilty of being evicted. > > Another major problem is scan resistance, which doesn't seem to be covered > anyhow. > > My question is - what was the reason of choosing random-pseudo-LRU > algorithm instead of commonly used variation of *real* LRU (such as LRU-K, > 2Q, etc)? Did we perform any evaluation of it's effectiveness? > > I am thinking of creating new IEP to evaluate and possibly improve our page > replacement as follows: > 1) Implement metrics to count page cache hit/miss by page type [1] > 2) Implement *heat map* which can optionally be enabled to track page > hits/misses per page or per specific object (cache, index) > 3) Run heat map on typical workloads (lookups, scans, joins, etc) to get a > baseline > 4) Prototype several LRU-based implementation and see if they gave any > benefit. It makes sense to start with minor improvements to current > algorithm (e.g. favor index pages over data pages, play with sample size, > replace timestamps with read counters, etc). > > In any case the first two action items would be good addition to product > monitoring. > > What do you think? > > [1] https://issues.apache.org/jira/browse/IGNITE-8580 >