Pinning is even worse thing, because you loose control on how data is moved within a single region. Instead, I would suggest to use partition warmup + separate data region to achieve "pinning" semantics.
On Wed, Sep 19, 2018 at 8:34 AM Zhenya Stanilovsky <arzamas...@mail.ru.invalid> wrote: > hi, but how to deal with page replacements, which Dmitriy Pavlov mentioned? > this approach would be efficient if all data fits into memory, may be > better to have method to pin some critical caches? > > > >Среда, 19 сентября 2018, 0:26 +03:00 от Dmitriy Pavlov < > dpavlov....@gmail.com>: > > > >Even better, if RAM is exhausted page replacement process will be started. > > > https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Durable+Memory+-+under+the+hood#IgniteDurableMemory-underthehood-Pagereplacement(rotationwithdisk > ) > > > >Effect of the preloading will be still markable, but not as excelled as > >with full-fitting into RAM. Later I can review or improve javadocs if it > is > >necessary. > > > >ср, 19 сент. 2018 г. в 0:18, Denis Magda < dma...@apache.org >: > > > >> Agree, it's just a matter of the documentation. If a user stores 100% in > >> RAM and on disk, and just wants to warm RAM up after a restart then he > >> knows everything will fit there. If during the preloading we detect that > >> the RAM is exhausted we can halt it and print out a warning. > >> > >> -- > >> Denis > >> > >> On Tue, Sep 18, 2018 at 2:10 PM Dmitriy Pavlov < dpavlov....@gmail.com > > > >> wrote: > >> > >> > Hi, > >> > > >> > I totally support the idea of cache preload. > >> > > >> > IMO it can be expanded. We can iterate over local partitions of the > cache > >> > group and preload each. > >> > > >> > But it should be really clear documented methods so a user can be > aware > >> of > >> > the benefits of such method (e.g. if RAM region is big enough, etc). > >> > > >> > Sincerely, > >> > Dmitriy Pavlov > >> > > >> > вт, 18 сент. 2018 г. в 21:36, Denis Magda < dma...@apache.org >: > >> > > >> > > Folks, > >> > > > >> > > Since we're adding a method that would preload a certain partition, > can > >> > we > >> > > add the one which will preload the whole cache? Ignite persistence > >> users > >> > > I've been working with look puzzled once they realize there is no > way > >> to > >> > > warm up RAM after the restart. There are use cases that require > this. > >> > > > >> > > Can the current optimizations be expanded to the cache preloading > use > >> > case? > >> > > > >> > > -- > >> > > Denis > >> > > > >> > > On Tue, Sep 18, 2018 at 3:58 AM Alexei Scherbakov < > >> > > alexey.scherbak...@gmail.com > wrote: > >> > > > >> > > > Summing up, I suggest adding new public > >> > > > method IgniteCache.preloadPartition(partId). > >> > > > > >> > > > I will start preparing PR for IGNITE-8873 > >> > > > < https://issues.apache.org/jira/browse/IGNITE-8873 > if no more > >> > > objections > >> > > > follow. > >> > > > > >> > > > > >> > > > > >> > > > вт, 18 сент. 2018 г. в 10:50, Alexey Goncharuk < > >> > > alexey.goncha...@gmail.com > >> > > > >: > >> > > > > >> > > > > Dmitriy, > >> > > > > > >> > > > > In my understanding, the proper fix for the scan query looks > like a > >> > big > >> > > > > change and it is unlikely that we include it in Ignite 2.7. On > the > >> > > other > >> > > > > hand, the method suggested by Alexei is quite simple and it > >> > definitely > >> > > > > fits Ignite 2.7, which will provide a better user experience. > Even > >> > > > having a > >> > > > > proper scan query implemented this method can be useful in some > >> > > specific > >> > > > > scenarios, so we will not have to deprecate it. > >> > > > > > >> > > > > --AG > >> > > > > > >> > > > > пн, 17 сент. 2018 г. в 19:15, Dmitriy Pavlov < > >> dpavlov....@gmail.com > >> > >: > >> > > > > > >> > > > > > As I understood it is not a hack, it is an advanced feature > for > >> > > warming > >> > > > > up > >> > > > > > the partition. We can build warm-up of the overall cache by > >> calling > >> > > its > >> > > > > > partitions warm-up. Users often ask about this feature and are > >> not > >> > > > > > confident with our lazy upload. > >> > > > > > > >> > > > > > Please correct me if I misunderstood the idea. > >> > > > > > > >> > > > > > пн, 17 сент. 2018 г. в 18:37, Dmitriy Setrakyan < > >> > > dsetrak...@apache.org > >> > > > >: > >> > > > > > > >> > > > > > > I would rather fix the scan than hack the scan. Is there any > >> > > > technical > >> > > > > > > reason for hacking it now instead of fixing it properly? Can > >> some > >> > > of > >> > > > > the > >> > > > > > > experts in this thread provide an estimate of complexity and > >> > > > difference > >> > > > > > in > >> > > > > > > work that would be required for each approach? > >> > > > > > > > >> > > > > > > D. > >> > > > > > > > >> > > > > > > On Mon, Sep 17, 2018 at 4:42 PM Alexey Goncharuk < > >> > > > > > > alexey.goncha...@gmail.com > > >> > > > > > > wrote: > >> > > > > > > > >> > > > > > > > I think it would be beneficial for some Ignite users if we > >> > added > >> > > > > such a > >> > > > > > > > partition warmup method to the public API. The method > should > >> be > >> > > > > > > > well-documented and state that it may invalidate existing > >> page > >> > > > cache. > >> > > > > > It > >> > > > > > > > will be a very effective instrument until we add the > proper > >> > scan > >> > > > > > ability > >> > > > > > > > that Vladimir was referring to. > >> > > > > > > > > >> > > > > > > > пн, 17 сент. 2018 г. в 13:05, Maxim Muzafarov < > >> > > maxmu...@gmail.com > >> > > > >: > >> > > > > > > > > >> > > > > > > > > Folks, > >> > > > > > > > > > >> > > > > > > > > Such warming up can be an effective technique for > >> performing > >> > > > > > > calculations > >> > > > > > > > > which required large cache > >> > > > > > > > > data reads, but I think it's the single narrow use case > of > >> > all > >> > > > over > >> > > > > > > > Ignite > >> > > > > > > > > store usages. Like all other > >> > > > > > > > > powerfull techniques, we should use it wisely. In the > >> general > >> > > > > case, I > >> > > > > > > > think > >> > > > > > > > > we should consider other > >> > > > > > > > > techniques mentioned by Vladimir and may create > something > >> > like > >> > > > > > `global > >> > > > > > > > > statistics of cache data usage` > >> > > > > > > > > to choose the best technique in each case. > >> > > > > > > > > > >> > > > > > > > > For instance, it's not obvious what would take longer: > >> > > > multi-block > >> > > > > > > reads > >> > > > > > > > or > >> > > > > > > > > 50 single-block reads issues > >> > > > > > > > > sequentially. It strongly depends on used hardware under > >> the > >> > > hood > >> > > > > and > >> > > > > > > > might > >> > > > > > > > > depend on workload system > >> > > > > > > > > resources (CPU-intensive calculations and I\O access) as > >> > well. > >> > > > But > >> > > > > > > > > `statistics` will help us to choose > >> > > > > > > > > the right way. > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > On Sun, 16 Sep 2018 at 23:59 Dmitriy Pavlov < > >> > > > dpavlov....@gmail.com > >> > > > > > > >> > > > > > > > wrote: > >> > > > > > > > > > >> > > > > > > > > > Hi Alexei, > >> > > > > > > > > > > >> > > > > > > > > > I did not find any PRs associated with the ticket for > >> check > >> > > > code > >> > > > > > > > changes > >> > > > > > > > > > behind this idea. Are there any PRs? > >> > > > > > > > > > > >> > > > > > > > > > If we create some forwards scan of pages, it should > be a > >> > very > >> > > > > > > > > intellectual > >> > > > > > > > > > algorithm including a lot of parameters (how much RAM > is > >> > > free, > >> > > > > how > >> > > > > > > > > probably > >> > > > > > > > > > we will need next page, etc). We had the private talk > >> about > >> > > > such > >> > > > > > idea > >> > > > > > > > > some > >> > > > > > > > > > time ago. > >> > > > > > > > > > > >> > > > > > > > > > By my experience, Linux systems already do such > forward > >> > > reading > >> > > > > of > >> > > > > > > file > >> > > > > > > > > > data (for corresponding sequential flagged file > >> > descriptors), > >> > > > but > >> > > > > > > some > >> > > > > > > > > > prefetching of data at the level of application may be > >> > useful > >> > > > for > >> > > > > > > > > O_DIRECT > >> > > > > > > > > > file descriptors. > >> > > > > > > > > > > >> > > > > > > > > > And one more concern from me is about selecting a > right > >> > place > >> > > > in > >> > > > > > the > >> > > > > > > > > system > >> > > > > > > > > > to do such prefetch. > >> > > > > > > > > > > >> > > > > > > > > > Sincerely, > >> > > > > > > > > > Dmitriy Pavlov > >> > > > > > > > > > > >> > > > > > > > > > вс, 16 сент. 2018 г. в 19:54, Vladimir Ozerov < > >> > > > > > voze...@gridgain.com > >> > > > > > > >: > >> > > > > > > > > > > >> > > > > > > > > > > HI Alex, > >> > > > > > > > > > > > >> > > > > > > > > > > This is good that you observed speedup. But I do not > >> > think > >> > > > this > >> > > > > > > > > solution > >> > > > > > > > > > > works for the product in general case. Amount of > RAM is > >> > > > > limited, > >> > > > > > > and > >> > > > > > > > > > even a > >> > > > > > > > > > > single partition may need more space than RAM > >> available. > >> > > > > Moving a > >> > > > > > > lot > >> > > > > > > > > of > >> > > > > > > > > > > pages to page memory for scan means that you evict a > >> lot > >> > of > >> > > > > other > >> > > > > > > > > pages, > >> > > > > > > > > > > what will ultimately lead to bad performance of > >> > subsequent > >> > > > > > queries > >> > > > > > > > and > >> > > > > > > > > > > defeat LRU algorithms, which are of great improtance > >> for > >> > > good > >> > > > > > > > database > >> > > > > > > > > > > performance. > >> > > > > > > > > > > > >> > > > > > > > > > > Database vendors choose another approach - skip > BTrees, > >> > > > iterate > >> > > > > > > > > direclty > >> > > > > > > > > > > over data pages, read them in multi-block fashion, > use > >> > > > separate > >> > > > > > > scan > >> > > > > > > > > > buffer > >> > > > > > > > > > > to avoid excessive evictions of other hot pages. > >> > > > Corresponding > >> > > > > > > ticket > >> > > > > > > > > for > >> > > > > > > > > > > SQL exists [1], but idea is common for all parts of > the > >> > > > system, > >> > > > > > > > > requiring > >> > > > > > > > > > > scans. > >> > > > > > > > > > > > >> > > > > > > > > > > As far as proposed solution, it might be good idea > to > >> add > >> > > > > special > >> > > > > > > API > >> > > > > > > > > to > >> > > > > > > > > > > "warmup" partition with clear explanation of pros > (fast > >> > > scan > >> > > > > > after > >> > > > > > > > > > warmup) > >> > > > > > > > > > > and cons (slowdown of any other operations). But I > >> think > >> > we > >> > > > > > should > >> > > > > > > > not > >> > > > > > > > > > make > >> > > > > > > > > > > this approach part of normal scans. > >> > > > > > > > > > > > >> > > > > > > > > > > Vladimir. > >> > > > > > > > > > > > >> > > > > > > > > > > [1] > https://issues.apache.org/jira/browse/IGNITE-6057 > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > On Sun, Sep 16, 2018 at 6:44 PM Alexei Scherbakov < > >> > > > > > > > > > > alexey.scherbak...@gmail.com > wrote: > >> > > > > > > > > > > > >> > > > > > > > > > > > Igniters, > >> > > > > > > > > > > > > >> > > > > > > > > > > > My use case involves scenario where it's > necessary to > >> > > > iterate > >> > > > > > > over > >> > > > > > > > > > > > large(many TBs) persistent cache doing some > >> calculation > >> > > on > >> > > > > read > >> > > > > > > > data. > >> > > > > > > > > > > > > >> > > > > > > > > > > > The basic solution is to iterate cache using > >> ScanQuery. > >> > > > > > > > > > > > > >> > > > > > > > > > > > This turns out to be slow because iteration over > >> cache > >> > > > > > involves a > >> > > > > > > > lot > >> > > > > > > > > > of > >> > > > > > > > > > > > random disk access for reading data pages > referenced > >> > from > >> > > > > leaf > >> > > > > > > > pages > >> > > > > > > > > by > >> > > > > > > > > > > > links. > >> > > > > > > > > > > > > >> > > > > > > > > > > > This is especially true when data is stored on > disks > >> > with > >> > > > > slow > >> > > > > > > > random > >> > > > > > > > > > > > access, like SAS disks. In my case on modern SAS > >> disks > >> > > > array > >> > > > > > > > reading > >> > > > > > > > > > > speed > >> > > > > > > > > > > > was like several MB/sec while sequential read > speed > >> in > >> > > perf > >> > > > > > test > >> > > > > > > > was > >> > > > > > > > > > > about > >> > > > > > > > > > > > GB/sec. > >> > > > > > > > > > > > > >> > > > > > > > > > > > I was able to fix the issue by using ScanQuery > with > >> > > > explicit > >> > > > > > > > > partition > >> > > > > > > > > > > set > >> > > > > > > > > > > > and running simple warmup code before each > partition > >> > > scan. > >> > > > > > > > > > > > > >> > > > > > > > > > > > The code pins cold pages in memory in sequential > >> order > >> > > thus > >> > > > > > > > > eliminating > >> > > > > > > > > > > > random disk access. Speedup was like x100 > magnitude. > >> > > > > > > > > > > > > >> > > > > > > > > > > > I suggest adding the improvement to the product's > >> core > >> > > by > >> > > > > > always > >> > > > > > > > > > > > sequentially preloading pages for all internal > >> > partition > >> > > > > > > iterations > >> > > > > > > > > > > (cache > >> > > > > > > > > > > > iterators, scan queries, sql queries with scan > plan) > >> if > >> > > > > > partition > >> > > > > > > > is > >> > > > > > > > > > cold > >> > > > > > > > > > > > (low number of pinned pages). > >> > > > > > > > > > > > > >> > > > > > > > > > > > This also should speed up rebalancing from cold > >> > > partitions. > >> > > > > > > > > > > > > >> > > > > > > > > > > > Ignite JIRA ticket [1] > >> > > > > > > > > > > > > >> > > > > > > > > > > > Thoughts ? > >> > > > > > > > > > > > > >> > > > > > > > > > > > [1] > >> https://issues.apache.org/jira/browse/IGNITE-8873 > >> > > > > > > > > > > > > >> > > > > > > > > > > > -- > >> > > > > > > > > > > > > >> > > > > > > > > > > > Best regards, > >> > > > > > > > > > > > Alexei Scherbakov > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > -- > >> > > > > > > > > -- > >> > > > > > > > > Maxim Muzafarov > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > >> > > > -- > >> > > > > >> > > > Best regards, > >> > > > Alexei Scherbakov > >> > > > > >> > > > >> > > >> > > > -- > Zhenya Stanilovsky >