Great, i don`t think about that.
>Среда, 19 сентября 2018, 9:40 +03:00 от Vladimir Ozerov <voze...@gridgain.com>: > >Pinning is even worse thing, because you loose control on how data is moved >within a single region. Instead, I would suggest to use partition warmup + >separate data region to achieve "pinning" semantics. > >On Wed, Sep 19, 2018 at 8:34 AM Zhenya Stanilovsky >< arzamas...@mail.ru.invalid > wrote: > >> hi, but how to deal with page replacements, which Dmitriy Pavlov mentioned? >> this approach would be efficient if all data fits into memory, may be >> better to have method to pin some critical caches? >> >> >> >Среда, 19 сентября 2018, 0:26 +03:00 от Dmitriy Pavlov < >> dpavlov....@gmail.com >: >> > >> >Even better, if RAM is exhausted page replacement process will be started. >> > >> >> https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Durable+Memory+-+under+the+hood#IgniteDurableMemory-underthehood-Pagereplacement(rotationwithdisk >> ) >> > >> >Effect of the preloading will be still markable, but not as excelled as >> >with full-fitting into RAM. Later I can review or improve javadocs if it >> is >> >necessary. >> > >> >ср, 19 сент. 2018 г. в 0:18, Denis Magda < dma...@apache.org >: >> > >> >> Agree, it's just a matter of the documentation. If a user stores 100% in >> >> RAM and on disk, and just wants to warm RAM up after a restart then he >> >> knows everything will fit there. If during the preloading we detect that >> >> the RAM is exhausted we can halt it and print out a warning. >> >> >> >> -- >> >> Denis >> >> >> >> On Tue, Sep 18, 2018 at 2:10 PM Dmitriy Pavlov < dpavlov....@gmail.com >> > >> >> wrote: >> >> >> >> > Hi, >> >> > >> >> > I totally support the idea of cache preload. >> >> > >> >> > IMO it can be expanded. We can iterate over local partitions of the >> cache >> >> > group and preload each. >> >> > >> >> > But it should be really clear documented methods so a user can be >> aware >> >> of >> >> > the benefits of such method (e.g. if RAM region is big enough, etc). >> >> > >> >> > Sincerely, >> >> > Dmitriy Pavlov >> >> > >> >> > вт, 18 сент. 2018 г. в 21:36, Denis Magda < dma...@apache.org >: >> >> > >> >> > > Folks, >> >> > > >> >> > > Since we're adding a method that would preload a certain partition, >> can >> >> > we >> >> > > add the one which will preload the whole cache? Ignite persistence >> >> users >> >> > > I've been working with look puzzled once they realize there is no >> way >> >> to >> >> > > warm up RAM after the restart. There are use cases that require >> this. >> >> > > >> >> > > Can the current optimizations be expanded to the cache preloading >> use >> >> > case? >> >> > > >> >> > > -- >> >> > > Denis >> >> > > >> >> > > On Tue, Sep 18, 2018 at 3:58 AM Alexei Scherbakov < >> >> > > alexey.scherbak...@gmail.com > wrote: >> >> > > >> >> > > > Summing up, I suggest adding new public >> >> > > > method IgniteCache.preloadPartition(partId). >> >> > > > >> >> > > > I will start preparing PR for IGNITE-8873 >> >> > > > < https://issues.apache.org/jira/browse/IGNITE-8873 > if no more >> >> > > objections >> >> > > > follow. >> >> > > > >> >> > > > >> >> > > > >> >> > > > вт, 18 сент. 2018 г. в 10:50, Alexey Goncharuk < >> >> > > alexey.goncha...@gmail.com >> >> > > > >: >> >> > > > >> >> > > > > Dmitriy, >> >> > > > > >> >> > > > > In my understanding, the proper fix for the scan query looks >> like a >> >> > big >> >> > > > > change and it is unlikely that we include it in Ignite 2.7. On >> the >> >> > > other >> >> > > > > hand, the method suggested by Alexei is quite simple and it >> >> > definitely >> >> > > > > fits Ignite 2.7, which will provide a better user experience. >> Even >> >> > > > having a >> >> > > > > proper scan query implemented this method can be useful in some >> >> > > specific >> >> > > > > scenarios, so we will not have to deprecate it. >> >> > > > > >> >> > > > > --AG >> >> > > > > >> >> > > > > пн, 17 сент. 2018 г. в 19:15, Dmitriy Pavlov < >> >> dpavlov....@gmail.com >> >> > >: >> >> > > > > >> >> > > > > > As I understood it is not a hack, it is an advanced feature >> for >> >> > > warming >> >> > > > > up >> >> > > > > > the partition. We can build warm-up of the overall cache by >> >> calling >> >> > > its >> >> > > > > > partitions warm-up. Users often ask about this feature and are >> >> not >> >> > > > > > confident with our lazy upload. >> >> > > > > > >> >> > > > > > Please correct me if I misunderstood the idea. >> >> > > > > > >> >> > > > > > пн, 17 сент. 2018 г. в 18:37, Dmitriy Setrakyan < >> >> > > dsetrak...@apache.org >> >> > > > >: >> >> > > > > > >> >> > > > > > > I would rather fix the scan than hack the scan. Is there any >> >> > > > technical >> >> > > > > > > reason for hacking it now instead of fixing it properly? Can >> >> some >> >> > > of >> >> > > > > the >> >> > > > > > > experts in this thread provide an estimate of complexity and >> >> > > > difference >> >> > > > > > in >> >> > > > > > > work that would be required for each approach? >> >> > > > > > > >> >> > > > > > > D. >> >> > > > > > > >> >> > > > > > > On Mon, Sep 17, 2018 at 4:42 PM Alexey Goncharuk < >> >> > > > > > > alexey.goncha...@gmail.com > >> >> > > > > > > wrote: >> >> > > > > > > >> >> > > > > > > > I think it would be beneficial for some Ignite users if we >> >> > added >> >> > > > > such a >> >> > > > > > > > partition warmup method to the public API. The method >> should >> >> be >> >> > > > > > > > well-documented and state that it may invalidate existing >> >> page >> >> > > > cache. >> >> > > > > > It >> >> > > > > > > > will be a very effective instrument until we add the >> proper >> >> > scan >> >> > > > > > ability >> >> > > > > > > > that Vladimir was referring to. >> >> > > > > > > > >> >> > > > > > > > пн, 17 сент. 2018 г. в 13:05, Maxim Muzafarov < >> >> > > maxmu...@gmail.com >> >> > > > >: >> >> > > > > > > > >> >> > > > > > > > > Folks, >> >> > > > > > > > > >> >> > > > > > > > > Such warming up can be an effective technique for >> >> performing >> >> > > > > > > calculations >> >> > > > > > > > > which required large cache >> >> > > > > > > > > data reads, but I think it's the single narrow use case >> of >> >> > all >> >> > > > over >> >> > > > > > > > Ignite >> >> > > > > > > > > store usages. Like all other >> >> > > > > > > > > powerfull techniques, we should use it wisely. In the >> >> general >> >> > > > > case, I >> >> > > > > > > > think >> >> > > > > > > > > we should consider other >> >> > > > > > > > > techniques mentioned by Vladimir and may create >> something >> >> > like >> >> > > > > > `global >> >> > > > > > > > > statistics of cache data usage` >> >> > > > > > > > > to choose the best technique in each case. >> >> > > > > > > > > >> >> > > > > > > > > For instance, it's not obvious what would take longer: >> >> > > > multi-block >> >> > > > > > > reads >> >> > > > > > > > or >> >> > > > > > > > > 50 single-block reads issues >> >> > > > > > > > > sequentially. It strongly depends on used hardware under >> >> the >> >> > > hood >> >> > > > > and >> >> > > > > > > > might >> >> > > > > > > > > depend on workload system >> >> > > > > > > > > resources (CPU-intensive calculations and I\O access) as >> >> > well. >> >> > > > But >> >> > > > > > > > > `statistics` will help us to choose >> >> > > > > > > > > the right way. >> >> > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > > On Sun, 16 Sep 2018 at 23:59 Dmitriy Pavlov < >> >> > > > dpavlov....@gmail.com >> >> > > > > > >> >> > > > > > > > wrote: >> >> > > > > > > > > >> >> > > > > > > > > > Hi Alexei, >> >> > > > > > > > > > >> >> > > > > > > > > > I did not find any PRs associated with the ticket for >> >> check >> >> > > > code >> >> > > > > > > > changes >> >> > > > > > > > > > behind this idea. Are there any PRs? >> >> > > > > > > > > > >> >> > > > > > > > > > If we create some forwards scan of pages, it should >> be a >> >> > very >> >> > > > > > > > > intellectual >> >> > > > > > > > > > algorithm including a lot of parameters (how much RAM >> is >> >> > > free, >> >> > > > > how >> >> > > > > > > > > probably >> >> > > > > > > > > > we will need next page, etc). We had the private talk >> >> about >> >> > > > such >> >> > > > > > idea >> >> > > > > > > > > some >> >> > > > > > > > > > time ago. >> >> > > > > > > > > > >> >> > > > > > > > > > By my experience, Linux systems already do such >> forward >> >> > > reading >> >> > > > > of >> >> > > > > > > file >> >> > > > > > > > > > data (for corresponding sequential flagged file >> >> > descriptors), >> >> > > > but >> >> > > > > > > some >> >> > > > > > > > > > prefetching of data at the level of application may be >> >> > useful >> >> > > > for >> >> > > > > > > > > O_DIRECT >> >> > > > > > > > > > file descriptors. >> >> > > > > > > > > > >> >> > > > > > > > > > And one more concern from me is about selecting a >> right >> >> > place >> >> > > > in >> >> > > > > > the >> >> > > > > > > > > system >> >> > > > > > > > > > to do such prefetch. >> >> > > > > > > > > > >> >> > > > > > > > > > Sincerely, >> >> > > > > > > > > > Dmitriy Pavlov >> >> > > > > > > > > > >> >> > > > > > > > > > вс, 16 сент. 2018 г. в 19:54, Vladimir Ozerov < >> >> > > > > > voze...@gridgain.com >> >> > > > > > > >: >> >> > > > > > > > > > >> >> > > > > > > > > > > HI Alex, >> >> > > > > > > > > > > >> >> > > > > > > > > > > This is good that you observed speedup. But I do not >> >> > think >> >> > > > this >> >> > > > > > > > > solution >> >> > > > > > > > > > > works for the product in general case. Amount of >> RAM is >> >> > > > > limited, >> >> > > > > > > and >> >> > > > > > > > > > even a >> >> > > > > > > > > > > single partition may need more space than RAM >> >> available. >> >> > > > > Moving a >> >> > > > > > > lot >> >> > > > > > > > > of >> >> > > > > > > > > > > pages to page memory for scan means that you evict a >> >> lot >> >> > of >> >> > > > > other >> >> > > > > > > > > pages, >> >> > > > > > > > > > > what will ultimately lead to bad performance of >> >> > subsequent >> >> > > > > > queries >> >> > > > > > > > and >> >> > > > > > > > > > > defeat LRU algorithms, which are of great improtance >> >> for >> >> > > good >> >> > > > > > > > database >> >> > > > > > > > > > > performance. >> >> > > > > > > > > > > >> >> > > > > > > > > > > Database vendors choose another approach - skip >> BTrees, >> >> > > > iterate >> >> > > > > > > > > direclty >> >> > > > > > > > > > > over data pages, read them in multi-block fashion, >> use >> >> > > > separate >> >> > > > > > > scan >> >> > > > > > > > > > buffer >> >> > > > > > > > > > > to avoid excessive evictions of other hot pages. >> >> > > > Corresponding >> >> > > > > > > ticket >> >> > > > > > > > > for >> >> > > > > > > > > > > SQL exists [1], but idea is common for all parts of >> the >> >> > > > system, >> >> > > > > > > > > requiring >> >> > > > > > > > > > > scans. >> >> > > > > > > > > > > >> >> > > > > > > > > > > As far as proposed solution, it might be good idea >> to >> >> add >> >> > > > > special >> >> > > > > > > API >> >> > > > > > > > > to >> >> > > > > > > > > > > "warmup" partition with clear explanation of pros >> (fast >> >> > > scan >> >> > > > > > after >> >> > > > > > > > > > warmup) >> >> > > > > > > > > > > and cons (slowdown of any other operations). But I >> >> think >> >> > we >> >> > > > > > should >> >> > > > > > > > not >> >> > > > > > > > > > make >> >> > > > > > > > > > > this approach part of normal scans. >> >> > > > > > > > > > > >> >> > > > > > > > > > > Vladimir. >> >> > > > > > > > > > > >> >> > > > > > > > > > > [1] >> https://issues.apache.org/jira/browse/IGNITE-6057 >> >> > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > > On Sun, Sep 16, 2018 at 6:44 PM Alexei Scherbakov < >> >> > > > > > > > > > > alexey.scherbak...@gmail.com > wrote: >> >> > > > > > > > > > > >> >> > > > > > > > > > > > Igniters, >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > My use case involves scenario where it's >> necessary to >> >> > > > iterate >> >> > > > > > > over >> >> > > > > > > > > > > > large(many TBs) persistent cache doing some >> >> calculation >> >> > > on >> >> > > > > read >> >> > > > > > > > data. >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > The basic solution is to iterate cache using >> >> ScanQuery. >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > This turns out to be slow because iteration over >> >> cache >> >> > > > > > involves a >> >> > > > > > > > lot >> >> > > > > > > > > > of >> >> > > > > > > > > > > > random disk access for reading data pages >> referenced >> >> > from >> >> > > > > leaf >> >> > > > > > > > pages >> >> > > > > > > > > by >> >> > > > > > > > > > > > links. >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > This is especially true when data is stored on >> disks >> >> > with >> >> > > > > slow >> >> > > > > > > > random >> >> > > > > > > > > > > > access, like SAS disks. In my case on modern SAS >> >> disks >> >> > > > array >> >> > > > > > > > reading >> >> > > > > > > > > > > speed >> >> > > > > > > > > > > > was like several MB/sec while sequential read >> speed >> >> in >> >> > > perf >> >> > > > > > test >> >> > > > > > > > was >> >> > > > > > > > > > > about >> >> > > > > > > > > > > > GB/sec. >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > I was able to fix the issue by using ScanQuery >> with >> >> > > > explicit >> >> > > > > > > > > partition >> >> > > > > > > > > > > set >> >> > > > > > > > > > > > and running simple warmup code before each >> partition >> >> > > scan. >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > The code pins cold pages in memory in sequential >> >> order >> >> > > thus >> >> > > > > > > > > eliminating >> >> > > > > > > > > > > > random disk access. Speedup was like x100 >> magnitude. >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > I suggest adding the improvement to the product's >> >> core >> >> > > by >> >> > > > > > always >> >> > > > > > > > > > > > sequentially preloading pages for all internal >> >> > partition >> >> > > > > > > iterations >> >> > > > > > > > > > > (cache >> >> > > > > > > > > > > > iterators, scan queries, sql queries with scan >> plan) >> >> if >> >> > > > > > partition >> >> > > > > > > > is >> >> > > > > > > > > > cold >> >> > > > > > > > > > > > (low number of pinned pages). >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > This also should speed up rebalancing from cold >> >> > > partitions. >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > Ignite JIRA ticket [1] >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > Thoughts ? >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > [1] >> >> https://issues.apache.org/jira/browse/IGNITE-8873 >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > -- >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > Best regards, >> >> > > > > > > > > > > > Alexei Scherbakov >> >> > > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > -- >> >> > > > > > > > > -- >> >> > > > > > > > > Maxim Muzafarov >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > > >> >> > > > -- >> >> > > > >> >> > > > Best regards, >> >> > > > Alexei Scherbakov >> >> > > > >> >> > > >> >> > >> >> >> >> >> -- >> Zhenya Stanilovsky >> -- Zhenya Stanilovsky