Re: Cache scan efficiency

Alexei Scherbakov Tue, 18 Sep 2018 03:58:36 -0700

Summing up, I suggest adding new public
method IgniteCache.preloadPartition(partId).


I will start preparing PR for IGNITE-8873
<https://issues.apache.org/jira/browse/IGNITE-8873> if no more objections
follow.



вт, 18 сент. 2018 г. в 10:50, Alexey Goncharuk <[email protected]>:

> Dmitriy,
>
> In my understanding, the proper fix for the scan query looks like a big
> change and it is unlikely that we include it in Ignite 2.7. On the other
> hand, the method suggested by Alexei is quite simple  and it definitely
> fits Ignite 2.7, which will provide a better user experience. Even having a
> proper scan query implemented this method can be useful in some specific
> scenarios, so we will not have to deprecate it.
>
> --AG
>
> пн, 17 сент. 2018 г. в 19:15, Dmitriy Pavlov <[email protected]>:
>
> > As I understood it is not a hack, it is an advanced feature for warming
> up
> > the partition. We can build warm-up of the overall cache by calling its
> > partitions warm-up. Users often ask about this feature and are not
> > confident with our lazy upload.
> >
> > Please correct me if I misunderstood the idea.
> >
> > пн, 17 сент. 2018 г. в 18:37, Dmitriy Setrakyan <[email protected]>:
> >
> > > I would rather fix the scan than hack the scan. Is there any technical
> > > reason for hacking it now instead of fixing it properly? Can some of
> the
> > > experts in this thread provide an estimate of complexity and difference
> > in
> > > work that would be required for each approach?
> > >
> > > D.
> > >
> > > On Mon, Sep 17, 2018 at 4:42 PM Alexey Goncharuk <
> > > [email protected]>
> > > wrote:
> > >
> > > > I think it would be beneficial for some Ignite users if we added
> such a
> > > > partition warmup method to the public API. The method should be
> > > > well-documented and state that it may invalidate existing page cache.
> > It
> > > > will be a very effective instrument until we add the proper scan
> > ability
> > > > that Vladimir was referring to.
> > > >
> > > > пн, 17 сент. 2018 г. в 13:05, Maxim Muzafarov <[email protected]>:
> > > >
> > > > > Folks,
> > > > >
> > > > > Such warming up can be an effective technique for performing
> > > calculations
> > > > > which required large cache
> > > > > data reads, but I think it's the single narrow use case of all over
> > > > Ignite
> > > > > store usages. Like all other
> > > > > powerfull techniques, we should use it wisely. In the general
> case, I
> > > > think
> > > > > we should consider other
> > > > > techniques mentioned by Vladimir and may create something like
> > `global
> > > > > statistics of cache data usage`
> > > > > to choose the best technique in each case.
> > > > >
> > > > > For instance, it's not obvious what would take longer: multi-block
> > > reads
> > > > or
> > > > > 50 single-block reads issues
> > > > > sequentially. It strongly depends on used hardware under the hood
> and
> > > > might
> > > > > depend on workload system
> > > > > resources (CPU-intensive calculations and I\O access) as well. But
> > > > > `statistics` will help us to choose
> > > > > the right way.
> > > > >
> > > > >
> > > > > On Sun, 16 Sep 2018 at 23:59 Dmitriy Pavlov <[email protected]
> >
> > > > wrote:
> > > > >
> > > > > > Hi Alexei,
> > > > > >
> > > > > > I did not find any PRs associated with the ticket for check code
> > > > changes
> > > > > > behind this idea. Are there any PRs?
> > > > > >
> > > > > > If we create some forwards scan of pages, it should be a very
> > > > > intellectual
> > > > > > algorithm including a lot of parameters (how much RAM is free,
> how
> > > > > probably
> > > > > > we will need next page, etc). We had the private talk about such
> > idea
> > > > > some
> > > > > > time ago.
> > > > > >
> > > > > > By my experience, Linux systems already do such forward reading
> of
> > > file
> > > > > > data (for corresponding sequential flagged file descriptors), but
> > > some
> > > > > > prefetching of data at the level of application may be useful for
> > > > > O_DIRECT
> > > > > > file descriptors.
> > > > > >
> > > > > > And one more concern from me is about selecting a right place in
> > the
> > > > > system
> > > > > > to do such prefetch.
> > > > > >
> > > > > > Sincerely,
> > > > > > Dmitriy Pavlov
> > > > > >
> > > > > > вс, 16 сент. 2018 г. в 19:54, Vladimir Ozerov <
> > [email protected]
> > > >:
> > > > > >
> > > > > > > HI Alex,
> > > > > > >
> > > > > > > This is good that you observed speedup. But I do not think this
> > > > > solution
> > > > > > > works for the product in general case. Amount of RAM is
> limited,
> > > and
> > > > > > even a
> > > > > > > single partition may need more space than RAM available.
> Moving a
> > > lot
> > > > > of
> > > > > > > pages to page memory for scan means that you evict a lot of
> other
> > > > > pages,
> > > > > > > what will ultimately lead to bad performance of subsequent
> > queries
> > > > and
> > > > > > > defeat LRU algorithms, which are of great improtance for good
> > > > database
> > > > > > > performance.
> > > > > > >
> > > > > > > Database vendors choose another approach - skip BTrees, iterate
> > > > > direclty
> > > > > > > over data pages, read them in multi-block fashion, use separate
> > > scan
> > > > > > buffer
> > > > > > > to avoid excessive evictions of other hot pages. Corresponding
> > > ticket
> > > > > for
> > > > > > > SQL exists [1], but idea is common for all parts of the system,
> > > > > requiring
> > > > > > > scans.
> > > > > > >
> > > > > > > As far as proposed solution, it might be good idea to add
> special
> > > API
> > > > > to
> > > > > > > "warmup" partition with clear explanation of pros (fast scan
> > after
> > > > > > warmup)
> > > > > > > and cons (slowdown of any other operations). But I think we
> > should
> > > > not
> > > > > > make
> > > > > > > this approach part of normal scans.
> > > > > > >
> > > > > > > Vladimir.
> > > > > > >
> > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-6057
> > > > > > >
> > > > > > >
> > > > > > > On Sun, Sep 16, 2018 at 6:44 PM Alexei Scherbakov <
> > > > > > > [email protected]> wrote:
> > > > > > >
> > > > > > > > Igniters,
> > > > > > > >
> > > > > > > > My use case involves scenario where it's necessary to iterate
> > > over
> > > > > > > > large(many TBs) persistent cache doing some calculation on
> read
> > > > data.
> > > > > > > >
> > > > > > > > The basic solution is to iterate cache using ScanQuery.
> > > > > > > >
> > > > > > > > This turns out to be slow because iteration over cache
> > involves a
> > > > lot
> > > > > > of
> > > > > > > > random disk access for reading data pages referenced from
> leaf
> > > > pages
> > > > > by
> > > > > > > > links.
> > > > > > > >
> > > > > > > > This is especially true when data is stored on disks with
> slow
> > > > random
> > > > > > > > access, like SAS disks. In my case on modern SAS disks array
> > > > reading
> > > > > > > speed
> > > > > > > > was like several MB/sec while sequential read speed in perf
> > test
> > > > was
> > > > > > > about
> > > > > > > > GB/sec.
> > > > > > > >
> > > > > > > > I was able to fix the issue by using ScanQuery with explicit
> > > > > partition
> > > > > > > set
> > > > > > > > and running simple warmup code before each partition scan.
> > > > > > > >
> > > > > > > > The code pins cold pages in memory in sequential order thus
> > > > > eliminating
> > > > > > > > random disk access. Speedup was like x100 magnitude.
> > > > > > > >
> > > > > > > > I suggest adding the improvement to the product's core  by
> > always
> > > > > > > > sequentially preloading pages for all internal partition
> > > iterations
> > > > > > > (cache
> > > > > > > > iterators, scan queries, sql queries with scan plan) if
> > partition
> > > > is
> > > > > > cold
> > > > > > > > (low number of pinned pages).
> > > > > > > >
> > > > > > > > This also should speed up rebalancing from cold partitions.
> > > > > > > >
> > > > > > > > Ignite JIRA ticket [1]
> > > > > > > >
> > > > > > > > Thoughts ?
> > > > > > > >
> > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-8873
> > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > > > Best regards,
> > > > > > > > Alexei Scherbakov
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > --
> > > > > --
> > > > > Maxim Muzafarov
> > > > >
> > > >
> > >
> >
>


-- 

Best regards,
Alexei Scherbakov

Re: Cache scan efficiency

Reply via email to