Hi Alexei,

I did not find any PRs associated with the ticket for check code changes
behind this idea. Are there any PRs?

If we create some forwards scan of pages, it should be a very intellectual
algorithm including a lot of parameters (how much RAM is free, how probably
we will need next page, etc). We had the private talk about such idea some
time ago.

By my experience, Linux systems already do such forward reading of file
data (for corresponding sequential flagged file descriptors), but some
prefetching of data at the level of application may be useful for O_DIRECT
file descriptors.

And one more concern from me is about selecting a right place in the system
to do such prefetch.

Sincerely,
Dmitriy Pavlov

вс, 16 сент. 2018 г. в 19:54, Vladimir Ozerov <voze...@gridgain.com>:

> HI Alex,
>
> This is good that you observed speedup. But I do not think this solution
> works for the product in general case. Amount of RAM is limited, and even a
> single partition may need more space than RAM available. Moving a lot of
> pages to page memory for scan means that you evict a lot of other pages,
> what will ultimately lead to bad performance of subsequent queries and
> defeat LRU algorithms, which are of great improtance for good database
> performance.
>
> Database vendors choose another approach - skip BTrees, iterate direclty
> over data pages, read them in multi-block fashion, use separate scan buffer
> to avoid excessive evictions of other hot pages. Corresponding ticket for
> SQL exists [1], but idea is common for all parts of the system, requiring
> scans.
>
> As far as proposed solution, it might be good idea to add special API to
> "warmup" partition with clear explanation of pros (fast scan after warmup)
> and cons (slowdown of any other operations). But I think we should not make
> this approach part of normal scans.
>
> Vladimir.
>
> [1] https://issues.apache.org/jira/browse/IGNITE-6057
>
>
> On Sun, Sep 16, 2018 at 6:44 PM Alexei Scherbakov <
> alexey.scherbak...@gmail.com> wrote:
>
> > Igniters,
> >
> > My use case involves scenario where it's necessary to iterate over
> > large(many TBs) persistent cache doing some calculation on read data.
> >
> > The basic solution is to iterate cache using ScanQuery.
> >
> > This turns out to be slow because iteration over cache involves a lot of
> > random disk access for reading data pages referenced from leaf pages by
> > links.
> >
> > This is especially true when data is stored on disks with slow random
> > access, like SAS disks. In my case on modern SAS disks array reading
> speed
> > was like several MB/sec while sequential read speed in perf test was
> about
> > GB/sec.
> >
> > I was able to fix the issue by using ScanQuery with explicit partition
> set
> > and running simple warmup code before each partition scan.
> >
> > The code pins cold pages in memory in sequential order thus eliminating
> > random disk access. Speedup was like x100 magnitude.
> >
> > I suggest adding the improvement to the product's core  by always
> > sequentially preloading pages for all internal partition iterations
> (cache
> > iterators, scan queries, sql queries with scan plan) if partition is cold
> > (low number of pinned pages).
> >
> > This also should speed up rebalancing from cold partitions.
> >
> > Ignite JIRA ticket [1]
> >
> > Thoughts ?
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-8873
> >
> > --
> >
> > Best regards,
> > Alexei Scherbakov
> >
>

Reply via email to