Re: Cache scan efficiency

2018-09-18 Thread Dmitriy Pavlov
Even better, if RAM is exhausted page replacement process will be started.
https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Durable+Memory+-+under+the+hood#IgniteDurableMemory-underthehood-Pagereplacement(rotationwithdisk)

Effect of the preloading will be still markable, but not as excelled as
with full-fitting into RAM. Later I can review or improve javadocs if it is
necessary.

ср, 19 сент. 2018 г. в 0:18, Denis Magda :

> Agree, it's just a matter of the documentation. If a user stores 100% in
> RAM and on disk, and just wants to warm RAM up after a restart then he
> knows everything will fit there. If during the preloading we detect that
> the RAM is exhausted we can halt it and print out a warning.
>
> --
> Denis
>
> On Tue, Sep 18, 2018 at 2:10 PM Dmitriy Pavlov 
> wrote:
>
> > Hi,
> >
> > I totally support the idea of cache preload.
> >
> > IMO it can be expanded. We can iterate over local partitions of the cache
> > group and preload each.
> >
> > But it should be really clear documented methods so a user can be aware
> of
> > the benefits of such method (e.g. if RAM region is big enough, etc).
> >
> > Sincerely,
> > Dmitriy Pavlov
> >
> > вт, 18 сент. 2018 г. в 21:36, Denis Magda :
> >
> > > Folks,
> > >
> > > Since we're adding a method that would preload a certain partition, can
> > we
> > > add the one which will preload the whole cache? Ignite persistence
> users
> > > I've been working with look puzzled once they realize there is no way
> to
> > > warm up RAM after the restart. There are use cases that require this.
> > >
> > > Can the current optimizations be expanded to the cache preloading use
> > case?
> > >
> > > --
> > > Denis
> > >
> > > On Tue, Sep 18, 2018 at 3:58 AM Alexei Scherbakov <
> > > alexey.scherbak...@gmail.com> wrote:
> > >
> > > > Summing up, I suggest adding new public
> > > > method IgniteCache.preloadPartition(partId).
> > > >
> > > > I will start preparing PR for IGNITE-8873
> > > >  if no more
> > > objections
> > > > follow.
> > > >
> > > >
> > > >
> > > > вт, 18 сент. 2018 г. в 10:50, Alexey Goncharuk <
> > > alexey.goncha...@gmail.com
> > > > >:
> > > >
> > > > > Dmitriy,
> > > > >
> > > > > In my understanding, the proper fix for the scan query looks like a
> > big
> > > > > change and it is unlikely that we include it in Ignite 2.7. On the
> > > other
> > > > > hand, the method suggested by Alexei is quite simple  and it
> > definitely
> > > > > fits Ignite 2.7, which will provide a better user experience. Even
> > > > having a
> > > > > proper scan query implemented this method can be useful in some
> > > specific
> > > > > scenarios, so we will not have to deprecate it.
> > > > >
> > > > > --AG
> > > > >
> > > > > пн, 17 сент. 2018 г. в 19:15, Dmitriy Pavlov <
> dpavlov@gmail.com
> > >:
> > > > >
> > > > > > As I understood it is not a hack, it is an advanced feature for
> > > warming
> > > > > up
> > > > > > the partition. We can build warm-up of the overall cache by
> calling
> > > its
> > > > > > partitions warm-up. Users often ask about this feature and are
> not
> > > > > > confident with our lazy upload.
> > > > > >
> > > > > > Please correct me if I misunderstood the idea.
> > > > > >
> > > > > > пн, 17 сент. 2018 г. в 18:37, Dmitriy Setrakyan <
> > > dsetrak...@apache.org
> > > > >:
> > > > > >
> > > > > > > I would rather fix the scan than hack the scan. Is there any
> > > > technical
> > > > > > > reason for hacking it now instead of fixing it properly? Can
> some
> > > of
> > > > > the
> > > > > > > experts in this thread provide an estimate of complexity and
> > > > difference
> > > > > > in
> > > > > > > work that would be required for each approach?
> > > > > > >
> > > > > > > D.
> > > > > > >
> > > > > > > On Mon, Sep 17, 2018 at 4:42 PM Alexey Goncharuk <
> > > > > > > alexey.goncha...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I think it would be beneficial for some Ignite users if we
> > added
> > > > > such a
> > > > > > > > partition warmup method to the public API. The method should
> be
> > > > > > > > well-documented and state that it may invalidate existing
> page
> > > > cache.
> > > > > > It
> > > > > > > > will be a very effective instrument until we add the proper
> > scan
> > > > > > ability
> > > > > > > > that Vladimir was referring to.
> > > > > > > >
> > > > > > > > пн, 17 сент. 2018 г. в 13:05, Maxim Muzafarov <
> > > maxmu...@gmail.com
> > > > >:
> > > > > > > >
> > > > > > > > > Folks,
> > > > > > > > >
> > > > > > > > > Such warming up can be an effective technique for
> performing
> > > > > > > calculations
> > > > > > > > > which required large cache
> > > > > > > > > data reads, but I think it's the single narrow use case of
> > all
> > > > over
> > > > > > > > Ignite
> > > > > > > > > store usages. Like all other
> > > > > > > > > powerfull techniques, we should use it wisely. In the
> general
> > > > > case, I
> > > > > > > > think
> > 

Re: Cache scan efficiency

2018-09-18 Thread Denis Magda
Agree, it's just a matter of the documentation. If a user stores 100% in
RAM and on disk, and just wants to warm RAM up after a restart then he
knows everything will fit there. If during the preloading we detect that
the RAM is exhausted we can halt it and print out a warning.

--
Denis

On Tue, Sep 18, 2018 at 2:10 PM Dmitriy Pavlov 
wrote:

> Hi,
>
> I totally support the idea of cache preload.
>
> IMO it can be expanded. We can iterate over local partitions of the cache
> group and preload each.
>
> But it should be really clear documented methods so a user can be aware of
> the benefits of such method (e.g. if RAM region is big enough, etc).
>
> Sincerely,
> Dmitriy Pavlov
>
> вт, 18 сент. 2018 г. в 21:36, Denis Magda :
>
> > Folks,
> >
> > Since we're adding a method that would preload a certain partition, can
> we
> > add the one which will preload the whole cache? Ignite persistence users
> > I've been working with look puzzled once they realize there is no way to
> > warm up RAM after the restart. There are use cases that require this.
> >
> > Can the current optimizations be expanded to the cache preloading use
> case?
> >
> > --
> > Denis
> >
> > On Tue, Sep 18, 2018 at 3:58 AM Alexei Scherbakov <
> > alexey.scherbak...@gmail.com> wrote:
> >
> > > Summing up, I suggest adding new public
> > > method IgniteCache.preloadPartition(partId).
> > >
> > > I will start preparing PR for IGNITE-8873
> > >  if no more
> > objections
> > > follow.
> > >
> > >
> > >
> > > вт, 18 сент. 2018 г. в 10:50, Alexey Goncharuk <
> > alexey.goncha...@gmail.com
> > > >:
> > >
> > > > Dmitriy,
> > > >
> > > > In my understanding, the proper fix for the scan query looks like a
> big
> > > > change and it is unlikely that we include it in Ignite 2.7. On the
> > other
> > > > hand, the method suggested by Alexei is quite simple  and it
> definitely
> > > > fits Ignite 2.7, which will provide a better user experience. Even
> > > having a
> > > > proper scan query implemented this method can be useful in some
> > specific
> > > > scenarios, so we will not have to deprecate it.
> > > >
> > > > --AG
> > > >
> > > > пн, 17 сент. 2018 г. в 19:15, Dmitriy Pavlov  >:
> > > >
> > > > > As I understood it is not a hack, it is an advanced feature for
> > warming
> > > > up
> > > > > the partition. We can build warm-up of the overall cache by calling
> > its
> > > > > partitions warm-up. Users often ask about this feature and are not
> > > > > confident with our lazy upload.
> > > > >
> > > > > Please correct me if I misunderstood the idea.
> > > > >
> > > > > пн, 17 сент. 2018 г. в 18:37, Dmitriy Setrakyan <
> > dsetrak...@apache.org
> > > >:
> > > > >
> > > > > > I would rather fix the scan than hack the scan. Is there any
> > > technical
> > > > > > reason for hacking it now instead of fixing it properly? Can some
> > of
> > > > the
> > > > > > experts in this thread provide an estimate of complexity and
> > > difference
> > > > > in
> > > > > > work that would be required for each approach?
> > > > > >
> > > > > > D.
> > > > > >
> > > > > > On Mon, Sep 17, 2018 at 4:42 PM Alexey Goncharuk <
> > > > > > alexey.goncha...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > I think it would be beneficial for some Ignite users if we
> added
> > > > such a
> > > > > > > partition warmup method to the public API. The method should be
> > > > > > > well-documented and state that it may invalidate existing page
> > > cache.
> > > > > It
> > > > > > > will be a very effective instrument until we add the proper
> scan
> > > > > ability
> > > > > > > that Vladimir was referring to.
> > > > > > >
> > > > > > > пн, 17 сент. 2018 г. в 13:05, Maxim Muzafarov <
> > maxmu...@gmail.com
> > > >:
> > > > > > >
> > > > > > > > Folks,
> > > > > > > >
> > > > > > > > Such warming up can be an effective technique for performing
> > > > > > calculations
> > > > > > > > which required large cache
> > > > > > > > data reads, but I think it's the single narrow use case of
> all
> > > over
> > > > > > > Ignite
> > > > > > > > store usages. Like all other
> > > > > > > > powerfull techniques, we should use it wisely. In the general
> > > > case, I
> > > > > > > think
> > > > > > > > we should consider other
> > > > > > > > techniques mentioned by Vladimir and may create something
> like
> > > > > `global
> > > > > > > > statistics of cache data usage`
> > > > > > > > to choose the best technique in each case.
> > > > > > > >
> > > > > > > > For instance, it's not obvious what would take longer:
> > > multi-block
> > > > > > reads
> > > > > > > or
> > > > > > > > 50 single-block reads issues
> > > > > > > > sequentially. It strongly depends on used hardware under the
> > hood
> > > > and
> > > > > > > might
> > > > > > > > depend on workload system
> > > > > > > > resources (CPU-intensive calculations and I\O access) as
> well.
> > > But
> > > > > > > > `statistics` will help us to choose
> > > > > > > > the 

Re: Cache scan efficiency

2018-09-18 Thread Dmitriy Pavlov
Hi,

I totally support the idea of cache preload.

IMO it can be expanded. We can iterate over local partitions of the cache
group and preload each.

But it should be really clear documented methods so a user can be aware of
the benefits of such method (e.g. if RAM region is big enough, etc).

Sincerely,
Dmitriy Pavlov

вт, 18 сент. 2018 г. в 21:36, Denis Magda :

> Folks,
>
> Since we're adding a method that would preload a certain partition, can we
> add the one which will preload the whole cache? Ignite persistence users
> I've been working with look puzzled once they realize there is no way to
> warm up RAM after the restart. There are use cases that require this.
>
> Can the current optimizations be expanded to the cache preloading use case?
>
> --
> Denis
>
> On Tue, Sep 18, 2018 at 3:58 AM Alexei Scherbakov <
> alexey.scherbak...@gmail.com> wrote:
>
> > Summing up, I suggest adding new public
> > method IgniteCache.preloadPartition(partId).
> >
> > I will start preparing PR for IGNITE-8873
> >  if no more
> objections
> > follow.
> >
> >
> >
> > вт, 18 сент. 2018 г. в 10:50, Alexey Goncharuk <
> alexey.goncha...@gmail.com
> > >:
> >
> > > Dmitriy,
> > >
> > > In my understanding, the proper fix for the scan query looks like a big
> > > change and it is unlikely that we include it in Ignite 2.7. On the
> other
> > > hand, the method suggested by Alexei is quite simple  and it definitely
> > > fits Ignite 2.7, which will provide a better user experience. Even
> > having a
> > > proper scan query implemented this method can be useful in some
> specific
> > > scenarios, so we will not have to deprecate it.
> > >
> > > --AG
> > >
> > > пн, 17 сент. 2018 г. в 19:15, Dmitriy Pavlov :
> > >
> > > > As I understood it is not a hack, it is an advanced feature for
> warming
> > > up
> > > > the partition. We can build warm-up of the overall cache by calling
> its
> > > > partitions warm-up. Users often ask about this feature and are not
> > > > confident with our lazy upload.
> > > >
> > > > Please correct me if I misunderstood the idea.
> > > >
> > > > пн, 17 сент. 2018 г. в 18:37, Dmitriy Setrakyan <
> dsetrak...@apache.org
> > >:
> > > >
> > > > > I would rather fix the scan than hack the scan. Is there any
> > technical
> > > > > reason for hacking it now instead of fixing it properly? Can some
> of
> > > the
> > > > > experts in this thread provide an estimate of complexity and
> > difference
> > > > in
> > > > > work that would be required for each approach?
> > > > >
> > > > > D.
> > > > >
> > > > > On Mon, Sep 17, 2018 at 4:42 PM Alexey Goncharuk <
> > > > > alexey.goncha...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > I think it would be beneficial for some Ignite users if we added
> > > such a
> > > > > > partition warmup method to the public API. The method should be
> > > > > > well-documented and state that it may invalidate existing page
> > cache.
> > > > It
> > > > > > will be a very effective instrument until we add the proper scan
> > > > ability
> > > > > > that Vladimir was referring to.
> > > > > >
> > > > > > пн, 17 сент. 2018 г. в 13:05, Maxim Muzafarov <
> maxmu...@gmail.com
> > >:
> > > > > >
> > > > > > > Folks,
> > > > > > >
> > > > > > > Such warming up can be an effective technique for performing
> > > > > calculations
> > > > > > > which required large cache
> > > > > > > data reads, but I think it's the single narrow use case of all
> > over
> > > > > > Ignite
> > > > > > > store usages. Like all other
> > > > > > > powerfull techniques, we should use it wisely. In the general
> > > case, I
> > > > > > think
> > > > > > > we should consider other
> > > > > > > techniques mentioned by Vladimir and may create something like
> > > > `global
> > > > > > > statistics of cache data usage`
> > > > > > > to choose the best technique in each case.
> > > > > > >
> > > > > > > For instance, it's not obvious what would take longer:
> > multi-block
> > > > > reads
> > > > > > or
> > > > > > > 50 single-block reads issues
> > > > > > > sequentially. It strongly depends on used hardware under the
> hood
> > > and
> > > > > > might
> > > > > > > depend on workload system
> > > > > > > resources (CPU-intensive calculations and I\O access) as well.
> > But
> > > > > > > `statistics` will help us to choose
> > > > > > > the right way.
> > > > > > >
> > > > > > >
> > > > > > > On Sun, 16 Sep 2018 at 23:59 Dmitriy Pavlov <
> > dpavlov@gmail.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Alexei,
> > > > > > > >
> > > > > > > > I did not find any PRs associated with the ticket for check
> > code
> > > > > > changes
> > > > > > > > behind this idea. Are there any PRs?
> > > > > > > >
> > > > > > > > If we create some forwards scan of pages, it should be a very
> > > > > > > intellectual
> > > > > > > > algorithm including a lot of parameters (how much RAM is
> free,
> > > how
> > > > > > > probably
> > > > > > > > we will need 

Re: Cache scan efficiency

2018-09-18 Thread Denis Magda
Folks,

Since we're adding a method that would preload a certain partition, can we
add the one which will preload the whole cache? Ignite persistence users
I've been working with look puzzled once they realize there is no way to
warm up RAM after the restart. There are use cases that require this.

Can the current optimizations be expanded to the cache preloading use case?

--
Denis

On Tue, Sep 18, 2018 at 3:58 AM Alexei Scherbakov <
alexey.scherbak...@gmail.com> wrote:

> Summing up, I suggest adding new public
> method IgniteCache.preloadPartition(partId).
>
> I will start preparing PR for IGNITE-8873
>  if no more objections
> follow.
>
>
>
> вт, 18 сент. 2018 г. в 10:50, Alexey Goncharuk  >:
>
> > Dmitriy,
> >
> > In my understanding, the proper fix for the scan query looks like a big
> > change and it is unlikely that we include it in Ignite 2.7. On the other
> > hand, the method suggested by Alexei is quite simple  and it definitely
> > fits Ignite 2.7, which will provide a better user experience. Even
> having a
> > proper scan query implemented this method can be useful in some specific
> > scenarios, so we will not have to deprecate it.
> >
> > --AG
> >
> > пн, 17 сент. 2018 г. в 19:15, Dmitriy Pavlov :
> >
> > > As I understood it is not a hack, it is an advanced feature for warming
> > up
> > > the partition. We can build warm-up of the overall cache by calling its
> > > partitions warm-up. Users often ask about this feature and are not
> > > confident with our lazy upload.
> > >
> > > Please correct me if I misunderstood the idea.
> > >
> > > пн, 17 сент. 2018 г. в 18:37, Dmitriy Setrakyan  >:
> > >
> > > > I would rather fix the scan than hack the scan. Is there any
> technical
> > > > reason for hacking it now instead of fixing it properly? Can some of
> > the
> > > > experts in this thread provide an estimate of complexity and
> difference
> > > in
> > > > work that would be required for each approach?
> > > >
> > > > D.
> > > >
> > > > On Mon, Sep 17, 2018 at 4:42 PM Alexey Goncharuk <
> > > > alexey.goncha...@gmail.com>
> > > > wrote:
> > > >
> > > > > I think it would be beneficial for some Ignite users if we added
> > such a
> > > > > partition warmup method to the public API. The method should be
> > > > > well-documented and state that it may invalidate existing page
> cache.
> > > It
> > > > > will be a very effective instrument until we add the proper scan
> > > ability
> > > > > that Vladimir was referring to.
> > > > >
> > > > > пн, 17 сент. 2018 г. в 13:05, Maxim Muzafarov  >:
> > > > >
> > > > > > Folks,
> > > > > >
> > > > > > Such warming up can be an effective technique for performing
> > > > calculations
> > > > > > which required large cache
> > > > > > data reads, but I think it's the single narrow use case of all
> over
> > > > > Ignite
> > > > > > store usages. Like all other
> > > > > > powerfull techniques, we should use it wisely. In the general
> > case, I
> > > > > think
> > > > > > we should consider other
> > > > > > techniques mentioned by Vladimir and may create something like
> > > `global
> > > > > > statistics of cache data usage`
> > > > > > to choose the best technique in each case.
> > > > > >
> > > > > > For instance, it's not obvious what would take longer:
> multi-block
> > > > reads
> > > > > or
> > > > > > 50 single-block reads issues
> > > > > > sequentially. It strongly depends on used hardware under the hood
> > and
> > > > > might
> > > > > > depend on workload system
> > > > > > resources (CPU-intensive calculations and I\O access) as well.
> But
> > > > > > `statistics` will help us to choose
> > > > > > the right way.
> > > > > >
> > > > > >
> > > > > > On Sun, 16 Sep 2018 at 23:59 Dmitriy Pavlov <
> dpavlov@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Hi Alexei,
> > > > > > >
> > > > > > > I did not find any PRs associated with the ticket for check
> code
> > > > > changes
> > > > > > > behind this idea. Are there any PRs?
> > > > > > >
> > > > > > > If we create some forwards scan of pages, it should be a very
> > > > > > intellectual
> > > > > > > algorithm including a lot of parameters (how much RAM is free,
> > how
> > > > > > probably
> > > > > > > we will need next page, etc). We had the private talk about
> such
> > > idea
> > > > > > some
> > > > > > > time ago.
> > > > > > >
> > > > > > > By my experience, Linux systems already do such forward reading
> > of
> > > > file
> > > > > > > data (for corresponding sequential flagged file descriptors),
> but
> > > > some
> > > > > > > prefetching of data at the level of application may be useful
> for
> > > > > > O_DIRECT
> > > > > > > file descriptors.
> > > > > > >
> > > > > > > And one more concern from me is about selecting a right place
> in
> > > the
> > > > > > system
> > > > > > > to do such prefetch.
> > > > > > >
> > > > > > > Sincerely,
> > > > > > > Dmitriy Pavlov
> > > > > > >
> > > > > > > вс, 16 сент. 2018 

Re: Cache scan efficiency

2018-09-18 Thread Dmitriy Setrakyan
On Tue, Sep 18, 2018 at 1:58 PM Alexei Scherbakov <
alexey.scherbak...@gmail.com> wrote:

> Summing up, I suggest adding new public
> method IgniteCache.preloadPartition(partId).
>
> I will start preparing PR for IGNITE-8873
>  if no more objections
> follow.
>


Alexey, let's make sure we document this feature very well in Javadoc, as
well as in public readme.io documentation. Also, all cache iterator methods
and SCAN queries should be documented, suggesting when the partitions
should be preloaded to achieve better performance.

D.


Re: Cache scan efficiency

2018-09-18 Thread Alexei Scherbakov
Summing up, I suggest adding new public
method IgniteCache.preloadPartition(partId).

I will start preparing PR for IGNITE-8873
 if no more objections
follow.



вт, 18 сент. 2018 г. в 10:50, Alexey Goncharuk :

> Dmitriy,
>
> In my understanding, the proper fix for the scan query looks like a big
> change and it is unlikely that we include it in Ignite 2.7. On the other
> hand, the method suggested by Alexei is quite simple  and it definitely
> fits Ignite 2.7, which will provide a better user experience. Even having a
> proper scan query implemented this method can be useful in some specific
> scenarios, so we will not have to deprecate it.
>
> --AG
>
> пн, 17 сент. 2018 г. в 19:15, Dmitriy Pavlov :
>
> > As I understood it is not a hack, it is an advanced feature for warming
> up
> > the partition. We can build warm-up of the overall cache by calling its
> > partitions warm-up. Users often ask about this feature and are not
> > confident with our lazy upload.
> >
> > Please correct me if I misunderstood the idea.
> >
> > пн, 17 сент. 2018 г. в 18:37, Dmitriy Setrakyan :
> >
> > > I would rather fix the scan than hack the scan. Is there any technical
> > > reason for hacking it now instead of fixing it properly? Can some of
> the
> > > experts in this thread provide an estimate of complexity and difference
> > in
> > > work that would be required for each approach?
> > >
> > > D.
> > >
> > > On Mon, Sep 17, 2018 at 4:42 PM Alexey Goncharuk <
> > > alexey.goncha...@gmail.com>
> > > wrote:
> > >
> > > > I think it would be beneficial for some Ignite users if we added
> such a
> > > > partition warmup method to the public API. The method should be
> > > > well-documented and state that it may invalidate existing page cache.
> > It
> > > > will be a very effective instrument until we add the proper scan
> > ability
> > > > that Vladimir was referring to.
> > > >
> > > > пн, 17 сент. 2018 г. в 13:05, Maxim Muzafarov :
> > > >
> > > > > Folks,
> > > > >
> > > > > Such warming up can be an effective technique for performing
> > > calculations
> > > > > which required large cache
> > > > > data reads, but I think it's the single narrow use case of all over
> > > > Ignite
> > > > > store usages. Like all other
> > > > > powerfull techniques, we should use it wisely. In the general
> case, I
> > > > think
> > > > > we should consider other
> > > > > techniques mentioned by Vladimir and may create something like
> > `global
> > > > > statistics of cache data usage`
> > > > > to choose the best technique in each case.
> > > > >
> > > > > For instance, it's not obvious what would take longer: multi-block
> > > reads
> > > > or
> > > > > 50 single-block reads issues
> > > > > sequentially. It strongly depends on used hardware under the hood
> and
> > > > might
> > > > > depend on workload system
> > > > > resources (CPU-intensive calculations and I\O access) as well. But
> > > > > `statistics` will help us to choose
> > > > > the right way.
> > > > >
> > > > >
> > > > > On Sun, 16 Sep 2018 at 23:59 Dmitriy Pavlov  >
> > > > wrote:
> > > > >
> > > > > > Hi Alexei,
> > > > > >
> > > > > > I did not find any PRs associated with the ticket for check code
> > > > changes
> > > > > > behind this idea. Are there any PRs?
> > > > > >
> > > > > > If we create some forwards scan of pages, it should be a very
> > > > > intellectual
> > > > > > algorithm including a lot of parameters (how much RAM is free,
> how
> > > > > probably
> > > > > > we will need next page, etc). We had the private talk about such
> > idea
> > > > > some
> > > > > > time ago.
> > > > > >
> > > > > > By my experience, Linux systems already do such forward reading
> of
> > > file
> > > > > > data (for corresponding sequential flagged file descriptors), but
> > > some
> > > > > > prefetching of data at the level of application may be useful for
> > > > > O_DIRECT
> > > > > > file descriptors.
> > > > > >
> > > > > > And one more concern from me is about selecting a right place in
> > the
> > > > > system
> > > > > > to do such prefetch.
> > > > > >
> > > > > > Sincerely,
> > > > > > Dmitriy Pavlov
> > > > > >
> > > > > > вс, 16 сент. 2018 г. в 19:54, Vladimir Ozerov <
> > voze...@gridgain.com
> > > >:
> > > > > >
> > > > > > > HI Alex,
> > > > > > >
> > > > > > > This is good that you observed speedup. But I do not think this
> > > > > solution
> > > > > > > works for the product in general case. Amount of RAM is
> limited,
> > > and
> > > > > > even a
> > > > > > > single partition may need more space than RAM available.
> Moving a
> > > lot
> > > > > of
> > > > > > > pages to page memory for scan means that you evict a lot of
> other
> > > > > pages,
> > > > > > > what will ultimately lead to bad performance of subsequent
> > queries
> > > > and
> > > > > > > defeat LRU algorithms, which are of great improtance for good
> > > > database
> > > > > > > performance.
> > > > > > >
> > > > > > > Database 

Re: Cache scan efficiency

2018-09-18 Thread Alexey Goncharuk
Dmitriy,

In my understanding, the proper fix for the scan query looks like a big
change and it is unlikely that we include it in Ignite 2.7. On the other
hand, the method suggested by Alexei is quite simple  and it definitely
fits Ignite 2.7, which will provide a better user experience. Even having a
proper scan query implemented this method can be useful in some specific
scenarios, so we will not have to deprecate it.

--AG

пн, 17 сент. 2018 г. в 19:15, Dmitriy Pavlov :

> As I understood it is not a hack, it is an advanced feature for warming up
> the partition. We can build warm-up of the overall cache by calling its
> partitions warm-up. Users often ask about this feature and are not
> confident with our lazy upload.
>
> Please correct me if I misunderstood the idea.
>
> пн, 17 сент. 2018 г. в 18:37, Dmitriy Setrakyan :
>
> > I would rather fix the scan than hack the scan. Is there any technical
> > reason for hacking it now instead of fixing it properly? Can some of the
> > experts in this thread provide an estimate of complexity and difference
> in
> > work that would be required for each approach?
> >
> > D.
> >
> > On Mon, Sep 17, 2018 at 4:42 PM Alexey Goncharuk <
> > alexey.goncha...@gmail.com>
> > wrote:
> >
> > > I think it would be beneficial for some Ignite users if we added such a
> > > partition warmup method to the public API. The method should be
> > > well-documented and state that it may invalidate existing page cache.
> It
> > > will be a very effective instrument until we add the proper scan
> ability
> > > that Vladimir was referring to.
> > >
> > > пн, 17 сент. 2018 г. в 13:05, Maxim Muzafarov :
> > >
> > > > Folks,
> > > >
> > > > Such warming up can be an effective technique for performing
> > calculations
> > > > which required large cache
> > > > data reads, but I think it's the single narrow use case of all over
> > > Ignite
> > > > store usages. Like all other
> > > > powerfull techniques, we should use it wisely. In the general case, I
> > > think
> > > > we should consider other
> > > > techniques mentioned by Vladimir and may create something like
> `global
> > > > statistics of cache data usage`
> > > > to choose the best technique in each case.
> > > >
> > > > For instance, it's not obvious what would take longer: multi-block
> > reads
> > > or
> > > > 50 single-block reads issues
> > > > sequentially. It strongly depends on used hardware under the hood and
> > > might
> > > > depend on workload system
> > > > resources (CPU-intensive calculations and I\O access) as well. But
> > > > `statistics` will help us to choose
> > > > the right way.
> > > >
> > > >
> > > > On Sun, 16 Sep 2018 at 23:59 Dmitriy Pavlov 
> > > wrote:
> > > >
> > > > > Hi Alexei,
> > > > >
> > > > > I did not find any PRs associated with the ticket for check code
> > > changes
> > > > > behind this idea. Are there any PRs?
> > > > >
> > > > > If we create some forwards scan of pages, it should be a very
> > > > intellectual
> > > > > algorithm including a lot of parameters (how much RAM is free, how
> > > > probably
> > > > > we will need next page, etc). We had the private talk about such
> idea
> > > > some
> > > > > time ago.
> > > > >
> > > > > By my experience, Linux systems already do such forward reading of
> > file
> > > > > data (for corresponding sequential flagged file descriptors), but
> > some
> > > > > prefetching of data at the level of application may be useful for
> > > > O_DIRECT
> > > > > file descriptors.
> > > > >
> > > > > And one more concern from me is about selecting a right place in
> the
> > > > system
> > > > > to do such prefetch.
> > > > >
> > > > > Sincerely,
> > > > > Dmitriy Pavlov
> > > > >
> > > > > вс, 16 сент. 2018 г. в 19:54, Vladimir Ozerov <
> voze...@gridgain.com
> > >:
> > > > >
> > > > > > HI Alex,
> > > > > >
> > > > > > This is good that you observed speedup. But I do not think this
> > > > solution
> > > > > > works for the product in general case. Amount of RAM is limited,
> > and
> > > > > even a
> > > > > > single partition may need more space than RAM available. Moving a
> > lot
> > > > of
> > > > > > pages to page memory for scan means that you evict a lot of other
> > > > pages,
> > > > > > what will ultimately lead to bad performance of subsequent
> queries
> > > and
> > > > > > defeat LRU algorithms, which are of great improtance for good
> > > database
> > > > > > performance.
> > > > > >
> > > > > > Database vendors choose another approach - skip BTrees, iterate
> > > > direclty
> > > > > > over data pages, read them in multi-block fashion, use separate
> > scan
> > > > > buffer
> > > > > > to avoid excessive evictions of other hot pages. Corresponding
> > ticket
> > > > for
> > > > > > SQL exists [1], but idea is common for all parts of the system,
> > > > requiring
> > > > > > scans.
> > > > > >
> > > > > > As far as proposed solution, it might be good idea to add special
> > API
> > > > to
> > > > > > "warmup" partition with clear 

Re: Cache scan efficiency

2018-09-17 Thread Dmitriy Pavlov
As I understood it is not a hack, it is an advanced feature for warming up
the partition. We can build warm-up of the overall cache by calling its
partitions warm-up. Users often ask about this feature and are not
confident with our lazy upload.

Please correct me if I misunderstood the idea.

пн, 17 сент. 2018 г. в 18:37, Dmitriy Setrakyan :

> I would rather fix the scan than hack the scan. Is there any technical
> reason for hacking it now instead of fixing it properly? Can some of the
> experts in this thread provide an estimate of complexity and difference in
> work that would be required for each approach?
>
> D.
>
> On Mon, Sep 17, 2018 at 4:42 PM Alexey Goncharuk <
> alexey.goncha...@gmail.com>
> wrote:
>
> > I think it would be beneficial for some Ignite users if we added such a
> > partition warmup method to the public API. The method should be
> > well-documented and state that it may invalidate existing page cache. It
> > will be a very effective instrument until we add the proper scan ability
> > that Vladimir was referring to.
> >
> > пн, 17 сент. 2018 г. в 13:05, Maxim Muzafarov :
> >
> > > Folks,
> > >
> > > Such warming up can be an effective technique for performing
> calculations
> > > which required large cache
> > > data reads, but I think it's the single narrow use case of all over
> > Ignite
> > > store usages. Like all other
> > > powerfull techniques, we should use it wisely. In the general case, I
> > think
> > > we should consider other
> > > techniques mentioned by Vladimir and may create something like `global
> > > statistics of cache data usage`
> > > to choose the best technique in each case.
> > >
> > > For instance, it's not obvious what would take longer: multi-block
> reads
> > or
> > > 50 single-block reads issues
> > > sequentially. It strongly depends on used hardware under the hood and
> > might
> > > depend on workload system
> > > resources (CPU-intensive calculations and I\O access) as well. But
> > > `statistics` will help us to choose
> > > the right way.
> > >
> > >
> > > On Sun, 16 Sep 2018 at 23:59 Dmitriy Pavlov 
> > wrote:
> > >
> > > > Hi Alexei,
> > > >
> > > > I did not find any PRs associated with the ticket for check code
> > changes
> > > > behind this idea. Are there any PRs?
> > > >
> > > > If we create some forwards scan of pages, it should be a very
> > > intellectual
> > > > algorithm including a lot of parameters (how much RAM is free, how
> > > probably
> > > > we will need next page, etc). We had the private talk about such idea
> > > some
> > > > time ago.
> > > >
> > > > By my experience, Linux systems already do such forward reading of
> file
> > > > data (for corresponding sequential flagged file descriptors), but
> some
> > > > prefetching of data at the level of application may be useful for
> > > O_DIRECT
> > > > file descriptors.
> > > >
> > > > And one more concern from me is about selecting a right place in the
> > > system
> > > > to do such prefetch.
> > > >
> > > > Sincerely,
> > > > Dmitriy Pavlov
> > > >
> > > > вс, 16 сент. 2018 г. в 19:54, Vladimir Ozerov  >:
> > > >
> > > > > HI Alex,
> > > > >
> > > > > This is good that you observed speedup. But I do not think this
> > > solution
> > > > > works for the product in general case. Amount of RAM is limited,
> and
> > > > even a
> > > > > single partition may need more space than RAM available. Moving a
> lot
> > > of
> > > > > pages to page memory for scan means that you evict a lot of other
> > > pages,
> > > > > what will ultimately lead to bad performance of subsequent queries
> > and
> > > > > defeat LRU algorithms, which are of great improtance for good
> > database
> > > > > performance.
> > > > >
> > > > > Database vendors choose another approach - skip BTrees, iterate
> > > direclty
> > > > > over data pages, read them in multi-block fashion, use separate
> scan
> > > > buffer
> > > > > to avoid excessive evictions of other hot pages. Corresponding
> ticket
> > > for
> > > > > SQL exists [1], but idea is common for all parts of the system,
> > > requiring
> > > > > scans.
> > > > >
> > > > > As far as proposed solution, it might be good idea to add special
> API
> > > to
> > > > > "warmup" partition with clear explanation of pros (fast scan after
> > > > warmup)
> > > > > and cons (slowdown of any other operations). But I think we should
> > not
> > > > make
> > > > > this approach part of normal scans.
> > > > >
> > > > > Vladimir.
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/IGNITE-6057
> > > > >
> > > > >
> > > > > On Sun, Sep 16, 2018 at 6:44 PM Alexei Scherbakov <
> > > > > alexey.scherbak...@gmail.com> wrote:
> > > > >
> > > > > > Igniters,
> > > > > >
> > > > > > My use case involves scenario where it's necessary to iterate
> over
> > > > > > large(many TBs) persistent cache doing some calculation on read
> > data.
> > > > > >
> > > > > > The basic solution is to iterate cache using ScanQuery.
> > > > > >
> > > > > > This turns out to be slow 

Re: Cache scan efficiency

2018-09-17 Thread Dmitriy Setrakyan
I would rather fix the scan than hack the scan. Is there any technical
reason for hacking it now instead of fixing it properly? Can some of the
experts in this thread provide an estimate of complexity and difference in
work that would be required for each approach?

D.

On Mon, Sep 17, 2018 at 4:42 PM Alexey Goncharuk 
wrote:

> I think it would be beneficial for some Ignite users if we added such a
> partition warmup method to the public API. The method should be
> well-documented and state that it may invalidate existing page cache. It
> will be a very effective instrument until we add the proper scan ability
> that Vladimir was referring to.
>
> пн, 17 сент. 2018 г. в 13:05, Maxim Muzafarov :
>
> > Folks,
> >
> > Such warming up can be an effective technique for performing calculations
> > which required large cache
> > data reads, but I think it's the single narrow use case of all over
> Ignite
> > store usages. Like all other
> > powerfull techniques, we should use it wisely. In the general case, I
> think
> > we should consider other
> > techniques mentioned by Vladimir and may create something like `global
> > statistics of cache data usage`
> > to choose the best technique in each case.
> >
> > For instance, it's not obvious what would take longer: multi-block reads
> or
> > 50 single-block reads issues
> > sequentially. It strongly depends on used hardware under the hood and
> might
> > depend on workload system
> > resources (CPU-intensive calculations and I\O access) as well. But
> > `statistics` will help us to choose
> > the right way.
> >
> >
> > On Sun, 16 Sep 2018 at 23:59 Dmitriy Pavlov 
> wrote:
> >
> > > Hi Alexei,
> > >
> > > I did not find any PRs associated with the ticket for check code
> changes
> > > behind this idea. Are there any PRs?
> > >
> > > If we create some forwards scan of pages, it should be a very
> > intellectual
> > > algorithm including a lot of parameters (how much RAM is free, how
> > probably
> > > we will need next page, etc). We had the private talk about such idea
> > some
> > > time ago.
> > >
> > > By my experience, Linux systems already do such forward reading of file
> > > data (for corresponding sequential flagged file descriptors), but some
> > > prefetching of data at the level of application may be useful for
> > O_DIRECT
> > > file descriptors.
> > >
> > > And one more concern from me is about selecting a right place in the
> > system
> > > to do such prefetch.
> > >
> > > Sincerely,
> > > Dmitriy Pavlov
> > >
> > > вс, 16 сент. 2018 г. в 19:54, Vladimir Ozerov :
> > >
> > > > HI Alex,
> > > >
> > > > This is good that you observed speedup. But I do not think this
> > solution
> > > > works for the product in general case. Amount of RAM is limited, and
> > > even a
> > > > single partition may need more space than RAM available. Moving a lot
> > of
> > > > pages to page memory for scan means that you evict a lot of other
> > pages,
> > > > what will ultimately lead to bad performance of subsequent queries
> and
> > > > defeat LRU algorithms, which are of great improtance for good
> database
> > > > performance.
> > > >
> > > > Database vendors choose another approach - skip BTrees, iterate
> > direclty
> > > > over data pages, read them in multi-block fashion, use separate scan
> > > buffer
> > > > to avoid excessive evictions of other hot pages. Corresponding ticket
> > for
> > > > SQL exists [1], but idea is common for all parts of the system,
> > requiring
> > > > scans.
> > > >
> > > > As far as proposed solution, it might be good idea to add special API
> > to
> > > > "warmup" partition with clear explanation of pros (fast scan after
> > > warmup)
> > > > and cons (slowdown of any other operations). But I think we should
> not
> > > make
> > > > this approach part of normal scans.
> > > >
> > > > Vladimir.
> > > >
> > > > [1] https://issues.apache.org/jira/browse/IGNITE-6057
> > > >
> > > >
> > > > On Sun, Sep 16, 2018 at 6:44 PM Alexei Scherbakov <
> > > > alexey.scherbak...@gmail.com> wrote:
> > > >
> > > > > Igniters,
> > > > >
> > > > > My use case involves scenario where it's necessary to iterate over
> > > > > large(many TBs) persistent cache doing some calculation on read
> data.
> > > > >
> > > > > The basic solution is to iterate cache using ScanQuery.
> > > > >
> > > > > This turns out to be slow because iteration over cache involves a
> lot
> > > of
> > > > > random disk access for reading data pages referenced from leaf
> pages
> > by
> > > > > links.
> > > > >
> > > > > This is especially true when data is stored on disks with slow
> random
> > > > > access, like SAS disks. In my case on modern SAS disks array
> reading
> > > > speed
> > > > > was like several MB/sec while sequential read speed in perf test
> was
> > > > about
> > > > > GB/sec.
> > > > >
> > > > > I was able to fix the issue by using ScanQuery with explicit
> > partition
> > > > set
> > > > > and running simple warmup code before each partition scan.
> > > > >
> > > 

Re: Cache scan efficiency

2018-09-17 Thread Alexey Goncharuk
I think it would be beneficial for some Ignite users if we added such a
partition warmup method to the public API. The method should be
well-documented and state that it may invalidate existing page cache. It
will be a very effective instrument until we add the proper scan ability
that Vladimir was referring to.

пн, 17 сент. 2018 г. в 13:05, Maxim Muzafarov :

> Folks,
>
> Such warming up can be an effective technique for performing calculations
> which required large cache
> data reads, but I think it's the single narrow use case of all over Ignite
> store usages. Like all other
> powerfull techniques, we should use it wisely. In the general case, I think
> we should consider other
> techniques mentioned by Vladimir and may create something like `global
> statistics of cache data usage`
> to choose the best technique in each case.
>
> For instance, it's not obvious what would take longer: multi-block reads or
> 50 single-block reads issues
> sequentially. It strongly depends on used hardware under the hood and might
> depend on workload system
> resources (CPU-intensive calculations and I\O access) as well. But
> `statistics` will help us to choose
> the right way.
>
>
> On Sun, 16 Sep 2018 at 23:59 Dmitriy Pavlov  wrote:
>
> > Hi Alexei,
> >
> > I did not find any PRs associated with the ticket for check code changes
> > behind this idea. Are there any PRs?
> >
> > If we create some forwards scan of pages, it should be a very
> intellectual
> > algorithm including a lot of parameters (how much RAM is free, how
> probably
> > we will need next page, etc). We had the private talk about such idea
> some
> > time ago.
> >
> > By my experience, Linux systems already do such forward reading of file
> > data (for corresponding sequential flagged file descriptors), but some
> > prefetching of data at the level of application may be useful for
> O_DIRECT
> > file descriptors.
> >
> > And one more concern from me is about selecting a right place in the
> system
> > to do such prefetch.
> >
> > Sincerely,
> > Dmitriy Pavlov
> >
> > вс, 16 сент. 2018 г. в 19:54, Vladimir Ozerov :
> >
> > > HI Alex,
> > >
> > > This is good that you observed speedup. But I do not think this
> solution
> > > works for the product in general case. Amount of RAM is limited, and
> > even a
> > > single partition may need more space than RAM available. Moving a lot
> of
> > > pages to page memory for scan means that you evict a lot of other
> pages,
> > > what will ultimately lead to bad performance of subsequent queries and
> > > defeat LRU algorithms, which are of great improtance for good database
> > > performance.
> > >
> > > Database vendors choose another approach - skip BTrees, iterate
> direclty
> > > over data pages, read them in multi-block fashion, use separate scan
> > buffer
> > > to avoid excessive evictions of other hot pages. Corresponding ticket
> for
> > > SQL exists [1], but idea is common for all parts of the system,
> requiring
> > > scans.
> > >
> > > As far as proposed solution, it might be good idea to add special API
> to
> > > "warmup" partition with clear explanation of pros (fast scan after
> > warmup)
> > > and cons (slowdown of any other operations). But I think we should not
> > make
> > > this approach part of normal scans.
> > >
> > > Vladimir.
> > >
> > > [1] https://issues.apache.org/jira/browse/IGNITE-6057
> > >
> > >
> > > On Sun, Sep 16, 2018 at 6:44 PM Alexei Scherbakov <
> > > alexey.scherbak...@gmail.com> wrote:
> > >
> > > > Igniters,
> > > >
> > > > My use case involves scenario where it's necessary to iterate over
> > > > large(many TBs) persistent cache doing some calculation on read data.
> > > >
> > > > The basic solution is to iterate cache using ScanQuery.
> > > >
> > > > This turns out to be slow because iteration over cache involves a lot
> > of
> > > > random disk access for reading data pages referenced from leaf pages
> by
> > > > links.
> > > >
> > > > This is especially true when data is stored on disks with slow random
> > > > access, like SAS disks. In my case on modern SAS disks array reading
> > > speed
> > > > was like several MB/sec while sequential read speed in perf test was
> > > about
> > > > GB/sec.
> > > >
> > > > I was able to fix the issue by using ScanQuery with explicit
> partition
> > > set
> > > > and running simple warmup code before each partition scan.
> > > >
> > > > The code pins cold pages in memory in sequential order thus
> eliminating
> > > > random disk access. Speedup was like x100 magnitude.
> > > >
> > > > I suggest adding the improvement to the product's core  by always
> > > > sequentially preloading pages for all internal partition iterations
> > > (cache
> > > > iterators, scan queries, sql queries with scan plan) if partition is
> > cold
> > > > (low number of pinned pages).
> > > >
> > > > This also should speed up rebalancing from cold partitions.
> > > >
> > > > Ignite JIRA ticket [1]
> > > >
> > > > Thoughts ?
> > > >
> > > > [1] 

Re: Cache scan efficiency

2018-09-17 Thread Maxim Muzafarov
Folks,

Such warming up can be an effective technique for performing calculations
which required large cache
data reads, but I think it's the single narrow use case of all over Ignite
store usages. Like all other
powerfull techniques, we should use it wisely. In the general case, I think
we should consider other
techniques mentioned by Vladimir and may create something like `global
statistics of cache data usage`
to choose the best technique in each case.

For instance, it's not obvious what would take longer: multi-block reads or
50 single-block reads issues
sequentially. It strongly depends on used hardware under the hood and might
depend on workload system
resources (CPU-intensive calculations and I\O access) as well. But
`statistics` will help us to choose
the right way.


On Sun, 16 Sep 2018 at 23:59 Dmitriy Pavlov  wrote:

> Hi Alexei,
>
> I did not find any PRs associated with the ticket for check code changes
> behind this idea. Are there any PRs?
>
> If we create some forwards scan of pages, it should be a very intellectual
> algorithm including a lot of parameters (how much RAM is free, how probably
> we will need next page, etc). We had the private talk about such idea some
> time ago.
>
> By my experience, Linux systems already do such forward reading of file
> data (for corresponding sequential flagged file descriptors), but some
> prefetching of data at the level of application may be useful for O_DIRECT
> file descriptors.
>
> And one more concern from me is about selecting a right place in the system
> to do such prefetch.
>
> Sincerely,
> Dmitriy Pavlov
>
> вс, 16 сент. 2018 г. в 19:54, Vladimir Ozerov :
>
> > HI Alex,
> >
> > This is good that you observed speedup. But I do not think this solution
> > works for the product in general case. Amount of RAM is limited, and
> even a
> > single partition may need more space than RAM available. Moving a lot of
> > pages to page memory for scan means that you evict a lot of other pages,
> > what will ultimately lead to bad performance of subsequent queries and
> > defeat LRU algorithms, which are of great improtance for good database
> > performance.
> >
> > Database vendors choose another approach - skip BTrees, iterate direclty
> > over data pages, read them in multi-block fashion, use separate scan
> buffer
> > to avoid excessive evictions of other hot pages. Corresponding ticket for
> > SQL exists [1], but idea is common for all parts of the system, requiring
> > scans.
> >
> > As far as proposed solution, it might be good idea to add special API to
> > "warmup" partition with clear explanation of pros (fast scan after
> warmup)
> > and cons (slowdown of any other operations). But I think we should not
> make
> > this approach part of normal scans.
> >
> > Vladimir.
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-6057
> >
> >
> > On Sun, Sep 16, 2018 at 6:44 PM Alexei Scherbakov <
> > alexey.scherbak...@gmail.com> wrote:
> >
> > > Igniters,
> > >
> > > My use case involves scenario where it's necessary to iterate over
> > > large(many TBs) persistent cache doing some calculation on read data.
> > >
> > > The basic solution is to iterate cache using ScanQuery.
> > >
> > > This turns out to be slow because iteration over cache involves a lot
> of
> > > random disk access for reading data pages referenced from leaf pages by
> > > links.
> > >
> > > This is especially true when data is stored on disks with slow random
> > > access, like SAS disks. In my case on modern SAS disks array reading
> > speed
> > > was like several MB/sec while sequential read speed in perf test was
> > about
> > > GB/sec.
> > >
> > > I was able to fix the issue by using ScanQuery with explicit partition
> > set
> > > and running simple warmup code before each partition scan.
> > >
> > > The code pins cold pages in memory in sequential order thus eliminating
> > > random disk access. Speedup was like x100 magnitude.
> > >
> > > I suggest adding the improvement to the product's core  by always
> > > sequentially preloading pages for all internal partition iterations
> > (cache
> > > iterators, scan queries, sql queries with scan plan) if partition is
> cold
> > > (low number of pinned pages).
> > >
> > > This also should speed up rebalancing from cold partitions.
> > >
> > > Ignite JIRA ticket [1]
> > >
> > > Thoughts ?
> > >
> > > [1] https://issues.apache.org/jira/browse/IGNITE-8873
> > >
> > > --
> > >
> > > Best regards,
> > > Alexei Scherbakov
> > >
> >
>
-- 
--
Maxim Muzafarov


Re: Cache scan efficiency

2018-09-16 Thread Dmitriy Pavlov
Hi Alexei,

I did not find any PRs associated with the ticket for check code changes
behind this idea. Are there any PRs?

If we create some forwards scan of pages, it should be a very intellectual
algorithm including a lot of parameters (how much RAM is free, how probably
we will need next page, etc). We had the private talk about such idea some
time ago.

By my experience, Linux systems already do such forward reading of file
data (for corresponding sequential flagged file descriptors), but some
prefetching of data at the level of application may be useful for O_DIRECT
file descriptors.

And one more concern from me is about selecting a right place in the system
to do such prefetch.

Sincerely,
Dmitriy Pavlov

вс, 16 сент. 2018 г. в 19:54, Vladimir Ozerov :

> HI Alex,
>
> This is good that you observed speedup. But I do not think this solution
> works for the product in general case. Amount of RAM is limited, and even a
> single partition may need more space than RAM available. Moving a lot of
> pages to page memory for scan means that you evict a lot of other pages,
> what will ultimately lead to bad performance of subsequent queries and
> defeat LRU algorithms, which are of great improtance for good database
> performance.
>
> Database vendors choose another approach - skip BTrees, iterate direclty
> over data pages, read them in multi-block fashion, use separate scan buffer
> to avoid excessive evictions of other hot pages. Corresponding ticket for
> SQL exists [1], but idea is common for all parts of the system, requiring
> scans.
>
> As far as proposed solution, it might be good idea to add special API to
> "warmup" partition with clear explanation of pros (fast scan after warmup)
> and cons (slowdown of any other operations). But I think we should not make
> this approach part of normal scans.
>
> Vladimir.
>
> [1] https://issues.apache.org/jira/browse/IGNITE-6057
>
>
> On Sun, Sep 16, 2018 at 6:44 PM Alexei Scherbakov <
> alexey.scherbak...@gmail.com> wrote:
>
> > Igniters,
> >
> > My use case involves scenario where it's necessary to iterate over
> > large(many TBs) persistent cache doing some calculation on read data.
> >
> > The basic solution is to iterate cache using ScanQuery.
> >
> > This turns out to be slow because iteration over cache involves a lot of
> > random disk access for reading data pages referenced from leaf pages by
> > links.
> >
> > This is especially true when data is stored on disks with slow random
> > access, like SAS disks. In my case on modern SAS disks array reading
> speed
> > was like several MB/sec while sequential read speed in perf test was
> about
> > GB/sec.
> >
> > I was able to fix the issue by using ScanQuery with explicit partition
> set
> > and running simple warmup code before each partition scan.
> >
> > The code pins cold pages in memory in sequential order thus eliminating
> > random disk access. Speedup was like x100 magnitude.
> >
> > I suggest adding the improvement to the product's core  by always
> > sequentially preloading pages for all internal partition iterations
> (cache
> > iterators, scan queries, sql queries with scan plan) if partition is cold
> > (low number of pinned pages).
> >
> > This also should speed up rebalancing from cold partitions.
> >
> > Ignite JIRA ticket [1]
> >
> > Thoughts ?
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-8873
> >
> > --
> >
> > Best regards,
> > Alexei Scherbakov
> >
>


Re: Cache scan efficiency

2018-09-16 Thread Vladimir Ozerov
HI Alex,

This is good that you observed speedup. But I do not think this solution
works for the product in general case. Amount of RAM is limited, and even a
single partition may need more space than RAM available. Moving a lot of
pages to page memory for scan means that you evict a lot of other pages,
what will ultimately lead to bad performance of subsequent queries and
defeat LRU algorithms, which are of great improtance for good database
performance.

Database vendors choose another approach - skip BTrees, iterate direclty
over data pages, read them in multi-block fashion, use separate scan buffer
to avoid excessive evictions of other hot pages. Corresponding ticket for
SQL exists [1], but idea is common for all parts of the system, requiring
scans.

As far as proposed solution, it might be good idea to add special API to
"warmup" partition with clear explanation of pros (fast scan after warmup)
and cons (slowdown of any other operations). But I think we should not make
this approach part of normal scans.

Vladimir.

[1] https://issues.apache.org/jira/browse/IGNITE-6057


On Sun, Sep 16, 2018 at 6:44 PM Alexei Scherbakov <
alexey.scherbak...@gmail.com> wrote:

> Igniters,
>
> My use case involves scenario where it's necessary to iterate over
> large(many TBs) persistent cache doing some calculation on read data.
>
> The basic solution is to iterate cache using ScanQuery.
>
> This turns out to be slow because iteration over cache involves a lot of
> random disk access for reading data pages referenced from leaf pages by
> links.
>
> This is especially true when data is stored on disks with slow random
> access, like SAS disks. In my case on modern SAS disks array reading speed
> was like several MB/sec while sequential read speed in perf test was about
> GB/sec.
>
> I was able to fix the issue by using ScanQuery with explicit partition set
> and running simple warmup code before each partition scan.
>
> The code pins cold pages in memory in sequential order thus eliminating
> random disk access. Speedup was like x100 magnitude.
>
> I suggest adding the improvement to the product's core  by always
> sequentially preloading pages for all internal partition iterations (cache
> iterators, scan queries, sql queries with scan plan) if partition is cold
> (low number of pinned pages).
>
> This also should speed up rebalancing from cold partitions.
>
> Ignite JIRA ticket [1]
>
> Thoughts ?
>
> [1] https://issues.apache.org/jira/browse/IGNITE-8873
>
> --
>
> Best regards,
> Alexei Scherbakov
>


Re: Cache scan efficiency

2018-09-16 Thread Dmitriy Setrakyan
Alexey, this is a great feature. Can you explain what you meant by
"warm-up" when iterating through pages? Do you have this feature already
implemented?

D.

On Sun, Sep 16, 2018 at 6:44 PM Alexei Scherbakov <
alexey.scherbak...@gmail.com> wrote:

> Igniters,
>
> My use case involves scenario where it's necessary to iterate over
> large(many TBs) persistent cache doing some calculation on read data.
>
> The basic solution is to iterate cache using ScanQuery.
>
> This turns out to be slow because iteration over cache involves a lot of
> random disk access for reading data pages referenced from leaf pages by
> links.
>
> This is especially true when data is stored on disks with slow random
> access, like SAS disks. In my case on modern SAS disks array reading speed
> was like several MB/sec while sequential read speed in perf test was about
> GB/sec.
>
> I was able to fix the issue by using ScanQuery with explicit partition set
> and running simple warmup code before each partition scan.
>
> The code pins cold pages in memory in sequential order thus eliminating
> random disk access. Speedup was like x100 magnitude.
>
> I suggest adding the improvement to the product's core  by always
> sequentially preloading pages for all internal partition iterations (cache
> iterators, scan queries, sql queries with scan plan) if partition is cold
> (low number of pinned pages).
>
> This also should speed up rebalancing from cold partitions.
>
> Ignite JIRA ticket [1]
>
> Thoughts ?
>
> [1] https://issues.apache.org/jira/browse/IGNITE-8873
>
> --
>
> Best regards,
> Alexei Scherbakov
>