Pavel,

Quite frankly, I think we used to lean into performance too much. We
generally preferred it over data consistency, project modularity and code
readability. Performance, of course, plays a very important rule in Ignite,
but it's possible to overdo anything.

There are certainly parts of the project that can benefit from features
like Stream API, without significant concern over performance. CLI is an
obvious example, but I'm sure it's not the only one.

That said, I don't think that banning something is productive. At the same
time, we should make sure we pay more attention to performance during
reviews. Maybe we should have a checklist of major things to look for? Not
as a set of strict rules, but more as a guideline for contributors and
committers.

We might also want to start benchmarking the code and tracking the progress.

-Val

On Wed, Sep 8, 2021 at 12:09 PM Pavel Tupitsyn <ptupit...@apache.org> wrote:

> Alexander, Ivan,
>
> > not very productive to assume that 100% of your code is
> > on the hot path
>
> That is exactly what we should be doing.
> When I joined Ignite community 6 years ago, this was the prevalent mindset.
>
> I'm not sure which part of Ignite can be considered "not on a hot path".
> Create/alter table (mentioned above) should perform well too.
>
> > measured first and only then optimized
> > https://wiki.c2.com/?OptimizeLater
>
> Extra allocations are a "death by a thousand cuts".
> They add up here and there, and then there are GC pauses.
> This would be hard to "optimize later".
>
> It is common for perf-oriented projects to avoid some techniques.
> For example, LINQ (streams analog from C# with similar perf issues) is
> avoided in libraries and compilers [1].
>
> [1] https://github.com/dotnet/roslyn/blob/main/CONTRIBUTING.md
>
> On Wed, Sep 8, 2021 at 9:49 PM Valentin Kulichenko <
> valentin.kuliche...@gmail.com> wrote:
>
> > I don't think we should ban anything. Streams API is just a tool in the
> > toolbox - it should be used appropriately. It's up to the contributor and
> > reviewer(s) to identify whether a particular usage might cause
> performance
> > issues.
> >
> > -Val
> >
> > On Wed, Sep 8, 2021 at 8:01 AM Alexander Polovtcev <
> > alexpolovt...@gmail.com>
> > wrote:
> >
> > > -1
> > > I think that it is not very productive to assume that 100% of your code
> > is
> > > on the hot path, it would be impossible to write and maintain. Humans
> are
> > > not very good at guessing where the performance bottlenecks are, so the
> > > performance of the possible hot paths should be measured first and only
> > > then optimized and documented.
> > >
> > > On Wed, Sep 8, 2021 at 5:53 PM Ivan Pavlukhin <vololo...@gmail.com>
> > wrote:
> > >
> > > > Does not this trivial strategy work for us?
> > > > https://wiki.c2.com/?OptimizeLater
> > > >
> > > > 2021-09-08 13:52 GMT+03:00, Andrey Gura <ag...@apache.org>:
> > > > > Agree that any additional object creation on a hot path could be a
> > > > > problem. So hot path should not contain stream API and any other
> > > > > potentially problem code (e.g. iterator instead of for by index).
> > > > >
> > > > > On Wed, Sep 8, 2021 at 1:45 PM Pavel Tupitsyn <
> ptupit...@apache.org>
> > > > wrote:
> > > > >>
> > > > >> Ok, maybe a total ban is overkill, but right now streams are used
> > even
> > > > on
> > > > >> some hot paths like getAllAsync [1].
> > > > >>
> > > > >> Another issue is that Collectors.toList() and other variants don't
> > > > accept
> > > > >> capacity, and we end up with unnecessary reallocations of
> underlying
> > > > >> arrays.
> > > > >>
> > > > >> [1]
> > > > >>
> > > >
> > >
> >
> https://github.com/apache/ignite-3/blob/1d7d703ff2b18234b15a9a7b03104fbb73388edf/modules/table/src/main/java/org/apache/ignite/internal/table/KVBinaryViewImpl.java#L83
> > > > >>
> > > > >> On Wed, Sep 8, 2021 at 1:06 PM Konstantin Orlov <
> > kor...@gridgain.com>
> > > > >> wrote:
> > > > >>
> > > > >> > Hi!
> > > > >> >
> > > > >> > Agree with Ivan that it’s an overkill. Code readability and
> > > > >> > maintainability should have
> > > > >> > the same priority as the performance (with some exceptions of
> > > course).
> > > > >> >
> > > > >> > BTW the result of your benchmark looks quite strange. The
> > > performance
> > > > >> > penalty on
> > > > >> > my laptop (Core i7 9750H, 6 cores up to 4.50 GHz) is 25%, not 8
> > > times:
> > > > >> >
> > > > >> > Benchmark                         Mode  Cnt      Score     Error
> > > > >> > Units
> > > > >> > JmhIncrementBenchmark.loopSum    thrpt   10  32347.819 ± 676.548
> > > > >> > ops/ms
> > > > >> > JmhIncrementBenchmark.streamSum  thrpt   10  24459.196 ± 610.152
> > > > >> > ops/ms
> > > > >> >
> > > > >> >
> > > > >> > --
> > > > >> > Regards,
> > > > >> > Konstantin Orlov
> > > > >> >
> > > > >> >
> > > > >> > > On 8 Sep 2021, at 12:23, Ivan Bessonov <bessonov...@gmail.com
> >
> > > > wrote:
> > > > >> > >
> > > > >> > > Hello Igniters,
> > > > >> > >
> > > > >> > > I object, banning streams is an overkill. I would argue that
> > most
> > > of
> > > > >> > > the
> > > > >> > > code
> > > > >> > > is not on hot paths and that allocations in TLAB don't create
> > much
> > > > >> > pressure
> > > > >> > > on GC.
> > > > >> > >
> > > > >> > > Streams must be used cautiously, developers should know
> whether
> > > they
> > > > >> > > write hot methods or not. And if methods are not hot, code
> > > > simplicity
> > > > >> > must
> > > > >> > > be
> > > > >> > > the first priority. I don't want Ignite 3 code to look like
> > > Ignite 2
> > > > >> > code,
> > > > >> > > where
> > > > >> > > people would iterate over Lists using explicit access by
> > indexes,
> > > > >> > because it
> > > > >> > > saves them a single Iterator allocation. That's absurd.
> > > > >> > >
> > > > >> > > ср, 8 сент. 2021 г. в 11:43, Pavel Tupitsyn <
> > ptupit...@apache.org
> > > >:
> > > > >> > >
> > > > >> > >> Igniters,
> > > > >> > >>
> > > > >> > >> Java streams are known to be slower and cause more GC
> pressure
> > > than
> > > > >> > >> an
> > > > >> > >> equivalent loop.
> > > > >> > >> Below is a simple filter/map/reduce scenario (code [1]):
> > > > >> > >>
> > > > >> > >> * Benchmark
> > > >  Mode
> > > > >> > Cnt
> > > > >> > >>    Score     Error   Units
> > > > >> > >>
> > > > >> > >> * StreamVsLoopBenchmark.loopSum
> > > > >> > >> thrpt
> > > > >> >   3
> > > > >> > >> 7987.016 ± 293.013  ops/ms
> > > > >> > >> * StreamVsLoopBenchmark.loopSum:·gc.alloc.rate
> > > > >> > >> thrpt
> > > > >> >   3
> > > > >> > >>   ≈ 10⁻⁴            MB/sec
> > > > >> > >> * StreamVsLoopBenchmark.loopSum:·gc.count
> > > > >> > >> thrpt
> > > > >> >   3
> > > > >> > >>      ≈ 0            counts
> > > > >> > >>
> > > > >> > >> * StreamVsLoopBenchmark.streamSum
> > > > >> > >> thrpt
> > > > >> >   3
> > > > >> > >> 1060.244 ±  36.485  ops/ms
> > > > >> > >> * StreamVsLoopBenchmark.streamSum:·gc.alloc.rate
> > > > >> > >> thrpt
> > > > >> >   3
> > > > >> > >>  315.819 ±  10.844  MB/sec
> > > > >> > >> * StreamVsLoopBenchmark.streamSum:·gc.count
> > > > >> > >> thrpt
> > > > >> >   3
> > > > >> > >>   55.000            counts
> > > > >> > >>
> > > > >> > >> Loop is several times faster and does not allocate at all.
> > > > >> > >>
> > > > >> > >> 1. Performance is one of the most important features of our
> > > > product.
> > > > >> > >> 2. Most of our APIs will be on the hot path.
> > > > >> > >>
> > > > >> > >> One can argue about performance differences in real-world
> > > > scenarios,
> > > > >> > >> but increasing GC pressure just to make the code a little bit
> > > nicer
> > > > >> > >> is
> > > > >> > >> unacceptable.
> > > > >> > >>
> > > > >> > >> I propose to ban streams usage in the codebase (except for
> the
> > > > >> > >> tests).
> > > > >> > >>
> > > > >> > >> Thoughts, objections?
> > > > >> > >>
> > > > >> > >> [1]
> > > > >> > >>
> > > https://gist.github.com/ptupitsyn/5934bbbf8f92ac4937e534af9386da97
> > > > >> > >>
> > > > >> > >
> > > > >> > >
> > > > >> > > --
> > > > >> > > Sincerely yours,
> > > > >> > > Ivan Bessonov
> > > > >> >
> > > > >> >
> > > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Best regards,
> > > > Ivan Pavlukhin
> > > >
> > >
> > >
> > > --
> > > With regards,
> > > Aleksandr Polovtcev
> > >
> >
>

Reply via email to