To summarize:
We don't ban streams but use them with caution, avoid on hot paths. The
decision is up to the author and reviewer.

Thanks for the discussion!

On Mon, Sep 13, 2021 at 10:25 PM Pavel Tupitsyn <ptupit...@apache.org>
wrote:

> Konstantin,
>
> > The performance penalty ... is 25%, not 8 times
> I'm sure you are using a different JDK, I'm on OpenJDK 11 as listed in the
> gist.
>
> Ivan,
>
> > rewrite it using BlackHole
> I did, updated the gist, the results are the same.
>
>
>
> On Thu, Sep 9, 2021 at 11:12 AM Ilya Kasnacheev <ilya.kasnach...@gmail.com>
> wrote:
>
>> Hello!
>>
>> -1 Let's not ban Java Streams, for the reasons already listed.
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> чт, 9 сент. 2021 г. в 10:59, Ivan Daschinsky <ivanda...@gmail.com>:
>>
>> > Few words about the topic.
>> > There is a disease, quite common in the java community -- it is called
>> the
>> > streamosis.
>> > But, to be honest, afear of streams is also not good.
>> >
>> > As for me, sometimes rewriting code completely with simple loops often
>> > makes it more readable, understandable and usually faster.
>> >
>> > So I am against a complete ban of streams, but I am for using this tool
>> > with caution. Often streams make code ugly and non-readable at all.
>> >
>> >
>> > чт, 9 сент. 2021 г. в 10:50, Ivan Daschinsky <ivanda...@gmail.com>:
>> >
>> > > To be honest, Pavel, your benchmark is not quite correct. Please,
>> rewrite
>> > > it using BlackHole
>> > >
>> > > чт, 9 сент. 2021 г. в 10:28, Nikolay Izhikov <nizhi...@apache.org>:
>> > >
>> > >> +1 to ban Streams usage.
>> > >>
>> > >>
>> > >>
>> > >> > 9 сент. 2021 г., в 02:59, Valentin Kulichenko <
>> > >> valentin.kuliche...@gmail.com> написал(а):
>> > >> >
>> > >> > Pavel,
>> > >> >
>> > >> > Quite frankly, I think we used to lean into performance too much.
>> We
>> > >> > generally preferred it over data consistency, project modularity
>> and
>> > >> code
>> > >> > readability. Performance, of course, plays a very important rule in
>> > >> Ignite,
>> > >> > but it's possible to overdo anything.
>> > >> >
>> > >> > There are certainly parts of the project that can benefit from
>> > features
>> > >> > like Stream API, without significant concern over performance. CLI
>> is
>> > an
>> > >> > obvious example, but I'm sure it's not the only one.
>> > >> >
>> > >> > That said, I don't think that banning something is productive. At
>> the
>> > >> same
>> > >> > time, we should make sure we pay more attention to performance
>> during
>> > >> > reviews. Maybe we should have a checklist of major things to look
>> for?
>> > >> Not
>> > >> > as a set of strict rules, but more as a guideline for contributors
>> and
>> > >> > committers.
>> > >> >
>> > >> > We might also want to start benchmarking the code and tracking the
>> > >> progress.
>> > >> >
>> > >> > -Val
>> > >> >
>> > >> > On Wed, Sep 8, 2021 at 12:09 PM Pavel Tupitsyn <
>> ptupit...@apache.org>
>> > >> wrote:
>> > >> >
>> > >> >> Alexander, Ivan,
>> > >> >>
>> > >> >>> not very productive to assume that 100% of your code is
>> > >> >>> on the hot path
>> > >> >>
>> > >> >> That is exactly what we should be doing.
>> > >> >> When I joined Ignite community 6 years ago, this was the prevalent
>> > >> mindset.
>> > >> >>
>> > >> >> I'm not sure which part of Ignite can be considered "not on a hot
>> > >> path".
>> > >> >> Create/alter table (mentioned above) should perform well too.
>> > >> >>
>> > >> >>> measured first and only then optimized
>> > >> >>> https://wiki.c2.com/?OptimizeLater
>> > >> >>
>> > >> >> Extra allocations are a "death by a thousand cuts".
>> > >> >> They add up here and there, and then there are GC pauses.
>> > >> >> This would be hard to "optimize later".
>> > >> >>
>> > >> >> It is common for perf-oriented projects to avoid some techniques.
>> > >> >> For example, LINQ (streams analog from C# with similar perf
>> issues)
>> > is
>> > >> >> avoided in libraries and compilers [1].
>> > >> >>
>> > >> >> [1] https://github.com/dotnet/roslyn/blob/main/CONTRIBUTING.md
>> > >> >>
>> > >> >> On Wed, Sep 8, 2021 at 9:49 PM Valentin Kulichenko <
>> > >> >> valentin.kuliche...@gmail.com> wrote:
>> > >> >>
>> > >> >>> I don't think we should ban anything. Streams API is just a tool
>> in
>> > >> the
>> > >> >>> toolbox - it should be used appropriately. It's up to the
>> > contributor
>> > >> and
>> > >> >>> reviewer(s) to identify whether a particular usage might cause
>> > >> >> performance
>> > >> >>> issues.
>> > >> >>>
>> > >> >>> -Val
>> > >> >>>
>> > >> >>> On Wed, Sep 8, 2021 at 8:01 AM Alexander Polovtcev <
>> > >> >>> alexpolovt...@gmail.com>
>> > >> >>> wrote:
>> > >> >>>
>> > >> >>>> -1
>> > >> >>>> I think that it is not very productive to assume that 100% of
>> your
>> > >> code
>> > >> >>> is
>> > >> >>>> on the hot path, it would be impossible to write and maintain.
>> > Humans
>> > >> >> are
>> > >> >>>> not very good at guessing where the performance bottlenecks
>> are, so
>> > >> the
>> > >> >>>> performance of the possible hot paths should be measured first
>> and
>> > >> only
>> > >> >>>> then optimized and documented.
>> > >> >>>>
>> > >> >>>> On Wed, Sep 8, 2021 at 5:53 PM Ivan Pavlukhin <
>> vololo...@gmail.com
>> > >
>> > >> >>> wrote:
>> > >> >>>>
>> > >> >>>>> Does not this trivial strategy work for us?
>> > >> >>>>> https://wiki.c2.com/?OptimizeLater
>> > >> >>>>>
>> > >> >>>>> 2021-09-08 13:52 GMT+03:00, Andrey Gura <ag...@apache.org>:
>> > >> >>>>>> Agree that any additional object creation on a hot path could
>> be
>> > a
>> > >> >>>>>> problem. So hot path should not contain stream API and any
>> other
>> > >> >>>>>> potentially problem code (e.g. iterator instead of for by
>> index).
>> > >> >>>>>>
>> > >> >>>>>> On Wed, Sep 8, 2021 at 1:45 PM Pavel Tupitsyn <
>> > >> >> ptupit...@apache.org>
>> > >> >>>>> wrote:
>> > >> >>>>>>>
>> > >> >>>>>>> Ok, maybe a total ban is overkill, but right now streams are
>> > used
>> > >> >>> even
>> > >> >>>>> on
>> > >> >>>>>>> some hot paths like getAllAsync [1].
>> > >> >>>>>>>
>> > >> >>>>>>> Another issue is that Collectors.toList() and other variants
>> > don't
>> > >> >>>>> accept
>> > >> >>>>>>> capacity, and we end up with unnecessary reallocations of
>> > >> >> underlying
>> > >> >>>>>>> arrays.
>> > >> >>>>>>>
>> > >> >>>>>>> [1]
>> > >> >>>>>>>
>> > >> >>>>>
>> > >> >>>>
>> > >> >>>
>> > >> >>
>> > >>
>> >
>> https://github.com/apache/ignite-3/blob/1d7d703ff2b18234b15a9a7b03104fbb73388edf/modules/table/src/main/java/org/apache/ignite/internal/table/KVBinaryViewImpl.java#L83
>> > >> >>>>>>>
>> > >> >>>>>>> On Wed, Sep 8, 2021 at 1:06 PM Konstantin Orlov <
>> > >> >>> kor...@gridgain.com>
>> > >> >>>>>>> wrote:
>> > >> >>>>>>>
>> > >> >>>>>>>> Hi!
>> > >> >>>>>>>>
>> > >> >>>>>>>> Agree with Ivan that it’s an overkill. Code readability and
>> > >> >>>>>>>> maintainability should have
>> > >> >>>>>>>> the same priority as the performance (with some exceptions
>> of
>> > >> >>>> course).
>> > >> >>>>>>>>
>> > >> >>>>>>>> BTW the result of your benchmark looks quite strange. The
>> > >> >>>> performance
>> > >> >>>>>>>> penalty on
>> > >> >>>>>>>> my laptop (Core i7 9750H, 6 cores up to 4.50 GHz) is 25%,
>> not 8
>> > >> >>>> times:
>> > >> >>>>>>>>
>> > >> >>>>>>>> Benchmark                         Mode  Cnt      Score
>> >  Error
>> > >> >>>>>>>> Units
>> > >> >>>>>>>> JmhIncrementBenchmark.loopSum    thrpt   10  32347.819 ±
>> > 676.548
>> > >> >>>>>>>> ops/ms
>> > >> >>>>>>>> JmhIncrementBenchmark.streamSum  thrpt   10  24459.196 ±
>> > 610.152
>> > >> >>>>>>>> ops/ms
>> > >> >>>>>>>>
>> > >> >>>>>>>>
>> > >> >>>>>>>> --
>> > >> >>>>>>>> Regards,
>> > >> >>>>>>>> Konstantin Orlov
>> > >> >>>>>>>>
>> > >> >>>>>>>>
>> > >> >>>>>>>>> On 8 Sep 2021, at 12:23, Ivan Bessonov <
>> bessonov...@gmail.com
>> > >> >>>
>> > >> >>>>> wrote:
>> > >> >>>>>>>>>
>> > >> >>>>>>>>> Hello Igniters,
>> > >> >>>>>>>>>
>> > >> >>>>>>>>> I object, banning streams is an overkill. I would argue
>> that
>> > >> >>> most
>> > >> >>>> of
>> > >> >>>>>>>>> the
>> > >> >>>>>>>>> code
>> > >> >>>>>>>>> is not on hot paths and that allocations in TLAB don't
>> create
>> > >> >>> much
>> > >> >>>>>>>> pressure
>> > >> >>>>>>>>> on GC.
>> > >> >>>>>>>>>
>> > >> >>>>>>>>> Streams must be used cautiously, developers should know
>> > >> >> whether
>> > >> >>>> they
>> > >> >>>>>>>>> write hot methods or not. And if methods are not hot, code
>> > >> >>>>> simplicity
>> > >> >>>>>>>> must
>> > >> >>>>>>>>> be
>> > >> >>>>>>>>> the first priority. I don't want Ignite 3 code to look like
>> > >> >>>> Ignite 2
>> > >> >>>>>>>> code,
>> > >> >>>>>>>>> where
>> > >> >>>>>>>>> people would iterate over Lists using explicit access by
>> > >> >>> indexes,
>> > >> >>>>>>>> because it
>> > >> >>>>>>>>> saves them a single Iterator allocation. That's absurd.
>> > >> >>>>>>>>>
>> > >> >>>>>>>>> ср, 8 сент. 2021 г. в 11:43, Pavel Tupitsyn <
>> > >> >>> ptupit...@apache.org
>> > >> >>>>> :
>> > >> >>>>>>>>>
>> > >> >>>>>>>>>> Igniters,
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> Java streams are known to be slower and cause more GC
>> > >> >> pressure
>> > >> >>>> than
>> > >> >>>>>>>>>> an
>> > >> >>>>>>>>>> equivalent loop.
>> > >> >>>>>>>>>> Below is a simple filter/map/reduce scenario (code [1]):
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> * Benchmark
>> > >> >>>>> Mode
>> > >> >>>>>>>> Cnt
>> > >> >>>>>>>>>>   Score     Error   Units
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> * StreamVsLoopBenchmark.loopSum
>> > >> >>>>>>>>>> thrpt
>> > >> >>>>>>>>  3
>> > >> >>>>>>>>>> 7987.016 ± 293.013  ops/ms
>> > >> >>>>>>>>>> * StreamVsLoopBenchmark.loopSum:·gc.alloc.rate
>> > >> >>>>>>>>>> thrpt
>> > >> >>>>>>>>  3
>> > >> >>>>>>>>>>  ≈ 10⁻⁴            MB/sec
>> > >> >>>>>>>>>> * StreamVsLoopBenchmark.loopSum:·gc.count
>> > >> >>>>>>>>>> thrpt
>> > >> >>>>>>>>  3
>> > >> >>>>>>>>>>     ≈ 0            counts
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> * StreamVsLoopBenchmark.streamSum
>> > >> >>>>>>>>>> thrpt
>> > >> >>>>>>>>  3
>> > >> >>>>>>>>>> 1060.244 ±  36.485  ops/ms
>> > >> >>>>>>>>>> * StreamVsLoopBenchmark.streamSum:·gc.alloc.rate
>> > >> >>>>>>>>>> thrpt
>> > >> >>>>>>>>  3
>> > >> >>>>>>>>>> 315.819 ±  10.844  MB/sec
>> > >> >>>>>>>>>> * StreamVsLoopBenchmark.streamSum:·gc.count
>> > >> >>>>>>>>>> thrpt
>> > >> >>>>>>>>  3
>> > >> >>>>>>>>>>  55.000            counts
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> Loop is several times faster and does not allocate at all.
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> 1. Performance is one of the most important features of
>> our
>> > >> >>>>> product.
>> > >> >>>>>>>>>> 2. Most of our APIs will be on the hot path.
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> One can argue about performance differences in real-world
>> > >> >>>>> scenarios,
>> > >> >>>>>>>>>> but increasing GC pressure just to make the code a little
>> bit
>> > >> >>>> nicer
>> > >> >>>>>>>>>> is
>> > >> >>>>>>>>>> unacceptable.
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> I propose to ban streams usage in the codebase (except for
>> > >> >> the
>> > >> >>>>>>>>>> tests).
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> Thoughts, objections?
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>> [1]
>> > >> >>>>>>>>>>
>> > >> >>>>
>> https://gist.github.com/ptupitsyn/5934bbbf8f92ac4937e534af9386da97
>> > >> >>>>>>>>>>
>> > >> >>>>>>>>>
>> > >> >>>>>>>>>
>> > >> >>>>>>>>> --
>> > >> >>>>>>>>> Sincerely yours,
>> > >> >>>>>>>>> Ivan Bessonov
>> > >> >>>>>>>>
>> > >> >>>>>>>>
>> > >> >>>>>>
>> > >> >>>>>
>> > >> >>>>>
>> > >> >>>>> --
>> > >> >>>>>
>> > >> >>>>> Best regards,
>> > >> >>>>> Ivan Pavlukhin
>> > >> >>>>>
>> > >> >>>>
>> > >> >>>>
>> > >> >>>> --
>> > >> >>>> With regards,
>> > >> >>>> Aleksandr Polovtcev
>> > >> >>>>
>> > >> >>>
>> > >> >>
>> > >>
>> > >>
>> > >
>> > > --
>> > > Sincerely yours, Ivan Daschinskiy
>> > >
>> >
>> >
>> > --
>> > Sincerely yours, Ivan Daschinskiy
>> >
>>
>

Reply via email to