Few words about the topic.
There is a disease, quite common in the java community -- it is called the
streamosis.
But, to be honest, afear of streams is also not good.

As for me, sometimes rewriting code completely with simple loops often
makes it more readable, understandable and usually faster.

So I am against a complete ban of streams, but I am for using this tool
with caution. Often streams make code ugly and non-readable at all.


чт, 9 сент. 2021 г. в 10:50, Ivan Daschinsky <ivanda...@gmail.com>:

> To be honest, Pavel, your benchmark is not quite correct. Please, rewrite
> it using BlackHole
>
> чт, 9 сент. 2021 г. в 10:28, Nikolay Izhikov <nizhi...@apache.org>:
>
>> +1 to ban Streams usage.
>>
>>
>>
>> > 9 сент. 2021 г., в 02:59, Valentin Kulichenko <
>> valentin.kuliche...@gmail.com> написал(а):
>> >
>> > Pavel,
>> >
>> > Quite frankly, I think we used to lean into performance too much. We
>> > generally preferred it over data consistency, project modularity and
>> code
>> > readability. Performance, of course, plays a very important rule in
>> Ignite,
>> > but it's possible to overdo anything.
>> >
>> > There are certainly parts of the project that can benefit from features
>> > like Stream API, without significant concern over performance. CLI is an
>> > obvious example, but I'm sure it's not the only one.
>> >
>> > That said, I don't think that banning something is productive. At the
>> same
>> > time, we should make sure we pay more attention to performance during
>> > reviews. Maybe we should have a checklist of major things to look for?
>> Not
>> > as a set of strict rules, but more as a guideline for contributors and
>> > committers.
>> >
>> > We might also want to start benchmarking the code and tracking the
>> progress.
>> >
>> > -Val
>> >
>> > On Wed, Sep 8, 2021 at 12:09 PM Pavel Tupitsyn <ptupit...@apache.org>
>> wrote:
>> >
>> >> Alexander, Ivan,
>> >>
>> >>> not very productive to assume that 100% of your code is
>> >>> on the hot path
>> >>
>> >> That is exactly what we should be doing.
>> >> When I joined Ignite community 6 years ago, this was the prevalent
>> mindset.
>> >>
>> >> I'm not sure which part of Ignite can be considered "not on a hot
>> path".
>> >> Create/alter table (mentioned above) should perform well too.
>> >>
>> >>> measured first and only then optimized
>> >>> https://wiki.c2.com/?OptimizeLater
>> >>
>> >> Extra allocations are a "death by a thousand cuts".
>> >> They add up here and there, and then there are GC pauses.
>> >> This would be hard to "optimize later".
>> >>
>> >> It is common for perf-oriented projects to avoid some techniques.
>> >> For example, LINQ (streams analog from C# with similar perf issues) is
>> >> avoided in libraries and compilers [1].
>> >>
>> >> [1] https://github.com/dotnet/roslyn/blob/main/CONTRIBUTING.md
>> >>
>> >> On Wed, Sep 8, 2021 at 9:49 PM Valentin Kulichenko <
>> >> valentin.kuliche...@gmail.com> wrote:
>> >>
>> >>> I don't think we should ban anything. Streams API is just a tool in
>> the
>> >>> toolbox - it should be used appropriately. It's up to the contributor
>> and
>> >>> reviewer(s) to identify whether a particular usage might cause
>> >> performance
>> >>> issues.
>> >>>
>> >>> -Val
>> >>>
>> >>> On Wed, Sep 8, 2021 at 8:01 AM Alexander Polovtcev <
>> >>> alexpolovt...@gmail.com>
>> >>> wrote:
>> >>>
>> >>>> -1
>> >>>> I think that it is not very productive to assume that 100% of your
>> code
>> >>> is
>> >>>> on the hot path, it would be impossible to write and maintain. Humans
>> >> are
>> >>>> not very good at guessing where the performance bottlenecks are, so
>> the
>> >>>> performance of the possible hot paths should be measured first and
>> only
>> >>>> then optimized and documented.
>> >>>>
>> >>>> On Wed, Sep 8, 2021 at 5:53 PM Ivan Pavlukhin <vololo...@gmail.com>
>> >>> wrote:
>> >>>>
>> >>>>> Does not this trivial strategy work for us?
>> >>>>> https://wiki.c2.com/?OptimizeLater
>> >>>>>
>> >>>>> 2021-09-08 13:52 GMT+03:00, Andrey Gura <ag...@apache.org>:
>> >>>>>> Agree that any additional object creation on a hot path could be a
>> >>>>>> problem. So hot path should not contain stream API and any other
>> >>>>>> potentially problem code (e.g. iterator instead of for by index).
>> >>>>>>
>> >>>>>> On Wed, Sep 8, 2021 at 1:45 PM Pavel Tupitsyn <
>> >> ptupit...@apache.org>
>> >>>>> wrote:
>> >>>>>>>
>> >>>>>>> Ok, maybe a total ban is overkill, but right now streams are used
>> >>> even
>> >>>>> on
>> >>>>>>> some hot paths like getAllAsync [1].
>> >>>>>>>
>> >>>>>>> Another issue is that Collectors.toList() and other variants don't
>> >>>>> accept
>> >>>>>>> capacity, and we end up with unnecessary reallocations of
>> >> underlying
>> >>>>>>> arrays.
>> >>>>>>>
>> >>>>>>> [1]
>> >>>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> https://github.com/apache/ignite-3/blob/1d7d703ff2b18234b15a9a7b03104fbb73388edf/modules/table/src/main/java/org/apache/ignite/internal/table/KVBinaryViewImpl.java#L83
>> >>>>>>>
>> >>>>>>> On Wed, Sep 8, 2021 at 1:06 PM Konstantin Orlov <
>> >>> kor...@gridgain.com>
>> >>>>>>> wrote:
>> >>>>>>>
>> >>>>>>>> Hi!
>> >>>>>>>>
>> >>>>>>>> Agree with Ivan that it’s an overkill. Code readability and
>> >>>>>>>> maintainability should have
>> >>>>>>>> the same priority as the performance (with some exceptions of
>> >>>> course).
>> >>>>>>>>
>> >>>>>>>> BTW the result of your benchmark looks quite strange. The
>> >>>> performance
>> >>>>>>>> penalty on
>> >>>>>>>> my laptop (Core i7 9750H, 6 cores up to 4.50 GHz) is 25%, not 8
>> >>>> times:
>> >>>>>>>>
>> >>>>>>>> Benchmark                         Mode  Cnt      Score     Error
>> >>>>>>>> Units
>> >>>>>>>> JmhIncrementBenchmark.loopSum    thrpt   10  32347.819 ± 676.548
>> >>>>>>>> ops/ms
>> >>>>>>>> JmhIncrementBenchmark.streamSum  thrpt   10  24459.196 ± 610.152
>> >>>>>>>> ops/ms
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> --
>> >>>>>>>> Regards,
>> >>>>>>>> Konstantin Orlov
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>> On 8 Sep 2021, at 12:23, Ivan Bessonov <bessonov...@gmail.com
>> >>>
>> >>>>> wrote:
>> >>>>>>>>>
>> >>>>>>>>> Hello Igniters,
>> >>>>>>>>>
>> >>>>>>>>> I object, banning streams is an overkill. I would argue that
>> >>> most
>> >>>> of
>> >>>>>>>>> the
>> >>>>>>>>> code
>> >>>>>>>>> is not on hot paths and that allocations in TLAB don't create
>> >>> much
>> >>>>>>>> pressure
>> >>>>>>>>> on GC.
>> >>>>>>>>>
>> >>>>>>>>> Streams must be used cautiously, developers should know
>> >> whether
>> >>>> they
>> >>>>>>>>> write hot methods or not. And if methods are not hot, code
>> >>>>> simplicity
>> >>>>>>>> must
>> >>>>>>>>> be
>> >>>>>>>>> the first priority. I don't want Ignite 3 code to look like
>> >>>> Ignite 2
>> >>>>>>>> code,
>> >>>>>>>>> where
>> >>>>>>>>> people would iterate over Lists using explicit access by
>> >>> indexes,
>> >>>>>>>> because it
>> >>>>>>>>> saves them a single Iterator allocation. That's absurd.
>> >>>>>>>>>
>> >>>>>>>>> ср, 8 сент. 2021 г. в 11:43, Pavel Tupitsyn <
>> >>> ptupit...@apache.org
>> >>>>> :
>> >>>>>>>>>
>> >>>>>>>>>> Igniters,
>> >>>>>>>>>>
>> >>>>>>>>>> Java streams are known to be slower and cause more GC
>> >> pressure
>> >>>> than
>> >>>>>>>>>> an
>> >>>>>>>>>> equivalent loop.
>> >>>>>>>>>> Below is a simple filter/map/reduce scenario (code [1]):
>> >>>>>>>>>>
>> >>>>>>>>>> * Benchmark
>> >>>>> Mode
>> >>>>>>>> Cnt
>> >>>>>>>>>>   Score     Error   Units
>> >>>>>>>>>>
>> >>>>>>>>>> * StreamVsLoopBenchmark.loopSum
>> >>>>>>>>>> thrpt
>> >>>>>>>>  3
>> >>>>>>>>>> 7987.016 ± 293.013  ops/ms
>> >>>>>>>>>> * StreamVsLoopBenchmark.loopSum:·gc.alloc.rate
>> >>>>>>>>>> thrpt
>> >>>>>>>>  3
>> >>>>>>>>>>  ≈ 10⁻⁴            MB/sec
>> >>>>>>>>>> * StreamVsLoopBenchmark.loopSum:·gc.count
>> >>>>>>>>>> thrpt
>> >>>>>>>>  3
>> >>>>>>>>>>     ≈ 0            counts
>> >>>>>>>>>>
>> >>>>>>>>>> * StreamVsLoopBenchmark.streamSum
>> >>>>>>>>>> thrpt
>> >>>>>>>>  3
>> >>>>>>>>>> 1060.244 ±  36.485  ops/ms
>> >>>>>>>>>> * StreamVsLoopBenchmark.streamSum:·gc.alloc.rate
>> >>>>>>>>>> thrpt
>> >>>>>>>>  3
>> >>>>>>>>>> 315.819 ±  10.844  MB/sec
>> >>>>>>>>>> * StreamVsLoopBenchmark.streamSum:·gc.count
>> >>>>>>>>>> thrpt
>> >>>>>>>>  3
>> >>>>>>>>>>  55.000            counts
>> >>>>>>>>>>
>> >>>>>>>>>> Loop is several times faster and does not allocate at all.
>> >>>>>>>>>>
>> >>>>>>>>>> 1. Performance is one of the most important features of our
>> >>>>> product.
>> >>>>>>>>>> 2. Most of our APIs will be on the hot path.
>> >>>>>>>>>>
>> >>>>>>>>>> One can argue about performance differences in real-world
>> >>>>> scenarios,
>> >>>>>>>>>> but increasing GC pressure just to make the code a little bit
>> >>>> nicer
>> >>>>>>>>>> is
>> >>>>>>>>>> unacceptable.
>> >>>>>>>>>>
>> >>>>>>>>>> I propose to ban streams usage in the codebase (except for
>> >> the
>> >>>>>>>>>> tests).
>> >>>>>>>>>>
>> >>>>>>>>>> Thoughts, objections?
>> >>>>>>>>>>
>> >>>>>>>>>> [1]
>> >>>>>>>>>>
>> >>>> https://gist.github.com/ptupitsyn/5934bbbf8f92ac4937e534af9386da97
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> --
>> >>>>>>>>> Sincerely yours,
>> >>>>>>>>> Ivan Bessonov
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>>
>> >>>>> --
>> >>>>>
>> >>>>> Best regards,
>> >>>>> Ivan Pavlukhin
>> >>>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> With regards,
>> >>>> Aleksandr Polovtcev
>> >>>>
>> >>>
>> >>
>>
>>
>
> --
> Sincerely yours, Ivan Daschinskiy
>


-- 
Sincerely yours, Ivan Daschinskiy

Reply via email to