Those are all fine points. Asm can sometimes make a bigger difference than
people conditioned to not question compiler output expect (and there are many
such people). This is especially with vector units. A few years back, I wrote
an AVX2 vectorized "minimum" function that ran 24x (twenty four
We are in agreement if I understand you correctly.
I don't care whether Nim code runs 0.5% or 3% slower than C code. In fact, I
think that whole benchmarking is irrelevant except for a rough overview ("Nim
is within x% of C's speed").
Reason (and C/C++/D developers might want to read this
Also, I guess a TL;DR part C) - I was never arguing against the tautology that
a faster algorithm is faster. That is kind of a weird straw man position. No
idea how quoting 600 microseconds for it left that impression, but maybe bears
correcting the record. (I didn't assess correctness, though
Oh, I got your point and tried to emphasize that. I wasn't arguing against you
anywhere that I know of. I totally agree with your a,b, I suspect we don't
disagree on anything real at all, but are just discussing different aspects. I
tried to express praise for your approach (if you happened to
My understanding of the original article was that it was about elegant
abstractions and their costs. IMO Nim really shines here. This thread shows how
low cost iterators really are in Nim. They are far cheaper in Nim than I
thought they were. Kudos to the Nim core team!
@moerm also shows a
@moerm: Your algorithm uses Euclid's formula, which (1) does not exhaustively
enumerate all non-primitive Pythagorean triples (for example, it'll skip
9^2+12^2=15^2) and (2) does not enumerate them in the same order as the
original algorithm. To get that right, you have to jump through a few
I fully agree on Nim indeed _being_ a good language. My point though wasn't "I
can do faster code than ...".
My point was that one should a) _think_ about optimization starting from
"what's actually the point and what's the bottleneck or the most promising
approach?" (in this case it was "use
Every language has nested loops. My view is that the original article about
C++20 ranges conceived this test/benchmark to be about the cost, if any, of
abstractions, not exactly the performance of "nested loops however you write
it" as suggested by Timothee's code or "the fastest algorithm for
`{.inline.}` and `uint` ?.
For what it's worth: I c2nim'd the simple.cpp and slightly adapted it to have a
`limit` parameter to (using `i`) limit the number of computed triples. Compile
time on my Ryzen box and using gcc as the backend was around 1.6s the first
time and about 0.25 s for following runs (said Nim).
Create a doWhile template?
`markAndSweep` ?.
My first thought would be to make a simple infinite iterator function for z.
the boilerplate of that iterator code aside, I think this solves the scope
problem and the cognitive load problem and makes it look more elegant, but I
know iterators are expensive. I wonder how expensive for this
Nim + Intel's proprietary C/C++ compiler == easy benchmark wins over languages
married to LLVM.
original article:
[https://atilanevesoncode.wordpress.com/2018/12/31/comparing-pythagorean-triples-in-c-d-and-rust](https://atilanevesoncode.wordpress.com/2018/12/31/comparing-pythagorean-triples-in-c-d-and-rust)/
here's nim version I proposed: see
15 matches
Mail list logo