On Saturday, 20 January 2024 at 01:27:50 UTC, H. S. Teoh wrote:
On Sat, Jan 20, 2024 at 01:35:44AM +0100, Daniel Kozak via
Digitalmars-d-learn wrote: [...]
> Try addressing the points I wrote above and see if it
makes a
> difference.
I have tried it (all of it) even before you wrote it here,
because
I have completely the same ideas, but to be fair it has
almost zero
effect on speed.
There is my version (It still use OOP, but I have try it wit
Printer and Counter to be structs and it has no effect at
all) [2]https://paste.ofcode.org/38vKWLS8DHRazpv6MTidRJY
The only difference in speed in the end is caused by hash
implementation of dlang associative arrays and rust HashMap,
actually if you modify rust to not used ahash it has almost
same
speed as D
[...]
I'm confused by the chained hashing of the digits. Why is that
necessary? I would have thought it'd be faster to hash the
entire key instead of computing the hash of each digit and
chaining them together.
Hash which key? The problem of the naive hashtable-based
algorithm is that we don't know the length of the key. So all
lengths are tried starting from 1 and up to the remaining part of
the phone number. An additional optimization would be to limit
this search and terminate it early. For example, stop after we
exceed the length of the longest word from the dictionary. Or
stop the search upon encountering certain "leaf" words, but this
effectively morphs the algorithm into something that partially
resembles Trie. And further evolving it we may end up with a
full-fledged Trie.
The only practical value of this algorithm is that it's easy to
implement and has low LOC count. My initial non-optimized naive
version was also similar
https://gist.github.com/ssvb/abe821b3cdba7fcb7f3c3cecde864153 (66
lines including blank lines and some comments). This is somewhat
comparable to the Lisp results
https://flownet.com/ron/papers/lisp-java/raw-results.html or to
the Peter Norvig's solution http://www.norvig.com/java-lisp.html
The purpose of the initial implementation "prototype" is just to
have some sort of a starting point. And if the runtime
performance doesn't matter, then we are already done. But if the
runtime performance does matter, then a lot of things can be
rapidly improved by spending a bit of time on it. Eventually one
runs out of ideas and further optimization efforts yield only
diminishing returns. That's when it's a good time to clean up the
code, add comments, etc. and label it as the "final" version.
I think that a good study, that intends to compare different
programming languages against each other, could take all these
factors into consideration: time to implement the "prototype",
runtime performance of the prototype, time to implement the
"final" version, runtime performance of the final version, the
number of lines of code and its readability/maintainability.