On Saturday, 20 January 2024 at 01:27:50 UTC, H. S. Teoh wrote:
On Sat, Jan 20, 2024 at 01:35:44AM +0100, Daniel Kozak via Digitalmars-d-learn wrote: [...]
> Try addressing the points I wrote above and see if it makes a
   > difference.

I have tried it (all of it) even before you wrote it here, because I have completely the same ideas, but to be fair it has almost zero
   effect on speed.
   There is my version (It still use OOP, but I have try it wit
   Printer and Counter to be structs and it has no effect at
   all) [2]https://paste.ofcode.org/38vKWLS8DHRazpv6MTidRJY
   The only difference in speed in the end is caused by hash
   implementation of dlang associative arrays and rust HashMap,
actually if you modify rust to not used ahash it has almost same
   speed as D
[...]

I'm confused by the chained hashing of the digits. Why is that necessary? I would have thought it'd be faster to hash the entire key instead of computing the hash of each digit and chaining them together.

Hash which key? The problem of the naive hashtable-based algorithm is that we don't know the length of the key. So all lengths are tried starting from 1 and up to the remaining part of the phone number. An additional optimization would be to limit this search and terminate it early. For example, stop after we exceed the length of the longest word from the dictionary. Or stop the search upon encountering certain "leaf" words, but this effectively morphs the algorithm into something that partially resembles Trie. And further evolving it we may end up with a full-fledged Trie.

The only practical value of this algorithm is that it's easy to implement and has low LOC count. My initial non-optimized naive version was also similar https://gist.github.com/ssvb/abe821b3cdba7fcb7f3c3cecde864153 (66 lines including blank lines and some comments). This is somewhat comparable to the Lisp results https://flownet.com/ron/papers/lisp-java/raw-results.html or to the Peter Norvig's solution http://www.norvig.com/java-lisp.html

The purpose of the initial implementation "prototype" is just to have some sort of a starting point. And if the runtime performance doesn't matter, then we are already done. But if the runtime performance does matter, then a lot of things can be rapidly improved by spending a bit of time on it. Eventually one runs out of ideas and further optimization efforts yield only diminishing returns. That's when it's a good time to clean up the code, add comments, etc. and label it as the "final" version.

I think that a good study, that intends to compare different programming languages against each other, could take all these factors into consideration: time to implement the "prototype", runtime performance of the prototype, time to implement the "final" version, runtime performance of the final version, the number of lines of code and its readability/maintainability.

Reply via email to