On Monday, 13 February 2017 at 00:56:37 UTC, Nestor wrote:
On Sunday, 12 February 2017 at 05:54:34 UTC, Era Scarecrow wrote:
Ran some more tests.

Wow!
Thanks for the interest and effort.

Certainly. But the bulk of the answer comes down that the 2 levels that I've already provided are the fastest you're probably going to get. Certainly we can test using shorts or bytes instead, but it's likely the results will only go down.

To note my tests are strictly on my x86 system and it would be better to also test this on other systems like PPC, Linux, ARM, and other architectures to see how they perform, and possibly tweak them as appropriate.

Still we did find out there is some optimization that can be done and successfully for the Damm algorithm, it just isn't going to be a lot.

Hmmm... A thought does come to mind. Parallelizing the code; However that would require probably 11 instances to get a 2x speedup (calculating the second half with all 10 possibilities for the carry over, and also calculating the first half, then choosing which of the 10 based on the first half's output), which only really works if you have a ton of cores, and the input is REALLY REALLY large, like a meg or something. While the usage of the Damm code is more useful for adding a digit to the end of a code like UPC or Barcodes as error detection, and expecting larger than 32 for real applications is unlikely.

 But at this point I'm rambling.

Reply via email to