Maybe it is time to look at the python implementation and see why it is faster.
It isn't faster:
$ time python3 test.py real 0m14.217s user 0m14.209s sys 0m0.004s $ gdmd -O -inline -release -noboundscheck test $ time ./test real 0m5.323s user 0m5.312s sys 0m0.008sD code here uses the same string as the python code, not the one in cvk012c's D code.