Wow, that's a very long email. While it's possible to fuzzy test bash, the problem is, first, you have to find a way to generate strings that maximize the chance of being a genuine command or a command that triggers a bug. This is very expensive...
Second, once you generate a command, how will your test program know if it found a bug? It's easy when bash segfaults, but in the case of shellshock, it wasn't a crash. It was rather that bash was executing code where it shouldn't. Not even humans were able to tell that for more than 20 years. I'm sure Stephane wasn't the first one to notice that, but he was the first one to realize the huge problem that this represented. Also, you have to find a way to isolate your machine/OS from the testing, because bash has access to the file system, and other things, that could destroy the operating system in the case of a command gone wrong. (I suggest looking at 'shbot' for this particular issue). Other than that, I guess, good luck. I also played with this idea for some time in my mind, but, didn't bother implementing it, because of the second point, mainly. I guess you could feed your command generator a large corpus of scripts, and use a Markov model to generate sequences of commands, that could be better than the combinatorics.