<snip>
Well, perhaps that is a more generic spam indicator: german text but not a
single Umlaut. I must think about that.
</snip>
You'd want a length qualifier on that test. An email of simply "Danke" would
contain a very small number of umlauts<g>.
Perhaps, such a rule should look for frequently used german words with
umlauts like "für" (means for) or "Möglichkeit" (means opportunity).
If the words contain the correct umlaut or the common low-ascii
circumscription (ae, oe, ue), it is no spam indicator. If it is
translated as ä->a, ö->o, ü->u, the common misspelling of someone who
doesn't know what umlauts are and whose keyboard doesn't have keys for
it, then it is a slight spam indicator. And if they are replaced by
totally wrong vocals or no vocals at all, like in many of our stock
spams, a stronger spam indicator.
It could be combined it with an analysis if the mail contains a word
with at least one real umlaut or no umlauts at all. At least one real
umlaut is an indicator for ham in such a rule.
Alex