Matt Kettler wrote:
mouss wrote:
I also understand that US guys may get less encoded subjects, but at least in .fr, we have that all the time (because of our accented letters, and because many companies still use software that predates mime). and if I find a legitimate IP in a dnsbl used by SA, then I just remove that dnsbl.

Sounds like we need more non-us based corpus contributors. After all, the SA
devs can only work with what they get.

Also, bear in mind that SpamAssassin's creator, Justin Mason, isn't based in the
US. Last I checked he was in Ireland. Unfortunately this doesn't help with the
encoding issue, as they still use ordinary English characters over there for
most things. (I don't think Gaelic is very common in email.)

So bear in mind that SA isn't just "developed in the US by US citizens for US
markets".

oh, I never meant that.


However, it is true that the vast majority of the corpus currently comes from
folks who speak English (King's or Yankee) as a primary language, and that's a
bit of a problem as it creates considerable bias in the rules.

And even us US folks do have encoding issues. After all, English is not our
official language here in the US,

what do you mean here? what would be your official language?

 and I've got plenty of users that speak
multiple languages, not all of which use plain-ascii.


I guess so. now I'm not sure our situation isn't worst because people tried to find non standard solutions that are still used. I still remember the days when some customers were asking us to "fix" our software because "it broke their accents"... hopefully these times are gone, but I still see "broken" mail (much more than I should). actually, I also see mail that doesn't get rendered correctly on thunderbird. so I'll admit that the issue isn't really about accented chars...

Reply via email to