On Wed, 18 Jan 2006, [EMAIL PROTECTED] yowled:
> Robert Menschel <[EMAIL PROTECTED]> schrieb am 17.01.2006 03:41:39:
>> Bigger problem: bayes can only learn what it's taught.  If you have
>> ham that really should be trained, and because of privacy issues it
>> should not be kept after training, then you really should develop a
>> system which will enable you to train without retaining.  Bayes works
>> best when properly and fully trained, not just trained on "those
>> unimportant non-private emails are ham".
> 
> Yes, I might forfeit the storage of ham mails in a Notes DB for that,
> BUT... I really doubt that the management would even give permission to
> send those messages into SA.

Well, if you don't train SA with them, you run the risk that Bayes will
misclassify them as spam.

Which is considered more important?

> When I say "confidential" it is really one of those few times where
> it means "confidential" ;) Our customers are mostly big banks, big
> insurance companies and the German government. Even the slightest
> risk of leaking _any_ kind of information could
> get us into problems noone even wants to imagine here...

Have a look at `sa-learn --dump' one of these days. It's changed. The
tokens are *hashes*, which means the only way an attacker who stole your
Bayes DB could learn what you'd been corresponding about would be to
guess words and try to look up their hashes, and even then they couldn't
tell what emails they'd been in.

(Possibly more risky is that the message IDs are recorded in the
bayes_seen DB, which means that if someone stole that and sorted it by
frequency, they could determine the source systems of your
major correspondents. But if they could do that, they could probably
also spy on your email in flight...)

One final problem is that as I understand it Notes damages the headers
of mail in its DB :( but this could be wrong, as I have no actual
experience of Notes. You'll probably want to add any headers Notes adds
to bayes_ignore_headers, at at least.

-- 
`Logic and human nature don't seem to mix very well,
 unfortunately.' --- Velvet Wood

Reply via email to