Re: Claims manager / LOTTO_AGENT

2012-11-08 Thread Alexandre Boyer
Hello there,

Well if you feel uncomfortable with running mass-check and send data
(not the email themselves, just the rules they hit, as Darxus is
pointing out), you may want to override the score for those rules in
your local.cf.

You may even write you own rules to compensate those false positives.

If you can't contribute to SA by giving feedback via the mass-check,
then do what you need to do on your side. Everybody here will be glad to
help ;)

Alex, from prypiat.
Yes, I recycle.


On 12-11-07 11:02 PM, Michael Orlitzky wrote:
 On 11/07/2012 10:36 PM, dar...@chaosreigns.com wrote:
 On 11/07, Michael Orlitzky wrote:
 Sorry, I was a little rude. But saying that she shouldn't put her job
 title anywhere in an email, ever, is ridiculous. 
 Certainly.

 The inputs (spam, ham)
 to the classifier are assumed god-given; and the classification needs to
 reflect the data, not the other way around.
 If the classifier is spamassassin, and The inputs are the spam
 and ham data provided via masscheck, then... the scores provided via
 sa-update *do* reflect the data.  So I'm not sure what you mean.

 The ideal rule scores are chosen to cause one false positive (ham flagged
 as spam) in every 2,500 hams, while maximizing the number of spams
 correctly flagged as spams.  With so few hams hitting this rule in the
 masscheck corpora, we're way below that threshold based on the data we
 have.

 I wrote that before I saw your clarification, sorry again for coming off
 as a jerk. Ignore it.


 This is my fault, of course, but I'm not allowed to mass-check this
 stuff. It's ongoing legal correspondence.
 Er, what?  You're not allowed to provide a list of which rules hit each
 of your emails?  Or you're not allowed to run a program on your emails
 that isn't spamassassin?  Or did I just not put This does not require
 sending us your email in bold enough times on the masscheck page?

 This is a client of ours (a law firm) and not the company that I work
 for. *I* know there's probably nothing sensitive in there, but just to
 cover my ass I'd need to get permission to send the results off-site.
 From their perspective, it's just simpler to say no: it's not worth the
 time or effort to even think about if there's a minute chance of it
 coming back to bite them legally.



signature.asc
Description: OpenPGP digital signature


Re: Claims manager / LOTTO_AGENT

2012-11-08 Thread John Hardin

On Wed, 7 Nov 2012, Michael Orlitzky wrote:


On 11/07/2012 10:36 PM, dar...@chaosreigns.com wrote:

On 11/07, Michael Orlitzky wrote:

This is my fault, of course, but I'm not allowed to mass-check this
stuff. It's ongoing legal correspondence.


Er, what?  You're not allowed to provide a list of which rules hit each
of your emails?  Or you're not allowed to run a program on your emails
that isn't spamassassin?  Or did I just not put This does not require
sending us your email in bold enough times on the masscheck page?


This is a client of ours (a law firm) and not the company that I work
for. *I* know there's probably nothing sensitive in there, but just to
cover my ass I'd need to get permission to send the results off-site.


Only the list of rules which hit is publicly visible, the actual content 
of the message is not. Any leakage of confidential information is very 
unlikely.



From their perspective, it's just simpler to say no: it's not worth the
time or effort to even think about if there's a minute chance of it
coming back to bite them legally.


I will take a look at claims manager in the 419 rules.

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  ...the good of having the government prohibited from doing harm
  far outweighs the harm of having it obstructed from doing good.
   -- Mike@mike-istan
---
 3 days until Veterans Day


Re: Claims manager / LOTTO_AGENT

2012-11-08 Thread Michael Orlitzky
On 11/08/2012 10:44 AM, John Hardin wrote:

 This is a client of ours (a law firm) and not the company that I work
 for. *I* know there's probably nothing sensitive in there, but just to
 cover my ass I'd need to get permission to send the results off-site.
 
 Only the list of rules which hit is publicly visible, the actual content 
 of the message is not. Any leakage of confidential information is very 
 unlikely.

I know, but there chance isn't zero. For example, I wouldn't want to
mass-check a corpus of emails to my girlfriend, and have it report that
they hit LOTS_OF_VIAGRA.

Likewise, things like LOTTO_AGENT can reveal that someone communicated
with a claims manager. I've explained both sides, and as long as it's a
non-zero chance, they aren't having it. It isn't even that there's a
risk of leaking anything -- the fact that anything at all is sent could
be used as justification for a pain-in-the-ass investigation that nobody
wants.


 From their perspective, it's just simpler to say no: it's not worth the
 time or effort to even think about if there's a minute chance of it
 coming back to bite them legally.
 
 I will take a look at claims manager in the 419 rules.
 

I appreciate it, thanks.


Re: Claims manager / LOTTO_AGENT

2012-11-08 Thread John Hardin

On Thu, 8 Nov 2012, Michael Orlitzky wrote:

On 11/08/2012 10:44 AM, John Hardin wrote:


I will take a look at claims manager in the 419 rules.


I appreciate it, thanks.


Okay, I've committed some tuning for that rule. I will probably take a 
couple of days before it shows up in a rules update.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  When I say I don't want the government to do X, do not
  automatically assume that means I don't want X to happen.
---
 3 days until Veterans Day


Claims manager / LOTTO_AGENT

2012-11-07 Thread Michael Orlitzky
So, LOTTO_AGENT will hit the string Claims Manager for 3.5 points.
This is bad news for,

  Barbara R. Krieg, Claims Manager
  Foodliner, Inc. / Quest Liner / Truck Country P.O. Box 1565 Dubuque,IA

who has a signature at the bottom of her messages.

This is compounded by the fact that

  ADVANCE_FEE_2_NEW_MONEY = __ADVANCE_FEE_2_NEW_MONEY  ...
  __ADVANCE_FEE_2_NEW_MONEY = LOTS_OF_MONEY  __ADVANCE_FEE_2_NEW
  __ADVANCE_FEE_2_NEW  = (__AFRICAN_STATE + ... + LOTTO_AGENT + ...  1)

for a total score of around 7.8. Believe it or not, claims managers talk
about LOTS_OF_MONEY =)

Can one of these be made a little more strict? Sorry to be a pain and
submit these one at a time, but most of the ones that give me trouble
are confidential.


Re: Claims manager / LOTTO_AGENT

2012-11-07 Thread darxus
Just in case nobody has pointed you toward it before:
https://wiki.apache.org/spamassassin/NightlyMassCheck

Stats we currently have on that rule:
http://ruleqa.spamassassin.org/?daterev=20121103rule=LOTTO_AGENT

  MSECSSPAM% HAM% S/ORANK   SCORE  NAME   WHO/AGE
  0   0.5022   0.0011   0.9980.743.50  LOTTO_AGENT  

It hits 2 of the 180,272 non-spams we have for use in optimal score
generation.  


On 11/07, Michael Orlitzky wrote:
 So, LOTTO_AGENT will hit the string Claims Manager for 3.5 points.
 This is bad news for,
 
   Barbara R. Krieg, Claims...

When you put a string an an email that hits a spamassassin rule... your
email then hits that spamassassin rule.  You should generally try to avoid
that.

-- 
It's never too late to panic.
http://www.ChaosReigns.com


Re: Claims manager / LOTTO_AGENT

2012-11-07 Thread Michael Orlitzky
On 11/07/2012 09:49 PM, dar...@chaosreigns.com wrote:
 On 11/07, Michael Orlitzky wrote:
 So, LOTTO_AGENT will hit the string Claims Manager for 3.5 points.
 This is bad news for,

   Barbara R. Krieg, Claims...
 
 When you put a string an an email that hits a spamassassin rule... your
 email then hits that spamassassin rule.  You should generally try to avoid
 that.
 

Yeah, well it's her job title, so...? You misunderstand statistics. The
data aren't wrong.


Re: Claims manager / LOTTO_AGENT

2012-11-07 Thread darxus
On 11/07, Michael Orlitzky wrote:
 Yeah, well it's her job title, so...? You misunderstand statistics. The
 data aren't wrong.

Do I?  I think it's more likely that you misunderstand what is expected of
spamassassin rules.

Somebody really should put up a page in the wiki explaining that rules all
have false positives, and that's the entire reason we don't flag an email
as spam for any one rule, etc..


But if you provide us with more masscheck data, we can do a better job of
automatically calculating ideal scores.

-- 
Of course there's strength in numbers. But there's strength in sharp
weaponry too. Ironically, this lead to what we call 'civilization'.
- spore
http://www.ChaosReigns.com


Re: Claims manager / LOTTO_AGENT

2012-11-07 Thread darxus
On 11/07, Michael Orlitzky wrote:
 On 11/07/2012 09:49 PM, dar...@chaosreigns.com wrote:
  On 11/07, Michael Orlitzky wrote:
  So, LOTTO_AGENT will hit the string Claims Manager for 3.5 points.
  This is bad news for,
 
Barbara R. Krieg, Claims...
  
  When you put a string an an email that hits a spamassassin rule... your
  email then hits that spamassassin rule.  You should generally try to avoid
  that.
 
 Yeah, well it's her job title, so...? You misunderstand statistics. The
 data aren't wrong.

After re-reading, I think you may have misunderstood my suggestion to avoid
putting stuff in emails that is known to hit spam rules.  I wasn't
suggesting that Barbara R. Krieg change her signature, I was suggesting
that you not include it intact when posting to this mailing list about it.

-- 
You shall know the truth, and it shall make you odd.
-- Flannery O'Connor
http://www.ChaosReigns.com


Re: Claims manager / LOTTO_AGENT

2012-11-07 Thread Michael Orlitzky
On 11/07/2012 10:12 PM, dar...@chaosreigns.com wrote:
 On 11/07, Michael Orlitzky wrote:
 Yeah, well it's her job title, so...? You misunderstand statistics. The
 data aren't wrong.
 
 Do I?  I think it's more likely that you misunderstand what is expected of
 spamassassin rules.
 

Sorry, I was a little rude. But saying that she shouldn't put her job
title anywhere in an email, ever, is ridiculous. The inputs (spam, ham)
to the classifier are assumed god-given; and the classification needs to
reflect the data, not the other way around.


 Somebody really should put up a page in the wiki explaining that rules all
 have false positives, and that's the entire reason we don't flag an email
 as spam for any one rule, etc..

Sure, that's why I pointed out that LOTTO_AGENT also helps trigger
ADVANCE_FEE_2_NEW_MONEY, and combined they score 7.8.


 But if you provide us with more masscheck data, we can do a better job of
 automatically calculating ideal scores.

This is my fault, of course, but I'm not allowed to mass-check this
stuff. It's ongoing legal correspondence.


Re: Claims manager / LOTTO_AGENT

2012-11-07 Thread Michael Orlitzky
On 11/07/2012 10:21 PM, dar...@chaosreigns.com wrote:
 On 11/07, Michael Orlitzky wrote:
 On 11/07/2012 09:49 PM, dar...@chaosreigns.com wrote:
 On 11/07, Michael Orlitzky wrote:
 So, LOTTO_AGENT will hit the string Claims Manager for 3.5 points.
 This is bad news for,

   Barbara R. Krieg, Claims...

 When you put a string an an email that hits a spamassassin rule... your
 email then hits that spamassassin rule.  You should generally try to avoid
 that.

 Yeah, well it's her job title, so...? You misunderstand statistics. The
 data aren't wrong.
 
 After re-reading, I think you may have misunderstood my suggestion to avoid
 putting stuff in emails that is known to hit spam rules.  I wasn't
 suggesting that Barbara R. Krieg change her signature, I was suggesting
 that you not include it intact when posting to this mailing list about it.
 

I see. My apologies. Disregard the first half of that last message.


Re: Claims manager / LOTTO_AGENT

2012-11-07 Thread darxus
On 11/07, Michael Orlitzky wrote:
 Sorry, I was a little rude. But saying that she shouldn't put her job
 title anywhere in an email, ever, is ridiculous. 

Certainly.

 The inputs (spam, ham)
 to the classifier are assumed god-given; and the classification needs to
 reflect the data, not the other way around.

If the classifier is spamassassin, and The inputs are the spam
and ham data provided via masscheck, then... the scores provided via
sa-update *do* reflect the data.  So I'm not sure what you mean.

The ideal rule scores are chosen to cause one false positive (ham flagged
as spam) in every 2,500 hams, while maximizing the number of spams
correctly flagged as spams.  With so few hams hitting this rule in the
masscheck corpora, we're way below that threshold based on the data we
have.

 This is my fault, of course, but I'm not allowed to mass-check this
 stuff. It's ongoing legal correspondence.

Er, what?  You're not allowed to provide a list of which rules hit each
of your emails?  Or you're not allowed to run a program on your emails
that isn't spamassassin?  Or did I just not put This does not require
sending us your email in bold enough times on the masscheck page?

-- 
It's never too late to panic.
http://www.ChaosReigns.com


Re: Claims manager / LOTTO_AGENT

2012-11-07 Thread Michael Orlitzky
On 11/07/2012 10:36 PM, dar...@chaosreigns.com wrote:
 On 11/07, Michael Orlitzky wrote:
 Sorry, I was a little rude. But saying that she shouldn't put her job
 title anywhere in an email, ever, is ridiculous. 
 
 Certainly.
 
 The inputs (spam, ham)
 to the classifier are assumed god-given; and the classification needs to
 reflect the data, not the other way around.
 
 If the classifier is spamassassin, and The inputs are the spam
 and ham data provided via masscheck, then... the scores provided via
 sa-update *do* reflect the data.  So I'm not sure what you mean.
 
 The ideal rule scores are chosen to cause one false positive (ham flagged
 as spam) in every 2,500 hams, while maximizing the number of spams
 correctly flagged as spams.  With so few hams hitting this rule in the
 masscheck corpora, we're way below that threshold based on the data we
 have.
 

I wrote that before I saw your clarification, sorry again for coming off
as a jerk. Ignore it.


 This is my fault, of course, but I'm not allowed to mass-check this
 stuff. It's ongoing legal correspondence.
 
 Er, what?  You're not allowed to provide a list of which rules hit each
 of your emails?  Or you're not allowed to run a program on your emails
 that isn't spamassassin?  Or did I just not put This does not require
 sending us your email in bold enough times on the masscheck page?
 

This is a client of ours (a law firm) and not the company that I work
for. *I* know there's probably nothing sensitive in there, but just to
cover my ass I'd need to get permission to send the results off-site.
From their perspective, it's just simpler to say no: it's not worth the
time or effort to even think about if there's a minute chance of it
coming back to bite them legally.