About reporting

2009-09-13 Thread João Eiras

Hi everyone.

I'm getting a bunch of spam through on my work account.
After reading this http://wiki.apache.org/spamassassin/ReportingSpam a few 
doubts remain.

Should the file message.txt in the example contain the full -mail with headers, 
attachments and everything ?
Does the reporting tool remove all information about the receiver for privacy 
sake ?

Thank you.



Re: About reporting

2009-09-13 Thread Theo Van Dinter
On Sun, Sep 13, 2009 at 5:08 PM, João Eiras  wrote:
> Should the file message.txt in the example contain the full -mail with
> headers, attachments and everything ?

Yes.  It should be the original and complete message.

> Does the reporting tool remove all information about the receiver for
> privacy sake ?

No, nothing is removed from the message.


Re: About reporting

2009-09-13 Thread João Eiras

On , Theo Van Dinter  wrote:


On Sun, Sep 13, 2009 at 5:08 PM, João Eiras  wrote:

Should the file message.txt in the example contain the full -mail with
headers, attachments and everything ?


Yes.  It should be the original and complete message.


Does the reporting tool remove all information about the receiver for
privacy sake ?


No, nothing is removed from the message.



Thank you.
Another thing, can I just report a big mbs file with all emails inside ? It's 
easier for me to just export the spam folder to one file.



Re: About reporting

2009-09-19 Thread João Eiras

On , Theo Van Dinter  wrote:


On Sun, Sep 13, 2009 at 5:08 PM, João Eiras  wrote:

Should the file message.txt in the example contain the full -mail with
headers, attachments and everything ?


Yes.  It should be the original and complete message.


Does the reporting tool remove all information about the receiver for
privacy sake ?


No, nothing is removed from the message.



Hi again.

I still haven't got the answer to my last question, so here it goes again:
Can I report a full mbs file with many mails in one go ? Or should I split each 
mail on it's own file ?

Thank you.


Re: About reporting

2009-09-21 Thread Matus UHLAR - fantomas
> On , Theo Van Dinter  wrote:
>
>> On Sun, Sep 13, 2009 at 5:08 PM, João Eiras  wrote:
>>> Should the file message.txt in the example contain the full -mail with
>>> headers, attachments and everything ?
>>
>> Yes.  It should be the original and complete message.
>>
>>> Does the reporting tool remove all information about the receiver for
>>> privacy sake ?
>>
>> No, nothing is removed from the message.

On 19.09.09 22:45, João Eiras wrote:
> I still haven't got the answer to my last question, so here it goes again:
> Can I report a full mbs file with many mails in one go ? Or should I split 
> each mail on it's own file ?

sa-learn can accept mail in mbox format. It's even in its manual page.
Is this what you have meant?
-- 
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Eagles may soar, but weasels don't get sucked into jet engines. 


Re: About reporting

2009-09-21 Thread João Eiras

On , Matus UHLAR - fantomas  wrote:


On , Theo Van Dinter  wrote:


On Sun, Sep 13, 2009 at 5:08 PM, João Eiras  wrote:

Should the file message.txt in the example contain the full -mail with
headers, attachments and everything ?


Yes.  It should be the original and complete message.


Does the reporting tool remove all information about the receiver for
privacy sake ?


No, nothing is removed from the message.


On 19.09.09 22:45, João Eiras wrote:

I still haven't got the answer to my last question, so here it goes again:
Can I report a full mbs file with many mails in one go ? Or should I split each 
mail on it's own file ?


sa-learn can accept mail in mbox format. It's even in its manual page.
Is this what you have meant?


Might be. I'm not familiar with spam assassin's internals, just a normal e-mail 
user wanting to report some spam.
A quick man sa-learn shows

   --mbox
   sa-learn will read in the file(s) containing the emails to be 
learned, and will process them in mbox format (one or more emails per file).

So, I have my answer.
The page at http://wiki.apache.org/spamassassin/ReportingSpam is ambiguous in 
this regard. Just mentions a message.txt.
I hope you can make something out of my uploaded mbox file :)

Thank you





Re: About reporting

2009-09-21 Thread João Eiras



On 19.09.09 22:45, João Eiras wrote:

I still haven't got the answer to my last question, so here it goes again:
Can I report a full mbs file with many mails in one go ? Or should I split each 
mail on it's own file ?


sa-learn can accept mail in mbox format. It's even in its manual page.
Is this what you have meant?


Might be. I'm not familiar with spam assassin's internals, just a normal e-mail 
user wanting to report some spam.
A quick man sa-learn shows

--mbox
sa-learn will read in the file(s) containing the emails to be 
learned, and will process them in mbox format (one or more emails per file).

So, I have my answer.
The page at http://wiki.apache.org/spamassassin/ReportingSpam is ambiguous in 
this regard. Just mentions a message.txt.
I hope you can make something out of my uploaded mbox file :)

Thank you


After further testing, while sa-laern might support multiple emails in a single mbox 
file, "spamassassin -r" does not, so I cooked a perl script to split the mbox 
file into each individual mails, and then a simple loop is enough to report everything.

Goodbye.


Re: About reporting

2009-09-21 Thread Benny Pedersen

On tir 22 sep 2009 03:05:30 CEST, João Eiras wrote

Goodbye.


dont want to be here anymore ?

--
xpoint



Re: About reporting

2009-09-22 Thread Matus UHLAR - fantomas
Hello,

please configure your mailer to wrap lines below 80 characters per line.
72 to 75 is usually OK.

Thank you.

>>> On 19.09.09 22:45, João Eiras wrote:
 I still haven't got the answer to my last question, so here it goes
 again: Can I report a full mbs file with many mails in one go ? Or
 should I split each mail on it's own file ?
>>>
>>> sa-learn can accept mail in mbox format. It's even in its manual page.
>>> Is this what you have meant?
>>
>> Might be. I'm not familiar with spam assassin's internals, just a normal
>> e-mail user wanting to report some spam. A quick man sa-learn shows
>>
>> --mbox
>> sa-learn will read in the file(s) containing the emails to be
>> learned, and will process them in mbox format (one or more
>> emails per file).
>>
>> So, I have my answer.
>> The page at http://wiki.apache.org/spamassassin/ReportingSpam is
>> ambiguous in this regard. Just mentions a message.txt. I hope you can
>> make something out of my uploaded mbox file :)

On 22.09.09 03:05, João Eiras wrote:
> After further testing, while sa-laern might support multiple emails in a
> single mbox file, "spamassassin -r" does not, so I cooked a perl script to
> split the mbox file into each individual mails, and then a simple loop is
> enough to report everything.

sa-learn only does bayes training.

spamassassin -r and spamassassin -k do other things - report to network
services like razor/pyzor/dcc and SpamCop.
This is quite useless for older messages, which will probably be in the
mailbox unless it contains mail received in last few days.

-- 
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
I don't have lysdexia. The Dog wouldn't allow that.


Re: About reporting

2009-09-22 Thread João Eiras




spamassassin -r and spamassassin -k do other things - report to network
services like razor/pyzor/dcc and SpamCop.


Hum, then how do the default spam filters that come with a clean spam assassin 
installation know what's spam and what's not ?
Is there service we can report spam to ?


Re: About reporting

2009-09-29 Thread Matus UHLAR - fantomas
>> spamassassin -r and spamassassin -k do other things - report to network
>> services like razor/pyzor/dcc and SpamCop.

On 22.09.09 22:11, João Eiras wrote:
> Hum, then how do the default spam filters that come with a clean spam
> assassin installation know what's spam and what's not ? Is there service
> we can report spam to ?

SA contains many rules that score the mail. BAYES is something you can use
fotr better scoring if any of those rules misfire
-- 
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
The only substitute for good manners is fast reflexes. 


Question about reporting

2007-05-18 Thread Giampaolo Tomassoni
Dears,

what's wrong with automatically SA-report messages scoring above a given
threshold (say, 10-12)?

Would it be regarded as *BAD* by DCC, Pyzor, Razor, and/or SC?

I often see that high-scoring messages are reported as spam by some of the
above-mentioned engine, but seldom by all of them.

Please contribute.

Thanks,

Giampaolo



Re: Question about reporting

2007-05-18 Thread Duncan Hill
On Fri, May 18, 2007 14:50, Giampaolo Tomassoni wrote:

> what's wrong with automatically SA-report messages scoring above a given
> threshold (say, 10-12)?
>
> Would it be regarded as *BAD* by DCC, Pyzor, Razor, and/or SC?
>
>
> I often see that high-scoring messages are reported as spam by some of
> the above-mentioned engine, but seldom by all of them.

Off the top of my head:

1) Your server can be used in a DoS against the distributed services

2) False positives get broadcast to the distributed services




R: Question about reporting

2007-05-18 Thread Giampaolo Tomassoni
> -Messaggio originale-
> Da: Duncan Hill [mailto:[EMAIL PROTECTED]
> 
> On Fri, May 18, 2007 14:50, Giampaolo Tomassoni wrote:
> 
> > what's wrong with automatically SA-report messages scoring above a
> given
> > threshold (say, 10-12)?
> >
> > Would it be regarded as *BAD* by DCC, Pyzor, Razor, and/or SC?
> >
> >
> > I often see that high-scoring messages are reported as spam by some
> of
> > the above-mentioned engine, but seldom by all of them.
> 
> Off the top of my head:
> 
> 1) Your server can be used in a DoS against the distributed services

You mean, by sending valid content on high-scoring mails? Mmmh. Yes, this
may create trouble to DCC, Pyzor and Razor. Not to SC, since they are mostly
interested to Received: header lines. Ok.


> 2) False positives get broadcast to the distributed services

I did browse my quarantine to have a check to FPs. Didn't find anything at
score 8... I don't think this would really be an issue, unless "something
goes wrong" and my systems start quarantining good mails, which is a
respectful case.

Ok. I think I could always report to SC. What about reporting to DCC, Pyzor
and Razor iff the message is already reported by one of these engines? That
would probably defeat the DoS case.

Giampaolo



Re: Question about reporting

2007-05-18 Thread Kelson

Giampaolo Tomassoni wrote:

what's wrong with automatically SA-report messages scoring above a given
threshold (say, 10-12)?

Would it be regarded as *BAD* by DCC, Pyzor, Razor, and/or SC?


Razor discourages automatic reporting because they're concerned about 
false positives.  They prefer verified reports.


--
Kelson Vibber
SpeedGate Communications 


R: Question about reporting

2007-05-18 Thread Giampaolo Tomassoni
> -Messaggio originale-
> Da: Kelson [mailto:[EMAIL PROTECTED]
> Inviato: venerdì 18 maggio 2007 18.19
> A: 'Spamassassin'
> Oggetto: Re: Question about reporting
> 
> Giampaolo Tomassoni wrote:
> > what's wrong with automatically SA-report messages scoring above a
> given
> > threshold (say, 10-12)?
> >
> > Would it be regarded as *BAD* by DCC, Pyzor, Razor, and/or SC?
> 
> Razor discourages automatic reporting because they're concerned about
> false positives.  They prefer verified reports.

Just to know, "prefer" means that, in example, spamtraps may report to
Razor, right?

Giampaolo


> 
> --
> Kelson Vibber
> SpeedGate Communications 



Re: Question about reporting

2007-05-18 Thread Matt Kettler
Giampaolo Tomassoni wrote:
> Dears,
>
> what's wrong with automatically SA-report messages scoring above a given
> threshold (say, 10-12)?
>
> Would it be regarded as *BAD* by DCC, Pyzor, Razor, and/or SC?
>   
Razor, definitely. The Razor FAQ explicitly prohibits this. They want
hand verified mail, or well groomed spamtraps only.

DCC, doesn't matter. By default *EVERY* message that isn't in your dcc
whitelist gets reported when you scan it. (DCC is a bulk-measurement
tool, not a spam measurement tool).

I can't really speak to SC or Pyzor's policies.
> I often see that high-scoring messages are reported as spam by some of the
> above-mentioned engine, but seldom by all of them.

And this would be important because?..

 If all the engines always matched the exact same profile of messages as
SA, they'd be redundant. That's primarily the reason why razor doesn't
want auto-reporting. They don't want razor to become an exact mirror of
what SA can detect, with all the same false positives to go with it...






R: Question about reporting

2007-05-19 Thread Giampaolo Tomassoni
> -Messaggio originale-
> Da: Matt Kettler [mailto:[EMAIL PROTECTED]
> 
> Giampaolo Tomassoni wrote:
> > Dears,
> >
> > what's wrong with automatically SA-report messages scoring above a
> given
> > threshold (say, 10-12)?
> >
> > Would it be regarded as *BAD* by DCC, Pyzor, Razor, and/or SC?
> >
> Razor, definitely. The Razor FAQ explicitly prohibits this. They want
> hand verified mail, or well groomed spamtraps only.

Ok. No Razor.


> DCC, doesn't matter. By default *EVERY* message that isn't in your dcc
> whitelist gets reported when you scan it. (DCC is a bulk-measurement
> tool, not a spam measurement tool).

Oh, really?

So why I get a DCC_CHECK only after manually reporting (with spamassassin
-r) a spam?


> I can't really speak to SC or Pyzor's policies.
> > I often see that high-scoring messages are reported as spam by some
> of the
> > above-mentioned engine, but seldom by all of them.
> 
> And this would be important because?..
> 
>  If all the engines always matched the exact same profile of messages
> as
> SA, they'd be redundant. That's primarily the reason why razor doesn't
> want auto-reporting. They don't want razor to become an exact mirror of
> what SA can detect, with all the same false positives to go with it...

Ok. Probably, Pyzor has the same policy then.

Then reporting high-scoring spam would be of help to SC (which are
interested more to spam traffic then to spam content) and, possibly, to DCC,
since this would help in asserting the "bulky nature" of a given mail.

Right Matt?

Giampaolo



Re: R: Question about reporting

2007-05-18 Thread Duncan Hill
On Fri, May 18, 2007 15:29, Giampaolo Tomassoni wrote:

>> 1) Your server can be used in a DoS against the distributed services
>>
>
> You mean, by sending valid content on high-scoring mails? Mmmh. Yes, this
>  may create trouble to DCC, Pyzor and Razor. Not to SC, since they are
> mostly interested to Received: header lines. Ok.

Not that per se.  More that if you automatically report anything over x
points, and you get hit with a spam storm that scores over x, you can end
up DoSing yourself and contributing to a DoS on the distributed servers. 
Some defensive programming could reduce some of the risk, but not all of
it.




R: R: Question about reporting

2007-05-18 Thread Giampaolo Tomassoni
> -Messaggio originale-
> Da: Duncan Hill [mailto:[EMAIL PROTECTED]
> Inviato: venerdì 18 maggio 2007 16.34
> A: users@spamassassin.apache.org
> Oggetto: Re: R: Question about reporting
> 
> On Fri, May 18, 2007 15:29, Giampaolo Tomassoni wrote:
> 
> >> 1) Your server can be used in a DoS against the distributed services
> >>
> >
> > You mean, by sending valid content on high-scoring mails? Mmmh. Yes,
> this
> >  may create trouble to DCC, Pyzor and Razor. Not to SC, since they
> are
> > mostly interested to Received: header lines. Ok.
> 
> Not that per se.  More that if you automatically report anything over x
> points, and you get hit with a spam storm that scores over x, you can
> end
> up DoSing yourself and contributing to a DoS on the distributed
> servers.
> Some defensive programming could reduce some of the risk, but not all
> of
> it.

Mmmmh. Well, no. I could implement a kind of self-tarpit algorithm: I get
the spam in my quarantine folder and I deliver it to SC and their hashes to
DCC, Pyzor and/or Razor at a fixed rate. Whenever I'm flooded, some of the
yet-to-report spam would became too old to have any meaning to report (say,
when older than 2 hours?) and I would skip it. 'Till "they" decide to let me
alone. Then the auto-reporter would smoothly get in synch with the new spam.

Wouldn't it work?

Nice thread. Thanks Duncan.

Giampaolo



Re: R: Question about reporting

2007-05-19 Thread Matt Kettler
Giampaolo Tomassoni wrote:
>
> Oh, really?
>
> So why I get a DCC_CHECK only after manually reporting (with spamassassin
> -r) a spam?
>   
I have no idea.

Are you running a check *immediately* before calling spammassassin -r?
If not, the difference is probably due to passage of time more than
anything else.

Certainly your one report is nowhere enough to cause listing in DCC. By
default it takes approximately 1 million reports to get listed.





R: R: Question about reporting

2007-05-19 Thread Giampaolo Tomassoni
> -Messaggio originale-
> Da: Matt Kettler [mailto:[EMAIL PROTECTED]
> 
> Giampaolo Tomassoni wrote:
> >
> > Oh, really?
> >
> > So why I get a DCC_CHECK only after manually reporting (with
> spamassassin
> > -r) a spam?
> >
> I have no idea.
> 
> Are you running a check *immediately* before calling spammassassin -r?
> If not, the difference is probably due to passage of time more than
> anything else.
> 
> Certainly your one report is nowhere enough to cause listing in DCC. By
> default it takes approximately 1 million reports to get listed.

Ok, I gave a look at the DCC plugin in my SA 3.1.8.

When reporting, DCC allows the user to specify how many copies of a message
have to be reported. If you say "MANY", it is like if you say "I've seen
999,999 copies of this message", which is the "million reports" you spoke
about.

When performing a lookup, the DCC plugin triggers the DCC_CHECK tag when DCC
says that at least 999,999 copies of that message had been reported. This
would mean that DCC_CHECK in SA is triggered on bulk mail.

BUT, when SA reports a message, the DCC plugin uses "MANY" (i.e.: 999,999)
as the report count, thereby this sounds like "this is spam, so assume it
bulk also". Please note that the DCC site says that: "Spam traps involving
dccproc -tMANY are useful to DCC reputation servers"
(http://www.rhyolite.com/anti-spam/dcc/reputations.html), which sounds like
"if you see something which is spam, report as if you have seen 999,999
copies of it.

By the way, SA doesn't report every and each message it sees: it does only
"look it up". I don't know if, when doing a DCC lookup, the DCC engine also
assumes the lookup as the report of a single message, but browsing the DCC
site it seems to me that a more active action than a simple lookup is needed
in order to report a message.

In summary, DCC_CHECK is triggered at the 999,999th copy of a message.
Reporting a message to DCC is like saying you saw 999,999 copies of that
message.

That said, my tool is not a spamtrap, thereby it could probably report one
copy of each message having score above the given threshold. However, the
DCC plugin has no settings by which you can configure the count of copies to
report of a messages (the word "many" is built-in in the report code). Since
my tool is going to use SA to report its things, I will probably have to
avoid reporting to DCC at all. Otherwise, since only DCC and SA are going to
be possible destinations of the reports, I should probably not rely on SA
for reporting, but instead build my own reporting code.

Thanks Matt for "triggering" my curiosity about the DCC reporting system.

Giampaolo