Re: Increase in Image Spam

2014-02-21 Thread Kevin A. McGrail
On 2/20/2014 10:35 PM, Amir Caspi wrote: On Feb 20, 2014, at 8:07 PM, Kevin A. McGrail wrote: No need to run through 3.3.2. The emails are well over the 256KB limit hard coded in sa-learn with 3.3.2. Understood, and thanks for checking on this. Now that I know this is the problem, I've ma

Re: Increase in Image Spam

2014-02-20 Thread Amir Caspi
On Feb 20, 2014, at 8:07 PM, Kevin A. McGrail wrote: > No need to run through 3.3.2. The emails are well over the 256KB limit hard > coded in sa-learn with 3.3.2. Understood, and thanks for checking on this. Now that I know this is the problem, I've manually edited Mail::SpamAssassin::Archiv

Re: Increase in Image Spam

2014-02-20 Thread Kevin A. McGrail
On 2/20/2014 7:18 PM, Amir 'CG' Caspi wrote: If you have a chance, please run it through both 3.3.2 and 3.4.0, to see if there's a difference... clearly, it's not working on _MY_ 3.3.2 for some reason! I sent the exact commands that I used in a prior email a couple of hours ago. Thanks. =) ---

Re: Increase in Image Spam

2014-02-20 Thread John Hardin
On Thu, 20 Feb 2014, Ian Zimmerman wrote: On Thu, 20 Feb 2014 11:57:17 -0800 (PST) John Hardin wrote: Amir> When I run sa-learn on this mailbox, it says: Amir> Learned tokens from 0 message(s) (0 message(s) examined) John> "0 messages examined" generally means either the format isn't what Jo

Re: Increase in Image Spam

2014-02-20 Thread Amir Caspi
On Feb 20, 2014, at 7:07 PM, Ian Zimmerman wrote: > In my case it usually means the message has been learned already and SA > just refuses to do so for the 2nd time :-) When I run sa-learn on already-learned messages, it says 0 tokens learned, but it still says N messages examined (where N > 0)

Re: Increase in Image Spam

2014-02-20 Thread Ian Zimmerman
On Thu, 20 Feb 2014 11:57:17 -0800 (PST) John Hardin wrote: Amir> When I run sa-learn on this mailbox, it says: Amir> Learned tokens from 0 message(s) (0 message(s) examined) John> "0 messages examined" generally means either the format isn't what John> sa-learn expected, or the message is larg

Re: Increase in Image Spam

2014-02-20 Thread Amir 'CG' Caspi
On Thu, February 20, 2014 5:13 pm, Kevin A. McGrail wrote: > Resend the mbox.link and I will likely have a cycle to throw it through. https://www.dropbox.com/s/m4fuv670wnvwa16/SA_testspam.mbox To be deleted in 24-48 hours (don't want spammers harvesting it). If you have a chance, please run it t

Re: Increase in Image Spam

2014-02-20 Thread Kevin A. McGrail
Resend the mbox.link and I will likely have a cycle to throw it through. Regards, KAM Amir 'CG' Caspi wrote: >On Thu, February 20, 2014 4:08 pm, Kevin A. McGrail wrote: >> Probably best if you install 3.4.0 (or even trunk) on a test system >and >> throw the offending email onto that server and r

Re: Increase in Image Spam

2014-02-20 Thread Amir 'CG' Caspi
On Thu, February 20, 2014 4:08 pm, Kevin A. McGrail wrote: > Probably best if you install 3.4.0 (or even trunk) on a test system and > throw the offending email onto that server and run sa-learn on that box > with -D. In the meantime, anyone want to do it on my behalf? =) I provided the mbox link

Re: Increase in Image Spam

2014-02-20 Thread Kevin A. McGrail
On 2/20/2014 6:01 PM, Amir 'CG' Caspi wrote: On Thu, February 20, 2014 3:52 pm, Kevin A. McGrail wrote: Questions that will be answered by "that is solved in 3.4.0" aren't really going to get much support from me... Understood, though it'll be a while before I can upgrade to 3.4 due to the RPM

Re: Increase in Image Spam

2014-02-20 Thread Amir 'CG' Caspi
On Thu, February 20, 2014 3:52 pm, Kevin A. McGrail wrote: > Questions that will be answered by "that is solved in 3.4.0" aren't > really going to get much support from me... Understood, though it'll be a while before I can upgrade to 3.4 due to the RPM issue that I've mentioned previously. Howev

Re: Increase in Image Spam

2014-02-20 Thread Kevin A. McGrail
On 2/20/2014 5:48 PM, Martin Gregorie wrote: On Thu, 2014-02-20 at 17:29 -0500, Kevin A. McGrail wrote: More to the point, spamc would have to process all config files first which would slow it down. The point of spamc is to be a VERY lightweight connection to spamd. That's why I suggested th

Re: Increase in Image Spam

2014-02-20 Thread Kevin A. McGrail
On 2/20/2014 5:38 PM, Amir 'CG' Caspi wrote: On Thu, February 20, 2014 3:29 pm, Kevin A. McGrail wrote: Unifying wouldn't be something I would want to see. Well, no one is arguing to _force_ unification, but to provide an option for it. That is, max-size could be set in local.cf and would beco

Re: Increase in Image Spam

2014-02-20 Thread Benny Pedersen
On 2014-02-20 23:16, Kevin A. McGrail wrote: Are you using 3.4.0? I believe the size was hard-coded until then when the max-size option was added to sa-learn. SpamAssassin 3.4.0 (2014-02-07) yes i do ebuilds for gentoo self 3.4 is not in gentoo yet Kevin: do i need to be reply private here

Re: Increase in Image Spam

2014-02-20 Thread Martin Gregorie
On Thu, 2014-02-20 at 17:29 -0500, Kevin A. McGrail wrote: > More to the point, spamc would have to process all config files first > which would slow it down. The point of spamc is to be a VERY > lightweight connection to spamd. > That's why I suggested that spamc could be handed that value by

Re: Increase in Image Spam

2014-02-20 Thread Amir 'CG' Caspi
On Thu, February 20, 2014 3:29 pm, Kevin A. McGrail wrote: > Unifying wouldn't be something I would want to see. Well, no one is arguing to _force_ unification, but to provide an option for it. That is, max-size could be set in local.cf and would become a global parameter, but could still be over

Re: Increase in Image Spam

2014-02-20 Thread Kevin A. McGrail
I think you were just on the email chain on list so my reply to another person went to you. On 2/20/2014 5:21 PM, Benny Pedersen wrote: On 2014-02-20 23:16, Kevin A. McGrail wrote: Are you using 3.4.0?  I believe the size was hard-c

Re: Increase in Image Spam

2014-02-20 Thread Amir 'CG' Caspi
On Thu, February 20, 2014 3:16 pm, Kevin A. McGrail wrote: > Are you using 3.4.0? I believe the size was hard-coded until then when > the max-size option was added to sa-learn. No, as mentioned previously in this flurry of emails, I'm using 3.3.2. However, note that using spamassassin directly (

Re: Increase in Image Spam

2014-02-20 Thread Kevin A. McGrail
On 2/20/2014 5:16 PM, Martin Gregorie wrote: On Thu, 2014-02-20 at 16:39 -0500, Kevin A. McGrail wrote: On 2/20/2014 4:35 PM, Amir 'CG' Caspi wrote: If it's a size issue, how can I increase the size limit for sa-learn? But, I don't think it's a size issue since these messages are under 512k eac

Re: Increase in Image Spam

2014-02-20 Thread Martin Gregorie
On Thu, 2014-02-20 at 16:39 -0500, Kevin A. McGrail wrote: > On 2/20/2014 4:35 PM, Amir 'CG' Caspi wrote: > > If it's a size issue, how can I increase the size limit for sa-learn? > > But, I don't think it's a size issue since these messages are under 512k > > each. > --max-size= I believe. Defaul

Re: Increase in Image Spam

2014-02-20 Thread Kevin A. McGrail
On 2/20/2014 5:07 PM, Amir 'CG' Caspi wrote: On Thu, February 20, 2014 2:49 pm, Benny Pedersen wrote: On 2014-02-20 22:39, Kevin A. McGrail wrote: --max-size= I believe. Default is 256K. sa-learn barfs, that flag is not accepted. That flag works for spamc, but not for sa-learn. sa-learn man

Re: Increase in Image Spam

2014-02-20 Thread Amir 'CG' Caspi
On Thu, February 20, 2014 2:49 pm, Benny Pedersen wrote: > On 2014-02-20 22:39, Kevin A. McGrail wrote: >> --max-size= I believe. Default is 256K. sa-learn barfs, that flag is not accepted. That flag works for spamc, but not for sa-learn. sa-learn man page and CLI help don't have any mention of

Re: Increase in Image Spam

2014-02-20 Thread Benny Pedersen
On 2014-02-20 22:56, Amir 'CG' Caspi wrote: I run a virtual-hosting server where the individual site RPMs are copied from server-level RPMs. Basically all software has to be installed as RPMs in order to propagate to the individual virtual hosts. google on dist2rpm, you basicly just use sour

Re: Increase in Image Spam

2014-02-20 Thread Amir 'CG' Caspi
On Thu, February 20, 2014 2:39 pm, Axb wrote: > what's wrong with installing from source? I run a virtual-hosting server where the individual site RPMs are copied from server-level RPMs. Basically all software has to be installed as RPMs in order to propagate to the individual virtual hosts. ---

Re: Increase in Image Spam

2014-02-20 Thread Benny Pedersen
On 2014-02-20 22:39, Kevin A. McGrail wrote: On 2/20/2014 4:35 PM, Amir 'CG' Caspi wrote: If it's a size issue, how can I increase the size limit for sa-learn? But, I don't think it's a size issue since these messages are under 512k each. --max-size= I believe. Default is 256K. and small m

Re: Increase in Image Spam

2014-02-20 Thread Benny Pedersen
On 2014-02-20 22:39, Axb wrote: noticed? (I can't install 3.4 since it hasn't been RPM'd for CentOS 5.x yet.) what's wrong with installing from source? (NOT Cpan install) http://searchcode.com/codesearch/view/21483839 the harddest part is to know howto :=)

Re: Increase in Image Spam

2014-02-20 Thread Kevin A. McGrail
On 2/20/2014 4:39 PM, Axb wrote: On 02/20/2014 10:35 PM, Amir 'CG' Caspi wrote: Note that I have some other spams for which this is now an issue but which I think worked fine in the past (with SA 3.3.1 for sure); is it possible something got borked in sa-learn between 3.3.1 and 3.3.2 and nobody

Re: Increase in Image Spam

2014-02-20 Thread Kevin A. McGrail
On 2/20/2014 4:35 PM, Amir 'CG' Caspi wrote: If it's a size issue, how can I increase the size limit for sa-learn? But, I don't think it's a size issue since these messages are under 512k each. --max-size= I believe. Default is 256K.

Re: Increase in Image Spam

2014-02-20 Thread Axb
On 02/20/2014 10:35 PM, Amir 'CG' Caspi wrote: Note that I have some other spams for which this is now an issue but which I think worked fine in the past (with SA 3.3.1 for sure); is it possible something got borked in sa-learn between 3.3.1 and 3.3.2 and nobody noticed? (I can't install 3.4 sin

Re: Increase in Image Spam

2014-02-20 Thread Amir 'CG' Caspi
On Thu, February 20, 2014 12:57 pm, John Hardin wrote: > "0 messages examined" generally means either the format isn't what > sa-learn expected, or the message is larger than the size limit. The file format is most certainly MBOX... it was created by my MUA, and running "file" on it tells me that

Re: Increase in Image Spam

2014-02-20 Thread Benny Pedersen
On 2014-02-20 21:43, Axb wrote: Redis DB in RAM - do the math :) got results as 781250 now its time to see how much power so many pi' is using :=) have anyone thinked about running mysql in memory ?, if its slow? engine=memory in the spamd init script, and engine=myisam on shutdown yes

Re: Increase in Image Spam

2014-02-20 Thread Axb
On 02/20/2014 07:46 PM, Benny Pedersen wrote: On 2014-02-20 19:34, Axb wrote: well, not huge...let me brag :) sa-learn --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 17663091 0 non-token data: nspam 0.000 06768342

Re: Increase in Image Spam

2014-02-20 Thread John Hardin
On Thu, 20 Feb 2014, Amir Caspi wrote: When I run sa-learn on this mailbox, it says: Learned tokens from 0 message(s) (0 message(s) examined) "0 messages examined" generally means either the format isn't what sa-learn expected, or the message is larger than the size limit. -- John Hardin

Re: Increase in Image Spam

2014-02-20 Thread Amir Caspi
On Feb 20, 2014, at 11:21 AM, Kris Deugau wrote: > Have you tried learning one specific FN, then reprocessing that message > to see what Bayes score it gets? IME it will usually shift from > BAYES_00 to at least BAYES_40 in most cases, even with a large sitewide > DB with far more tokens than th

Re: Increase in Image Spam

2014-02-20 Thread Benny Pedersen
On 2014-02-20 19:34, Axb wrote: well, not huge...let me brag :) sa-learn --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 17663091 0 non-token data: nspam 0.000 06768342 0 non-token data: nham how many r

Re: Increase in Image Spam

2014-02-20 Thread Axb
On 02/20/2014 06:44 PM, Amir Caspi wrote: On Feb 20, 2014, at 10:34 AM, Axb wrote: I hope you're running SA 3.4 so: I am still on 3.3.2 because nobody has yet packaged 3.4 for CentOS 5.x, from what I can tell. I have the package from the rpmforge-extras repo, and 3.3.2 is still the most cur

Re: Increase in Image Spam

2014-02-20 Thread Kris Deugau
Amir Caspi wrote: > Bayes is set to autolearn, and I manually run sa-learn about once a week on > my spam folder (to learn the FNs, plus lower-scoring spam that was not > autolearned). Try setting up a cron job to run this daily or even as often as hourly. The faster you get feedback into the s

Re: Increase in Image Spam

2014-02-20 Thread Benny Pedersen
On 2014-02-20 18:06, Amir Caspi wrote: for whatever reason, many of the FNs I've been getting lately are passing because they hit BAYES_00, even though they are matching AC_SPAMMY_URI_PATTERNS. I need to enable bayes tokens in the headers so I can see why these are considered so hammy when I kn

Re: Increase in Image Spam

2014-02-20 Thread Amir Caspi
On Feb 20, 2014, at 10:34 AM, Axb wrote: > I hope you're running SA 3.4 so: I am still on 3.3.2 because nobody has yet packaged 3.4 for CentOS 5.x, from what I can tell. I have the package from the rpmforge-extras repo, and 3.3.2 is still the most current version there (and on Atomic and AtRP

Re: Increase in Image Spam

2014-02-20 Thread Axb
On 02/20/2014 06:22 PM, Amir Caspi wrote: On Feb 20, 2014, at 10:15 AM, Axb wrote: What kind of traffic are you dealing with? personal, corporate? ISPish? How many domains/users/msgs/day? This is mostly personal email with a little bit of corporate. In this instance, it is for a single doma

Re: Increase in Image Spam

2014-02-20 Thread Amir Caspi
On Feb 20, 2014, at 10:15 AM, Axb wrote: > What kind of traffic are you dealing with? personal, corporate? ISPish? > How many domains/users/msgs/day? This is mostly personal email with a little bit of corporate. In this instance, it is for a single domain with 3 users and approximately 50-100

Re: Increase in Image Spam

2014-02-20 Thread Axb
On 02/20/2014 06:06 PM, Amir Caspi wrote: Hi all, Following some off-list discussions with Kevin, John, et al., I had a question that was suggested I bring up on-list, so here it is: For whatever reason, many of the FNs I've been getting lately are passing because they hit BAY

Re: Increase in Image Spam

2014-02-20 Thread Amir Caspi
Hi all, Following some off-list discussions with Kevin, John, et al., I had a question that was suggested I bring up on-list, so here it is: For whatever reason, many of the FNs I've been getting lately are passing because they hit BAYES_00, even though they are matching AC_SPA

Re: Increase in Image Spam

2014-02-11 Thread Benny Pedersen
On 2014-02-11 20:59, RW wrote: Actually I find BAYES_99 to be so reliable that I'd be happy to score it above 5.0. Other have made similar comments too. there is a number of ways to punish spf pass domains for spamming :) blacklist_from *@foo.example.org and for the bayes on could make anoth

Re: Increase in Image Spam

2014-02-11 Thread RW
On Tue, 11 Feb 2014 20:22:00 +0100 Benny Pedersen wrote: > On 2014-02-11 18:25, Andy Jezierski wrote: > > > They don't really hit on any rules > > > > X-Spam-Status: No, score=3.5 required=5.0 > > tests=BAYES_99,HTML_MESSAGE, > > > > SPF_HELO_PASS,SPF_PASS autolearn=no autolearn_force=no >

Re: Increase in Image Spam

2014-02-11 Thread Benny Pedersen
On 2014-02-11 18:25, Andy Jezierski wrote: They don't really hit on any rules X-Spam-Status: No, score=3.5 required=5.0 tests=BAYES_99,HTML_MESSAGE, SPF_HELO_PASS,SPF_PASS autolearn=no autolearn_force=no version=3.4.0-rc5 bayes is seeing it as spam, so it might be in vain :) well if ba

Re: Increase in Image Spam

2014-02-11 Thread Kevin A. McGrail
On 2/11/2014 2:02 PM, John Hardin wrote: On Tue, 11 Feb 2014, Amir Caspi wrote: I could release the rules publicly but that may end up backfiring, per above. John, Kevin, what do you guys think? Spammers can install SpamAssassin as easily as anyone else, that's a known risk. Any rules we pr

Re: Increase in Image Spam

2014-02-11 Thread John Hardin
On Tue, 11 Feb 2014, Amir Caspi wrote: I could release the rules publicly but that may end up backfiring, per above. John, Kevin, what do you guys think? Spammers can install SpamAssassin as easily as anyone else, that's a known risk. Any rules we provide they can potentially test against th

Re: Increase in Image Spam

2014-02-11 Thread Amir Caspi
On Feb 11, 2014, at 10:25 AM, Andy Jezierski wrote: > They don't really hit on any rules A number of image spams have certain template formats and I've written custom rules to catch many... however, I've been hesitant to release those rules publicly since spammers could just change their t

Increase in Image Spam

2014-02-11 Thread Andy Jezierski
I've been seeing a pretty big increase in image spam over the last month or so. I remember using FuzzyOCR years ago when image spam was a much bigger problem. Since FuzzyOCR hasn't been maintained in several years, is there an alternative that would work? Or is there another way

Re: Increase in image spam

2007-02-16 Thread LuKreme
On 6-Feb-2007, at 09:30, Sujit Choudhury wrote: Lately there has been an increase in image spam. We are using imageinfo.cf with ImageInfo plugin. However, this is not making a lot of difference. We are also using virtually all the SARE rules plus using sa-update and restarting spamd everyday