Re: spam from venamail

2012-11-10 Thread C. Bensend

 I just grepped a gigabyte collection of spam for venamail.com and found
 nothing.

And I just queried my database of about 1.67 million spams of mine,
and didn't find a single one.

Benny


-- 
Unless you're a lawyer, you don't understand Oracle licensing.
That applies equally to Oracle employees as well as customers.
  -- Me, 2012-05-10




Re: Question for experts....

2011-11-28 Thread C. Bensend

 Why bug such people unless their product IS vulnerable? Note that this
 seems
 to be an email trying to get people who have a vulnerable browser to
 click
 a specific link. I'd expect that link to be loaded with a zero day or the
 likes that the browser exhibits.

 I figured people here with their basic interest in security might know of
 vulnerable browsers to make progressing to the next logical steps easy. I
 am
 somewhat surprised NOBODY here seems to know.

 {^_^}

I guess I'm confused why you think this is a vulnerability...  It's
simply another way to represent an IP address that browsers grok.
Is it obfuscation?  Sure.  But hell, for the average internet user,
a NON-obfuscated IP address is cryptic enough.  ;)  This is just
another way to do it...

Benny

PS:  My Firefox (8.0) and my IE (8.0.whatever.build) both retrieved
an HTML document, or at least presented an empty one with only
a header.


-- 
Cats land on their feet. Toast lands peanut butter side down. A cat
with toast strapped to its back will hover above the ground in a state
of quantum indecision.   -- Unknown



Re: Question for experts....

2011-11-28 Thread C. Bensend

 I guess I'm confused why you think this is a vulnerability...  It's
 simply another way to represent an IP address that browsers grok.
 Is it obfuscation?  Sure.  But hell, for the average internet user,
 a NON-obfuscated IP address is cryptic enough.  ;)  This is just
 another way to do it...

 Might I suggest reading the specification for URLs. I believe that
 only DNS addresses and decimal dotted quads are legal. The other
 misrepresentations are not permitted so responding to them is a bug
 for a browser or other URL based tool. If I'm wrong I'd like to know
 with the appropriate URL RFC cited.

 {^_^}

I didn't say legal.  :)  Browsers have a long and rich history of
bending/breaking the rules in order to make the browsing experience
faster/better/insert-buzzword-here.

HTML content (web pages, rich email, blah blah blah) is horrifying
nowadays.  Standards?  Nope, standards get in the way.  I wouldn't
be surprised if a vast majority of the HTML clients out there (web
browsers, email clients, etc) exhibit this behavior.

There's a difference between vulnerability and it works anyway.
Honest question - do you believe this is a *vulnerability*, or are
you just irritated because it's happening?  :)

Not intending to come across as snarky...  I just don't think this
is a bug or vulnerability, but probably considered a feature.

Benny


-- 
Cats land on their feet. Toast lands peanut butter side down. A cat
with toast strapped to its back will hover above the ground in a state
of quantum indecision.   -- Unknown



Re: two SA folders and sa-updates

2010-08-19 Thread C. Bensend

 better - *don't even think of using them* - they are not being updated
 and never will.

 Anything worthy has already been migrated to SA mainstream and the few
 SARE survivors are also SA commiters so they'll commit to SA instead of
 SARE.

 Anybody hammering the rulesemporium with lwp/wget on a regular basis is
 advised to stop unless in need of surprises when the files are zeroed out.

I'm changing my SpamAssassin config to remove the SARE rules due to
all this advice, and I just want to make sure I'm doing the correct
thing here...

I have an /etc/mail/spamassassin/sa-update-channels.txt file that
lists the additional SARE channels I was updating via Daryl's
site.  Only SARE channels are in it.

Given this cronjob that runs once a day:

/usr/local/bin/sa-update --channelfile
/etc/mail/spamassassin/sa-update-channels.txt --gpgkey 856AA88A --gpgkey
6C6191E3  /usr/local/bin/spamassassin --lint  pkill -SIGHUP spamd

I should just be able to rip out the sa-update-channels.txt and the
second GPG key, and I'll still get the stock ruleset updates, but
won't be buggin' Daryl or futzing with the SARE rules any longer,
correct?  I will of course remove them from the rules directories
and restart SpamAssassin.  :)

Just want to make sure I'll still get the regular updates...

Thanks!

Benny


-- 
Something's going on in this house - last night, I saw a face!
Did it have a nose?
Yes!
That sounds like a face all right.
  -- Scary Movie 4




Re: two SA folders and sa-updates

2010-08-19 Thread C. Bensend

 Then you haven't been getting the regular updates.  If you don't have
 updates.spamassassin.org in your --channelfile, it won't check it...

No, I stand corrected, sorry for the misinformation.  At the very
top of the file (they had scrolled out of my term), I have:

updates.spamassassin.org
sought.rules.yerp.org

So, given this *accurate* information, I should be OK just removing
the SARE channels from said channel file, and removing the GPG key.
Right?

:)

Benny


-- 
Something's going on in this house - last night, I saw a face!
Did it have a nose?
Yes!
That sounds like a face all right.
  -- Scary Movie 4




A few basic questions about custom rules

2010-03-26 Thread C. Bensend

Hey folks,

   I'm playing around with writing some rules, something I've
never done before.  I'm not looking to do anything fancy, I just
wanted to learn how to do it.

   So, I picked one of my spams at total random, and wanted to
match on e-shop.gr in any of the headers.  The rule, I believe,
should look like this:

describe CJB_ESHOP_GR Contains reference to e-shop.gr
header CJB_ESHOP_GR ALL =~ /e-shop\.gr/i
score CJB_ESHOP_GR 0.01

   The efficiency and wisdom of matching against any headers aside,
is this correct so far?  I'm just thinking simple, I just want to
match the e-shop.gr string.

   Assuming that's OK, I moved on to creating the rule.  Now, I store
my user preferences in a PostgreSQL database, so this is where I'm
slightly unsure...  Can I simply add my rule above to my userprefs
table, one line per row?  Like so (pardon the line wrap):

db= SELECT prefid, username, preference, value FROM userpref WHERE
username = 'me' AND preference LIKE '%CJB%' ORDER BY preference;
 prefid | username |  preference   |  value
+--+---+-
450 | me| describe CJB_ESHOP_GR | Contains reference to e-shop.gr
452 | me| header CJB_ESHOP_GR   | ALL =~ /e-shop\.gr/i
453 | me| score CJB_ESHOP_GR| 0.01
(3 rows)

   Is this the correct way to do it?  Or do I *have* to add them to
local.cf or another such local file?

   I'm also unsure if I need to restart spamd to pick up on this new
rule, if indeed it can be stored in SQL - my other userprefs do not
need a restart, but rules I'm very unsure about.  I tried cat'ing a
sample spam and piping it to spamc (cat sample.txt | spamc), but
it did not hit my test rule.  And yes, e-shop.gr shows up in the
Received and From headers.  :)

   This is SpamAssassin 3.2.5 FYI.  Thanks, folks!

Benny


-- 
Whoever said that hell hath no fury like a women scorned never
owned a cat.
  -- bash.org



Re: A few basic questions about custom rules

2010-03-26 Thread C. Bensend

I received answers to my questions off-list, so I'll reply to myself
and share them with the rest of everyone.  :)

 describe CJB_ESHOP_GR Contains reference to e-shop.gr
 header CJB_ESHOP_GR ALL =~ /e-shop\.gr/i
 score CJB_ESHOP_GR 0.01

The efficiency and wisdom of matching against any headers aside,
 is this correct so far?  I'm just thinking simple, I just want to
 match the e-shop.gr string.

While not the best way to catch one of these emails, this will
indeed do what I had intended - match the string 'e-shop.gr' in
any of the email headers.

Assuming that's OK, I moved on to creating the rule.  Now, I store
 my user preferences in a PostgreSQL database, so this is where I'm
 slightly unsure...  Can I simply add my rule above to my userprefs
 table, one line per row?  Like so (pardon the line wrap):

Is this the correct way to do it?  Or do I *have* to add them to
 local.cf or another such local file?

Nope, this isn't possible - rules cannot be stored in a database,
nor will most installations allow user-defined rules.  I went
ahead and created my own .cf file in /etc/mail/spamassassin,
and that works great.

I'm also unsure if I need to restart spamd to pick up on this new
 rule, if indeed it can be stored in SQL - my other userprefs do not
 need a restart, but rules I'm very unsure about.  I tried cat'ing a
 sample spam and piping it to spamc (cat sample.txt | spamc), but
 it did not hit my test rule.  And yes, e-shop.gr shows up in the
 Received and From headers.  :)

Yes, a restart is necessary.  Always do a 'spamassassin --lint'
first, too.

Thank you!

Benny


-- 
Whoever said that hell hath no fury like a women scorned never
owned a cat.
  -- bash.org



Re: Using SpamAssassin to parse Received headers

2007-08-16 Thread C. Bensend

 Look in man perldoc Mail::SpamAssassin::Plugin for the definition of a
 new plugin (for example MyFilter)

 You could do a lot of interesting things inside the plugin
 with the Mail::SpamAssassin::PerMsgStatus element (again perldoc ...)

Hi Leonardo,

   I think this example might be more suited towards creating a new
SA plugin, right?  That's not at all what I'm trying to do here -
this isn't part of the mail stream, this is a standalone perl program.
I just want to take advantage of SA's excellent Received header
parsing to get the IP addresses for the relays.

   Basically, here's what I'm trying to do:

1) Connect to DB
2) Grab the full text of an email into a variable
3) Use SA's code to parse the headers
4) Grab the IP addresses from the Received headers

   #3 and #4 are where I'm having issues, because I don't understand
the code (and I suck pretty bad at Perl).

Thanks, and hope that clears up my problem,

Benny


-- 
This officer's men seem to follow him merely out of idle curiosity.
   -- Sandhurst officer cadet evaluation



Using SpamAssassin to parse Received headers

2007-08-14 Thread C. Bensend

Hey folks,

   This is a question about using SpamAssassin's perl interface, not
about filtering mail.

   I'm using 3.2.2 (soon to be 3.2.3) on OpenBSD, built from source.
In addition to using SA to filter my email, I'd also like to take
advantage of SA's ability to parse Received headers for my own
project.

   I store the entire spam in a database.  What I want to do is to
be able to parse out the Received headers' IP addresses from the
full text of each email.  I only really need the IP that hands off
to my own servers, but it would be useful to get an array of all of
them.

   I am not at all a perl guru - I've written quite a bit of it,
but more complex stuff than my simple scratchings makes my brain
swell and hurt.  If someone could give me a quick leg up in going
from a variable containing the entire message to an array of IPs
(or just the handoff IP, that's fine), I'd really appreciate it.

Thanks a bunch!

Benny


-- 
This officer's men seem to follow him merely out of idle curiosity.
   -- Sandhurst officer cadet evaluation



Getting the breakdown of the rules and scores in every email

2007-03-02 Thread C. Bensend

Hey folks,

   I'm running 3.1.8 on OpenBSD 4.0-STABLE, with sa-update snagging
updates and a subset of the SARE rules.  YAY sa-update, what a
snazzy addition to SpamAssassin.  :)

   I have report_safe set to 0 via SQL userprefs, so message bodies
are not modified, only headers.

   Now, on spam messages, I get the X-Spam-Report header like so:


X-Spam-Report:
 * 0.2 RCVD_ILLEGAL_IP Received: contains illegal IP address
 * 1.7 SARE_MLB_Stock1 BODY: SARE_MLB_Stock1
 * 1.8 TVD_FUZZY_SYMBOL BODY: TVD_FUZZY_SYMBOL
 * 1.7 SARE_LWTARGETP BODY: SARE_LWTARGETP
 * 0.0 HTML_MESSAGE BODY: HTML included in message
 * 0.3 HTML_FONT_BIG BODY: HTML tag for a big font size
 * 0.0 UPPERCASE_25_50 message body is 25-50% uppercase


   I would very much like to add that to _all_ messages, and I
know how to do that (add_report all something blah), but the part
I'm missing is the something and the blah.

   'add_header all FullReport _SUMMARY_' is almost there:


X-Spam-FullReport: 0.1 FORGED_RCVD_HELO Received: contains a forged HELO
 -2.6 BAYES_00 BODY: Bayesian spam probability is 0 to 1%
 [score: 0.]


   I think I'm on a roll, when I find this posting from Matt Kettler:

http://marc.theaimsgroup.com/?l=spamassassin-usersm=117008065926512w=2

stating that _SUMMARY_ is only intended for in-body reports, ie, not
what I'm doing.  But I SEE what I want in the headers of messages
tagged as spam!  So...  It has to work somehow, right?  What
incantation of 'add_header all something blah' brings forth the same
header display?  I'm making an assumption that it _is_ an 'add_header
spam something blah' sort of thing, I guess...

   It would be helpful if there were some list of all the macros that
can be used, but I didn't find that either...  Is there such a thing?
I've been through the wiki and Mail::SpamAssassin::Conf and such...

Thanks much,

Benny


-- 
During the Armageddon, only two things will survive - cockroaches and
Cher.-- What's her bra size online game










Re: Getting the breakdown of the rules and scores in every email

2007-03-02 Thread C. Bensend

 perldoc Mail::SpamAssassin::Conf

 Search for 'TEMPLATE TAGS'

Geh.  Ever stared at something so long, that you completely
miss the obvious?  Yeah, me too.

Thanks a bunch, Dave.

Benny


-- 
During the Armageddon, only two things will survive - cockroaches
and Cher.-- What's her bra size online game



Re: Getting the breakdown of the rules and scores in every email

2007-03-02 Thread C. Bensend

I would very much like to add that to _all_ messages, and I
 know how to do that (add_report all something blah), but the part
 I'm missing is the something and the blah.

'add_header all FullReport _SUMMARY_' is almost there:

 This works for me:

 add_header  ham Report _REPORT_

That is _precisely_ what I was looking for.

Thanks very much, and sorry I missed those macros.

Benny


-- 
During the Armageddon, only two things will survive - cockroaches
and Cher.-- What's her bra size online game



Re: Google Summer of Code 2007 ...

2007-02-21 Thread C. Bensend

 Perhaps this is trivial, or not desired by anyone else but myself,
 but I'd _love_ to be able to strip SpamAssassin tags via spamc and
 spamd, instead of having to fire up the full-blown spamassassin
 for each message.  :)

 formail ?

That would work in most cases, yes.  Unfortunately, not in mine.
Thanks for the pointer, though.  :)

Benny


-- 
During the armageddon, only two things will survive - cockroaches
and Cher.-- What's her bra size online game



Re: Google Summer of Code 2007 ...

2007-02-16 Thread C. Bensend

 Also, any suggestions from outside the dev team?  Anyone got good ideas
 for new SpamAssassin features that would be good to pay someone to work on
 for 3 months?

Perhaps this is trivial, or not desired by anyone else but myself,
but I'd _love_ to be able to strip SpamAssassin tags via spamc and
spamd, instead of having to fire up the full-blown spamassassin
for each message.  :)

Benny


-- 
Very funny, Scotty. Now beam down my clothes.  -- James. T. Kirk




user_bayes_sql_custom_query ?

2006-12-08 Thread C. Bensend

Hey folks,

   So, I've been giving this some thought in the last week, as I'm
running into the old either site bayes or per-user bayes, nothing
in between issue.  I'm using simscan, which passes the first email
address to spamc, so for me it's a per-email-address limitation.

   For a majority of my users, that's fine - they only have _one_
email address.  For me, it's a problem, as I have dozens of email
addresses that are delivered to me, and sorted via maildrop.  Many
of these secondary addresses get tons of spam, but because they're
delivered to aliases, SA never applies bayes scoring, because the
user doesn't match the user my bayes database uses (using SQL,
of course).

   I would _love_ to have a bayes equivalent of
user_score_sql_custom_query, where spamd would query a table
consisting of something like so:

email_alias  CHAR(64)
email_user   CHAR(64)

or something similar.  That way, I could populate it with data like:

[EMAIL PROTECTED]   [EMAIL PROTECTED]
[EMAIL PROTECTED]  [EMAIL PROTECTED]
[EMAIL PROTECTED]  [EMAIL PROTECTED]
[EMAIL PROTECTED] [EMAIL PROTECTED]
[EMAIL PROTECTED][EMAIL PROTECTED]

etc...

   So, in this scenario, an email comes in destined to one of the
many secondary email addresses.  spamd makes a query (SELECT
email_user FROM aliases WHERE email_alias = '$user').  If spamd
gets a hit, great, try to initialize the bayes database for that
user.  If not, skip bayes and go on with life.

   Just a thought.  It would certainly help me in my situation,
but perhaps I'm just spending a little too much quality time with
the crackpipe.

Good idea?  Bad idea?  Dumb idea?

Benny


-- 
The faster you finish the fight, the less shot you will get.
-- Marine Corps Rules for
   Gunfighting




Re: user_bayes_sql_custom_query ?

2006-12-08 Thread C. Bensend

 Why not modify simscan to do this kind of lookup for you, and pass the
 correct username to SA?

Yes, absolutely, that would be another solution to the issue.  :)

The reason I ask here is because SA already does almost exactly
this sort of lookup for userpref.  Maybe some of the code could be
reused, but maybe not...  I'm not a developer, and you'd weep
yourself to sleep for weeks on end if I tried to come up with a
patch.  ;)

If there's no interest/resources, no problem.  It would be nice to
have, though.  :)

Benny


-- 
The faster you finish the fight, the less shot you will get.
-- Marine Corps Rules for
   Gunfighting




Re: SpamAssassin 3.1.7 and Openbsd 4.0 installation fail

2006-12-05 Thread C. Bensend

 Anybody can guide me how to proceed. I am installing SpamAssassin on
 OpenBSD
 4.0 and it failed during the test phase. I have attached the output.
 Perl version is v5.8.8 built for i386-openbsd 4.0

You didn't build your own perl or anything, did you?

I have installed 3.1.7 on OpenBSD 4.0-STABLE dozens of times since
4.0 was released, and have not had any problems.

Have you updated to -STABLE?

Benny


-- 
The faster you finish the fight, the less shot you will get.
-- Marine Corps Rules for
   Gunfighting




SA 3.1.7 not picking up SQL-based Bayes

2006-12-03 Thread C. Bensend

Hey folks,

   I'm finishing up a mailserver upgrade this weekend, and I notice
that my new SQL-based install isn't picking up on user-based Bayes
data.  This is on a new, squeaky-clean OpenBSD 4.0-STABLE machine
running on AMD64, using SpamAssassin 3.1.7 with perl 5.8.8.

As per spamd -D info:

2006-12-03 22:41:53.760956500 [12889] dbg: config: retrieving prefs for
[EMAIL PROTECTED] from SQL server

OK, yay, spamd is picking up on the SQL userprefs.

2006-12-03 22:41:53.772480500 [12889] dbg: info: user has changed

Not sure what this means?

2006-12-03 22:41:53.774209500 [12889] dbg: bayes: using username:
[EMAIL PROTECTED]
2006-12-03 22:41:53.781308500 [12889] dbg: bayes: database connection
established
2006-12-03 22:41:53.786485500 [12889] dbg: bayes: found bayes db version 3
2006-12-03 22:41:53.789654500 [12889] dbg: bayes: unable to initialize
database for [EMAIL PROTECTED] user, aborting!
2006-12-03 22:41:54.117388500 [12889] dbg: bayes: not scoring message,
returning undef
2006-12-03 22:41:54.118260500 [12889] dbg: bayes: opportunistic call
attempt failed, DB not readable

Uh.  What does unable to initialize database mean?  Spamd has already
successfully connected to the PostgreSQL database above, right?  So what
does initializing database mean?

My user_scores_sql_custom_query is as follows, if that makes a
difference (not sure if that's consulted for Bayes data):


user_scores_sql_custom_querySELECT preference, value FROM userpref
WHERE username = _MAILBOX_ OR username = _USERNAME_ OR username =
'$GLOBAL' ORDER BY user name ASC;


To add insult to injury, learning spam and ham work just fine.
It's just the Bayes scoring that seems to have issues.

So.  I'm at a loss at the moment...  My SA install is doing well,
but not as well as it should, if it's ignoring Bayes.  What info
can I pass along to help diagnose this problem?

Thanks much!

Benny


-- 
If stupidity were a handicap, you'd have the best parking spot.
--Bill Paul




Re: SA 3.1.7 not picking up SQL-based Bayes

2006-12-03 Thread C. Bensend

 I think its just a slightly confusing message.  If you run:
 sa-learn -u [EMAIL PROTECTED]

 Does it show that you have 200 ham and 200 spam in the database?  If so
 then there is a problem, if not you just need to train it some more.

 What the WARNING is telling you is that hey this database isn't ready
 for scoring so I'm not gonna use it.  This is why learning works just
 fine.  Finish training up the DB and see if it then starts working for
 you.

 Michael

 PS Possibly we should get the warning text changed a bit, feel free to
 open up a bug so we can track the work, thanks.

Hi Michael,

Well, I have the following in the script that runs every now and
again, to execute sa-learn:

[EMAIL PROTECTED] ~]$ sa-learn --dump magic | grep non-token data: nham |
awk '{ print $3 }'
257526
[EMAIL PROTECTED] ~]$ sa-learn --dump magic | grep non-token data: nspam |
awk '{ print $3 }'
470150

I'm fairly sure I have enough ham and spam.  :)  Also, I'm watching
the PostgreSQL logfile when I do that, and it _is_ querying the
database.

Just for argument's sake, I checked for *BAYES* in the spamd logfile,
and I don't get a single hit.  So, Bayes is definately not working
for _any_ of the accounts, not just mine.  :(

Thanks for any insight,

Benny


-- 
If stupidity were a handicap, you'd have the best parking spot.
--Bill Paul




Re: SA 3.1.7 not picking up SQL-based Bayes

2006-12-03 Thread C. Bensend

 Ahh but you didn't run the command I asked you to run.  You are passing
 the user: [EMAIL PROTECTED] to SpamAssassin so it will use that as
 the key for the database, running the command from the command like that
 way is going to use your unix id as the key.  I'm guessing you changed
 something in your mail setup to start passing in @domain in addition to
 the regular unix username.

Actually, yes, I did, but I don't think it turned out like we
were expecting (hence I didn't include it, I'm sorry):


[EMAIL PROTECTED] ~]$ sa-learn -u [EMAIL PROTECTED]   
 SpamAssassin version 3.1.7
Please select either --spam, --ham, --folders, --forget, --sync, --import,
--dump, --clear, --backup or --restore
Usage:
sa-learn [options] [file]...

sa-learn [options] --dump [ all | data | magic ]

Options:

 --ham Learn messages as ham (non-spam)
 --spamLearn messages as spam
 --forget  Forget a message
 --use-ignores Use bayes_ignore_from and
bayes_ignore_to
 --syncSyncronize the database and the
journal if needed
 --force-expireForce a database sync and expiry run
 --dbpath path   Allows commandline override (in
bayes_path form)
   for where to read the Bayes DB from
 --dump [all|data|magic]   Display the contents of the Bayes
database
   Takes optional argument for what to
display
  --regexp reFor dump only, specifies which
tokens to
   dump based on a regular expression.
 -f file, --folders=file   Read list of files/directories from
file
 --dir Ignored; historical compatability
 --fileIgnored; historical compatability
 --mboxInput sources are in mbox format
 --mbx Input sources are in mbx format
 --showdotsShow progress using dots
 --no-sync Skip syncronizing the database and
journal
   after learning
 -L, --local   Operate locally, no network accesses
 --import  Migrate data from older version/non
DB_File
   based databases
 --clear   Wipe out existing database
 --backup  Backup, to STDOUT, existing database
 --restore filename  Restore a database from filename

 -u username, --username=username  Override username taken from the
runtime environment
 -C path, --configpath=path, --config-file=path   Path to standard
configuration dir
 -p prefs, --prefspath=file, --prefs-file=fileSet user preferences
file
 --siteconfigpath=path Path for site configs (def:
/etc/mail/spamassassin)
 -D, --debug-level Print debugging messages
 -V, --version Print version
 -h, --helpPrint usage message


But regardless - won't the user_scores_sql_custom_query I posted
handle that possibility?  I am _so_ not an SQL guru, but it looks
correct to me?  I'm never afraid to admit a mistake, so if I'm
smoking crack here, please step up and say so.  :)

Benny


-- 
If stupidity were a handicap, you'd have the best parking spot.
--Bill Paul




Re: SA 3.1.7 not picking up SQL-based Bayes

2006-12-03 Thread C. Bensend

 add the rest of you --dump magic command to that.

Right.  Duh me.  Heh.  The following was captured via -D:

[20507] dbg: bayes: using username: [EMAIL PROTECTED]
[20507] dbg: bayes: database connection established
[20507] dbg: bayes: found bayes db version 3
[20507] dbg: bayes: unable to initialize database for
[EMAIL PROTECTED] user, aborting!
[20507] dbg: config: score set 0 chosen.
[20507] dbg: bayes: database connection established
[20507] dbg: bayes: found bayes db version 3
[20507] dbg: bayes: unable to initialize database for
[EMAIL PROTECTED] user, aborting!
ERROR: Bayes dump returned an error, please re-run with -D for more
information

 That custom query has nothing to do with bayes or awl sql stuffs.

Gotcha.  Thanks.

Thanks for taking a look at this, Michael,

Benny


-- 
If stupidity were a handicap, you'd have the best parking spot.
--Bill Paul




Re: sa-learn --backup and --restore issue: duplicate key violations

2006-03-25 Thread C. Bensend

After a number of these, it dies with:

 bayes: encountered too many errors (20) while parsing seen lines,
 reverting to empty database and exiting

 ERROR: Bayes restore returned an error, please re-run with -D for more
 information

.. which makes me sad.  So, my question - is there a way to
 fix this?  Or will I have to end up dumping my Bayes and starting
 over?  I really hope I don't have to do that, because my Bayes
 database is huge and really quite accurate.

   I hadn't seen any responses to my question as of yet, so I
decided to do some more experimenting.

   I ran the backup file through sort and uniq, moved the version
line back to the top, and ran it through sa-learn again.  This
time, it completed successfully:

[17678] dbg: bayes: parsed 522507 lines
[17678] dbg: bayes: created database with 117864 tokens based on 249654
spam messages and 155005 ham messages

   So, is this an OK thing to have done?  Due to the lack of
a single error, I'm guessing that changing the order of the
backup file (other than the version line) doesn't hurt anything.
Is this correct?

   Also, any ideas how my Bayes database got duplicate tokens
in the first place?

Thanks,

Benny


-- 
A computer lets you make more mistakes faster than any invention
in human history, with the possible exceptions of handguns and
tequila.  -- Found on usenet



sa-learn --backup and --restore issue: duplicate key violations

2006-03-24 Thread C. Bensend

Hey folks,

   I'm going to be upgrading my mailserver in a month or two,
so I'm running through some different configurations for
SpamAssassin, IMAP, and anti-virus.  I'm working on testing the
SQL stuff for user configs and Bayes right now.

   So, here are the stats:


Old mailserver  New mailserver
=   
OpenBSD 3.6 on AMD64OpenBSD 3.9 on AMD64
SpamAssassin 3.0.4 using files  SpamAssassin 3.1.1 using SQL


   To begin testing, I did a 'sa-learn --backup  outfile' on
the existing mailserver, and 'sa-learn --restore outfile' on
a POS test box I have installed with a recent snapshot of
OpenBSD 3.9.  I used the native SpamAssassin's version of
sa-learn (ie, I used 3.0.4's sa-learn on the old box, and
3.1.1's version of sa-learn on the new).

   The dump is significant - over a half a million lines and
around 28MB.  I should mention that I believe I have
SpamAssassin properly configured to talk to the database on
the POS testing box, everything seems fine there.

   The restore starts fine, and runs and runs.  I see the
tokens being stuffed into the bytea columns, and finally when
it comes to the bayes_seen stuff, I see the INSERTs flying
past.  Yay!

   But after a while (I know it got over 270,000 rows INSERTed,
but I don't know how many more after that), it starts throwing
unique contraint violations:

[21458] dbg: bayes: error inserting msgid in seen table for line:
[EMAIL PROTECTED]
[21458] dbg: bayes: seen_put: SQL error: ERROR:  duplicate key violates
unique constraint bayes_seen_pkey
[21458] dbg: bayes: error inserting msgid in seen table for line:
[EMAIL PROTECTED]
[21458] dbg: bayes: seen_put: SQL error: ERROR:  duplicate key violates
unique constraint bayes_seen_pkey

   After a number of these, it dies with:

bayes: encountered too many errors (20) while parsing seen lines,
reverting to empty database and exiting

ERROR: Bayes restore returned an error, please re-run with -D for more
information

   .. which makes me sad.  So, my question - is there a way to
fix this?  Or will I have to end up dumping my Bayes and starting
over?  I really hope I don't have to do that, because my Bayes
database is huge and really quite accurate.

Thanks, folks!

Benny


-- 
A computer lets you make more mistakes faster than any invention
in human history, with the possible exceptions of handguns and
tequila.  -- Found on usenet



Re: SA-LEARN HANGING when database over 2000 SPAM messages

2006-02-27 Thread C. Bensend

 I can't offer much assistance with your problem, but on the db size, I can
 say that we were running it with around 25k spams and 25k hams learned,
 with sa-learn running on shared imap folders every hour adding more.

This is from this morning's sa-learn run:

Total number of HAM messages : 144335
Total number of SPAM messages: 232633

I doubt if the OP's problem has to do with the number of emails.  :)


-- 
A computer lets you make more mistakes faster than any invention
in human history, with the possible exceptions of handguns and
tequila.  -- Dave Pooser



RE: Simple question TRUE or FALSE (More data to answer this question)

2005-05-19 Thread C. Bensend

 Please don't take this as me doubting you - but how in the world are you
 able to scan a message in 2-3 seconds?  I assume you're running some of

Personally, I rarely have any processing times over 1 second.  Most
of mine are between 0.3 and 0.9 seconds per message.

I do not run any network tests, however.  Stock SpamAssassin rules,
the only modifications I've made have been some scoring adjustments.

This is on an AMD64 3000+ with 1GB of DDR400 RAM, running
OpenBSD 3.6-STABLE, spamd/spamc, and qmail.

Benny


-- 
You come from a long line of scary women. -- Ranger, Three To
   Get Deadly



RE: Amazon is killing me....

2005-02-28 Thread C. Bensend

 Doesn't look like spamassassin. I think maybe you should look at your
 Clamv
 virus attachment config http://www.clamav.net/ or call the guy you laid
 off
 who setup the clam av filter. :-)

I believe that is qmail-scanner that's complaining about the MIME
stuff.  The OP needs to dive into qmail-scanner's config and/or
check with their mailing list.

Benny


-- 
So scary, Steven King shiat his pants.
-- Photoshop contest, Fark



sa-learn fails on certain spams with out of memory error

2005-02-02 Thread C. Bensend

Hey folks,

   I'm running SpamAssassin 3.0.2 on an OpenBSD 3.6-STABLE machine,
on an AMD64 3000+ with 2GB of RAM.  I have yet to see this machine
even touch the second gig of RAM, and it's never been into swap.

   This is a new server, so I'm trying to train Bayes using a corpus
I've been saving for a while.  Unfortunately, I seem to have found
an issue with sa-learn (or perhaps sa-learn and OpenBSD):

[EMAIL PROTECTED] ~]$ sa-learn --showdots --spam --dir
/home/benny/Maildir/.SPAM.corpus.2004.archive15/cur/
...
...
..Out of memory!

   This failure is reproducable every time, on the exact same message.
When sa-learn fails in this manner, it also fails to clean up its
lock file (although, I suppose that's to be expected if it's the OS
that's killing it), presenting a minor DoS situation for future
sa-learn runs.

   I have increased my limits to the same level as system daemons and
root, to no effect.  As a test, I tried running the exact same command
as root, and got the same Out of memory! error.  I went through the
SpamAssassin source, and I don't find this error, so I'm thinking it's
OpenBSD clamping down on sa-learn for some reason.

   I have gone through several thousand mails, and randomly picked
twenty that cause sa-learn to fail every time.  They can be found at:

http://www.bennyvision.com/temp/sa/

   I have included twenty emails that cause it to fail (broken*),
the same twenty emails with the SpamAssassin 2.64 markup removed
(stripped/broken*.stripped), as well as the output of 'perl -V'
and the output of 'cat broken1 | spamassassin -D' to show the debug
output and actual error.

   I asked a related question over on the OpenBSD misc list, asking
what limits I might adjust to get around this, but I haven't found a
solution yet.  What is sa-learn doing that's even being limited as
_root_?!?

   If someone could help me out with this, it would be GREATLY
appreciated.

Thanks much!

Benny


-- 
I'm on the Zoloft to keep from killing y'all.
  -- Mike Tyson