Re: SA 32x build not finding openssl headers on FreeBSD?

2007-12-31 Thread snowcrash+sa
cd /usr/local/build/spamassassin/spamc
perl version.h.pl
./configure --enable-ssl
cd ../

 those should not be necessary.  Makefile.PL will call the spamc
 configure script with the appropriate
 args (including a few you're missing)...

admittedly, those are fairly-old-problem holdovers ... and i stand corrected.

with 'just',

 setenv LDFLAGS   -L/usr/local/lib
 setenv CPPFLAGS  -I/usr/local/include
 setenv CFLAGS$CPPFLAGS

 perl Makefile.PL \
  PREFIX=/usr/local \
  DATADIR=/usr/local/etc/SA/Dist \
  CONFDIR=/usr/local/etc/SA/Local \
  LOCALSTATEDIR=/usr/local/etc/SA/Updates \
  ENABLE_SSL=yes
 make install

i get,

  ls -al /usr/local/bin/spamc
-r-xr-xr-x  1 root  wheel  54578 Dec 31 07:03 /usr/local/bin/spamc*

  ldd /usr/local/bin/spamc
/usr/local/bin/spamc:
libssl.so.5 = /usr/local/lib/libssl.so.5 (0x80063b000)
libcrypto.so.5 = /usr/local/lib/libcrypto.so.5 (0x800784000)
libz.so.3 = /lib/libz.so.3 (0x800a1a000)
libc.so.6 = /lib/libc.so.6 (0x800b2e000)

i.e., for now, for me, just the CFLAGS == CPPFLAGS is req'd.

thanks.


Re: sa 32x-branch 'make test' fails @ t/spamc_optL.t (among others ...) on freebsd

2007-12-31 Thread snowcrash+sa
hi justin,

 first of all, try setting the env var SPAMD_HOST to the IP address the jail 
 can
 use for localhost.

ok.

tried that.  didn't help :-/

although, take a look at the test details @
http://issues.apache.org/SpamAssassin/attachment.cgi?id=4222action=edit

despite setting SPAMD_HOST, there's still a lot of 127.0.0.1 refs ...
and none to the IP I set.  the ENV var isn't picking up -- did i bork
that as well?

 if that doesn't work open a bug

done.  http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5761

 but bear in mind that it will probably only get attention from other jail
 users

heh. understood. and, expected.

alas, i know it's wasted breath to argue that the prevalence of SA-(
everything else, for that matter)-in-jails/VMs is only going to
increase, and that this will not be an atypical use-case ... but, for
now, NIH-syndrome, i s'pose ;-)

thanks!

cheers.


Re: sa 32x-branch 'make test' fails @ t/spamc_optL.t (among others ...) on freebsd

2007-12-31 Thread snowcrash+sa
 Not wasted breath as long as you'll accept:

 Patches Welcome!

 as a response :)

heh!  you have that reponse on auto dial, doncha?  come on, now --
fess up ;-) (p.s., i wasn't referring to those -- such as yourself --
already *on* the 'right' side of the argument)

yes. patches.  once a problem is understood as actually *being* a
problem. or just plain understood. which, in this case, it isn't.
works on OSX, doesn't on FreeBSD/JAIL.  no clue -- yet -- as to why.

and, might i suggest, soliciting  accepting such patches from a
first-timer (namely, atm, 'me'), is a questionable venture ... but
i'll happily 'spew-n-share' if/when/how i do!

cheers.


Re: sa 32x-branch 'make test' fails @ t/spamc_optL.t (among others ...) on freebsd

2007-12-31 Thread snowcrash+sa
 I should point out -- half of the attention from jail users comment has
 to do with another issue -- only people with jails can effectively test
 any potential fix.  That poses a big problem for developers testing.

i think syndey's seeing it in/on non-jail osx, as well  cref: the bug.

cheers.


sa 32x-branch 'make test' fails @ t/spamc_optL.t (among others ...) on freebsd

2007-12-30 Thread snowcrash+sa
hi,

bldg SA 32x-branch manually, from src in a freebsd jail.

SA builds/installs/runs without noticeable issue.

but, 'make test' has scads of failures ... similar to those mentioned here:

  http://www.nabble.com/Re:-Problem-with-3.2.2-p14236931.html

picking one, t/spamc_optL.t, fails as:

% sudo -u spamd make test TEST_VERBOSE=1 TEST_FILES=t/spamc_optL.t
make -f spamc/Makefile spamc/spamc
`spamc/spamc' is up to date.
/usr/local/bin/perl build/mkrules --exit_on_no_src --src rulesrc --out
rules --manifest MANIFEST --manifestskip MANIFEST.SKIP
mkrules: no rules updated
/usr/local/bin/perl build/preprocessor  -Mvars  -DVERSION=3.002004
-DPREFIX=/usr/local  -DDEF_RULES_DIR=/usr/local/etc/SA/Dist
-DLOCAL_RULES_DIR=/usr/local/etc/SA/Local
-DLOCAL_STATE_DIR=/usr/local/etc/SA/Updates
-DINSTALLSITELIB=/usr/local/lib/perl5/site_perl/5.8.8
-DCONTACT_ADDRESS= -Msharpbang  -Mconditional
-DPERL_BIN=/usr/local/bin/perl  -DPERL_WARN=  -DPERL_TAINT=
-m755 -isa-update.raw -osa-update
cp sa-update blib/script/sa-update
/usr/local/bin/perl -MExtUtils::MY -e MY-fixin(shift) blib/script/sa-update
PERL_DL_NONLAZY=1 /usr/local/bin/perl -MExtUtils::Command::MM -e
test_harness(1, 'blib/lib', 'blib/arch') t/spamc_optL.t
t/spamc_optL..
1..16
# Running under perl version 5.008008 for freebsd
# Current time local: Sun Dec 30 00:13:31 2007
# Current time GMT:   Sun Dec 30 08:13:31 2007
# Using Test.pm version 1.25
/usr/local/bin/perl SATest.pl -Mredirect
-Olog/d.spamc_optL/spamd.err.1 -olog/d.spamc_optL/spamd.out.1 --
/usr/local/bin/perl -T -w ../spamd/spamd.raw -D -x -s stderr -C
log/test_rules_copy  --siteconfigpath log/localrules.tmp -p 62704 -A
127.0.0.1 -L --allow-tell -s log/d.spamc_optL/spamd.err.1.timestamped

../spamc/spamc -F data/spamc_blank.cf -d 127.0.0.1 -p 62704 -L spam 
data/spam/001
Killed 1 spamd instances
Waiting for spamd at pid 95478 to exit...
not ok 1
Checking learned spam
not ok 2
../spamc/spamc -F data/spamc_blank.cf -d 127.0.0.1 -p 62704 -L spam 
data/spam/001
# Failed test 1 in t/spamc_optL.t at line 20
Not found: learned spam = Message successfully un/learned
# Failed test 2 in t/SATest.pm at line 662
Output can be examined in:
not ok 3
Checking already learned spam
not ok 4
/usr/local/bin/perl -T -w ../sa-learn.raw -C log/test_rules_copy
--siteconfigpath log/localrules.tmp -p log/test_default.cf  --dump
magic
# Failed test 3 in t/spamc_optL.t at line 24
Not found: already learned spam = Message was already un/learned
# Failed test 4 in t/SATest.pm at line 662 fail #2
Output can be examined in:
ERROR: Bayes dump returned an error, please re-run with -D for more information
not ok 5
Checking spam in database
not ok 6
../spamc/spamc -F data/spamc_blank.cf -d 127.0.0.1 -p 62704 -L forget
 data/spam/001
# Failed test 5 in t/spamc_optL.t at line 28
Not found: spam in database = 1 0  non-token data: nspam
# Failed test 6 in t/SATest.pm at line 662 fail #3
Output can be examined in:
not ok 7
Checking forget spam
not ok 8
../spamc/spamc -F data/spamc_blank.cf -d 127.0.0.1 -p 62704 -L ham 
data/nice/001
# Failed test 7 in t/spamc_optL.t at line 32
Not found: forget spam = Message successfully un/learned
# Failed test 8 in t/SATest.pm at line 662 fail #4
Output can be examined in:
not ok 9
Checking learned ham
not ok 10
../spamc/spamc -F data/spamc_blank.cf -d 127.0.0.1 -p 62704 -L ham 
data/nice/001
# Failed test 9 in t/spamc_optL.t at line 36
Not found: learned ham = Message successfully un/learned
# Failed test 10 in t/SATest.pm at line 662 fail #5
Output can be examined in:
not ok 11
Checking already learned ham
not ok 12
/usr/local/bin/perl -T -w ../sa-learn.raw -C log/test_rules_copy
--siteconfigpath log/localrules.tmp -p log/test_default.cf  --dump
magic
# Failed test 11 in t/spamc_optL.t at line 40
Not found: already learned ham = Message was already un/learned
# Failed test 12 in t/SATest.pm at line 662 fail #6
Output can be examined in:
ERROR: Bayes dump returned an error, please re-run with -D for more information
not ok 13
Checking ham in database
not ok 14
../spamc/spamc -F data/spamc_blank.cf -d 127.0.0.1 -p 62704 -L forget
 data/nice/001
# Failed test 13 in t/spamc_optL.t at line 44
Not found: ham in database = 1 0  non-token data: nham
# Failed test 14 in t/SATest.pm at line 662 fail #7
Output can be examined in:
not ok 15
Checking learned ham
not ok 16
# Failed test 15 in t/spamc_optL.t at line 48
Not found: learned ham = Message successfully un/learned
# Failed test 16 in t/SATest.pm at line 662 fail #8
Output can be examined in:
 Failed 16/16 subtests

Test Summary Report
---
t/spamc_optL.t (Wstat: 0 Tests: 16 Failed: 16)
  Failed test number(s):  1-16
Files=1, Tests=16, 23 wallclock secs ( 0.01 usr  0.02 sys +  3.02 cusr
 1.05 csys =  4.10 CPU)
Result: FAIL
Failed 1/1 test 

Re: sa 32x-branch 'make test' fails @ t/spamc_optL.t (among others ...) on freebsd

2007-12-30 Thread snowcrash+sa
hi matthias,

 i use CPAN on a FreeBSD 4.8 Jail to install / upgrade   SA and i never
 successfuly  run 'make test'
 the same goes for compiling from the sources.

 but i installed with  'notest install  ' in CPAN and have no
 Problems while running SA

i've also often found that failures in 'SA's 'make test' are just
failures in the tests, and that SA ends up running well, nonetheless.

but, atm, i've 133/2048 failures (e.g., here:
http://rafb.net/p/1puesA82.html), and that concerns me a bit.
especially since this is my 1st FreeBSD install of SA.

ideally, i'd like to get them figured out / fixed.

cheers!


Re: sa 32x-branch 'make test' fails @ t/spamc_optL.t (among others ...) on freebsd

2007-12-30 Thread snowcrash+sa
hi justin,

On Dec 30, 2007 4:16 AM, Justin Mason [EMAIL PROTECTED] wrote:
 could it be the use of 127.0.0.1, opening listening sockets there
 etc.? we have had issues
 with that and solaris zones.

one of the joys of FreeBSD v6.2R jails seems to be only one IP per
jail. (there are kernel patches available for 6.3  7.0 PRERELEASES
 dunno if official yet)

that said, i *CAN* ping 127.0.0.1 from within the jail.

i'll have to dig about to see if openeing sockets on localhost in a
jail is problematic.  atm, i have no clue.

is there a straightforward way to have the tests use the jail's
assigned, non-localhost IP instead? or , do i need to change the src?

cheers!


Re: sa 32x-branch 'make test' fails @ t/spamc_optL.t (among others ...) on freebsd

2007-12-30 Thread snowcrash+sa
noting that

(a) these errors have appeared before
(b) you've some suspicion that it may be related to issue w/ solaris zones
(c) y'all are goin' great-guns on -devel wrapping up bugs for 324

should i open a bug on this? or is it something that'll get some
attention anyway?

thanks!


Re: SA 32x build not finding openssl headers on FreeBSD?

2007-12-29 Thread snowcrash+sa
hi michael,

 I am the official ports maintainer for the SA port on Freebsd.

:-)

 I notice you are using the ports version of openssl,

yes

 but not the ports version of SA (you are using SVN) and you are mixing a ports
 version of openssl with the SVN version of SA.

since SA project is 'fast moving' in terms of fixes/patches, e.g. re:
sa-compile fixes etc, lots of what i want/need is NOT in the release
(yet) but IS in the 32x branch svn.

i tried, initially, to clone a local port to handle SVN rather than
release tarball, but borked it badly. :-/

so, i reverted to my tried-n-true-on-osx svn process.

after slogging about in src, i've stumbled on the fix, simply adding

  CFLAGS == CPPFLAGS

i.e., this works without a hitch (still using the portinstalled-openssl ...)

  setenv LDFLAGS   -L/usr/local/lib
  setenv CPPFLAGS  -I/usr/local/include
  setenv CFLAGS$CPPFLAGS

  cd /usr/local/build/spamassassin
  perl Makefile.PL \
   PREFIX=/usr/local \
   DATADIR=/usr/local/etc/SA/Dist \
   CONFDIR=/usr/local/etc/SA/Local \
   LOCALSTATEDIR=/usr/local/etc/SA/Updates \
   ENABLE_SSL=yes

  cd /usr/local/build/spamassassin/spamc

  perl version.h.pl
  ./configure --enable-ssl

  cd ../
  make
  make install

kind of odd that the CFLAGS spec is req'd ... don't know whether
that's expected behavior /or something that can/should be
accommodated in the port.

cheers!


SA 32x build not finding openssl headers on FreeBSD?

2007-12-27 Thread snowcrash+sa
i'm migrating to SA 32x-branch/svn (r607171) on FreeBSD 62.

i've openssl installed from ports in prefix=/usr/local, i.e.,

  ssl libs in /usr/local/lib
  ssl incs in /usr/local/include/openssl

echoing my working procedure from osx, i

 cd spamassassin
 setenv LDFLAGS   -L/usr/local/lib
 setenv CPPFLAGS  -I/usr/local/include

 perl Makefile.PL PREFIX=/usr/local/spamassassin ENABLE_SSL=yes

 cd spamc
 perl version.h.pl

 ./configure --enable-ssl CFLAGS=-DSPAMC_SSL

 cd ../
 make

on freebsd only, this fails to find openssl headers,

 ...
 cp sa-compile blib/script/sa-compile
 /usr/local/bin/perl -MExtUtils::MY -e MY-fixin(shift)
blib/script/sa-compile
 make -f spamc/Makefile spamc/spamc
 gcc -DSPAMC_SSL -DSPAMC_SSL spamc/spamc.c spamc/getopt.c
spamc/libspamc.c spamc/utils.c  -o spamc/spamc -L/usr/local/lib -lssl
-lcrypto -lz
 In file included from spamc/spamc.c:22:
 spamc/utils.h:29:28: openssl/crypto.h: No such file or directory
 spamc/utils.h:30:25: openssl/pem.h: No such file or directory
 spamc/utils.h:31:25: openssl/ssl.h: No such file or directory
 spamc/utils.h:32:25: openssl/err.h: No such file or directory
 ...

what freebsd/spamassassin mysticism am i missing?  some other FLAG?

thanks.


(old?) sa hang @ bayes: can't use estimation method for expiry

2007-09-19 Thread snowcrash+sa
Hi,

I'm updating SA  exploring/cleaning Bayes DBs on a slighty 'dusty' box.

after update to latest SA 32x-branch, --lint is OK.

but, currently, i get:

sa-learn --force-expire -D
...
[19416] dbg: bayes: found bayes db version 3
[19416] dbg: bayes: opportunistic call attempt skipped, found fresh
running expire magic token
[19416] dbg: config: score set 3 chosen.
[19416] dbg: learn: initializing learner
[19416] dbg: bayes: bayes journal sync starting
[19416] dbg: bayes: bayes journal sync completed
[19416] dbg: bayes: expiry starting
[19416] dbg: locker: safe_lock: created 
/etc/sa/local/.spamassassin/bayes.mutex
[19416] dbg: locker: safe_lock: trying to get lock on
/etc/sa/local/.spamassassin/bayes with 300 timeout
[19416] dbg: locker: safe_lock: link to
/etc/sa/local/.spamassassin/bayes.mutex: link ok
[19416] dbg: bayes: tie-ing to DB file R/W
/etc/sa/local/.spamassassin/bayes_toks
[19416] dbg: bayes: tie-ing to DB file R/W
/etc/sa/local/.spamassassin/bayes_seen
[19416] dbg: bayes: found bayes db version 3
[19416] dbg: locker: refresh_lock: refresh
/etc/sa/local/.spamassassin/bayes.mutex
[19416] dbg: bayes: expiry check keep size, 0.75 * max: 112500
[19416] dbg: bayes: token count: 218650, final goal reduction size: 
106150
[19416] dbg: bayes: first pass? current: 1190215099, Last:
1110223381, atime: 1261902, count: 28215, newdelta: 335417, ratio:
3.76218323586745, period: 43200
[19416] dbg: bayes: can't use estimation method for expiry,
unexpected result, calculating optimal atime delta (first pass)
[19416] dbg: bayes: expiry max exponent: 9

where it just 'sits' for awhile -- no errors in logs -- and,
eventually, 'releases' and completes without error.

this only seems to happen an unquantified some of the time ...
sometimes (as moments ago) repeating the process results in no such
hand/delay.

fyi,

 grep -i bayes local.cf
  use_bayes  1
  use_bayes_rules1
  bayes_auto_learn   1
  bayes_min_ham_num  200
  bayes_min_spam_num 200
  bayes_learn_during_report  1
  bayes_use_hapaxes  1
  bayes_journal_max_size 102400 # in bytes
  bayes_expiry_max_db_size   15 # of tokens
  bayes_auto_expire  1
  bayes_learn_to_journal 1
  bayes_path /etc/sa/local/.spamassassin/bayes
  bayes_auto_learn_threshold_nonspam  1.0
  bayes_auto_learn_threshold_spam 6.0

and,

 cd /etc/sa/.spamassassin
 ls -al
  total 36024
  drwxrwxr-x 10 sa   sa  340 2007-09-19 08:22 .
  drwxrwxr-x 39 sa   sa 1326 2007-09-19 06:59 ..
  -rw-rw  1 sa   sa90112 2007-09-19 08:23 auto-whitelist
  -rw-rw  1 sa   sa 10244096 2007-06-28 07:09 auto-whitelist-old
  -rw-rw  1 sa   sa6 2007-09-19 08:23 auto-whitelist.mutex
  -rw-rw  1 sa   sa   30 2007-09-19 08:21 bayes.mutex
  -rw---  1 root sa  816 2007-09-19 08:22 bayes_journal
  -rw-rw  1 sa   sa 20570112 2007-09-19 07:30 bayes_seen
  -rw-rw  1 sa   sa  5488640 2007-09-19 08:21 bayes_toks

i had NOT noticed this before on this box ... not clear if it's the
build or my local.cf's bayes-related settings.  or, perms, maybe?

i've found some old thread references to

  [19416] dbg: bayes: can't use estimation method for expiry, unexpected
   result, calculating optimal atime delta (first pass)

but not yet a resolution ... (still loooking!)

hints?

thanks.


Re: (old?) sa hang @ bayes: can't use estimation method for expiry

2007-09-19 Thread snowcrash+sa
  where it just 'sits' for awhile -- no errors in logs -- and,
  eventually, 'releases' and completes without error.

 What are the messages when that happens?  It shouldn't just exit.

Here's a recent example,

 sa-learn --force-expire -D
  ...
  [19725] dbg: bayes: can't use estimation method for expiry,
unexpected result, calculating optimal atime delta (first pass)
(waits here ~ 2 minutes ...)
  [19725] dbg: bayes: expiry max exponent: 9
  [19725] dbg: bayes: atime token reduction
  [19725] dbg: bayes:  ===
  [19725] dbg: bayes: 43200 221166
  [19725] dbg: bayes: 86400 221038
  [19725] dbg: bayes: 172800 220701
  [19725] dbg: bayes: 345600 220382
  [19725] dbg: bayes: 691200 219516
  [19725] dbg: bayes: 1382400 217516
  [19725] dbg: bayes: 2764800 215351
  [19725] dbg: bayes: 5529600 212250
  [19725] dbg: bayes: 11059200 210410
  [19725] dbg: bayes: 22118400 203758
  [19725] dbg: bayes: couldn't find a good delta atime, need more
token difference, skipping expire
  [19725] dbg: bayes: expiry completed
  [19725] dbg: bayes: untie-ing
  [19725] dbg: bayes: files locked, now unlocking lock
  [19725] dbg: locker: safe_unlock: unlocked
tc/sa/local/.spamassassin/bayes.mutex



Re: (old?) sa hang @ bayes: can't use estimation method for expiry

2007-09-19 Thread snowcrash+sa
 Yeah, that's the expiry first pass.  As it said, it couldn't find a
 good atime delta to use for expiry, so it didn't anything.

ok, so it's expected.  read on ...

 man sa-learn has a large amount of information about how all of this works.

Arguably not in normal-human-english, but, yes.

The 'trigger' for my question is the presence of unexpected result
in the report.

No, it doesn't SAY error -- but neither does the manpage literallly
address unexpected.  It implies something outside of the norm.  I'd
suggest something  a bit clearer, especially given the existing 'old'
list-references to this *AS* an error, with nothing newer available
...

My $0.02. Thx.


two supposedly identical SA boxes, with slightly different report output -- help find the diff?

2007-08-28 Thread snowcrash+sa
hi,

grr. i'm at that resorting-to-visine stage of wtf ... :-/

i've

spamassassin --version
SpamAssassin version 3.2.4-r564346
  running on Perl version 5.8.8

with, among numerous other ruls/plugins, FuzzyOcr/r330 installed.

i've just updated two supposedly identical boxes, building from clean
sources, and running the same setup scripts on both.

no errors in the installs.

on testing of FuzzyOcr image processing on one of its included test files with,

spamassassin -D -t -x  FuzzyOcr/samples/ocr-animated.eml

i see in the debug output the following report on one box,

  ...
  Content analysis details:   (38.2 points, 4.0 required)

   pts rule name  description
   -- --
   4.2 MID_DEGREESMID_DEGREES
   3.7 CTYPE_8SPACE_GIF   BODY: Stock spam image part
'Content-Type' found (8
  spc)
   0.0 HTML_MESSAGE   BODY: HTML included in message
   1.5 BAYES_50   BODY: Bayesian spam probability is 40 to 60%
  [score: 0.4467]
   1.7 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
   2.5 HTML_IMAGE_ONLY_16 BODY: HTML: images with 1200-1600 bytes of words
   1.2 SARE_GIF_ATTACHFULL: Email has a inline gif
   1.5 MY_CID_AND_STYLE   SARE cid and style
   2.9 DRUGS_STOCK_MIMEOLEStock-spam forged headers found (5510)
16 FUZZY_OCR_KNOWN_HASH   BODY: Image with known hash
  []
  [Words found:]
  [investor in 1 lines]
  [price in 2 lines]
  [company in 1 lines]
  [alert in 1 lines]
  [valium in 1 lines]
  [trade in 1 lines]
  [banking in 1 lines]
  [news in 1 lines]
  [(13.5 word occurrences found)]


and, similarly on the other box,

  ...
  Content analysis details:   (38.5 points, 4.0 required)

   pts rule name  description
   -- --
   3.7 MID_DEGREESMID_DEGREES
   1.6 CTYPE_8SPACE_GIF   BODY: Stock spam image part
'Content-Type' found (8
  spc)
   0.0 HTML_MESSAGE   BODY: HTML included in message
   1.5 BAYES_50   BODY: Bayesian spam probability is 40 to 60%
  [score: 0.4467]
   1.5 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
   1.5 HTML_IMAGE_ONLY_16 BODY: HTML: images with 1200-1600 bytes of words
   1.2 SARE_GIF_ATTACHFULL: Email has a inline gif
   1.5 MY_CID_AND_STYLE   SARE cid and style
   3.5 DRUGS_STOCK_MIMEOLEStock-spam forged headers found (5510)
18 FUZZY_OCR_KNOWN_HASH   BODY: Image with known hash
  []
  [Words found:]
  []
  [(13.5 word occurrences found)]


NOTE the words found detail in the second box's debug output :-/

trying to find what's causing the different output, i've pored over
the debug output, googl'd the lists, diff'd the config files, etc.

nada.  to my weary eye, all looks the same.

obviously, it's not.

any hints/suggestions as to what i might've missed? how to find it?

thanks!


Re: two supposedly identical SA boxes, with slightly different report output -- help find the diff?

2007-08-28 Thread snowcrash+sa
hi,

(you 'busted out' of the thread ... replying back in it.)

 Disable the FuzzyOcr's result hash cache on both machines before testing for
 differences: you are looking at stale results.

 If these systems cached the results when the version or config of the two
 FuzzyOcr(s) were not the same, of course you see a difference...

You might have a point.

So, silly question:

HOW do I Disable the FuzzyOcr's result hash cache?

commenting out:

  #body FUZZY_OCR_KNOWN_HASHeval:dummy_check()

in FuzzyOcr.cf isn't doing it:-/

thanks!


Re: two supposedly identical SA boxes, with slightly different report output -- help find the diff?

2007-08-28 Thread snowcrash+sa
aha!

 in FuzzyOcr.cf,

-   focr_hashing_learn_scanned 1
+   focr_hashing_learn_scanned 0

then,

rm Fuzzy*db*


now, as expected ...

  18 FUZZY_OCR  BODY: Img with common spam text inside
[Words found:]
[investor in 1 lines]
[price in 2 lines]
[company in 1 lines]
[alert in 1 lines]
[valium in 1 lines]
[trade in 1 lines]
[banking in 1 lines]
[news in 1 lines]
[(13.5 word occurrences found)]


i did not realize that if the HASH ore-exists, then the images' total
score hits -- and is reused frm the hash db, but thata none of the
word-hit data is stored/resed.

got it,

thx!


Re: two supposedly identical SA boxes, with slightly different report output -- help find the diff?

2007-08-28 Thread snowcrash+sa
hi andy,

 For what it's worth, the fuzzyocr hashing is of very limited value, and in
 many cases is a severe performance hit. I found that scanning the hashes,
 due to the fuzzy nature, is more costly than just rescanning the file
 with OCR, as *each* *and* *every* hash must be checked iteratively.

now, *that's* an interesting point to consider.

i'd be interested in what, then, the 'goal' of the hashing/comparison *is*?

is it performance, and it just missed the mark for the reasons you
state?  or is it something else?

dunno.

but, your point bears some benchmarking ...

thx!


where'd SendmailID.pm go?

2007-08-27 Thread snowcrash+sa
hi,

i've a script that keeps me up to date with latest 32x-branch svn.

in today's DL/co of r570165


svn co http://svn.apache.org/repos/asf/spamassassin/branches/3.2 spamassassin


i note that,

rules/SIQ.pm
rules/SendmailID.pm

are no longer there (iirc, they were 'fairly recently' ...) .

i *do* find

rulesrc/sandbox/dos/SIQ.pm

but no trace of SendmailID.pm.

'egrep -i rulesrc|Sendmail *' on the src tree doesn't tell me anything ...

can someone clarify what's changed, and where'd they go? if the
change is doc'd somewhere, i've missed it ...

thanks!


Re: where'd SendmailID.pm go?

2007-08-27 Thread snowcrash+sa
fair enuf.

where are such removals documented?  my point being simply: it *was*
in the src tree, suddenly it isn't.  even if well-justified, shouln't
that action be *mentioned* in Changelog?

also, when did plugins move (back?) to rulesrc/sandbox/... as opposed
to rules/...?


Re: where'd SendmailID.pm go?

2007-08-27 Thread snowcrash+sa
  also, when did plugins move (back?) to rulesrc/sandbox/... as opposed
  to rules/...?

 I suspect you had the output of a mkrules compilation step in
 your rules dir; they were always there, in the sandbox, but
 mkrules copies them into rules.

bingo.

i always build SA from src w/,

cd ${SA_SRC}
perl Makefile.PL ...
cd spamc
perl version.h.pl
./configure --enable-ssl ...
make

and note afterwards,

cd ${SA_SRC}
ls -1 rules/*.pm
rules/SIQ.pm
rules/sandbox-felicity.pm
rules/sandbox-hstern.pm

i did *not* realize that mkrules is the 'culprit', but i *do* see it
in Makefile.PL.

(need to look more closely at what's going on so as not to fubar my setup ... )

thanks!


Re: where'd SendmailID.pm go?

2007-08-27 Thread snowcrash+sa
 since rulesrc is independent of the SA distribution.

Good to know.

Perhaps I *should* know.  Can't find that stated/clarified anywhere in
the src tree.  I've looked repeatedly.  If it's supposed to be
obvious, i'm clueless.

My -- incorrect -- presumption has been, since its DISTRIBUTED *with*
SA, it's  change-managed along with it.

Could that be stated clearly somewhere?  Perhaps a README.RULES in the
src tree top?


Re: is there any processor-dependency to sa-compile?

2007-08-20 Thread snowcrash+sa
hi,

 it's compiled C code, so whatever affects portability of that will
 affect compiled rulesets too.

likely depends on choices of compile-time optimization, i think.

need to read up, and check if/what presumptions are made by sa-compile process.

i've cross-compiled across different arch's within a CPU family
before, so that's not an issue.  i've *not* done so across different
CPUs (e.g., PPC vs x86), so that'll need some investigation.

assuming 'all that' gets ironed out, is it sufficient to simply 'push'
the /compiled dir's contents to each box -- with a HUP of SA, i'd
guess?  or does each SA instance need to be otherwise 'informed' of
the presence/change of compiled rules/files?

thanks!


Re: is there any processor-dependency to sa-compile?

2007-08-20 Thread snowcrash+sa
hi,

 I think either different family, or different CPU arch, will be a
 problem to be honest...

yeah, probably right ... worth a look-see, though.

(or, i should simply build that 16-core Opteron box and be done with it ...)

 Yep, with a HUP.

thanks.

cheers!


is there any processor-dependency to sa-compile?

2007-08-19 Thread snowcrash+sa
as long as my

  SA-version
  included rulesets
  enabled plugins

are the SAME from arch/OS to arch/OS, is it OK to simply compile rules
once somewhere, and push them to each box?

or, *is* there some sort of processor/architecture, or other
environmental, depdency that throws a wrench into the works?

thanks!


only (re)sa-compile channel files that have changed?

2007-08-13 Thread snowcrash+sa
i use sa-update to update/maintain three separate source channels of rules,

sudo -u spam sa-update --channelfile DIST-ch.conf
sudo -u spam sa-update --channelfile SARE-ch.conf --gpgkey 856AA88A
sudo -u spam sa-update --channelfile JMAS-ch.conf --gpgkey 6C6191E3

where, fwiw,

cat DIST-ch.conf
updates.spamassassin.org

cat SARE-ch.conf
70_sare_obfu.cf.sare.sa-update.dostech.net
72_sare_redirect_post3.0.0.cf.sare.sa-update.dostech.net
70_sare_evilnum0.cf.sare.sa-update.dostech.net
70_sare_evilnum1.cf.sare.sa-update.dostech.net
70_sare_bayes_poison_nxm.cf.sare.sa-update.dostech.net
70_sare_header.cf.sare.sa-update.dostech.net
70_sare_header_eng.cf.sare.sa-update.dostech.net
99_sare_fraud_post25x.cf.sare.sa-update.dostech.net
70_sare_spoof.cf.sare.sa-update.dostech.net
...

cat JMAS-ch.conf
sought.rules.yerp.org

works great manually /or via cron job.

i've *also* turned on,

# Rule2XSBody - speedup by compilation of ruleset to native code
loadplugin Mail::SpamAssassin::Plugin::Rule2XSBody

in my init.pre.

currently, my sa-update cron-job script detects ANY changes to ANY of
the existing channel files, and if a change exists, RE-compiles the
whole set of rules.

as the number of channels managed grows, the odds of a *single*
channel being updated increase, as, then, does the probability that a
RE-compile will be done.

inefficient.

i (think i) can simply cobble up a script to only re-compile rules for
those channel files that HAVE been updated/changed, but though i'd ask
here first ...

*is* there an already available/clever script or process already
available that would only re-compile those rules that NEED
recompiling?


thanks!


only (re)sa-compile channel files that have changed?

2007-08-13 Thread snowcrash+sa
i use sa-update to update/maintain three separate source channels of rules,

sudo -u spam sa-update --channelfile DIST-ch.conf
sudo -u spam sa-update --channelfile SARE-ch.conf --gpgkey 856AA88A
sudo -u spam sa-update --channelfile JMAS-ch.conf --gpgkey 6C6191E3

where, fwiw,

cat DIST-ch.conf
updates.spamassassin.org

cat SARE-ch.conf
70_sare_obfu.cf.sare.sa-update.dostech.net
72_sare_redirect_post3.0.0.cf.sare.sa-update.dostech.net
70_sare_evilnum0.cf.sare.sa-update.dostech.net
70_sare_evilnum1.cf.sare.sa-update.dostech.net
70_sare_bayes_poison_nxm.cf.sare.sa-update.dostech.net
70_sare_header.cf.sare.sa-update.dostech.net
70_sare_header_eng.cf.sare.sa-update.dostech.net
99_sare_fraud_post25x.cf.sare.sa-update.dostech.net
70_sare_spoof.cf.sare.sa-update.dostech.net
...

cat JMAS-ch.conf
sought.rules.yerp.org

works great manually /or via cron job.

i've *also* turned on,

# Rule2XSBody - speedup by compilation of ruleset to native code
loadplugin Mail::SpamAssassin::Plugin::Rule2XSBody

in my init.pre.

currently, my sa-update cron-job script detects ANY changes to ANY of
the existing channel files, and if a change exists, RE-compiles the
whole set of rules.

as the number of channels managed grows, the odds of a *single*
channel being updated increase, as, then, does the probability that a
RE-compile will be done.

inefficient.

i (think i) can simply cobble up a script to only re-compile rules for
those channel files that HAVE been updated/changed, but though i'd ask
here first ...

*is* there an already available/clever script or process already
available that would only re-compile those rules that NEED
recompiling?


thanks!


Re: only (re)sa-compile channel files that have changed?

2007-08-13 Thread snowcrash+sa
hi jason,

 If I understand correctly, part of the recompile process includes removing
 redundant regexps.  It can only do this if all rulesets are available at 
 compile
 time for comparison.

hm.  if that *is* the case, then you've got a point.

just seems that if we're sa-update'ing multiple times per day, or even
hour that, with enough rules under mgmt, we'll get to a point that
there's always a change, and thus we're simply constantly compiling.

gotta be a better way ...

 To reduce load, maybe you can compile on a different
 server and redistribute?

of course that's always a possiblity. but, heh, i've already got
SA/ClamAV on a separate box from my mail server to 'reduce load' ...
at this rate, i'm gonna need a farm ;-)


Re: only (re)sa-compile channel files that have changed?

2007-08-13 Thread snowcrash+sa
 Unfortunately -- not yet.   It'll take code changes to sa-compile,
 specifically to cache the base strings somewhere so they don't have to
 be re-extracted next time.

 could you open an enhancement request on the bugzilla for this?

will do.

thanks!


Re: only (re)sa-compile channel files that have changed?

2007-08-13 Thread snowcrash+sa
 could you open an enhancement request on the bugzilla for this?

fyi: http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5594


which 'other' rules can/should be sa-compiled?

2007-08-02 Thread snowcrash+sa
i currently use a cron job to

  sa-update --channelfile /usr/local/etc/sa/update-DIST-ch.conf
  sa-update --channelfile /usr/local/etc/sa/update-SARE-ch.conf

pulling updates, as available, into my system-wide rules update directory.

if updates exist, then i compile them,

sa-compile --sudo -D

resulting in compiled rules that override the non-compiled distro'd files.

works fine here.  and, i /think/ is the generally intended (?)
behavior of sa-compile'ing.

question:

any reason NOT to ALSO compile other, non-SA/SARE rules?

e.g., FuzzyOcr could benefit from a performance boost.  just not clear
to me if it can be compiled, or benefit if it does.

comments?

thanks.


Re: which 'other' rules can/should be sa-compiled?

2007-08-02 Thread snowcrash+sa
got it.

rats!, nonetheless...

thx!


Re: installing patch

2007-08-02 Thread snowcrash+sa
atm, you need to patch the source, and rebuild ... e.g.,

cd /top/of/fresh/src/tree/for/spamassassin
wget -k -O bug5574_patch4077.patch
http://issues.apache.org/SpamAssassin/attachment.cgi?id=4077action=view;
patch -p0  bug5574_patch4077.patch

config, build  install

fwiw, works peachy-keen with svn co of 32x branch head.

hth.


Re: sa v32x + Mail::SPF are installed; Mail::SPF::Query still required. really, or typo?

2007-07-30 Thread snowcrash+sa
ah. so 'tis 'just' that require. gr8.

i've become too attuned to the appearance of fail in --lint ouput ...

thanks!


sa v32x + Mail::SPF are installed; Mail::SPF::Query still required. really, or typo?

2007-07-29 Thread snowcrash+sa
i've sa v32-branch, r560837 installed.

i have perl 588 + Mail::SPF installed,

module_info Mail::SPF
Name:Mail::SPF
Version: v2.005
...

but NOT Mail::SPF::Query.

reading @ SA/INSTALL,

Either of Mail::SPF or Mail::SPF::Query can be used but Mail::SPF is
 preferred as it is the current reference implementation for RFC 4408.

and comments here,

http://www.gossamer-threads.com/lists/spf/devel/31745

i understand that M::S::Q is *no longer* required.

but, on --lint, i note,

...
[470] dbg: diag: module installed: Mail::SPF, version v2.005
[470] dbg: diag: module not installed: Mail::SPF::Query ('require' 
failed)
...

other than the above mention of failed, all tests/finishes ok.

the 'require' failed originates @
./lib/Mail/SpamAssassin/Util/DependencyInfo.pm

  ...
  foreach my $moddef (@MODULES, @OPTIONAL_MODULES) {
my $module = $moddef-{module};
my $modver;
if (eval ' require '.$module.'; $modver = $'.$module.'::VERSION; 1;')
{
  $modver ||= '(undef)';
  $out .= module installed: $module, version $modver\n;
} else {
  $out .= module not installed: $module ('require' failed)\n;
}
  ...

but it' not immediately clear to me if M::S::Q *is* a *required*
dependency anywhere else ... or just a typo.

clarification?

thanks.