How to get SA ...

2006-12-14 Thread Tyler Nally
Hello all,

Included (in plain text) at the end of this message is the source of an
e-mail that I received yesterday.  Clearly SPAM and it looks like they did
some kind of header injection kind of stuff from their end to get the
e-mail on it's way.

SA didn't recognize this as SPAM.  What can be done to trap a message like
this from going to the e-mail inbox?

Thanks a lot.. and thanks in advance.

Tyler Nally
[EMAIL PROTECTED]

--obvious spam follows--

Return-Path: <[EMAIL PROTECTED]>
Received: from localhost (yadler.com [127.0.0.1])
by mail.yadler.com (Postfix) with ESMTP id AF709106
for <[EMAIL PROTECTED]>; Wed, 13 Dec 2006 02:52:09 -0500 (EST)
X-Virus-Scanned: amavisd-new at yadler.com
Received: from mail.yadler.com ([127.0.0.1])
by localhost (superneo.yadler.com [127.0.0.1]) (amavisd-new, port 10024)
with LMTP id WwvayNWlduiX for <[EMAIL PROTECTED]>;
Wed, 13 Dec 2006 02:52:09 -0500 (EST)
Received: by mail.yadler.com (Postfix, from userid 99)
id 1AA03107; Wed, 13 Dec 2006 02:52:09 -0500 (EST)
X-Spam-Checker-Version: SpamAssassin 3.1.7 (2006-10-05) on
superneo.yadler.com
X-Spam-Level: **
X-Spam-Status: No, score=2.6 required=4.0 tests=BAYES_50,FORGED_RCVD_HELO,
HEAD_LONG autolearn=no version=3.1.7
Received: from mail.radiosuomipop.fi (metromedia1-hki.far-m.com
[80.64.11.164])
by mail.yadler.com (Postfix) with ESMTP id 9E69C106
for <[EMAIL PROTECTED]>; Wed, 13 Dec 2006 02:52:06 -0500 (EST)
Received: by mail.radiosuomipop.fi (Postfix, from userid 33)
id 16BE47B6; Wed, 13 Dec 2006 09:52:02 +0200 (EET)
To: [EMAIL PROTECTED],
"Content-Transfer-Encoding:quoted-printable"@metromedia1-hki.far-m.com,
"Content-Type:text/plain"@metromedia1-hki.far-m.com,
"Subject:Never"@metromedia1-hki.far-m.com,
[EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED],
"bcc:bitchesleftnut"@yahoo.com, [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL 

Parsing Email

2006-10-11 Thread Tyler Nally
Hello,

I've a project that I'm needing to solve.  Fax machines (for a client)
have been replaced with the phone company's fax server that e-mails
the incomming fax (.tif) images to a specific e-address at the clients
place of business.

Just so happens, the e-mail passes through a mail server that will
inspect it for e-viri as well as run it through spamassassin before
it forwards it onto their machine.  That mail server that pre-processes
the clients e-mail is a machine I administer.

What I'd like to do... is capture the contents of these particular
fax e-mails as its passing through the machine I administer and either:

  1- copy the fax images (detach the images from e-mail messages)
  and store these images on that server (whether as a file
  or put into a database as a blob)
  2- create a database record that will essentially catalog the
  incoming fax to associate a fax file image (or db blob ID)

  A- and also search a database for existing origination fax #'s
  so that the fax can be associated as to the right company
  that sent it.  In this case.. the DB used is a MySQL
  database that exists on this particular machine as well.


Now.. what I need help in understanding... is ... assuming that
I can handle each e-mail separately as it comes through, how do I
parse the e-mail (like the way Spamassassin does) to have the
ability to pull the component parts from the e-mail (from:,
subject:, and MIME-encapsulated fax image) in order to be able
to use these pieces (somehow) for the customer care module.

I'm well versed in PHP... I used to do a lot of perl (many moons
ago) and I'd like to make this work without too awful much pain.

I think ultimately, I'll probably let the normal copy of the e-mail
go onto the customers destination.  I'd cause an extra Cc: to
go through a specific e-mail account on the server where anything
that is delivered to this account is strained by this e-mail
parsing program that'll split the e-mail up into it's pieces,
and distribute/use the chunks it in a manner that I can manipulate
it later in the process.

Any help to point me in the right direction?

Thanks a lot

Tyler Nally


Re: How was this missed?

2006-04-13 Thread Tyler Nally
On Thursday 13 April 2006 11:55, [EMAIL PROTECTED] wrote:
> Theo Van Dinter wrote:
> > On Thu, Apr 13, 2006 at 10:39:29AM -0600,  wrote:
> >> Any idea how this one got through?
> >> 
> >> body BRIAN_PHONE_NUMBERS
> >>
> /2.0.6.9.8.4.2.3.2.7|2.0.6.3.3.3.0.0.5.1|2.0.6.9.8.4.0.1.0.6|3.3.8.3.5.7
> .9|2.0.6.3.3.8.6.0.6.1|2.0.6
> >> .2.0.2.2.0.3.3/ 

There's a ruleset I use from:

   http://www.emtinc.net/includes/chickenpox.cf

.. that checks for the d.i.f.f.e.r.e.n.t kinds of 
spacing like this... a lot of the spam that has
those kinds of characteristics will have several
of the CHICKENPOX_ rules that have fired positive.

It checks for some 60+ different patterns..

describe J_CHICKENPOX_12  1alpha-pock-2alpha
describe J_CHICKENPOX_13  1alpha-pock-3alpha
describe J_CHICKENPOX_14  1alpha-pock-4alpha
describe J_CHICKENPOX_15  1alpha-pock-5alpha
describe J_CHICKENPOX_16  1alpha-pock-6alpha
describe J_CHICKENPOX_17  1alpha-pock-7alpha
describe J_CHICKENPOX_18  1alpha-pock-8alpha
describe J_CHICKENPOX_19  1alpha-pock-9alpha
describe J_CHICKENPOX_110 1alpha-pock-10alpha
describe J_CHICKENPOX_111 1alpha-pock-11alpha
describe J_CHICKENPOX_21  2alpha-pock-1alpha
describe J_CHICKENPOX_22  2alpha-pock-2alpha
describe J_CHICKENPOX_23  2alpha-pock-3alpha
describe J_CHICKENPOX_24  2alpha-pock-4alpha
describe J_CHICKENPOX_25  2alpha-pock-5alpha
describe J_CHICKENPOX_26  2alpha-pock-6alpha
describe J_CHICKENPOX_27  2alpha-pock-7alpha
describe J_CHICKENPOX_28  2alpha-pock-8alpha
describe J_CHICKENPOX_29  2alpha-pock-9alpha
describe J_CHICKENPOX_210 2alpha-pock-10alpha
describe J_CHICKENPOX_31  3alpha-pock-1alpha
describe J_CHICKENPOX_32  3alpha-pock-2alpha
describe J_CHICKENPOX_33  3alpha-pock-3alpha
describe J_CHICKENPOX_34  3alpha-pock-4alpha
describe J_CHICKENPOX_35  3alpha-pock-5alpha
describe J_CHICKENPOX_36  3alpha-pock-6alpha
describe J_CHICKENPOX_37  3alpha-pock-7alpha
describe J_CHICKENPOX_38  3alpha-pock-8alpha
describe J_CHICKENPOX_39  3alpha-pock-9alpha
describe J_CHICKENPOX_41  4alpha-pock-1alpha
describe J_CHICKENPOX_42  4alpha-pock-2alpha
describe J_CHICKENPOX_43  4alpha-pock-3alpha
describe J_CHICKENPOX_44  4alpha-pock-4alpha
describe J_CHICKENPOX_45  4alpha-pock-5alpha
describe J_CHICKENPOX_46  4alpha-pock-6alpha
describe J_CHICKENPOX_47  4alpha-pock-7alpha
describe J_CHICKENPOX_48  4alpha-pock-8alpha
describe J_CHICKENPOX_51  5alpha-pock-1alpha
describe J_CHICKENPOX_52  5alpha-pock-2alpha
describe J_CHICKENPOX_53  5alpha-pock-3alpha
describe J_CHICKENPOX_54  5alpha-pock-4alpha
describe J_CHICKENPOX_55  5alpha-pock-5alpha
describe J_CHICKENPOX_56  5alpha-pock-6alpha
describe J_CHICKENPOX_57  5alpha-pock-7alpha
describe J_CHICKENPOX_61  6alpha-pock-1alpha
describe J_CHICKENPOX_62  6alpha-pock-2alpha
describe J_CHICKENPOX_63  6alpha-pock-3alpha
describe J_CHICKENPOX_64  6alpha-pock-4alpha
describe J_CHICKENPOX_65  6alpha-pock-5alpha
describe J_CHICKENPOX_66  6alpha-pock-6alpha
describe J_CHICKENPOX_71  7alpha-pock-1alpha
describe J_CHICKENPOX_72  7alpha-pock-2alpha
describe J_CHICKENPOX_73  7alpha-pock-3alpha
describe J_CHICKENPOX_74  7alpha-pock-4alpha
describe J_CHICKENPOX_75  7alpha-pock-5alpha
describe J_CHICKENPOX_81  8alpha-pock-1alpha
describe J_CHICKENPOX_82  8alpha-pock-2alpha
describe J_CHICKENPOX_83  8alpha-pock-3alpha
describe J_CHICKENPOX_84  8alpha-pock-4alpha
describe J_CHICKENPOX_91  9alpha-pock-1alpha
describe J_CHICKENPOX_92  9alpha-pock-2alpha
describe J_CHICKENPOX_93  9alpha-pock-3alpha
describe J_CHICKENPOX_101 10alpha-pock-1alpha
describe J_CHICKENPOX_102 10alpha-pock-2alpha

-- 
Tyler Nally
[EMAIL PROTECTED]
317-989-2028


Re: Which Operating Systems Do You Use and Why?

2006-04-10 Thread Tyler Nally
On Sunday 09 April 2006 15:20, mouss wrote:

> No. white and black aren't colors. they are absence of colour:)

Well... according to physics... it really depends on what 
is delivering the pigments...

When you paint.. and you combine a bunch of colors.. the colors
get darker and darker.. to the point where they'll eventually
turn black if you add enough dark color... or you can use
crayons, or sharpies, etc.  Combine red, blue, and green colors
using this and see if you end up with something very dark like
black.

When you combine light, three sources of mono-colored filtered
light (like on a tv) or a three-lens projection tv or even three
spotlights.. of red, blue, and green... project them onto the 
same spot.. and you have white.  

White happen when all spectrum of the visible light is reflected
back into your eyes... technically white is *all colors* .. and
we know that because a prism will do the reverse and split white
light up in to the ROYGIBV spectrum of all visible light.  

Black .. is the non-color .. it happens when *no visible light*
is reflected back to your eye.  Like a darkened shaded room 
with no open windows to a bright outside world.  It's black
inside .. because there's a *void* of light.  You illuminate
a single light bulb or open the window.. what happens?  The
blackness/darkness flees and is replaced by light.

Black is the same to White as Cold is to Heat.  Black is a 
"void" of color.  Cold is a "void" of heat.  It's not so much
that cold refrigerates and puts "cold/coolness" upon something 
resulting in in it being cooler ... but the way it does is by 
pulling the heat from item.  Cold is literally anti-heat.  
Probably easier to think of as a temperature vacuum of sorts 
... where it draws temperature away from an object.

Now.. if you want to read something on the 'net that is 
absolutely hilarious... go out to google and search for
"Dark Suckers" -or- maybe "Dark Sucker Bulbs"... it's 
hilarious about how lights work by "sucking dark" and how
when a bulb gets full of enough "dark" they go out because
they can no longer suck any more dark.

If you don't think it's funny when you find it, then that's
proof that I'm a geek and easily entertained.

-- 
Tyler Nally
[EMAIL PROTECTED]
317-989-2028


Re: Problem with Bayes learning

2006-02-28 Thread Tyler Nally
On Tuesday 28 February 2006 10:46 pm, you wrote:

> I am new to spamassassin. Thank you so much for your help and Tyler too.

Thanks.. I'm not the expert.. I just use it!

> Bayes autolearn is enabled when I feed Bayes with the 1500 emails manually
> using the "sa-learn" command. Does it cause the problem?

I think that sa-learn... probably creates a lock file.  Assuming that 
sa-learn exits normally, I would think that it'd remove the lock file
when it's done.  I assume that it works this way because when you're
"sa-learn"-ing .. the auto-learn feature is unavailable for spamd to
record the bayes tokens (I think) because it can't get a lock on the
bayes structures to record them.  Once sa-learn halts and removes the
lock.. auto-learn should be available.

> I also checked the Bayes database directory and found two stale lock files
> "bayes.lock...". One is pretty old, almost 4 months and the other was
> created during I feed bayes this time. Could I delete them?

I'd say.. that you can toast the 4 month old one rather easily... 
Watch for when sa-learn finishes.. and you should see the newer lock 
file go away after it's completion.  If it doesn't... then remove
that one as well   

I don't think, in the normal operation of spamassassin.. if the auto-learn
*write* to the bayes structure put's a lockfile on the bayes structures.
At the same time... I've never explicitly watched the directory that 
bayes exists .. to see if a lock file appears quickly and disapppears just
as fast when it's done. 

I do know.. that if I evoke *sa-learn*.. that a lockfile will exist
while it's sa-learn'ing.. and then go away afterwards.  While it's 
sa-learn'ing, I see the Spamassassin header tags show that autolearn
is "unavailable" during this time because it knows it can't open up
the bayes structures to write the tokens to it.


-- 
Tyler Nally
[EMAIL PROTECTED]


Re: Problem with Bayes learning

2006-02-28 Thread Tyler Nally
On Tuesday 28 February 2006 05:06 pm, Jonathan Nie wrote:
> Greetings!
>
> I got a problem when I try to feed Bayes with large number of emails
> (over 1500). It just hang there and I got the the following error
> messages from maillog file:
>
> .bayes: cannot open bayes databases /spamassassin/bayes_* R/W: lock
> failed: File exists
>
> Does anyone know how to fix it?

The "bayes" section of my spamassassin setup in local.cf looks like
this:

#--
bayes_path /etc/mail/spamassassin/bayes/bayes
bayes_file_mode 0777

use_bayes 1
#bayes_use_hapaxes 1

# Enable Bayes auto-learning
bayes_auto_learn  1

bayes_auto_learn_threshold_nonspam0.1
bayes_auto_learn_threshold_spam   9.0
#--

Which for me.. would mean that I'd cd to:

   /etc/mail/spamassassin/bayes

.. and do a (as root):

   chmod 666 bayes*

... to allow anyprocess with access to those
bayes files an opportunity to either open and
read/write to it.

Nobody else (that can login to the system) has access 
to the files on that file system so it should be safe 
(at least for me) to perform this and it not be a 
breach of security of some kind.

I'm pretty sure that the owner/group of the bayes
files are "spamd" so that it can access the files
as it needs.. and when I run sa-learn to harvest 
other tokens, I run sa-learn as the same user as
well.

-- 
Tyler Nally
[EMAIL PROTECTED]


Re: Spamassassin does not learn

2006-02-28 Thread Tyler Nally
On Tuesday 28 February 2006 10:36 am, Egoitz Aurrekoetxea wrote:

> I'm using Spamassassin 3.0.3 on a Debian machine running spampd
> proxy. When I check my receiving mail's headers I see that when talks
> about autolearn always says no or failed, what could be the reason?

My server was doing the same thing *until*... for months ...

(yeah, yeah, yeah ... I know ... why so long?  Busy with real
job work and didn't get to it until a while later when I was
a little bored and thought I'd check it out why it wasn't
*learning* as I thought it should ... o.k?)

 I corrected how/where it thought the bayes database
was located on the machine.  Once I made the correction to
where it was (and permission to open it), then it started
learning upon restarting spamd (assuming it satisfied the
threshold conditions, etc)...

My next big tasks will be to MySQL-ize Spamassassin so
that it's running from the database for customization
as well as MySQL-izing the Bayes database.  Those two
projects are a month (maybe two) away.

-- 
Tyler Nally
[EMAIL PROTECTED]


Re: _POSTAL isn't numeric ... /Conf.pm line 251

2006-02-13 Thread Tyler Nally
On Monday 13 February 2006 12:16 pm, Theo Van Dinter wrote:
> On Mon, Feb 13, 2006 at 12:11:16PM -0500, Tyler Nally wrote:
> > Argument "_POSTAL" isn't numeric in addition (+)
> > at /usr/lib/perl5/site_perl/5.8.6/Mail/SpamAssassin/Conf.pm line 251
> >
> > Is this a problem?  Is there a fix?
>
> You have a bad config line somewhere.  probably something like
> "score RULE _POSTAL ..."

Marvelous!  It was "score REMOVE _POSTAL" that had to be reconnected
between the RULE and _POSTAL.

Thanks much I appreciate it!

-- 
Tyler Nally
[EMAIL PROTECTED]


_POSTAL isn't numeric ... /Conf.pm line 251

2006-02-13 Thread Tyler Nally
Hmmm. upgraded to SA 3.1.0 and I get this when performing
a sa-learn:

Argument "_POSTAL" isn't numeric in addition (+) 
at /usr/lib/perl5/site_perl/5.8.6/Mail/SpamAssassin/Conf.pm line 251

Is this a problem?  Is there a fix?

-- 
Tyler Nally
[EMAIL PROTECTED]


Re: using LearnAsSpam IMAP folder

2005-12-04 Thread Tyler Nally
On Sunday 04 December 2005 05:59 pm, Pollywog wrote:

> I do that as well and have no problem, but when I use the fetchmail as
> shown at the URL I posted (the command goes into a crontab), fetchmail
> can't find the IMAP folder. 

I wonder if it's just the way you refer to the sub-folders with
fetchmail in order to fetch the right directories "Fetchmail" 
being essentially a "dumb" utility .. you might have to provide it
with more information before it can find what YOU are looking for.

A quick scan of Google looking for "fetchmail IMAP" had someone 
calling fetchmail like this using the IMAP protocol and fetchmail
reading certain folders:

fetchmail -k -a -p IMAP \
-S localhost \
--smtpname [EMAIL PROTECTED] \
-u username \
--auth password \
-r "Inbox.folder1","Inbox.folder2","Inbox.etc" \
-v mailserver.domain.com

... with the "Inbox" representing the home "Maildir" path
and the "Inbox.folder1" representing the first named sub-folder
within the main maildir... and "Inbox.folder2" the 2nd, etc.

-- 
Tyler Nally
[EMAIL PROTECTED]


Re: using LearnAsSpam IMAP folder

2005-12-04 Thread Tyler Nally
On Sunday 04 December 2005 05:08 pm, Pollywog wrote:

> That did not do it, but I think you are close, that is is something along
> those lines.  I think the instructions I followed are not intended for
> Courier but for Cyrus.

I user Courier on this particular server .. and for
sa-learn to harvest spam out of a Courier/IMAP Maildir
folder (because all Courier Maildir folders (regardless
of whether it's the main "Maildir" or one of the subfolders)
has a "cur" "tmp" and "new" subfolders ... this is what I 
have to do.

I use a similar mechanism where I keep track of the different
Ham and Spam folder names... and I use two different scripts
to sweep through the file system with a HAM sweep (going through
all of the cur, tmp, & new named sub-folders that are known to
have HAM in them and another script that is setup to sweeep
through the cur, tmp, & new SPAM sub-folders.

This is what most of my HAM folders sa-learning script looks like:

##NOTES---
$WORKDIR is the path of the main maildir (inbox) that sa-learn
is processing.  $DIRFILE is the name of the file that contains
the list of named sub-folders of the main Maildir.  $SALEARN is
the variable that contains the full path to sa-learn.  And since
this particular e-mail system is fully database driven, "vmail"
is the account of the virtual-mail while e-mail is processed.
##

OUTPUT="$WORKDIR/hamfolder.out"  # captures standard output output
WIPPUT="$WORKDIR/hamfolder.wip"  # captures current work-in-progress
ERRPUT="$WORKDIR/hamfolder.err"  # captures standard error output
WORKFIL="$WORKDIR/hamfolder.do"  # captures current named sub-folder

for OUTFILE in `echo $OUTPUT $WIPPUT $ERRPUT $WORKFIL`
do
if [ -e "$OUTFILE" ]
then
echo "erasing $OUTFILE"
rm $OUTFILE
fi
done


DATE=`date`

OFLD="tmp cur new"

for FOLDERIN in `cat $DIRFILE`
do
  for SUBFLDR in $OFLD
 do
   echo "$DIRPATH/$FOLDERIN/$SUBFLDR" > folder.do
   echo " $DIRPATH/$FOLDERIN/$SUBFLDR -" >> $WIPPUT
   echo " $DIRPATH/$FOLDERIN/$SUBFLDR -" >> $ERRPUT
   echo " $DIRPATH/$FOLDERIN/$SUBFLDR -" >> $OUTPUT
   nice  $SALEARN -u vmail --showdots --ham --folders=folder.do >> $OUTPUT 
2>>$ERRPUT
   DATE=`date`
   echo "--DONE w/ $DIRPATH/$FOLDERIN/$SUBFLDR --" >> $WIPPUT
   echo "--DONE w/ $DIRPATH/$FOLDERIN/$SUBFLDR --" >> $ERRPUT
   echo "--DONE w/ $DIRPATH/$FOLDERIN/$SUBFLDR --" >> $OUTPUT
done
done
echo "+-+" >> $OUTPUT
echo "| |" >> $OUTPUT
echo "|   Learning about HAM|" >> $OUTPUT
echo "| |" >> $OUTPUT
echo "|  DONE   |" >> $OUTPUT
echo "|     |" >> $OUTPUT
echo "+-+" >> $OUTPUT


The main difference between HAM and SPAM processing is the "--ham" is 
changed to "--spam" in the sa-learn execution. and where you see 
"ham" or "HAM" in the different variable file names.. they are changed
to "spam" or "SPAM" accordingly as well.

-- 
Tyler Nally
[EMAIL PROTECTED]


Re: using LearnAsSpam IMAP folder

2005-12-04 Thread Tyler Nally
On Sunday 04 December 2005 03:29 pm, Pollywog wrote:

> I forgot one thing that might be important: I am using Courier IMAP, not
> Cyrus
>
> courier-imap   3.0.8-4
> courier-maildrop 0.47-4

Courier doesn't put the e-mail *directly* into the folder that 
carries the folder's name, but one of three directories "cur",
"tmp", or "new" inside of the named folder so, if you
have a folder... "LearnAsSpam" inside of your inbox (Maildir/), 
the path (I think) you'd need is...

   $HOME/Maildir/.LearnAsSpam/cur/

.. for the value that a sa-learn would have to be aimed towards
to pick up the contents thereof.

-- 
Tyler Nally
[EMAIL PROTECTED]


RE: Bayes question

2005-07-27 Thread Tyler Nally
Boy... anytime I've done some kind of network file sharing across
a system or two, I have never done it for good performance reasons...
only convenience sakes.  And even then, never large files.

Almost a decade ago when I was performing massive COBOL database
conversions to load data into flat files to be imported into a
relational database, I noticed a significant decrease in performance
of the machine that is accessing remotely stored files.  It was far
easier/faster to auto-ftp the half a gigabyte of information to another
machine so that it could have the information *local* and therefore it
can access the data extremely quickly.   Depending on the machine and
it's resources, I'd expect it to slow down it's processing between 25-40%
on the average.

If the data remained on a remote machine, then the CPU has to use
it's resources to handle the resources on the remote file system
as if it's a part of it's own.  It is then at the whim of a NFS
file system handle that may or may not stay fresh.  Even if the
machines are separated by a couple feet of cable .. for me .. back
then ... NFS wasn't reliable enough for me to be able to bank on it
being up.  Because when the remote NFS file handle went stale, it
caused the local machine to hang and drag.  Maybe NFS is better now
than back then... I don't know.

The machine doesn't make a network *call* to the other machine to
borrow it's resources, it uses it's own resources to access the
remote files as if they are local yet, it does it over a network
cable rather than the typical high-speed of motherboard's bus that
would access the local hard drive.

So... the only way I'd do this in this day and age would be to have
the kind of hardware that you could build a multi-node supercomputer
where they all share the same hard drive over a fiber optic network
with lightning quick hard disks on the server node as it shares its
resources with the worker nodes.  In that case, the networking element
has been removed from the equation as the slowest link in the chain
of events.

On Wed, July 27, 2005 16:37, Alan Fullmer said:
> I attempted to do that once, with a network file system, but it didn't seem
> to know how to handle the locking properly.  I know I did something wrong,
> so if anyone else has a solution, I'd also be happy to hear it! :-)


-- 
Tyler Nally
[EMAIL PROTECTED]