Re: learn ham

2017-01-05 Thread Shawn Bakhtiar

> On Jan 5, 2017, at 8:54 AM, Dave Funk  wrote:
> 
> On Thu, 5 Jan 2017, Nicola Piazzi wrote:
> 
>> Each minute it learn messages of the last minute so it read and learn one 
>> time only for each message
>> Messages are that it sends from internal, so il learn that words are not spam
>> 
>> Internal messages are not spam
> 
> Until one of your users gets their account hacked/phished and spammers then 
> use it to abuse your server to send out megabytes of spam.
> (or they may have had an account on Yahoo that used the same password).
> 
> Careless users happen to the best of us. ;(
> 
> John's point is still valid; blind un-vetted automated Bayes learning is 
> asking for trouble.

I would have to agree and re-inforce the message here... automated learning of 
SPAM/HAM is not a good idea. I have users dropping emails THEY HAVE SUBSCRIBED 
TO and forgotten they did so in their SPAM folder, and I would argue those are 
NOT SPAM. They actually contain a LOT of industry standard nomenclature that if 
trained as SPAM would not necessarily be valid tokens.

Think about it, the best machine to tell whether something is SPAM or not is 
the human machine. learning in this regard is telling SA emails like this one 
that I have specifically identified as SPAM are ones you should look out for. 
It (in and of itself) does not make a judgement call on what is or is not SPAM. 
You need to do that. 

Keep teaching and pretty soon everything is in every pool (there is such a 
thing as knowing too much, so much so, that you are left indecisive and 
perplexed at event the simplest problem). I think it's far better to have a 
smaller pool of tokens keyed with precision than a lot of tokens that well 
frankly could go either way.



> 
> -- 
> Dave Funk  University of Iowa
> College of Engineering
> 319/335-5751   FAX: 319/384-0549   1256 Seamans Center
> Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
> #include 
> Better is not better, 'standard' is better. B{



Re: Weird Spamassassin startup behaviour on Ubuntu 16.10

2016-12-08 Thread Shawn Bakhtiar
I believe in order for that to be true you need to use do-release-upgrade

https://help.ubuntu.com/lts/serverguide/installing-upgrading.html

Which also handles the migration of the startup scripts etc (regardless 
upgrades like these always tend to leave loose threads).

For now thought I would simply focus on what's wrong, by adding the missing 
after directive in the [unit] section of your systemD SA config and see if that 
solves the problem.

OT:
I would also at this point think about disabling the monit, by doing an 
inventory of daemons you need running on that system, and verifying that 
systemD is aware of them. SysV init scripts can easily be integrated into 
SystemD, for any custom modules, or products that are not yet systemD 
integrated.


On Dec 7, 2016, at 12:37 PM, Michael Heuberger 
<michael.heuber...@binarykitchen.com<mailto:michael.heuber...@binarykitchen.com>>
 wrote:

Right, it was upgraded from Ubuntu 14.10

I thought apt-get dist-upgrade + update + upgrade is supposed to migrate that 
stuff?


On 8/12/16 06:39, Shawn Bakhtiar wrote:
Good point. Although since ubuntu 16.x systemD is the default init system, 
which begs the question, was the system upgraded or is this a fresh install?

On Dec 7, 2016, at 7:59 AM, sha...@shanew.net<mailto:sha...@shanew.net> wrote:

Since you appear to have both kinds of init files (sysV and systemd),
you may want to doublecheck which init system is actually running.
I'm pretty sure you can figure it out by running "dpkg -S /sbin/init"
(this tells you which package owns that file).  It will probably say
systemd-sysv, in which case you'll want to follow Shawn's
instruction.  If it says upstart, then you'll most likely need to edit
/etc/init.d/spamassassin instead.

On Wed, 7 Dec 2016, Shawn Bakhtiar wrote:

Yeah... it's missing the "after" directive in the [unit] section, which
would have systemD wait until the other services (targets) are up. But also
as Marc mentioned not sure why you would use Monit since systemD (for all
it's issues) does monitor daemons and makes sure to spawn them again if they
die for any reason.
although according to the Monit wiki you can run it with systemD
(https://mmonit.com/wiki/Monit/Systemd), just not sure what the advantages
are.
You should have something like this:
[unit]
...
After=syslog.target network.target
...
You can add any target to after directive in [unit] to make sure it's up
before SA starts.

 On Dec 6, 2016, at 8:39 PM, Michael Heuberger
 
<michael.heuber...@binarykitchen.com<mailto:michael.heuber...@binarykitchen.com>>
 wrote:
Thanks Shawn
Here the contents on my server:
michael.heuberger@binarykitchen /l/s/system ❯❯❯ cat
spamassassin.service
[Unit]
Description=Perl-based spam filter using text analysis
[Service]
Type=forking
PIDFile=/var/run/spamassassin.pid
EnvironmentFile=-/etc/default/spamassassin
ExecStart=/usr/sbin/spamd -d --pidfile=/var/run/spamassassin.pid
$OPTIONS
ExecReload=/bin/kill -HUP $MAINPID
[Install]
WantedBy=multi-user.target
Does this seem to be outdated and wrong?
- Michael
On 7/12/16 09:29, Shawn Bakhtiar wrote:
 With Ubuntu 16.10 you should be using systemd.
you can enable dependencies (after directive) which can make
sure that all the services you need are started prior to (in the
case of SA) the service you want.
Check your systemD service configuration file:
/usr/lib/systemd/system/spamassassin.service (or wherever your
systemD config files are stored on Ubuntu.
The content should be something like:
[Unit]
Description=Spamassassin daemon
After=syslog.target network.target
Wants=sa-update.timer
[Service]
EnvironmentFile=-/etc/sysconfig/spamassassin
ExecStart=/usr/bin/spamd $SPAMDOPTIONS
StandardOutput=null
StandardError=null
Restart=always
[Install]
WantedBy=multi-user.target
Notice that systemd waits for syslog and network to complete
before it launches spamassassin.
Checkout this document if you have not and are still using
upstart.
https://wiki.ubuntu.com/SystemdForUpstartUsers

 On Dec 6, 2016, at 9:40 AM, sha...@shanew.net wrote:
I recently set up an email server on Ubuntu 14.10 and kept
being
frustrated that on boot various filter software and
related milters
were regularly starting after sendmail, sometimes by as
much as five
minutes.  We don't reboot that server very often, so it
took a while
to test various fixes, but in the end I added the
following lines to
the INIT INFO section of various milters (it's really only
the first
one that matters for startup):
# X-Start-Before:sendmail
# X-Stop-After:  sendmail
If postfix uses an /etc/init.d script like sendmail does
on 14.10,
check to see what the "Provides:" part of the INIT INFO is
(probably
postfix), and add an X-Start-Before line with tha value to
the
spamassassin init script.  Or, if you just want to make
sure that SA
starts before monit, use whatever the "Provides:" is set
to in the
monit init script.
If you have a mix

Re: Weird Spamassassin startup behaviour on Ubuntu 16.10

2016-12-07 Thread Shawn Bakhtiar
Good point. Although since ubuntu 16.x systemD is the default init system, 
which begs the question, was the system upgraded or is this a fresh install? 

> On Dec 7, 2016, at 7:59 AM, sha...@shanew.net wrote:
> 
> Since you appear to have both kinds of init files (sysV and systemd),
> you may want to doublecheck which init system is actually running.
> I'm pretty sure you can figure it out by running "dpkg -S /sbin/init"
> (this tells you which package owns that file).  It will probably say
> systemd-sysv, in which case you'll want to follow Shawn's
> instruction.  If it says upstart, then you'll most likely need to edit
> /etc/init.d/spamassassin instead.
> 
> On Wed, 7 Dec 2016, Shawn Bakhtiar wrote:
> 
>> Yeah... it's missing the "after" directive in the [unit] section, which
>> would have systemD wait until the other services (targets) are up. But also
>> as Marc mentioned not sure why you would use Monit since systemD (for all
>> it's issues) does monitor daemons and makes sure to spawn them again if they
>> die for any reason. 
>> although according to the Monit wiki you can run it with systemD
>> (https://mmonit.com/wiki/Monit/Systemd), just not sure what the advantages
>> are.
>> You should have something like this:
>> [unit]
>> ...
>> After=syslog.target network.target
>> ...
>> You can add any target to after directive in [unit] to make sure it's up
>> before SA starts.
>> 
>>  On Dec 6, 2016, at 8:39 PM, Michael Heuberger
>>  <michael.heuber...@binarykitchen.com> wrote:
>> Thanks Shawn
>> Here the contents on my server:
>> michael.heuberger@binarykitchen /l/s/system ❯❯❯ cat
>> spamassassin.service
>> [Unit]
>> Description=Perl-based spam filter using text analysis
>> [Service]
>> Type=forking
>> PIDFile=/var/run/spamassassin.pid
>> EnvironmentFile=-/etc/default/spamassassin
>> ExecStart=/usr/sbin/spamd -d --pidfile=/var/run/spamassassin.pid
>> $OPTIONS
>> ExecReload=/bin/kill -HUP $MAINPID
>> [Install]
>> WantedBy=multi-user.target
>> Does this seem to be outdated and wrong?
>> - Michael
>> On 7/12/16 09:29, Shawn Bakhtiar wrote:
>>  With Ubuntu 16.10 you should be using systemd. 
>> you can enable dependencies (after directive) which can make
>> sure that all the services you need are started prior to (in the
>> case of SA) the service you want.  
>> Check your systemD service configuration file:
>> /usr/lib/systemd/system/spamassassin.service (or wherever your
>> systemD config files are stored on Ubuntu.
>> The content should be something like:
>> [Unit]
>> Description=Spamassassin daemon
>> After=syslog.target network.target
>> Wants=sa-update.timer
>> [Service]
>> EnvironmentFile=-/etc/sysconfig/spamassassin
>> ExecStart=/usr/bin/spamd $SPAMDOPTIONS
>> StandardOutput=null
>> StandardError=null
>> Restart=always
>> [Install]
>> WantedBy=multi-user.target
>> Notice that systemd waits for syslog and network to complete
>> before it launches spamassassin. 
>> Checkout this document if you have not and are still using
>> upstart.
>> https://wiki.ubuntu.com/SystemdForUpstartUsers
>> 
>>  On Dec 6, 2016, at 9:40 AM, sha...@shanew.net wrote:
>> I recently set up an email server on Ubuntu 14.10 and kept
>> being
>> frustrated that on boot various filter software and
>> related milters
>> were regularly starting after sendmail, sometimes by as
>> much as five
>> minutes.  We don't reboot that server very often, so it
>> took a while
>> to test various fixes, but in the end I added the
>> following lines to
>> the INIT INFO section of various milters (it's really only
>> the first
>> one that matters for startup):
>> # X-Start-Before:sendmail
>> # X-Stop-After:  sendmail
>> If postfix uses an /etc/init.d script like sendmail does
>> on 14.10,
>> check to see what the "Provides:" part of the INIT INFO is
>> (probably
>> postfix), and add an X-Start-Before line with tha value to
>> the
>> spamassassin init script.  Or, if you just want to make
>> sure that SA
>> starts before monit, use whatever the "Provides:" is set
>> to in the
>> monit init script.
>> If you have a mixture of SysV (regular) and upstart
>> script, things get
>> more complicated (unless 16.10 introduces functionality to
>> make
>> dependencies interoperable that doesn't exist in 14.10).
>> On Tue, 6 Dec 2016, Michael Heuberger wrote:
>> 
>>  Hi David
>

Re: Weird Spamassassin startup behaviour on Ubuntu 16.10

2016-12-07 Thread Shawn Bakhtiar
Yeah... it's missing the "after" directive in the [unit] section, which would 
have systemD wait until the other services (targets) are up. But also as Marc 
mentioned not sure why you would use Monit since systemD (for all it's issues) 
does monitor daemons and makes sure to spawn them again if they die for any 
reason.

although according to the Monit wiki you can run it with systemD 
(https://mmonit.com/wiki/Monit/Systemd), just not sure what the advantages are.

You should have something like this:
[unit]
...
After=syslog.target network.target
...

You can add any target to after directive in [unit] to make sure it's up before 
SA starts.


On Dec 6, 2016, at 8:39 PM, Michael Heuberger 
<michael.heuber...@binarykitchen.com<mailto:michael.heuber...@binarykitchen.com>>
 wrote:


Thanks Shawn

Here the contents on my server:

michael.heuberger@binarykitchen /l/s/system ❯❯❯ cat spamassassin.service
[Unit]
Description=Perl-based spam filter using text analysis

[Service]
Type=forking
PIDFile=/var/run/spamassassin.pid
EnvironmentFile=-/etc/default/spamassassin
ExecStart=/usr/sbin/spamd -d --pidfile=/var/run/spamassassin.pid $OPTIONS
ExecReload=/bin/kill -HUP $MAINPID

[Install]
WantedBy=multi-user.target

Does this seem to be outdated and wrong?

- Michael

On 7/12/16 09:29, Shawn Bakhtiar wrote:
With Ubuntu 16.10 you should be using systemd.

you can enable dependencies (after directive) which can make sure that all the 
services you need are started prior to (in the case of SA) the service you want.

Check your systemD service configuration file:
/usr/lib/systemd/system/spamassassin.service (or wherever your systemD config 
files are stored on Ubuntu.

The content should be something like:

[Unit]
Description=Spamassassin daemon
After=syslog.target network.target
Wants=sa-update.timer

[Service]
EnvironmentFile=-/etc/sysconfig/spamassassin
ExecStart=/usr/bin/spamd $SPAMDOPTIONS
StandardOutput=null
StandardError=null
Restart=always

[Install]
WantedBy=multi-user.target


Notice that systemd waits for syslog and network to complete before it launches 
spamassassin.

Checkout this document if you have not and are still using upstart.
https://wiki.ubuntu.com/SystemdForUpstartUsers


On Dec 6, 2016, at 9:40 AM, sha...@shanew.net<mailto:sha...@shanew.net> wrote:

I recently set up an email server on Ubuntu 14.10 and kept being
frustrated that on boot various filter software and related milters
were regularly starting after sendmail, sometimes by as much as five
minutes.  We don't reboot that server very often, so it took a while
to test various fixes, but in the end I added the following lines to
the INIT INFO section of various milters (it's really only the first
one that matters for startup):

# X-Start-Before:sendmail
# X-Stop-After:  sendmail

If postfix uses an /etc/init.d script like sendmail does on 14.10,
check to see what the "Provides:" part of the INIT INFO is (probably
postfix), and add an X-Start-Before line with tha value to the
spamassassin init script.  Or, if you just want to make sure that SA
starts before monit, use whatever the "Provides:" is set to in the
monit init script.

If you have a mixture of SysV (regular) and upstart script, things get
more complicated (unless 16.10 introduces functionality to make
dependencies interoperable that doesn't exist in 14.10).

On Tue, 6 Dec 2016, Michael Heuberger wrote:

Hi David

I dont know. Not sure how I can find this out whether it does some DNS/network 
stuff.

In my other response to John you can see that it takes about 5.69 sec to start 
spamassassin.

And no idea how to configure a SA startup dependency on the network being up. 
And shouldn't that come along with the package when installed via apt-get?

- Michael


On 6/12/16 11:47, David B Funk wrote:

Could it be some kind if interaction with other system services startup?
(in particular this feels like a network timeout issue).

One of the things SA does during its startup process is check to see if
DNS/network stuff is available.
If the system hasn't yet brought up the network stack when SA starts, it
may hang waiting for the network to stabilize.

On a running system, if you stop/restart SA do you see the same delay or
is it only on a cold start of the system?

Is it possible to configure a SA starup dependency on the network being
up?



--
Public key #7BBC68D9 at| Shane Williams
http://pgp.mit.edu/|  System Admin - UT CompSci
=--+---
All syllogisms contain three lines |  
sha...@shanew.net<mailto:sha...@shanew.net>
Therefore this is not a syllogism  | 
www.ischool.utexas.edu/~shanew<http://www.ischool.utexas.edu/%7Eshanew>



--

Binary Kitchen
Michael Heuberger
1/33 Parrish Road
Sandringham
Auckland 1025
(New Zealand)

Mobile (text only) ...  +64 21 261 89 81
Email   
mich...

Re: Weird Spamassassin startup behaviour on Ubuntu 16.10

2016-12-06 Thread Shawn Bakhtiar
With Ubuntu 16.10 you should be using systemd.

you can enable dependencies (after directive) which can make sure that all the 
services you need are started prior to (in the case of SA) the service you want.

Check your systemD service configuration file:
/usr/lib/systemd/system/spamassassin.service (or wherever your systemD config 
files are stored on Ubuntu.

The content should be something like:

[Unit]
Description=Spamassassin daemon
After=syslog.target network.target
Wants=sa-update.timer

[Service]
EnvironmentFile=-/etc/sysconfig/spamassassin
ExecStart=/usr/bin/spamd $SPAMDOPTIONS
StandardOutput=null
StandardError=null
Restart=always

[Install]
WantedBy=multi-user.target


Notice that systemd waits for syslog and network to complete before it launches 
spamassassin.

Checkout this document if you have not and are still using upstart.
https://wiki.ubuntu.com/SystemdForUpstartUsers


On Dec 6, 2016, at 9:40 AM, sha...@shanew.net wrote:

I recently set up an email server on Ubuntu 14.10 and kept being
frustrated that on boot various filter software and related milters
were regularly starting after sendmail, sometimes by as much as five
minutes.  We don't reboot that server very often, so it took a while
to test various fixes, but in the end I added the following lines to
the INIT INFO section of various milters (it's really only the first
one that matters for startup):

# X-Start-Before:sendmail
# X-Stop-After:  sendmail

If postfix uses an /etc/init.d script like sendmail does on 14.10,
check to see what the "Provides:" part of the INIT INFO is (probably
postfix), and add an X-Start-Before line with tha value to the
spamassassin init script.  Or, if you just want to make sure that SA
starts before monit, use whatever the "Provides:" is set to in the
monit init script.

If you have a mixture of SysV (regular) and upstart script, things get
more complicated (unless 16.10 introduces functionality to make
dependencies interoperable that doesn't exist in 14.10).

On Tue, 6 Dec 2016, Michael Heuberger wrote:

Hi David

I dont know. Not sure how I can find this out whether it does some DNS/network 
stuff.

In my other response to John you can see that it takes about 5.69 sec to start 
spamassassin.

And no idea how to configure a SA startup dependency on the network being up. 
And shouldn't that come along with the package when installed via apt-get?

- Michael


On 6/12/16 11:47, David B Funk wrote:

Could it be some kind if interaction with other system services startup?
(in particular this feels like a network timeout issue).

One of the things SA does during its startup process is check to see if
DNS/network stuff is available.
If the system hasn't yet brought up the network stack when SA starts, it
may hang waiting for the network to stabilize.

On a running system, if you stop/restart SA do you see the same delay or
is it only on a cold start of the system?

Is it possible to configure a SA starup dependency on the network being
up?



--
Public key #7BBC68D9 at| Shane Williams
http://pgp.mit.edu/|  System Admin - UT CompSci
=--+---
All syllogisms contain three lines |  
sha...@shanew.net
Therefore this is not a syllogism  | 
www.ischool.utexas.edu/~shanew



Re: Anyone else just blocking the ".top" TLD?

2016-11-03 Thread Shawn Bakhtiar
1:22 smtp sendmail[14469]: u9GKpGUY014469: 
from=<t...@leaders2016.xyz>, size=0, class=0, nrcpts=0, proto=ESMTP, 
daemon=MTA, relay=[69.94.151.220]
Oct 16 14:40:48 smtp sendmail[15615]: u9GLehHT015615: 
from=<j...@leaders2016.xyz>, size=0, class=0, nrcpts=0, proto=ESMTP, 
daemon=MTA, relay=[69.94.151.222]

The IP range belongs to:  Lanset America Corporation (LANA)  which is a second 
rate email marketing corp.

I would suggest, if the need is there to open up individual domains, not the 
entire TLD, unless you are certainly your other counter measures will be 
sufficient in catching spam.



On Nov 3, 2016, at 9:40 AM, Vincent Fox 
<vb...@ucdavis.edu<mailto:vb...@ucdavis.edu>> wrote:

Indeed, that is what is happening.  I have had requests for
overrides.  I hate maintaining overrides if I no longer need to
even list the domain.  See driver.xyz for example which is legit.

This is an interesting statistics page I had not seen before:

https://ntldstats.com/fraud


[https://ntldstats.com/img/meta/fraud.jpg]<https://ntldstats.com/fraud>

Statistic of suspicious/fraudulent Domains in new gTLDs 
...<https://ntldstats.com/fraud>
ntldstats.com<http://ntldstats.com/>
Suspicious Domains in new gTLDs namespace ... TLDs with suspicious Domains: 209 
(17.59%)


Per that, TOP accounts for 64% of the problem.

SCIENCE is next at a mere 8%.

While XYZ comes in at #15 on the SURBL abused domains list
at present in raw numbers, as a percentage of it's email volume
it seems it's abuse is quite low.

From: Shawn Bakhtiar <shashan...@hotmail.com<mailto:shashan...@hotmail.com>>
Sent: Thursday, November 3, 2016 9:33:59 AM
To: users@spamassassin.apache.org<mailto:users@spamassassin.apache.org>
Subject: Re: Anyone else just blocking the ".top" TLD?

Unless you have customers/employees/vendors complaining that they are not 
receiving legitimate email from that TLD why would you un block it??


On Nov 3, 2016, at 9:27 AM, Vincent Fox 
<vb...@ucdavis.edu<mailto:vb...@ucdavis.edu>> wrote:

Resurrecting thread

TOP remains at the err... top of abuse heap.

XYZ insights anyone?  They have been on my reject list
for a long time, but claim to be cleaning it up.  Thinking to
drop my shields on this one.

https://gen.xyz/blog/antiabuse

.

My current total-block list:
From:link   REJECT
From:websiteREJECT
From:berlin REJECT
From:club   REJECT
From:email  REJECT
From:csr24.emailOK
From:guru   REJECT
From:wang   REJECT
From:xyzREJECT
From:driver.xyz ACCEPT
From:photographyREJECT
From:rocks  REJECT
From:click  REJECT
From:xn--czrs0t REJECT
From:xn--hxt814eREJECT
From:xn--flw351eREJECT
From:xn--qcka1pmc   REJECT
From:xn--45q11c REJECT
From:xn--vermgensberatung-pwb   REJECT
From:xn--vermgensberater-ctbREJECT
From:xn--p1acf  REJECT
From:xn--vhquv  REJECT
From:xn--xhq521bREJECT
From:xn--1qqw23aREJECT
From:xn--kput3i REJECT
From:xn--4gbrim REJECT
From:xn--czr694bREJECT
From:xn--80adxhks   REJECT
From:xn--ses554gREJECT
From:xn--czru2d REJECT
From:xn--rhqv96gREJECT
From:xn--nqv7f  REJECT
From:xn--i1b6b1a6a2eREJECT
From:xn--nqv7fs00emaREJECT
From:xn--c1avg  REJECT
From:xn--d1acj3bREJECT
From:xn--mgbab2bd   REJECT
From:xn--6frz82gREJECT
From:xn--io0a7i REJECT
From:xn--55qx5d REJECT
From:xn--fiq64b REJECT
From:xn--3bst00mREJECT
From:xn--6qq986b3xl REJECT
From:xn--fiq228c5hs REJECT
From:xn--3ds443gREJECT
From:xn--55qw42gREJECT
From:xn--zfr164bREJECT
From:xn--q9jyb4cREJECT
From:xn--ngbc5azd   REJECT
From:xn--80asehdb   REJECT
From:xn--80aswg REJECT
From:xn--unup4y REJECT
From:ninja  REJECT
From:gripe  REJECT
From:loans  REJECT
From:luxury REJECT
From:market REJECT
From:marketing  REJECT
From:pink   REJECT
From:whoswhoREJECT
From:work   REJECT
From:cricketREJECT
From:xn--plai   REJECT
From:review REJECT
From:countryREJECT
From:kimREJECT
From:scienceREJECT
From:party  REJECT
From:gq REJECT
From:topREJECT
From:unoREJECT
From:winREJECT
From:download   REJECT
From:tk REJECT
From:pw REJECT
From:international  REJECT
From:slice.internationalOK
From:date   REJECT
From:gdnREJECT
From:proREJECT
From:mm.law.pro OK
From:npocpa.pro OK
From:bidREJECT
From:trade  REJECT
From:press  REJECT
From:faith  REJECT
From:racing REJECT
From:stream REJECT
From:diet   REJECT
From:tokyo  REJECT
From:accountant REJECT
From:webcam REJECT
From:help   REJECT
From:space  REJECT
From:menREJECT



Re: Anyone else just blocking the ".top" TLD?

2016-11-03 Thread Shawn Bakhtiar
Unless you have customers/employees/vendors complaining that they are not 
receiving legitimate email from that TLD why would you un block it??


On Nov 3, 2016, at 9:27 AM, Vincent Fox 
> wrote:

Resurrecting thread

TOP remains at the err... top of abuse heap.

XYZ insights anyone?  They have been on my reject list
for a long time, but claim to be cleaning it up.  Thinking to
drop my shields on this one.

https://gen.xyz/blog/antiabuse

.

My current total-block list:
From:link   REJECT
From:websiteREJECT
From:berlin REJECT
From:club   REJECT
From:email  REJECT
From:csr24.emailOK
From:guru   REJECT
From:wang   REJECT
From:xyzREJECT
From:driver.xyz ACCEPT
From:photographyREJECT
From:rocks  REJECT
From:click  REJECT
From:xn--czrs0t REJECT
From:xn--hxt814eREJECT
From:xn--flw351eREJECT
From:xn--qcka1pmc   REJECT
From:xn--45q11c REJECT
From:xn--vermgensberatung-pwb   REJECT
From:xn--vermgensberater-ctbREJECT
From:xn--p1acf  REJECT
From:xn--vhquv  REJECT
From:xn--xhq521bREJECT
From:xn--1qqw23aREJECT
From:xn--kput3i REJECT
From:xn--4gbrim REJECT
From:xn--czr694bREJECT
From:xn--80adxhks   REJECT
From:xn--ses554gREJECT
From:xn--czru2d REJECT
From:xn--rhqv96gREJECT
From:xn--nqv7f  REJECT
From:xn--i1b6b1a6a2eREJECT
From:xn--nqv7fs00emaREJECT
From:xn--c1avg  REJECT
From:xn--d1acj3bREJECT
From:xn--mgbab2bd   REJECT
From:xn--6frz82gREJECT
From:xn--io0a7i REJECT
From:xn--55qx5d REJECT
From:xn--fiq64b REJECT
From:xn--3bst00mREJECT
From:xn--6qq986b3xl REJECT
From:xn--fiq228c5hs REJECT
From:xn--3ds443gREJECT
From:xn--55qw42gREJECT
From:xn--zfr164bREJECT
From:xn--q9jyb4cREJECT
From:xn--ngbc5azd   REJECT
From:xn--80asehdb   REJECT
From:xn--80aswg REJECT
From:xn--unup4y REJECT
From:ninja  REJECT
From:gripe  REJECT
From:loans  REJECT
From:luxury REJECT
From:market REJECT
From:marketing  REJECT
From:pink   REJECT
From:whoswhoREJECT
From:work   REJECT
From:cricketREJECT
From:xn--plai   REJECT
From:review REJECT
From:countryREJECT
From:kimREJECT
From:scienceREJECT
From:party  REJECT
From:gq REJECT
From:topREJECT
From:unoREJECT
From:winREJECT
From:download   REJECT
From:tk REJECT
From:pw REJECT
From:international  REJECT
From:slice.internationalOK
From:date   REJECT
From:gdnREJECT
From:proREJECT
From:mm.law.pro OK
From:npocpa.pro OK
From:bidREJECT
From:trade  REJECT
From:press  REJECT
From:faith  REJECT
From:racing REJECT
From:stream REJECT
From:diet   REJECT
From:tokyo  REJECT
From:accountant REJECT
From:webcam REJECT
From:help   REJECT
From:space  REJECT
From:menREJECT



Re: DNS Terminology

2016-09-23 Thread Shawn Bakhtiar
A forwarding name server simply forwards (proxies) the query to an upstream 
recursive server.


On Sep 23, 2016, at 9:03 AM, RW 
> wrote:

On Thu, 22 Sep 2016 20:24:21 -0700 (PDT)
John Hardin wrote:


Lists shouldn't have said "caching", that confuses the issue. Caching
and recursion are two different, unrelated pieces.

Focus on the "recursion" and "no forwarding" parts of that
recommendation.

I've been wondering whether recursive is actually the correct term.

As I understand it there are two types of DNS lookup:

 1. Iterative - where results are found by working down through
 multiple servers from the root servers.

 2. Recursive - where a request is made to a single nameserver which
 handles the whole look-up on behalf of a client.

What this turns on is whether a forwarding server is a distinct
class of of nameserver or a type of recursive server. I think the
latter is most logical, since both provide a recursive interface.
Definitions of the term "recursive server" that I've seen  contrast it
only with "authoritative server".

One thing is certain, what you want is a name server that does
*iterative* lookups.

A forwarding server is best used when a firewall does not allow direct access 
for DNS queries on the egress side (outbound). A forwarding server can be setup 
on the inside to point to a recessive server on the outside (or DMZ) and act as 
a proxy for internal hosts. A recursive server needs to be able to communicate 
unhindered to the world so it can follow the TLD chain down to the 
authoritative host responsible for a given subdomain.

Recursive server does lookups iteratively.
1) get root hints from file and find "." (one of the many) (this dot is implied 
at the end of every domain i.e. www.example.com. <-- we 
simply never really type the last dot)
2) ask root server where to look for COM
3) ask .COM where to look for EXAMPLE
4) Ask .EXAMPLE.COM where to look for WWW

A forwarding server simply forwards a (usually recursive) request to the next 
available upstream server, with some option to re-direct based on query (but 
that starts getting into multi views which is irrelevant here), and the 
recursive server simply sees the forwarding server as a client. It may be 
required based on firewall configuration (paranoid security specialist may not 
want to allow recursion from just any host on their network).

In regards to the OP and RBL lookups, it makes no difference whether there is a 
forwarding DNS in between the client (the spam blocking MTA) and the/a 
recursive DNS server, but in order for the RBL to work it will have to somehow 
get to a recursive DNS that can find and query the RBL, and that can be 
"proxied" by a forwarding server.

However what will NOT work is asking an authoritative DNS server. Authoritative 
DNS servers strictly provide information for a given sub domain, and *SHOULD* 
not allow recursion (lest you want to participate in DNS 
reflection/amplification DDoS attacks, since authoritative servers must respond 
to queries from the world - any ip address that may ask).

A few simple drill/dig/nslookups would easily provide all the information 
necessary as to how the DNS pathway is setup.

Here is what a drill -T for www.example.com looks 
like... notice the iterative recursion from com. all the way down to the host:

drill -T www.example.com
com. 172800 IN NS a.gtld-servers.net.
com. 172800 IN NS g.gtld-servers.net.
com. 172800 IN NS f.gtld-servers.net.
com. 172800 IN NS e.gtld-servers.net.
com. 172800 IN NS d.gtld-servers.net.
com. 172800 IN NS m.gtld-servers.net.
com. 172800 IN NS b.gtld-servers.net.
com. 172800 IN NS l.gtld-servers.net.
com. 172800 IN NS j.gtld-servers.net.
com. 172800 IN NS h.gtld-servers.net.
com. 172800 IN NS k.gtld-servers.net.
com. 172800 IN NS i.gtld-servers.net.
com. 172800 IN NS c.gtld-servers.net.
example.com. 172800 IN NS 
a.iana-servers.net.
example.com. 172800 IN NS 
b.iana-servers.net.
www.example.com. 86400 IN A 93.184.216.34
example.com. 86400 IN NS 
a.iana-servers.net.
example.com. 86400 IN NS 
b.iana-servers.net.


And here is the same query using dig on my SPAM firewall for a known IP listed 
on zen.spamhause.org again notice the recursion 
starting at root (.) .

dig 

Re: Spam by IP-address? Spamassassin with geoiplookup?

2016-09-20 Thread Shawn Bakhtiar

> On Sep 20, 2016, at 8:13 AM, RW <rwmailli...@googlemail.com> wrote:
> 
> On Tue, 20 Sep 2016 14:34:02 +
> Shawn Bakhtiar wrote:
> 
>> If you are strictly looking to block by IP addresses this is a far
>> better task left to the firewall, and configured by networks not
>> individual IP addresses. 
> 
> It shouldn't really be about blocking, it's about biasing the score. 
> 
> 

I humbly disagree

I find it interesting that most ISP's will block incoming connections like port 
80 so home users can't run their own web servers, effectively forcing them to 
use providers for services "in the name of security" but when it comes to 
outgoing connection they take no measures what so ever.

Mind you, I'm not taking about blocking HTTP or DNS. I simply block them on the 
SMTP gateway (kernel level firewall), this reduces directed spearfishing by a 
lot when I catch it early enough. Of course it usually means getting into the 
office at 5 AM and waddling through the honeypot email address to see where the 
next attack is coming from. :P




Re: Spam by IP-address? Spamassassin with geoiplookup?

2016-09-20 Thread Shawn Bakhtiar
If you are strictly looking to block by IP addresses this is a far better task 
left to the firewall, and configured by networks not individual IP addresses. 

There are many ranges which should not be sending email directly (IE those 
allocated by providers to home users). Unfortunately finding all of them and 
keeping the list valid is a full time job. 

I believe this is the point behind RBLs, but they can be a bit slow picking up 
on directed phishing attacks. 

In those cases I look up the IP address at ARIN or RIPE find the segment, and 
if it's anything other than an a real ISP I block the network from my mail 
server. A kernel firewall is magnitude faster than a SA and can be your first 
line of defense, the same way I use RBLs at the MTA before the mail even gets 
to SA.

I also agree, there is plenty of blame to go around for all countries. This is 
not a region specific issue (tho some tend to be more nefarious than others).


> On Sep 20, 2016, at 6:43 AM, Byung-Hee HWANG (황병희, 黃炳熙)  
> wrote:
> 
> Dear Thomas,
> 
> Thomas Barth  께서 쓰시길,
> 《記事 全文 <5eddfcdb-957c-e7c0-b133-a40c7ab37...@txbweb.de> 에서》:
> 
>> Hello,
>> 
>> is it possible to use geoiplookup with Spamassassin? I want to reject
>> all mails as spam not send in my country or another second country,
>> but accept whitelisted mailing list addresses. Any chance to use
>> geoiplookup for this? I want to exclude Spammer Countries e.g. China,
>> Thaiwan, India, etc...
> 
> There are many people to contribute for FOSS projects all around the
> world. You would be reconsideration about blocking by countries.
> 
> Sincerely,
> 
> -- 
> ^고맙습니다 _地平天成_ 감사합니다_^))//



Re: spamassassin and caching nameservers

2016-08-22 Thread Shawn Bakhtiar
Not sure if this helps but I use bind dlz with a mysql back-end as DNSBL of 
last resort. We get the IP addresses from honeypot emails, and it works pretty 
good. I have a daemon running in the background that uses a few intermediary 
tables with metrics like last seen, rate, total count, etc.. to make the final 
zone table which Sendmail queries (or SA if you wish).

http://bind-dlz.sourceforge.net/mysql_driver.html

How you populate the backend table is up to you. I’m sure there are lists you 
can download to populate the data, mitigating the need to make the DNS query, 
but I don’t use this as the first line of defense, it is our last line of 
defense before we engage SA.


On Aug 22, 2016, at 7:04 PM, Rob McEwen 
> wrote:

On 8/22/2016 9:15 PM, Alex wrote:
Has anyone configured it as a local caching nameserver, and if so,
could you share your config?

Correct me if I'm wrong... but...

I'm almost positive that rbldnsd acts ONLY as an authoritative name server, and 
not ever as a caching name server. I don't think there is functionality to 
either fetch root hints or to do catch-all forwarding to an upstream DNS server 
for just any host names. Instead, it only serves up the zones that it is 
specifically told to serve at startup, using the physical source data files to 
which those zones point.

It was designed from the ground up only to serve as a dumbed down locally 
hosted DNS, only for serving DNSBLs where the data files are found locally. It 
makes up for the lack of more extensive DNS features with blazing speed and 
very low memory overhead.

--
Rob McEwen




Re: Matching infinite sets

2016-08-22 Thread Shawn Bakhtiar

On Aug 22, 2016, at 10:44 AM, Marc Perkel 
> wrote:



On 08/22/16 09:06, Dianne Skoll wrote:
On Mon, 22 Aug 2016 09:03:38 -0700
Marc Perkel > 
wrote:

The ones that are the same are of no interest. Only where it matches
one side and not the other.
But... but... that's exactly like Bayes if you throw out tokens whose
observed probability is not 0 or 1.

Also, in your list of tokens, they are all phrases ranging from 1 to 4 words,
and that's why you get good results.  Multiword Bayes is just as good,
and I know that from experience.



This is nothing like bayes. Bayes is creating a mental block. When I describe 
it to people who don't know bayes they immediately get it. If I describe it to 
people who know bayes - they confuse it. Bayes is a probability spectrum based 
on a frequency match on both sets. That's not even close to what I'm doing.


I think you've copied and pasted this same paragraph half a dozen times now, 
and the list has tried it's best to accommodate your statement about "Bayes is 
creating a mental block", asking you pertinent questions that either remained 
un-answered, and/or when answered provided conflicting statements, and when 
pressed ended up showing that what you are doing is (at best) a slightly 
modified version.

However, I find the statement "When I describe it to people who don't know 
bayes they immediately get it" the most telling of them all. Of course people 
who don't know the probability theory will look at what you are doing and go 
"Wow!!! This is great!!" BECAUSE THEY DON'T KNOW.

People who know, obviously, recognize it for what it is, and you can claim as 
much as you like it's NOT, but at the end of they day, if it looks like a rose, 
smells like a rose (no matter what you call it) tis still rose!

All you have to do is READ the Process section of the following link to see 
exactly how similar your explanation is (save one factor which is using phrases 
vs. words), which has already been explained as a feature of SA using 
multi-word tokens:
https://en.wikipedia.org/wiki/Naive_Bayes_spam_filtering



Also - some of what I'm doing is all combinations, not just sequential. So it's 
like a system that writes and scores it's own rules. I just throw data at it 
and it classifies it.

The real magic is the feedback learning. So as it identifies ham it learns new 
words and phrases that then match email from other people. So it learns how 
normal people speak, it learns how spammers speak, and it identifies the 
DIFFERENCES between the two. And it's completely automated.


--
Marc Perkel - Sales/Support
supp...@junkemailfilter.com
http://www.junkemailfilter.com
Junk Email Filter dot com
415-992-3400




Re: Matching infinite sets

2016-08-22 Thread Shawn Bakhtiar

> On Aug 22, 2016, at 8:09 AM, John Hardin  wrote:
> 
> On Mon, 22 Aug 2016, Antony Stone wrote:
> 
>> On Monday 22 August 2016 at 16:45:09, Dianne Skoll wrote:
>> 
>>> On Mon, 22 Aug 2016 07:34:00 -0700 Marc Perkel wrote:
> So.  What percentage of emails using your algorithm are actually
> decidable?
 
 Almost 100% if you look at a wide variety of tokens from multiple
 attributes.
>>> 
>>> I can't believe that, or I'm missing something.  Almost every spam I see
>>> contains words that also appear in ham.  Things like "this" or "invoice"
>>> or "regards" or "dear".
>>> 
>>> What am I missing?
>> 
>> I believe you're missing Marc's definition of "token".
> 
> ...and it looks like we're venturing into the "SA Bayes multiple-word token 
> support" realm (as a surrogate).
> 

Even with the multiple tokens combined into one fingerprint, you've changed 
little. No matter how you bound the token, the assumption that there are not 
SPAM emails that contain HAM content, and vice versa is false. 

Regardless that is NOT what you claimed before, you seem to be flip-flopping 
between definitions to suite your argument.


> -- 
> John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
> jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
> key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
> ---
>  USMC Rules of Gunfighting #6: If you can choose what to bring to a
>  gunfight, bring a long gun and a friend with a long gun.
> ---
> 2 days until the 1937th anniversary of the destruction of Pompeii



Re: I have some bad news

2016-08-17 Thread Shawn Bakhtiar

On Aug 17, 2016, at 3:43 AM, Matus UHLAR - fantomas 
> wrote:

On 16.08.16 20:06, Marc Perkel wrote:
What I'm doing is looking for fingerprints in email that intersect HAM and not 
in SPAM - which would be a HAM result.
If it matches SPAM and does NOT match HAM - then it's SPAM.

The magic is in the NOT matching on the other side.

so, if mail matches both hammy and spammy tokens (or token sets), you don't
classify at all?


I guess what is confusing me (and I imagine others, as alluded to by Matus) is 
the fact that you are describing a special condition of Bayes' probability 
theorem. You are testing two variables (match SPAM and match HAM) (not matching 
is simply the negation of matching) thus giving you four conditions:

1) SPAM  && HAM
2) SPAM   && ~HAM
3) ~SPAM && HAM
4) ~SPAM && ~HAM

Here is a great diagram to show the four probable conditions:
https://en.wikipedia.org/wiki/Bayes%27_theorem#/media/File:Bayes%27_Theorem_2D.svg

So (if I am correct) Matus is asking what if condition 1 is true? How are you 
classifying an email than? Which is often the state of most emails, and thus 
why the use of Naive Bayes spam filtering, which generates a probability based 
on Bayes' probability theorem and is the conventional methodology to date. A 
Rose by any other name

Condition 4 is obvious it's nothing you have ever seen so classifying it 
anything other than HAM would be a huge mistake (IMHO), and fully covered by 
the aforementioned theorem as the probability of SPAM would (should) be 0. Same 
with Condition 3, obviously it never hits SPAM so wether it matches HAM or not 
you're going to treat it as HAM anyway same as condition 4.

That leaves condition 2. Which (if I'm not mistaken) is "... it matches SPAM 
and does NOT match HAM - then it's SPAM.". Which brings us back to Matus 
question, what if the email contains a single HAM token? Two HAM tokens? This 
is exactly what Bayes' probability theorem is designed for. All you are doing 
is defining a special condition in which the HAM probability is ZERO.

I think that's were I need to understand a bit more about what HAM means in 
this solution, does getting a hit on HAM somehow negate it being SPAM 
completely? In other words if the email contains some set of tokens that are 
SPAM, yet only one HAM token, that single HAM token makes it not SPAM? If so, 
you have a long way to go in convincing me that this is a good solution.

So if I say to you, "Let's get some lunch" that's ham because spammers never 
say that, but normal people do. So the way to test what "spammers never say" is 
to store what they do say and see if it's NOT in the list. (Thus the infinite 
set)


Actually I get SPAM with that very set of tokes in it. If somehow the HAM 
rating of it overrides the SPAM, I don't believe it would have a desirable 
effect.

I get plenty of:

"
Hay Shawn,

Hope you have time to do some lunch, click on this link and check out my new 
pictures!

Wannabe Phisher
"

Based on your example there's plenty of HAM and SPAM tokens in there, "Click on 
this link" high probability of SPAM-e-ness, would it get HAMed based on "hope 
you have time to do lunch". Or am I missing something?


Similarly, there's only so many ways to misspell viagra, and good email 
wouldn't have it spelled wrong.

Does that make sense?


Again, what you are saying makes sense in that it is special condition of the 
probability theory, What does not make sense is why would you not simply use 
the probability theory, that already encompasses that condition?

--
Matus UHLAR - fantomas, uh...@fantomas.sk ; 
http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Linux - It's now safe to turn on your computer.
Linux - Teraz mozete pocitac bez obav zapnut.



Re: I have some bad news

2016-08-16 Thread Shawn Bakhtiar
Marc,

Let me first say I am truly sorry to here about your cancer. I lost my father 
to cancer just over a decade ago, after a long battle with sarcoma of the 
throat and tongue. So I pray and wish you the best.

I sent this to you in January 2016 (don't recall if I ever got a reply to it) 
but based on your document:

Set theory is not my strongest suit,  but your diagram looks incorrect:
http://www.junkemailfilter.com/patent/patent5.pdf

Let:

H be ham
S be spam
E be an email

Than you state that:
HE = (H u E)
SE = (S u E)

But than the next diagram shows that there is some solution in which (HE u SE) 
and thus there may be some set which is (HE / SE). Even though in the first 
diagram S and H do not intersect.

This is not logical. Either (H u S) in which there are tokens common to the ham 
and spam token sets, or it does not, so which is it?? in other words, if a 
token is both ham and spam how are you calculating it’s weight?? Is it spam or 
ham?

Clearly it’s the latter (they do not intersect) as described in this:
http://www.junkemailfilter.com/patent/patent2.pdf

In which case you are simply looking to see if (H u E) > (S u E) and has 
nothing to do with what is not in the set, and there is indeed no (H u S) or 
the negation or NOT which is (H / S), so as everyone has been trying to explain 
it has NOTHING to do with what is NOT matched.

By they way, you can’t match an infinite set (well theoretically but not 
actually).
https://en.wikipedia.org/wiki/Intersection_(set_theory)

Since the current Bayes learns both SPAM and HAM I imagine that it does a very 
similar thing, other than perhaps the larger multi word token sets, which seems 
a trivial thing to add, and available in other tool sets.


I'll only add this, if you believe that your SPAM has been greatly reduced. 
That's awesome! But have you really isolated it to this "new technique" or in 
playing around have you inadvertently changed something else that may have 
changed your results?

I am also not saying that you have not developed some "new technique", but that 
if you have, your description of it does not line up logically with the 
technique itself. Back in January you were looking to patent it, today you 
simply want it to live on. I suggest that if it is indeed the latter, than 
perhaps it's time to release the source code/scripts and let a few more eyes 
look at the logic to see exactly what is it doing, that you believe is so 
different than what is out there.

Again, I pray and hope the best for you,
Shawn




On Aug 16, 2016, at 6:45 AM, Marc Perkel 
> wrote:

Thanks for the encouragement Ted. Unfortunately I know way too much about 
mathematics and I have a deep understanding of probability spectrums. There's a 
curve and I'm going to be somewhere on it. If I'm lucky I might be here for 
some time. But my life is a casino right now. And yes - there is also a 
probability spectrum for any of us getting hit by a bus tomorrow as well. 
SpamAssassin is based on statistical probabilities.

I have to have a dual track strategy. One one hand I need to do what I can to 
move the curve into the future. But at the same time I need to accomplish thing 
that are important within a limited time slot as well.

Spam filtering isn't just another job to me. I actually have a passion for it. 
On a philosophical basis I look at the internet as the new nervous system for 
humanity and is now core to who we are as a species. And email is a very key 
technology in that nervous system.

In that context spam is like poison where predators suck some of the life out 
of humanity, and my real life has always been about the progress of the human 
race.

I am somewhat of a spam fighting savant. I actually run very little of my email 
through SpamAssassin, truth be told. Over the years I've thrown some ideas into 
the mix and sometimes they have been adopted to make SA better. Sometimes I 
just get shouted down by trolls and the ideas go no where.

At this point however there's a deadline and I have ideas that could be 
implemented in SA very very easily. In fact it was through SA that I discovered 
Redis, and SA already talks to redis.

Although my innovation is excellent as a programmer I'm mediocre. Never worked 
as a team. Easily frustrated. Probably somewhat autistic and somewhat arrogant. 
So mostly living in my own world doing my own development. I have my little 
online empire. I work from home. I make a great living. And I really like (most 
of) my customers and enjoy doing tech support. And it's allowed me a lot of 
free time to do things that I'm really interested in.

But my ideas are now my immortality, so I'm now releasing this to the world. 
And mostly this simple AI method that SA could easily implement.

This new spam filtering trick is not only extremely effective, it's extremely 
simple. I had it working in 2 days. The developers here could probably 
implement it in 1 day. 

Re: Using Postfix and Postgrey - not scanning after hold

2016-07-29 Thread Shawn Bakhtiar

> On Jul 29, 2016, at 10:42 AM, Reindl Harald <h.rei...@thelounge.net> wrote:
> 
> 
> Am 29.07.2016 um 19:26 schrieb Shawn Bakhtiar:
>> 
>>> On Jul 29, 2016, at 10:12 AM, @lbutlr <krem...@kreme.com> wrote:
>>> 
>>> On 29 Jul 2016, at 09:20, sha...@shanew.net wrote:
>>>> I would generalize that even more to say that greylisting should come
>>>> before any other content-based filtering (virus scanners, defanging,
>>>> etc.).
>>> 
>>> Greylisting is a great idea, in theory. In practice there are so many large 
>>> emailers who can’t do email properly that is causes more trouble than it 
>>> prevents.
>>> 
>> 
>> I second that. I've tried gray listing a couple of times, and all I got from 
>> my users (and as the logs end up showing) is that some emails systems do not 
>> re-attempt delivery in an adequate enough time to be relevant to our 
>> business processes. When purchasing is waiting for confirmation of a hot 
>> rush delivery from a new vendor, gray listing can more than cause a few 
>> calls to the IT department.
> 
> that's why you don't do *unconditional* greylisting

There is no way for me to know when a new vendor is setup and what their email 
system is setup to do. We purchase raw materials from all over the world, 
especially when the lab is trying out new formulations. Not sure how I could 
create a system that conditionally would deal with that. We purchase form mom 
and pop shops all the way up to multi national conglomerates. 

Invariably you will run into problems with gray listing, that has simply been 
my experience, and to avoid those problems, you are now suggestion I  setup 
even more systemic processes (conditional gray listing) to deal with it, how 
many more percentage points of accuracy do I get for all that? well not enough 
to make it worth the effort. 

SA blocks A LOT. RBLs help A LOT, gray listing  well just a little bit 
better, but with A LOT of headache. not worth it.

> 
>> I also have to agree with John Hardin. At what point does the advice not 
>> become worth the slap in the face it comes with.
>> 
>> As much as your advice has its merits Harald, it is also, very narrow sighted
> 
> see next part of response too.
> 
> "narrow sighted" is really nonsense when one played around with all sorts of 
> configurations and orderings for months until making decisions how to setup 
> the systems
> 
>> The reality is most of us (the other 99%) are not dedicated mail admins
> 
> and hence that ones should listen was dedicated sysadmins spent thousands of 
> hours in rock stable system are explaining
> 
>> I for one am a software engineer
> 
> well, in fact i once was hired as software engineer before it took over the 
> CTO and sysadmin *additionally* and so i am not a *dedicated* mail admin 
> since there is database, voip, http and other services besides my development 
> job, but that don't change the fact that i spent counted 1200 hours alsone in 
> 2014 for the inbound MX where SA is only a small part of it after being 
> mailadmin over nearly 10 years anyways
> 
> so i pretend taht i know what i am talking about without being only mailadmin 
> and nothing else
> 



Re: Using Postfix and Postgrey - not scanning after hold

2016-07-29 Thread Shawn Bakhtiar

> On Jul 29, 2016, at 10:12 AM, @lbutlr  wrote:
> 
> On 29 Jul 2016, at 09:20, sha...@shanew.net wrote:
>> I would generalize that even more to say that greylisting should come
>> before any other content-based filtering (virus scanners, defanging,
>> etc.).
> 
> Greylisting is a great idea, in theory. In practice there are so many large 
> emailers who can’t do email properly that is causes more trouble than it 
> prevents.
> 
> 

I second that. I've tried gray listing a couple of times, and all I got from my 
users (and as the logs end up showing) is that some emails systems do not 
re-attempt delivery in an adequate enough time to be relevant to our business 
processes. When purchasing is waiting for confirmation of a hot rush delivery 
from a new vendor, gray listing can more than cause a few calls to the IT 
department. 

I also have to agree with John Hardin. At what point does the advice not become 
worth the slap in the face it comes with.

As much as your advice has its merits Harald, it is also, very narrow sighted. 
You often make assumptions about implementation, and when you find an 
implementation (or a persons understanding of it) contrary to your standards, 
you chalk it up to ignorance, and make sure the list knows what you think of 
that person. 

The reality is most of us (the other 99%) are not dedicated mail admins. I for 
one am a software engineer, who happens to also have to do network engineering, 
systems engineering, and half a dozen other hats. This is why I am on this 
list, so I can be kept a breast of what other people are dealing with and in 
the process learn something. 


> 
> 



Re: whitelist issues with sprintpcs.com

2016-07-05 Thread Shawn Bakhtiar
One possibility I don't see mentioned is to simply accept this at the MTA level.

I've often had to do this when a sending domain is misconfigured but is part of 
our legitimate senders. It obviously opens up doors you'll have to monitor 
other ways.

but in Sendmail it is as simple as adding the domains to the access db.

Then use something a la the following to set a really low score on those emails:

https://spamassassin.apache.org/full/3.1.x/doc/Mail_SpamAssassin_Plugin_AccessDB.html



On Jul 3, 2016, at 10:43 AM, Alex 
> wrote:

Hi,

I'm trying to whitelist mail from sprintpcs.com in the 
best way
possible, but it's ignoring attempts at even using whitelist_from and
I don't know why. Perhaps it's something with the way the mail is
formatted? No SPF or DKIM available to be used.

These messages are being quarantined because people are using sending
photos in a quick text message without any subject or body content.

I've put up an example here and hoped someone could take a look.

http://pastebin.com/1vapSDdF

This appears to be the only available headers:

Received: from lxnsmsomta04.localdomain (smtp4a.mo.sprintpcs.com [66.1.208.13])
   by mail01.example.com (Postfix) with ESMTP id 7FF846800CC30
   for ; Sat, 25 Jun 2016 21:21:21 -0400 (EDT)
Received: from musreb31.nmcc.sprintspectrum.com (unknown [10.25.157.71])
   by lxnsmsomta04.localdomain (Postfix) with ESMTP id 64B18608C
   for ; Sat, 25 Jun 2016 20:19:20 -0500 (CDT)

The envelope-from looks okay, but the "From" is not formatted properly.

X-Envelope-From: <15556142...@pm.sprint.com>
From: 5556142...@pm.sprint.com

Thanks for any ideas.
Alex



Re: Which DNSBLs do you use?

2016-06-17 Thread Shawn Bakhtiar

> On Jun 17, 2016, at 7:25 AM, Vincent Fox  wrote:
> 
> Greylisting imo helps a lot with RBL lag.

It can, but it's definitely a double edge sword. Depending on the way the 
remote MTA works, I've experienced emails being delayed for quite sometime. I 
had a lot of users requesting to be removed from the graylist, and eventually 
decided to drop it. When you're waiting for the confirmation of a PO from a new 
vendor on raw materials you need for a batch being made tomorrow it can be very 
frustrating :)

They MTA will let the remote client know the email was rejected, or the local 
client can go into SPAM folder and find the email, with graylists, the sender 
nor the receiver may realize the status of the email.

> 
> Delay suspect IP long enough that by the time they retry, if they do,  they 
> are on half a dozen RBL and score high and reject.
> 
> Sent from my iPhone
> 
>> On Jun 17, 2016, at 13:23, Reindl Harald  wrote:
>> 
>> 
>> 
>> Am 17.06.2016 um 02:57 schrieb Alex:
 For example, 212.227.126.135, scores 4 out of a 100 on senderscore. It
 also currently hits just sorbs. The individual score for each would
 have to be so low, even with such a poor reputation, that it hardly
 makes it worthwhile. I can't reject just on the almost worst
 reputation as you can have or just on sorbs, and the combination of
 the two isn't significant enough either.
>> 
>> and hence you score several DNSBL *and* DNSWL and make decisions on the 
>> final score
>> 
>>> I also meant to point out that with a reputation like 4 out of a 100,
>>> you'd think it would be listed on more RBLs than just sorbs. Something
>>> is wrong there. A mail server doesn't receive an absolutely horrible
>>> reputation without being blacklisted elsewhere.
>> 
>> bla - it takes time until a IP makes it to different RBL's and hence use 
>> many of them with moderate scoring so that you can make useful decisions and 
>> liekly have new offenders on enugh most of the time
>> 
>>> Senderscore is not trustworthy
>> 
>> NO RBL alone is trustworthy, hence you score them in the MTA as well as in 
>> the contentfilter
>> 



Re: Which DNSBLs do you use?

2016-06-16 Thread Shawn Bakhtiar

> On Jun 16, 2016, at 7:54 AM, Merijn van den Kroonenberg  
> wrote:
> 
>> Agreed.
>> 
>> We use sendmail, and check our DNSBL's their, it is much more efficient to
>> use them before we ever engage SA. It is extremely rare to find an IP that
>> lands on a reputable DNSBL and in those cases we can whitelist. Of course
>> most of our traffic is B2B, not sure how effective this would be in B2C or
>> C2C.
> 
> What do you use in sendmail to check the blacklists?
> 
> And do you use scoring or just direct block when on a BL?
> 
> 
> 

I simply reject when an IP address is on a BL. no questions asked. I also 
reject if the host fails its reverse lookup. In cases where a vendor or 
customer has a misconfigured email server, we can whitelist and notify them. 
I've actually helped several of our customers who were having issues with their 
clients resolve bad configurations. 

The problem lies in that I have come across more than a few SPAM mail filtering 
services that don't have correct configuration (i.e things like reverse lookup 
identify a different host). A more nefarious case I've run across is that a 
mail filtering services charging on a per outbound email, so clients are using 
the service for inbound, but than use their own MTA to send (bypassing the 
ISPs) so they don't get charged.

Again, our servers only deal with B2B, not sure of the impact in B2C/C2C.

SA is processes intensive, if you're looking to save CPU time, using BLs at the 
MTA process level is much faster (IMHO).

Re: Which DNSBLs do you use?

2016-06-16 Thread Shawn Bakhtiar

> On Jun 16, 2016, at 7:31 AM, Reindl Harald <h.rei...@thelounge.net> wrote:
> 
> 
> Am 16.06.2016 um 16:21 schrieb Shawn Bakhtiar:
>> Agreed.
>> 
>> We use sendmail, and check our DNSBL's their, it is much more efficient to 
>> use them before we ever engage SA. It is extremely rare to find an IP that 
>> lands on a reputable DNSBL and in those cases we can whitelist. Of course 
>> most of our traffic is B2B, not sure how effective this would be in B2C or 
>> C2C.
> 
> no difference - the majority of so blacklisted servers are infected enduser 
> machines which have no business to connect to any machine on port 25 and for 
> a well scored decision it don't matter anyways
> 

I disagree with no different. From a process perspective IMHO it's much faster 
to reject with postfix or sendmail than to engage a perl script (via pipe or 
tcp port no less) to check the email content before continuing to process. It 
adds a little bit more processing if they are not on the DNSBL, but saves a lot 
of processing if they are.

Which actually begs the OT question: Why is SA not written in C?

> also spammers don't care if you are business or not, easily to test with 
> spam-traps and how fast they are abused with all sort of junk
> 
>>> On Jun 16, 2016, at 7:16 AM, jaso...@mail-central.com wrote:
>>> 
>>> Fwiw, I've moved the DNSBL issue out of SA and put it 'in front' with 
>>> Postfix's postscreen.
> 
> postfix 'in front' has the job to complement and not replace blacklists in SA 
> since they still matter when some client don't reach the reject score but get 
> additional point in the content filtering
> 
>>> Instead of just *one* DNSBL, which is imo always  a risk, I use multiple 
>>> dnsbls, and weight them in scoring.
>>> 
>>> In my experience, it works fantastically well.
>>> 
>>> A great write up on the approach is here
>>> 
>>> http://rob0.nodns4.us/postscreen.html
>>> 
>>> OF course, that presumes Postfix.  You might me able to do the same with 
>>> other servers, or maybe don't have the option at all.
> 



Re: Which DNSBLs do you use?

2016-06-16 Thread Shawn Bakhtiar
Agreed.

We use sendmail, and check our DNSBL's their, it is much more efficient to use 
them before we ever engage SA. It is extremely rare to find an IP that lands on 
a reputable DNSBL and in those cases we can whitelist. Of course most of our 
traffic is B2B, not sure how effective this would be in B2C or C2C.

> On Jun 16, 2016, at 7:16 AM, jaso...@mail-central.com wrote:
> 
> Fwiw, I've moved the DNSBL issue out of SA and put it 'in front' with 
> Postfix's postscreen.
> 
> Instead of just *one* DNSBL, which is imo always  a risk, I use multiple 
> dnsbls, and weight them in scoring.
> 
> In my experience, it works fantastically well.
> 
> A great write up on the approach is here
> 
>  http://rob0.nodns4.us/postscreen.html
> 
> OF course, that presumes Postfix.  You might me able to do the same with 
> other servers, or maybe don't have the option at all.
> 
> Jason



Re: Which DNSBLs do you use?

2016-06-14 Thread Shawn Bakhtiar
zen.spamhaus.org
bl.spamcop.net
b.barracudacentral.org
dnsbl.inksystems.com <-- private internal one derived from honeypot email 
address we have.

I have disabled dnsbl.sorbs.net as they are too aggressive for our purposes, 
they block a lot of Gmail et al, which a lot of our customers and vendors use.



> On Jun 14, 2016, at 4:46 AM, Heinrich Boeder  
> wrote:
> 
> Hi Folks,
> 
> I have been on this list for quiet some time now and the topic "DNSBL" was 
> discussed pretty often, but I was still wondering which DNSBLs you guys use 
> for your mail environment.
> 
> So here are my questions: Which DNSBLs do you use? Which one can you suggest 
> the most?
> 
> Kind Regards,
> 
> - heinrich
> 
> heinr...@heinrichboeder.com -- www.heinrichboeder.com
> key: 0xC15DAD56 -- 363D 5BC3 9C45 9D09 3D78  1C28 DB68 F047 C15D AD56
> 



Re: URIBL/DNSBL from a database

2016-02-15 Thread Shawn Bakhtiar
I use to spend a lot of time blocking hosts and subnets, using IP tables, of 
malicious providers who would let any tom, dick, and Harry (no pun intended) to 
host spam hosts/relays on their servers. What I ended up doing is also blocking 
a lot SMB vendors from sending legitimate emails to users because most SMBs 
outsource their services without really comprehending the consequences of the 
provider they choose, this is especially true for low tech industries such as 
toll and process manufacturing companies, and frankly led to a management 
nightmare.

There are A LOT more people out there, far greater than just the Googles and 
Yahoos of the world, and to block IP addresses/subnets without an automated 
system using definable metric (that usually is enterprise specific), invariably 
IT will be inundated with complaints about users not receiving legitimate 
vendor emails.

It is much more effective to use existing RBLs, and supplementing it with your 
own honeypot RBL that uses metrics developed in house that can react to what 
your organization will consider the critical mass of spam it can take. That, 
along with the proper training of SA, is perhaps the best defense you can have. 
Using metric like last seen, total count, and frequency seem to provide the 
best metrics for me, my private RBL (based on honeypot addresses) can react 
faster than the big guys, on both ends of the equation (to block and to 
release), It's not that Google doesn't sometimes land on my RBL, it's that it 
also drops off fast as they remedy the issue, and the time outs are reached and 
they drop off my list.



> On Feb 14, 2016, at 10:19 PM, Noel Butler  wrote:
> 
> On 15/02/2016 09:02, Reindl Harald wrote:
>> Am 14.02.2016 um 23:34 schrieb Noel Butler:
>>> On 14/02/2016 01:46, Alex wrote:
 rejecting outright at the SMTP level for IPs reaching my honeypots
 could be dangerous if not checked.
>>> how so? if your honey pots use specific non human used (ever) addresses,
>>> then there should never ever be a genuine mail destined for it.
>>> I dont care who the connector is, be it foobar.com or gmail.com if they
>>> relay it, they are listed, its where spamhaus and I always disagreed,
>>> because what they are doing is sending a clear message to spammers to
>>> simply "use gmail" to avoid being listed in spamhaus.
>>> You are never too big to be stuffed into a dnsbl, there are a number of
>>> well known bl's that have been around for over ten years that also take
>>> that approach.
>> you missed to say that you are the type RBL operator which lists whole
>> subnets (in not only personal RBL's) because you don't like specific
>> people on mailing-lists
> 
> 
> Ohh, so you wanna bring this up again in public do you, fine by me... lets 
> have some history though shall we Harry...
> 
> Most DNSBL's blacklist spam *and* abusive hosts, there is no question about 
> you spamming, I know you don't and would never do that, but you are/were a 
> very very aggressively abusive person - this is supported by all those 
> mailing lists bannings/moderations you've copped over recent years which we 
> need both hands to count, the listing I placed on you was not just because of 
> the abuse and blackmailing you leveled at me, but number of complaints we 
> received also.
> 
> Further more, most people who've had interactions with you over the past 
> couple of years, espeically those that you've disagreed with also know how 
> you used to act, and occasionally still come close to, because you think you 
> are always right and anyone who disagrees with you is the anti christ or 
> something.
> 
> Ordinarily this does just warrant a /32 listing, however as a system 
> administrator with access to at least a /24, and evidence of your mailing 
> list ghost accounts, including at least one I recall from another IP in that 
> /24 a while back, yes, I took the step to block your /24.
> 
> 
>> also you don't realize that this don't stop any single mail from a
>> list sent by that person but just harms other domains using the SMTP
>> server
> 
> I realise a lot more than you think, as I've told you, and told you, and told 
> you, its up to lists what DNSBL's if any they use, but you are known to, on 
> the lists youve been moderated on, send abusive messages to recipients 
> directly since you can't via the lists
> so it does have a catching effect of those who use it.
> 
>> so *you* are hardly in the position for education about RBL's since
>> you don't care about any collateral damage but only your ego
> 
> You are entitled to your opinion, I care about valid collateral damage, if 
> you abuse an employers resources and your employers customers are caught up 
> on it, your employer, if they care, would take appropriate action, it is no 
> different than blocking a domain for spamming, forcing the host to clean up 
> its act and get rid of its spamming clients, of course at no time did I wish 
> to see your employment 

Re: URIBL/DNSBL from a database

2016-02-12 Thread Shawn Bakhtiar

On Feb 12, 2016, at 5:39 AM, Alex 
> wrote:

Hi,

For some time now I've been cycling URLs and IPs through  a mariadb
database gathered from incoming mail on a honeypot I've created.
Surprising how many are received ahead of spamhaus/barracuda.

I'm looking for ideas on how to now make this information available to
spamassassin on my production system. I'd like to somehow export the
IPs, any URLs in the body, and email addresses to spam assassin.

DNSBLs are very effective at this task, and I would recommend using before you 
filter the email with SA, unless you specifically want to score, due to 
uncertainty.


Is it possible for spamassassin to query a database directly?

It is:
https://spamassassin.apache.org/full/3.4.x/doc/Mail_SpamAssassin_Plugin_URIDNSBL.html

But even than I find it more effective having the server running the DNSBL 
manage the the block list using some metrics such as number of times the IP 
address appears, and/or not recording ip addresses in a whitelist table etc... 
Once (either via blacklist or metric) the IP gets into the DNSBL there is no 
need for me to worry about SA, simply reject. I find URI tend to change A LOT, 
so IP based blocking can be much more effective. But I think that's more of a 
preference.


I'm familiar with how to create a uridnsbl, but is DNS the best
approach here? The info needs to be updated and reloaded rapidly, and
not all the info (URLs, emails) are conducive to being in DNS.


That's the way I do it. using bind DLZ http://bind-dlz.sourceforge.net/
We have a delegated subdomain off our main domain that hosts a DNS exclusively 
used for block list, created from incoming mail sent to honeypot email address 
(ones that are no never were/or are no longer valid). Again I tend to focus on 
the IP address not the URI as a find that URI are dime a dozen and change quite 
frequently.

Is anyone else doing this, and are you just rejecting the IPs at the
SMTP level outright?

We use sendmail features to reject long before it gets to SA. It works better 
(IMHO) since there is much lower over head for sendmail doing a quick DNS 
lookup than engaging the milter that runs the email through it's passes with SA.

http://weldon.whipple.org/sendmail/dnsbl.html

But in this case it's IP based only not URI based. For URI (especially ones 
that you'll want to regex) SA may be more effective.


Thanks,
Alex



Re: OUTPUT OF SPAMASSASSIN

2016-01-24 Thread Shawn Bakhtiar

> On Jan 24, 2016, at 11:29 AM, Martin Gregorie  wrote:
> 
> On Mon, 2016-01-25 at 00:07 +0530, Sarang Shrivastava wrote:
>> I am just a newbie who has started using SA. Someone on the mailing 
>> list suggested me to use -D option. So if this option is for 
>> debugging then how do we classify it ?
>> 
> You don't classify it: that's SA's job. It only scores messages and
> sets the Yes/No flag before adding the X-Spam-* headers to the message.
> Nothing else. What you do with mail that SA has classified as spam is
> the responsibility of your additional software and/or your users.
> 
> Simplest case: configure SA to add [SPAM] as the first word in the
> Subject header and let the users decide what to do with this mail.
> 
> Next easiest: If your users' mail readers can detect spam and put it in
> a spam folder, enable that feature for them once you've configured SA
> to set whatever indicator the mail reader uses for spam identification.
> 

The best option I’ve found is to use sieve (filter) scripts (we have a default 
set we enable for all new users) that simply moves emails tagged as spam into a 
special folder for the user called SPAM. This allows them to have access to any 
false positives that may occur. 

Most common MDA/LDA’s have sieve script integration either by default or as a 
plugin module. For example Cyrus has timsieve server built it, and Dovecote has 
Pigeonhole, etc...


> Beyond that: write a spam quarantine subsystem and install it in the
> mail flow it where it can inspect messages that have been classified by
> SA and quarantine or delete them. Of course, you'll also need some way
> that your users can retrieve misclassified spam, and provide you with
> feedback so you can correct misclassifications  
> 
> 
> Martin
> 
> 



Re: My new method for blocking spam - REVEALED!

2016-01-20 Thread Shawn Bakhtiar
Sorry.. how is this different than Naive Bayes filtering??

"Naive Bayes classifiers work by correlating the use of tokens (typically 
words, or sometimes other things), with spam and non-spam e-mails and then 
using Bayes' theorem to calculate a probability that an email is or is not 
spam."
— https://en.wikipedia.org/wiki/Naive_Bayes_spam_filtering

"the set of fingerprints of the test message is intersected with the spam and 
ham corpi creating sub sets of matches. Then you do a set diff both ways (ham - 
spam) (spam - ham) and whichever side is bigger wins. Generally it will match 
on only one side or very predominately on one side.” — Marc Perkel

You are still looking up words/phrases in a dictionary set, and coming up with 
a probability factor of which side it falls on (an application of Baye’s 
theorom).

Or did I miss something?



On Jan 20, 2016, at 9:17 AM, Wrolf > 
wrote:

Good luck with your patent application, it should be in the infinitely elastic 
queue right after my perpetual motion machine.

Not sure how you will deal with the number of ham tokens in spam messages. Also 
not sure how much ham will get canned as spam - but then, maybe people 
shouldn't be sending each other poetry?

haiku by email
blossoms in my inbox
drink morning coffee


;-)


Wrolf
wr...@wrolf.net

On Wed, Jan 20, 2016 at 11:52 AM, Marc Perkel 
> wrote:
OK - following up on this. I have my provisional patent filed. I'm still doing 
development to improve it and working on a licensing contract. But the license 
will be based on the Creative Commons patent with some restrictions added. 
Basically I want to get a license fee from the big guys and my spam filtering 
competitors. So unless you are in the spam filtering business or have more than 
10,000 email addresses it's not going to cost you anything.

I'm going to describe the concept here. I'm not going to share my code because 
my code is specific to my system and it a combination of bash scripts, redis, 
pascal, php, and Exim rules. And the open source programmers are likely to 
implement it better than I have. Basically I'm trying not to put myself out of 
business and this new method is a bigger breakthrough than Bayesian filtering.

Maybe I should call it a new plan for spam?

So - I'm just going to introduce the concept right now about how it works. Once 
you know what I'm doing it should be easy to implement, I had it working in a 
couple of days and I'm not an outstanding programmer. One thing to keep in mind 
is this is a paradigm shift. It's not about matching - it's about NOT matching. 
And although it is far better at catching spam, it best feature is actively 
identifying good email.

The secret sauce

Suppose I get an email with the subject line "Let's get some lunch". I know 
it's a good email because spammers never say "Let's go to lunch". In fact there 
are an infinite number of words and phrases that are used in good email that 
are never ever used in spam. And if I'm using words and phrases never used in 
spam that are used in ham - it's good email. And similarly - if I'm using words 
and phrases that are used in spam and never used in spam - it's spam.

So - how do I get a list of words and phrases never used in spam? I create a 
list of words and phrases that are used in spam and check to see if it's not on 
the list.

What I do is tokenize the spamiest parts of the email, like the subject line, 
into words and phrases of 1 2 3 and 4 word phrases.

the quick brown fox jumps over the lazy dog - becomes

"the" "quick" "the quick" "brown" "quick brown" "the quick brown" "fox" "brown 
fox" "quick brown fox" "the quick brown fox" "jumps" "fox jumps" "brown fox 
jumps" "quick brown fox jumps" "over" "jumps over" "fox jumps over" "brown fox 
jumps over" "the" "over the" "jumps over the" "fox jumps over the" "lazy" "the 
lazy" "over the lazy" "jumps over the lazy" "dog" "lazy dog" "the lazy dog" 
"over the lazy dog"

These tokens are learned as ham or spam and added to sets. I'm using Redis to 
do this because it has extremely fast set operations. I don't know of anything 
other than Redis that can do this. So think about Redis as the way to implement 
this.

A new message comes in. It is tokenized and fingerprinted and hundreds of 
fingerprints are generated. Then it's all set operations. the set of 
fingerprints of the test message is intersected with the spam and ham corpi 
creating sub sets of matches. Then you do a set diff both ways (ham - spam) 
(spam - ham) and whichever side is bigger wins. Generally it will match on only 
one side or very predominately on one side.

So I'm not just tokenizing the subject. Also the first 25 words of the message, 
the text of links in the message, The name part of the from address, The header 
names, the attachment names, the PHP script if there is one, and various 
behavior