RESENDING: Scripts were blocked for security reasons
On 5/13/2017 4:56 PM, Dave Jones wrote:
What's the next priority now that the rsync and httpd configs are active?
I will work on the build next using this:
https://svn.apache.org/repos/asf/spamassassin/trunk/build/README
PREFACE: I'm working on the build. If you would like to help, we need
to coordinate first.
Here is the promised list of items I've identified.
To me, the priority would be getting ruleqa/masscheck better documented
and back up and running would be ideal.
If we can get that system running smoother with a shorter lag to
publishing rules, I'd like to help more with it.
DONE - Touch a file called MIRROR.CHECK in
/var/www/bbmass.spamassassin.org/updates on SA-VM1 and test if it is
synced to the Mirrors. NOTE: I sync every 10 mins
- Document on the wiki that MIRRORED.BY contains the sa update mirror
contact names.
- Get the various files for running the sa-update aka bbmass website
into SVN. This would NOT be the update files but likely everything else
including the httpd.conf, MIRRORED.BY, etc.
- Get the email to root from sa-vm1 to go to sysadmins@ without
moderation so we have cron logs, etc. archived.
- KAM to Get the passwords for crashplan for SA into sysadmins repo
encrypted so we have multiple people who have access.
- Get the sa-update-mirror-check script (attached) running on SA-VM1 and
emailing sysadmins@ without moderation
- Get Darxus' rule update check script (attached) running on SA-VM1 and
emailing sysadmins@ without moderation. See SA Dev list example: Rule
updates are too old - 2017-05-08
- Get Darxus' check script updated for 3.4.2 and 3.3.2.
- Perhaps update the sa-update-mirror-check to use the MIRROR.CHECK with
a timestamp to confirm it's within a reasonable period of time.
- Find out who wrote the sa-update-mirror-check (likely on the list
archives), check the licensing on the post and hopefully ask who wrote
it to public domain or Apache license. Then add
attribution/license/copyright and add it to the sysadmins repo.
- Ask Darxus' if we can repo his script as well with
attribution/license/copyright as above
- Ask Darxus' to turn off his script that runs on his infrastructure
- Identify what we used to provide on the old servers. Some things KAM
believes we had that need to be verified and likely expanded on:
o Masscheck RSYNC for people to send us their Masscheck Logs
o An email system for people to email and it would send the results
of checking that email
o Masscheck Corpora RSYNC or perhaps SSH for people to send us their
corpora for us to run our own Masscheck server. NOTE: This is the most
sensitive data we would have I believe since it is other people's real
mail.
o For the above, I think I myself have this setup. I'd like to
identify where and extend it / improve it / make sure it's working, etc.
o Look at the rsync MOTD[1]
o Masscheck stuff:
https://wiki.apache.org/spamassassin/NightlyMassCheck - KAM sent notes a
few days ago about how he got this running on spamassassin-vm. if that
doesn't suffice, please let me know.
- Identify what jm was using talon1.pccc.com to provide so I can mimic
it. His cron jobs were disabled last January but I think they were
running items related to masscheck.
- Get the RuleQA Website running again.
- Identify what the incoming.spamassassin.org server did/does/can do for
us. NOTE: It might be the the same as below.
- Talk to Grant Kellar with Sonic about the traps they have in place and
where they are sent to make sure we are utilizing them.
- Clean up and remove unnecessary backup data on sa-vm1 - NO NEED TO BE
HASTY ON THIS, I'M JUST WRITING A COMPLETE LIST.
- Identify how much data we need if Infra can shrink the data storage
allocated for sa-vm1
- Talk to AXB about SOUGHT and SOUGHT2
- Update the documentation for InfraNotes2017 with another pass of
updates about machines, etc.
[1]
corpus
nightly mass-check result upload area. It is password protected.
If you would like a password, please send a request to
p...@spamassassin.apache.org and request a "nightly" username and password.
submit
Score generation mass-check result upload area. It is password
protected. If you would like a password, please send a request to
p...@spamassassin.apache.org and request a "score generation" username
and password. Generally these are only granted after a mass-check
announcement has been made on the spamassassin developer mailing list.
anoncorpus
mass-check result download area, available via anonymous access.
--
Kevin A. McGrail
Asst. Treasurer, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
#!/usr/bin/perl
# host -t txt 2.3.3.updates.spamassassin.org
use strict;
use warnings;
use Net::DNS;
use POSIX qw(strftime);
use LWP::Simple;
############################### Checking updates
my $updatelog = '/home/darxus/progs/sa/updatever.txt';
my @versions = ('3.3.0','3.3.1','3.3.2');
my $debug = 0;
my $founderror = 0;
my $report = '';
print "Checking DNS records.\n";
our $res = Net::DNS::Resolver->new;
my $answer = '';
my %vers = ();
my $currentdate = strftime "%F", localtime;
for my $saver (@versions) {
my $dnsver = join('.',reverse(split(/\./,$saver)));
$vers{$saver} = dns("$dnsver.updates.spamassassin.org.",'txt');
print "Latest update for SA $saver: $vers{$saver}, $currentdate\n" if $debug;
}
my %oldmatch = ();
if (-e $updatelog) {
open IN, "<$updatelog" or die "Couldn't read $updatelog: $!";
while (my $line = <IN>) {
chomp $line;
my ($date, $saver, $updatever) = split(' ', $line);
print "date: $date, SA version: $saver, update version: $updatever\n" if
$debug;
if ($vers{$saver} eq $updatever and !exists $oldmatch{$saver}) {
$oldmatch{$saver} = $date;
}
}
close IN;
} else {
print "No record of previous versions, creating new.\n";
}
my $body = '';
if (scalar keys %oldmatch > 0) {
for my $saver (sort keys %oldmatch) {
print "SpamAssassin version $saver has not had a rule update since
$oldmatch{$saver}.\n" if $debug;
$body .= "SpamAssassin version $saver has not had a rule update since
$oldmatch{$saver}.\n";
$founderror = 1;
}
}
open OUT, ">>$updatelog" or die "Couldn't write to $updatelog: $!";
for my $version (sort keys %vers) {
print OUT "$currentdate $version $vers{$version}\n";
}
close OUT;
print "Done with DNS.\n";
############################### Checking thresholds
$body .= "\n" unless ($body eq '');
my $baseurl = 'http://ruleqa.spamassassin.org/?daterev=';
my $datestamp = `date -d'yesterday' +%Y%m%d`;
chomp $datestamp;
my $dow = `date -d'yesterday' +%A`;
chomp $dow;
#print "datestamp: $datestamp, $dow\n";
my @datestamps;
push @datestamps, $datestamp;
#if ($dow ne 'Saturday') {
# $datestamp = `date +%Y%m%d -d"last saturday"`;
# chomp $datestamp;
# push @datestamps, $datestamp;
#}
our $url = '';
for my $datestamp (@datestamps) {
$url = $baseurl . $datestamp;
print "Retrieving url: $url\n";
my $content = get($url);
print "Retrieved.\n";
my $hamcount = 'not_found';
my $spamcount = 'not_found';
for my $line (split "\n", $content) {
if ($line =~ m#\(all messages\)#) {
($spamcount,$hamcount) = (split(' ', $line))[1,2];
}
}
print "spamcount / hamcount: $spamcount / $hamcount\n";
if ($spamcount eq 'not_found' or $hamcount eq 'not-found') {
$founderror = 1;
$body .= "$datestamp: Could not find the ham / spam counts, probably an
http error: $url\n";
} elsif ($spamcount < 150000 or $hamcount < 150000) {
$founderror = 1;
$body .= "$datestamp: Spam or ham is below threshold of 150,000: $url\n";
} else {
$body .= "$datestamp: Spam and ham are above threshold of 150,000:
$url\n";
}
$body .= "$datestamp: Spam: $spamcount, Ham: $hamcount\n";
}
############################### Reporting
print $body;
if ($founderror) {
$body .= "\n\nThe spam and ham counts on which this script alerts are
from\n$url\nClick \"(source details)\" (it's tiny and low contrast).\nIt's from
the second and third columns of the line that ends with\n\"(all
messages)\"\n\nThe source to this script
is\nhttp://www.chaosreigns.com/sa/update-version-mon.pl\n\nIt looks like both
the weekly and nightly masschecks need to have sufficient\ncorpora in order for
an update to be generated.\n";
#open OUT, "| /usr/bin/mail -s 'Rule updates are too old - $currentdate'
darxus\@chaosreigns.com,dev\@spamassassin.apache.org,ruleqa\@spamassassin.apache.org";
open OUT, "| /usr/bin/mail -s 'Rule updates are too old - $currentdate'
darxus\@chaosreigns.com,dev\@spamassassin.apache.org";
# open OUT, "| /usr/bin/mail -s 'Rule updates are too old - $currentdate'
darxus\@chaosreigns.com";
print OUT $body;
# print OUT "\nLog of update versions:\n";
# open IN, "<$updatelog";
# while (my $line = <IN>) {
# print OUT $line;
# }
# close IN;
close OUT;
} else {
print "No problems.\n";
}
exit; ######################################################################
sub dns {
my ($name, $class) = @_;
my $answer = $res->query($name, $class);
my @answers = $answer->answer;
my @values = ();
for my $txtdata (@answers) {
push @values, $txtdata->txtdata;
}
return $values[0];
}
#!/bin/sh
# script for cron job to monitor update mirrors
# Assumes the page in the MIRRORED.BY URLs contains a specified string
# Tries twice with a specified number of seconds time out
# The first time both tries fail, outputs a message about the failure
# Subsequent failures are silent
# First success after failures outputs a message that site is up again
# Settings go here
# MIRROR The URL for the mirror list
# TESTSTRING string to check for to see if a valid page was fetched
# LOGS base name for temporary file that holds result of fetching page
# ERRORS base name for error log file, extensions .1, .2, ... appended
# TIMEOUT longer than this seconds to fetch the page treat as an error
HOME="/root"
MIRRORS="http://spamassassin.apache.org/updates/MIRRORED.BY"
TESTSTRING="Apache SpamAssassin Project updates"
LOGS=$HOME/testupdates/log
ERRORS=$HOME/testupdates/error
TIMEOUT=60
function processurl () {
LOGFILE="$LOGS.$1"
ERRFILE="$ERRORS.$1"
URL="$2"
/usr/bin/curl -m $TIMEOUT -s -S $URL > $LOGFILE 2>&1
if /bin/grep -q "$TESTSTRING" "$LOGFILE"
then
if [ -e $ERRFILE ] ; then
#echo "$URL is up again"
EMAIL="Kevin A. McGrail \<kmcgr...@pccc.com\>" /usr/bin/mutt
priv...@spamassassin.apache.org -s "SA-Update Mirror Check: $URL is up again" <
/dev/null
rm $ERRFILE
fi
else
if [ ! -e $ERRFILE ] ; then
/usr/bin/curl -m $TIMEOUT -s -S "${URL}" > $LOGFILE 2>&1
if /bin/grep -q "$TESTSTRING" "$LOGFILE"
then
return
else
#echo "$URL is down. See $ERRFILE for error message"
mv -f $LOGFILE $ERRFILE
EMAIL="Kevin A. McGrail \<kmcgr...@pccc.com\>" /usr/bin/mutt
priv...@spamassassin.apache.org -s "SA-Update Mirror Check: $URL is down" -a
$ERRFILE < /dev/null
fi
fi
fi
}
i=1
/usr/bin/curl -s "$MIRRORS" | sed -e '/^\s*#/d' -e
's/^\s*\([a-zA-Z0-9_/:.-]*\).*/\1/' | while read url
do
processurl $i $url
(( i++ ))
done