On Sat, 22 Sep 2018, Dave Jones wrote:
On 9/20/18 2:50 PM, Fossies Administrator wrote:
Hi,
incidentally I looked some weeks ago on the web server access log file of
the SpamAssassin rules update files mirror sa-update.fossies.org and found
surprisingly that at noon (midday) the log file has a size much more than
the roughly expected half of a complete daily log.
Just for curiosity I plotted the number of the GET requests for update
files (tarballs) per hour and saw an interesting characteristics with a
great peak between 6 and 7 a.m. (GMT+2). Ok, the main reason is probably
the publication time (mostly between 5 and 6 a.m. GMT+2) with a delay til
the user's sa-update scripts are running. But the structure of the curves
with the some curious (?) mimima is a little bit "surprisingly" to me but
it is constant and reproducible.
A simple example text plot for a single day is attached (more accurate
plots are available under the URL given below).
But more interesting and "irritating" was the fact that I found in the main
update time often (at least 100-1000) entries with the HTTP status 404
("Not Found"). That motivated me to write a primitive script to analyze the
reason by monitoring the update status resp. update times of the new
published rules update files.
First I checked the local web log files assuming that a 404 request to an
update file means that an external client had the information about a new
file that the local mirror sa-update.fossies.org has not yet available
resp. not yet fetched (via rsync).
Additionally I checked the local DNS server (of the server provider) and
the DNS servers I found responsible for the domain spamassassin.org
ns2.pccc.com.
ns2.ena.com.
c.auth-ns.sonic.net.
b.auth-ns.sonic.net.
a.auth-ns.sonic.net.
via the command
dig @<server> 3.3.3.updates.spamassassin.org txt +short
The plots and an extract of the script output you can find under
https://fossies.org/~schleusener/sa-update.mirror_analysis/
User: sa
PW: update
The main reason for the 404 errors seems to be that the mirroring script is
started as cronjob on sa-update.fossies.org only every 10 minutes.
Probably better would be to check the original nameservers (the local
nameserver answers according the TTL only with a freshness delay of max.
one hour) and start only a rsync job if the response shows that a new file
is available.
If all mirror servers would use update frequencies not smaller than 10
minutes an idea may be also to set/change the DNS TXT entry only 10 minutes
after the release (availability) of a new update file.
Additionally I found that the synchronization of the above DNS servers
seems delayed by some minutes. The "best" DNS server seems to be
"ns2.ena.com" since it always as first one provides the new versions.
Maybe this behaviour is a little bit related to the current thread with the
subject "repeated sa-update problems" on the users list.
Regards
Jens
Very interesting and useful information. Thank you Jens.
I have put a 20 minute sleep in the script before the DNS updates happen to
give the mirrors time to update before sa-update starts looking for the new
ruleset.
I run ns2.ena.com and it's updating quickly because it's receiving the DNS
NOTIFY from the hidden master and performing a zone transfer immediately.
Now this will happen after a 20 minute delay. All other DNS servers must be
ignoring the NOTIFY and updating at the normal REFRESH interval in the SOA
record which is 7200 so they will average out to be 1 hour delay behind the
hidden master.
[djones@djones5 trunk]$ svn diff
Index: build/mkupdates/mkupdate-with-scores
===================================================================
--- build/mkupdates/mkupdate-with-scores (revision 1841667)
+++ build/mkupdates/mkupdate-with-scores (working copy)
@@ -282,6 +282,8 @@
if [ $AUTOUPDATESDISABLED -eq 1 -a $REVERT_REVISION -eq 0 ]; then
echo "DNS updating disabled (auto update publishing disabled), skipping
DNS reload"
else
+ # Wait 20 minutes for the mirrors to update via rsync
+ sleep 1200
# Newer versions >= 3.4.1 of SpamAssassin are CNAME'd to 3.3.3
/usr/local/bin/updateDNS.sh 3.3.3.updates TXT $REVISION
RC=$?
[djones@djones5 trunk]$ svn commit -m "Added DNS update delay to give time
for the mirrors to update via rsync before sa-update will start looking for
the new rule sets."
Sending build/mkupdates/mkupdate-with-scores
Transmitting file data .done
Committing transaction...
Committed revision 1841668.
Dave
After more than two weeks of observation I just want to confirm that your
measure succeeds: Since September 23, there was not a single 404 error
for an update file found on the mirror server sa-update.fossies.org.
Jens