Attached is a proposed patch for /var/lib/spamassassin/rules_du_jour
which addresses the problem of the refresh URL which Rules Emporium
sometimes sends out instead of a valid cf file.  Basically, this patch
greps the downloaded file for the string "META HTTP-EQUIV", which should
never occur in a valid rules file, but is part if the refresh URL.  If
the downloaded file is a refresh URL, it's deleted, the script waits 1
second and tries again, up to 3 times.  If the download fails after 3
tries, the bad file is deleted and the script moves on.

You might try running rules_du_jour from a cron job with the -D option
and redirecting the output to a /tmp file and see if you get any notices
about "Download of .... FAILED after 3 tries", in which case I've
mis-diagnosed the problem somewhat.  In any event, the problem file
should be deleted rather than causing a --lint failure in spamassassin.

-- 
Lindsay Haisley       |"Fighting against human |     PGP public key
FMP Computer Services |   creativity is like   |      available at
512-259-1190          |   trying to eradicate  |<http://pubkeys.fmp.com>
http://www.fmp.com    |       dandelions"      |
                      |     (Pamela Jones)     |


--- /root/rules_du_jour.orig    2007-06-17 21:01:24.000000000 -0500
+++ /var/lib/spamassassin/rules_du_jour 2007-06-28 14:07:37.000000000 -0500
@@ -780,7 +780,30 @@
         [ "${DEBUG}" ] && echo "Retrieving file from ${CF_URL}...";
        
         # send wget output to a temp file for grepping
-       HttpGet ${CF_URL} ${TMPDIR}/${CF_BASENAME};
+       #
+       # This while loop is a fix for Rules Emporium honey-pot DDoS
+       # shield as of 6/28/07.  Send comments and bugs to Lindsay Haisley,
+       # [EMAIL PROTECTED]
+       GET_COUNT=1;
+       MAX_GET_COUNT=4;
+       while [ ${GET_COUNT} -lt ${MAX_GET_COUNT} ]; do
+               HttpGet ${CF_URL} ${TMPDIR}/${CF_BASENAME};
+               if ${GREP} -iq 'META HTTP-EQUIV' ${TMPDIR}/${CF_BASENAME} ; then
+                       rm -f ${TMPDIR}/${CF_BASENAME};
+                       sleep 1;
+                       [ "${DEBUG}" ] && echo "Got refresh URL, pass 
${GET_COUNT}...";
+                       GET_COUNT=`expr ${GET_COUNT} + 1`;
+               else
+                       [ "${DEBUG}" ] && echo "Rules file OK, pass 
${GET_COUNT}...";
+                       GET_COUNT=`expr ${MAX_GET_COUNT} + 1`;
+               fi
+       done
+       if ${GREP} -iq 'META HTTP-EQUIV' ${TMPDIR}/${CF_BASENAME} ; then
+               rm -f ${TMPDIR}/${CF_BASENAME};
+               GET_COUNT=`expr ${GET_COUNT} - 1`;
+               [ "${DEBUG}" ] && echo "Download of ${CF_BASENAME} FAILED after 
${GET_COUNT} tries.  Skipping ...";
+       fi
+
 
         # Append these errors to a variable to be mailed to the admin (later 
in script)
         [ "${FAILED}" ] && RULES_THAT_404ED="${RULES_THAT_404ED}\n${CF_NAME} 
had an unknown error:\n${HTTP_ERROR}";

Reply via email to