Attached is a proposed patch for /var/lib/spamassassin/rules_du_jour which addresses the problem of the refresh URL which Rules Emporium sometimes sends out instead of a valid cf file. Basically, this patch greps the downloaded file for the string "META HTTP-EQUIV", which should never occur in a valid rules file, but is part if the refresh URL. If the downloaded file is a refresh URL, it's deleted, the script waits 1 second and tries again, up to 3 times. If the download fails after 3 tries, the bad file is deleted and the script moves on.
You might try running rules_du_jour from a cron job with the -D option and redirecting the output to a /tmp file and see if you get any notices about "Download of .... FAILED after 3 tries", in which case I've mis-diagnosed the problem somewhat. In any event, the problem file should be deleted rather than causing a --lint failure in spamassassin. -- Lindsay Haisley |"Fighting against human | PGP public key FMP Computer Services | creativity is like | available at 512-259-1190 | trying to eradicate |<http://pubkeys.fmp.com> http://www.fmp.com | dandelions" | | (Pamela Jones) |
--- /root/rules_du_jour.orig 2007-06-17 21:01:24.000000000 -0500 +++ /var/lib/spamassassin/rules_du_jour 2007-06-28 14:07:37.000000000 -0500 @@ -780,7 +780,30 @@ [ "${DEBUG}" ] && echo "Retrieving file from ${CF_URL}..."; # send wget output to a temp file for grepping - HttpGet ${CF_URL} ${TMPDIR}/${CF_BASENAME}; + # + # This while loop is a fix for Rules Emporium honey-pot DDoS + # shield as of 6/28/07. Send comments and bugs to Lindsay Haisley, + # [EMAIL PROTECTED] + GET_COUNT=1; + MAX_GET_COUNT=4; + while [ ${GET_COUNT} -lt ${MAX_GET_COUNT} ]; do + HttpGet ${CF_URL} ${TMPDIR}/${CF_BASENAME}; + if ${GREP} -iq 'META HTTP-EQUIV' ${TMPDIR}/${CF_BASENAME} ; then + rm -f ${TMPDIR}/${CF_BASENAME}; + sleep 1; + [ "${DEBUG}" ] && echo "Got refresh URL, pass ${GET_COUNT}..."; + GET_COUNT=`expr ${GET_COUNT} + 1`; + else + [ "${DEBUG}" ] && echo "Rules file OK, pass ${GET_COUNT}..."; + GET_COUNT=`expr ${MAX_GET_COUNT} + 1`; + fi + done + if ${GREP} -iq 'META HTTP-EQUIV' ${TMPDIR}/${CF_BASENAME} ; then + rm -f ${TMPDIR}/${CF_BASENAME}; + GET_COUNT=`expr ${GET_COUNT} - 1`; + [ "${DEBUG}" ] && echo "Download of ${CF_BASENAME} FAILED after ${GET_COUNT} tries. Skipping ..."; + fi + # Append these errors to a variable to be mailed to the admin (later in script) [ "${FAILED}" ] && RULES_THAT_404ED="${RULES_THAT_404ED}\n${CF_NAME} had an unknown error:\n${HTTP_ERROR}";