Hello,

revoking old thread.

I think I have found some of problems:

1.

looking at old debug outputs, seems that everytime we've had problem,
sa-update first tried to fetch from http://sa-update.spamassassin.org

further fetching from other mirrors did not help, due to issues below.
I feel that I incorrectly blamed the other mirrors for this, sorry.

I don't have output from Sep 05 and Sep 20 stored anymore, but I can guess
it was the same problem.

I asked out fortinet team to look at that issue.
I am currently unable to fetch the update from sa-update.spamassassin.org
- are there any download limits implemented on that server?


2.
when those problems happened, curl returned "18", in 3.4.0 shown as 4608
(18*256) which means:

      18     Partial file. Only a part of the file was transferred.

- just as it did today:

Dec 17 07:06:59.051 [11809] dbg: http: /usr/bin/curl -s -L -O --remote-time -g 
--max-redirs 2 --connect-timeout 30 --max-time 300 --fail -o 1849014.tar.gz -- 
http://sa-update.spamassassin.org/1849014.tar.gz
Dec 17 07:07:17.618 [11809] dbg: http: process [11812], exit status: exit 18

so, only partial content, but IS returned.

Since the while() loop checks for content returned, and since http_get
function returns $out_fname even if curl did not return 0, the sa-update
does NOT move to next mirror and the checksum comparison fails.

- the 3.4.0 sa-update did detect this problem and http_get only returned
 content when curl exited with status 0


3.
when curl returns 18 and leaves the target file on filesystem, the filename
does not have original file's timestamp.

when file exists, the "-z filename" is appended to the curl command line,
which causes curl to fail, since the stored timestamp is newer.

the "-z" documentation said it fetches file modified later or
before the given time, which we don't apparently want but:

the "-z" seems to cause If-Modified-Since: header to be appended into the
request, which means it only fetches files newer than which we already have.

According to this, I believe that the "-z" option for CURL should be
dropped.


On 20.09.18 16:05, Matus UHLAR - fantomas wrote:
I looked at update times and they are different each day - debian script
sleeps random number of seconds (up to one hour) in order to lower the
impact at mirror servers.

I have removed the "--fail" option from curl and will look at error message
if there's any.

I'll keep you updated and will fill bugreport if I'm able to find out
anything useful.

On 08.10.18 16:43, Matus UHLAR - fantomas wrote:
I was able to repeat this problem now:

# /usr/bin/curl --verbose -L -O --remote-time -g --max-redirs 2 
--connect-timeout 30 --max-time 300 -o 1843052.tar.gz -- 
http://sa-update.spamassassin.org/1843052.tar.gz
* Hostname was NOT found in DNS cache
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                               Dload  Upload   Total   Spent    Left  Speed
0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   
Trying 64.142.56.146...
* Connected to sa-update.spamassassin.org (64.142.56.146) port 80 (#0)
GET /1843052.tar.gz HTTP/1.1
User-Agent: curl/7.38.0
Host: sa-update.spamassassin.org
Accept: */*

0     0    0     0    0     0      0      0 --:--:--  0:00:11 --:--:--     0< 
HTTP/1.1 200 OK
< Date: Mon, 08 Oct 2018 14:16:19 GMT
* Server Apache/2.4.6 (CentOS) is not blacklisted
< Server: Apache/2.4.6 (CentOS)
< Last-Modified: Mon, 08 Oct 2018 03:19:20 GMT
< ETag: "4600c-577af16429e00"
< Accept-Ranges: bytes
< Content-Length: 286732
< Content-Type: application/x-gzip
<
{ [data not shown]
0  280k    0     1    0     0      0      0 --:--:--  0:00:13 --:--:--     0* 
transfer closed with 286731 bytes remaining to read
* Closing connection 0
curl: (18) transfer closed with 286731 bytes remaining to read


# ls -l 1843052.tar.gz
-rw-r--r-- 1 root root      1 Oct  8 16:16 1843052.tar.gz

look at today's debug log says:

Oct  8 07:12:59.899 [20257] dbg: channel: selected mirror 
http://sa-update.spamassassin.org
Oct  8 07:12:59.899 [20257] dbg: http: url: 
http://sa-update.spamassassin.org/1843052.tar.gz
Oct  8 07:12:59.899 [20257] dbg: http: downloading to: 
/var/lib/spamassassin/3.004000/updates_spamassassin_org/1843052.tar.gz, new
Oct  8 07:12:59.899 [20257] dbg: util: executable for curl was found at 
/usr/bin/curl
Oct  8 07:12:59.899 [20257] dbg: http: /usr/bin/curl -s -L -O --remote-time -g 
--max-redirs 2 --connect-timeout 30 --max-time 300 -o 1843052.tar.gz -- 
http://sa-update.spamassassin.org/1843052.tar.gz
Oct  8 07:13:15.385 [20257] dbg: http: process [20258], exit status: 4608
Oct  8 07:13:15.385 [20257] dbg: channel: selected mirror 
http://sa-update.ena.com
Oct  8 07:13:15.385 [20257] dbg: http: url: 
http://sa-update.ena.com/1843052.tar.gz
Oct  8 07:13:15.385 [20257] dbg: http: downloading to: 
/var/lib/spamassassin/3.004000/updates_spamassassin_org/1843052.tar.gz, update
Oct  8 07:13:15.385 [20257] dbg: util: executable for curl was found at 
/usr/bin/curl
Oct  8 07:13:15.385 [20257] dbg: http: /usr/bin/curl -s -L -O --remote-time -g 
--max-redirs 2 --connect-timeout 30 --max-time 300 -o 1843052.tar.gz -z 
1843052.tar.gz -- http://sa-update.ena.com/1843052.tar.gz
Oct  8 07:13:15.889 [20257] dbg: http: process [20272], exit status: 0

This looks that invalid file was downloaded from sa-update.spamassassin.org,
and while next curl invocation succeeded with exit code 0, the file was not
overridden:

# /usr/bin/curl -s -L -O --remote-time -g --max-redirs 2 --connect-timeout 30 
--max-time 300 -o 1843052.tar.gz -z 1843052.tar.gz -- 
http://sa-update.ena.com/1843052.tar.gz
# ls -l 1843052.tar.gz
-rw-r--r-- 1 root root 243 Oct  8 16:21 1843052.tar.gz
# /usr/bin/curl -s -L -O --remote-time -g --max-redirs 2 --connect-timeout 30 
--max-time 300 -o 1843052.tar.gz -z 1843052.tar.gz -- 
http://sa-update.ena.com/1843052.tar.gz
# ls -l 1843052.tar.gz
-rw-r--r-- 1 root root 243 Oct  8 16:21 1843052.tar.gz
# rm 1843052.tar.gz
# /usr/bin/curl -s -L -O --remote-time -g --max-redirs 2 --connect-timeout 30 
--max-time 300 -o 1843052.tar.gz -z 1843052.tar.gz -- 
http://sa-update.ena.com/1843052.tar.gz
# ls -l 1843052.tar.gz
-rw-r--r-- 1 root root 286732 Oct  8 05:19 1843052.tar.gz

(the file size changed to 243 because of my tests).

further look at logs says that all failed downloads were from
sa-update.spamassassin.org:

Sep 28 07:43:07.888 [7018] dbg: http: /usr/bin/curl -s -L -O --remote-time -g 
--max-redirs 2 --connect-timeout 30 --max-time 300 -o 1842077.tar.gz -- 
http://sa-update.spamassassin.org/1842077.tar.gz
Sep 28 07:43:21.973 [7018] dbg: http: process [7019], exit status: 4608

Oct  5 06:35:10.552 [29702] dbg: http: /usr/bin/curl -s -L -O --remote-time -g 
--max-redirs 2 --connect-timeout 30 --max-time 300 -o 1842787.tar.gz -- 
http://sa-update.spamassassin.org/1842787.tar.gz
Oct  5 06:35:29.199 [29702] dbg: http: process [29705], exit status: 4608

Oct  7 07:17:37.644 [30424] dbg: http: /usr/bin/curl -s -L -O --remote-time -g 
--max-redirs 2 --connect-timeout 30 --max-time 300 -o 1843008.tar.gz -- 
http://sa-update.spamassassin.org/1843008.tar.gz
Oct  7 07:18:11.394 [30424] dbg: http: process [30427], exit status: 4608


btw, when repeating attempts to download from sa-update.spamassassin.org,
many of them failed.

It looks to me that:
1. sa-update.spamassassin.org has some rate limiting or problem with tcp/ip
 (accelerator?)

2. the script does NOT remove downloaded file when download fails and curl
 is NOT instructed to do so.

--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Posli tento mail 100 svojim znamim - nech vidia aky si idiot
Send this email to 100 your friends - let them see what an idiot you are

Reply via email to