Re: [Bug-wget] http server responding with 416 but file was not transferred completely

2017-09-14 Thread Josef Moellers
On 14.09.2017 17:06, Tim Rühsen wrote:
> On 09/14/2017 12:11 PM, Josef Moellers wrote:
>> On 14.09.2017 10:12, Tim Rühsen wrote:
>>> On 09/14/2017 09:53 AM, Josef Moellers wrote:
 Hi,

 We have seen (at least) one server who has
 Accept-Ranges: bytes
 in his header but, upon receiving a request with
 Range: bytes=23068672-
 responds with
 HTTP/1.1 416 Requested Range Not Satisfiable
 although the file was not transferred completely!

 wget then claims that
 The file is already fully retrieved; nothing to do.

 E.g.
 run
   wget https://downloads.dell.com/FOLDER02721216M/1/SUU_14.12.200.69.ISO
 the, after a couple of MB, abort the transfer and then continue the
 download:
   wget --continue
 https://downloads.dell.com/FOLDER02721216M/1/SUU_14.12.200.69.ISO

 Maybe the check in src/http.c:
 3821   if (statcode == HTTP_STATUS_RANGE_NOT_SATISFIABLE
 3822   || (!opt.timestamping && hs->restval > 0 && statcode ==
 HTTP_STATUS_OK
 3823   && contrange == 0 && contlen >= 0 && hs->restval >= 
 contlen))

 should be changed and any HTTP_STATUS_RANGE_NOT_SATISFIABLE with an
 incomplete file should show something like

 "download continue failed, file incomplete"
>>>
>>> Well, that would be ok for this special server.
>>>
>>> Normally 416 together with a server timestamp matching the file's
>>> timestamp means that the file is complete (as far as the client can
>>> judge from HTTP).
>>>
>>> IMO, if the server is broken (or misbehaves) then the server should be
>>> repaired. Except it is a very common misbehavior. In which case we could
>>> consider a work-around on the client side.
>>>
>>
>> So I humbly propose the attached patch.
>> I tried to create a pull request, but got a 403.
> 
> Thanks for the patch - I'll test it in the next days.

I have attached a simple webserver that simulates the error:
when a request with a Range comes in, it replies with 416 and also
returns an unsanely huge Content-Length. You'll need glib2 and
microhttpd for it to build.

I was able to reproduce the issue with this server and check that the
patch fixes it by causing wget to retry (until --retries is exhausted).

> BTW, we currently work on Wget2 where we have a related issue, if you
> like to take a look at it: https://gitlab.com/gnuwget/wget2/issues/278

I'll do that.

Josef


416-webserver.tgz
Description: application/compressed-tar


signature.asc
Description: OpenPGP digital signature


Re: [Bug-wget] http server responding with 416 but file was not transferred completely

2017-09-14 Thread Tim Rühsen
On 09/14/2017 12:11 PM, Josef Moellers wrote:
> On 14.09.2017 10:12, Tim Rühsen wrote:
>> On 09/14/2017 09:53 AM, Josef Moellers wrote:
>>> Hi,
>>>
>>> We have seen (at least) one server who has
>>> Accept-Ranges: bytes
>>> in his header but, upon receiving a request with
>>> Range: bytes=23068672-
>>> responds with
>>> HTTP/1.1 416 Requested Range Not Satisfiable
>>> although the file was not transferred completely!
>>>
>>> wget then claims that
>>> The file is already fully retrieved; nothing to do.
>>>
>>> E.g.
>>> run
>>>   wget https://downloads.dell.com/FOLDER02721216M/1/SUU_14.12.200.69.ISO
>>> the, after a couple of MB, abort the transfer and then continue the
>>> download:
>>>   wget --continue
>>> https://downloads.dell.com/FOLDER02721216M/1/SUU_14.12.200.69.ISO
>>>
>>> Maybe the check in src/http.c:
>>> 3821   if (statcode == HTTP_STATUS_RANGE_NOT_SATISFIABLE
>>> 3822   || (!opt.timestamping && hs->restval > 0 && statcode ==
>>> HTTP_STATUS_OK
>>> 3823   && contrange == 0 && contlen >= 0 && hs->restval >= contlen))
>>>
>>> should be changed and any HTTP_STATUS_RANGE_NOT_SATISFIABLE with an
>>> incomplete file should show something like
>>>
>>> "download continue failed, file incomplete"
>>
>> Well, that would be ok for this special server.
>>
>> Normally 416 together with a server timestamp matching the file's
>> timestamp means that the file is complete (as far as the client can
>> judge from HTTP).
>>
>> IMO, if the server is broken (or misbehaves) then the server should be
>> repaired. Except it is a very common misbehavior. In which case we could
>> consider a work-around on the client side.
>>
> 
> So I humbly propose the attached patch.
> I tried to create a pull request, but got a 403.

Thanks for the patch - I'll test it in the next days.
BTW, we currently work on Wget2 where we have a related issue, if you
like to take a look at it: https://gitlab.com/gnuwget/wget2/issues/278

With Best Regards, Tim



signature.asc
Description: OpenPGP digital signature


Re: [Bug-wget] Fwd: patch proposition

2017-09-14 Thread Avinash Sonawane
On Thu, Sep 14, 2017 at 1:28 PM, kalle  wrote:
> hello,

Hi Kalle!

> i hereby repeat my request (in standard english for Darshit):

Please don't mind but there are people from all around the globe on
bug-wget so it'll be great if we could extend the courtesy of not
wasting others time and helping each other understand better by
communicating in common language (de facto English) instead of
regional ones!

Think this way, what if I had written this email in Hindi? Would it
have been more helpful? Or a plain nuisance? :)

Regards,
Avinash Sonawane (rootKea)
PICT, Pune
https://rootkea.wordpress.com



Re: [Bug-wget] http server responding with 416 but file was not transferred completely

2017-09-14 Thread Josef Moellers
On 14.09.2017 10:12, Tim Rühsen wrote:
> On 09/14/2017 09:53 AM, Josef Moellers wrote:
>> Hi,
>>
>> We have seen (at least) one server who has
>> Accept-Ranges: bytes
>> in his header but, upon receiving a request with
>> Range: bytes=23068672-
>> responds with
>> HTTP/1.1 416 Requested Range Not Satisfiable
>> although the file was not transferred completely!
>>
>> wget then claims that
>> The file is already fully retrieved; nothing to do.
>>
>> E.g.
>> run
>>   wget https://downloads.dell.com/FOLDER02721216M/1/SUU_14.12.200.69.ISO
>> the, after a couple of MB, abort the transfer and then continue the
>> download:
>>   wget --continue
>> https://downloads.dell.com/FOLDER02721216M/1/SUU_14.12.200.69.ISO
>>
>> Maybe the check in src/http.c:
>> 3821   if (statcode == HTTP_STATUS_RANGE_NOT_SATISFIABLE
>> 3822   || (!opt.timestamping && hs->restval > 0 && statcode ==
>> HTTP_STATUS_OK
>> 3823   && contrange == 0 && contlen >= 0 && hs->restval >= contlen))
>>
>> should be changed and any HTTP_STATUS_RANGE_NOT_SATISFIABLE with an
>> incomplete file should show something like
>>
>> "download continue failed, file incomplete"
> 
> Well, that would be ok for this special server.
> 
> Normally 416 together with a server timestamp matching the file's
> timestamp means that the file is complete (as far as the client can
> judge from HTTP).
> 
> IMO, if the server is broken (or misbehaves) then the server should be
> repaired. Except it is a very common misbehavior. In which case we could
> consider a work-around on the client side.
> 

So I humbly propose the attached patch.
I tried to create a pull request, but got a 403.

Josef
Index: wget-1.19.1/src/http.c
===
--- wget-1.19.1.orig/src/http.c
+++ wget-1.19.1/src/http.c
@@ -3819,6 +3819,16 @@ gethttp (const struct url *u, struct url
 }
 
   if (statcode == HTTP_STATUS_RANGE_NOT_SATISFIABLE
+  && hs->restval < (contlen + contrange))
+{
+  /* The file was not completely downloaded,
+ yet the server claims the range is invalid.
+ Bail out.  */
+  CLOSE_INVALIDATE (sock);
+  retval = RANGEERR;
+  goto cleanup;
+}
+  if (statcode == HTTP_STATUS_RANGE_NOT_SATISFIABLE
   || (!opt.timestamping && hs->restval > 0 && statcode == HTTP_STATUS_OK
   && contrange == 0 && contlen >= 0 && hs->restval >= contlen))
 {


Re: [Bug-wget] Fwd: patch proposition

2017-09-14 Thread Darshit Shah
* kalle  [170914 10:06]:
> hello,
> i hereby repeat my request (in standard english for Darshit):
> can someone please insert this patch?
> no one responded to my patch proposal last time.
> kalle
> 
> 
>  Weitergeleitete Nachricht 
> Betreff: patch proposition
> Datum: Mon, 10 Jul 2017 19:36:56 +0200
> Von: kalle 
> An: Dale R. Worley 
> 
> at the end of the second segment of the node '3 Recursive Download',
> which ends with the words "parsed and followed further",add:
> "However, wget by default will not follow links to a different host than
> the one, the link was found on."
> 
> kalle
> 
> 
> Am 08.07.2017 um 03:06 schrieb Dale R. Worley:
> > The effective way to do this is propose as a patch the specific changes
> > in the documentation that you would like to see.
> 
> 

Hi,

We can indeed do that. I'll try and fix the info pages accordingly.

Thanks for the readability improvements to the documentation

-- 
Thanking You,
Darshit Shah
PGP Fingerprint: 7845 120B 07CB D8D6 ECE5 FF2B 2A17 43ED A91A 35B6


signature.asc
Description: PGP signature


Re: [Bug-wget] http server responding with 416 but file was not transferred completely

2017-09-14 Thread Josef Moellers
On 14.09.2017 10:12, Tim Rühsen wrote:
> On 09/14/2017 09:53 AM, Josef Moellers wrote:
>> Hi,
>>
>> We have seen (at least) one server who has
>> Accept-Ranges: bytes
>> in his header but, upon receiving a request with
>> Range: bytes=23068672-
>> responds with
>> HTTP/1.1 416 Requested Range Not Satisfiable
>> although the file was not transferred completely!
>>
>> wget then claims that
>> The file is already fully retrieved; nothing to do.
>>
>> E.g.
>> run
>>   wget https://downloads.dell.com/FOLDER02721216M/1/SUU_14.12.200.69.ISO
>> the, after a couple of MB, abort the transfer and then continue the
>> download:
>>   wget --continue
>> https://downloads.dell.com/FOLDER02721216M/1/SUU_14.12.200.69.ISO
>>
>> Maybe the check in src/http.c:
>> 3821   if (statcode == HTTP_STATUS_RANGE_NOT_SATISFIABLE
>> 3822   || (!opt.timestamping && hs->restval > 0 && statcode ==
>> HTTP_STATUS_OK
>> 3823   && contrange == 0 && contlen >= 0 && hs->restval >= contlen))
>>
>> should be changed and any HTTP_STATUS_RANGE_NOT_SATISFIABLE with an
>> incomplete file should show something like
>>
>> "download continue failed, file incomplete"
> 
> Well, that would be ok for this special server.
> 
> Normally 416 together with a server timestamp matching the file's
> timestamp means that the file is complete (as far as the client can
> judge from HTTP).
> 
> IMO, if the server is broken (or misbehaves) then the server should be
> repaired. Except it is a very common misbehavior. In which case we could
> consider a work-around on the client side.

OK, so I'll have a go at it.
Looks simple enough (famous last words ;-) )

Josef




[Bug-wget] Fwd: patch proposition

2017-09-14 Thread kalle
hello,
i hereby repeat my request (in standard english for Darshit):
can someone please insert this patch?
no one responded to my patch proposal last time.
kalle


 Weitergeleitete Nachricht 
Betreff: patch proposition
Datum: Mon, 10 Jul 2017 19:36:56 +0200
Von: kalle 
An: Dale R. Worley 

at the end of the second segment of the node '3 Recursive Download',
which ends with the words "parsed and followed further",add:
"However, wget by default will not follow links to a different host than
the one, the link was found on."

kalle


Am 08.07.2017 um 03:06 schrieb Dale R. Worley:
> The effective way to do this is propose as a patch the specific changes
> in the documentation that you would like to see.