Re: [Bug-wget] What are the tests testing?

2017-06-12 Thread Josef Moellers
Hello Tim,

Thanks for the reply.

On 10.06.2017 13:36, Tim Rühsen wrote:
> On Freitag, 9. Juni 2017 17:02:15 CEST Josef Moellers wrote:
>> Hi,
> 
> Hi Josef,
> 
>> I'm currently trying to build test suites for openQA.
>> One of the candidates is wget and, luckily, it already provides quite an
>> extensive test suite.
>> I have successfully built an RPM which has all that is needed for the tests.
>> One test, Test-ftp-iri-fallback.px, fails on SLES12-SP2 and I can't see
>> why. 
> 
> Look at tests/Test-ftp-iri-fallback.log, if you can't interpret the content 
> send it here.

I cannot find any such file, no *.log" anywhere in the vicinity of the
tests.

Ah ... maybe I should have addede that I'm working on a slightly older
version of wget: 1.14, which we ship with SLES12.

NB I run the tests by calling
run-px /var/opt/wget-tests
The tests are installed in /var/opt/wget-tests/tests and the wget binary
is in /var/opt/wget-tests/src (although I would have preferred to use
the system's own wget, but that's a thing to be considered later).

I want to run just the tests in an openQA environment to aid in
integration testing. To that end, I am building an RPM with just enough
to run the tests:
tests/run-px
tests/unit-tests
tests/Test-*
tests/*.pm
tests/Makefile (not used)
tests/WgetFeature.cfg
tests/WgetTest.pm.in
tests/certs/*
src/wget

>> Is there a list describing exactly what each test checks and what a
>> failure means?
> 
> Each test should self-contain a short description of it's purpose, sometimes 
> these are missing (accidentally).

The accident must have happened here ;-)
NB The only difference I find between the 1.14 and the 1.19 versions of
this test is that the 1.19 version has the "name" hash tag in the
"FTPTest->new()" call.

> Test-ftp-iri-fallback tries to FTP-download a file containing non-ASCII 
> char(s).
> The behavior of Wget (with IRI support) is to convert the file name to UTF-8 
> for using with a RETR command.
> This should fail with a "550 file not found".

It does.

> Now Wget falls back to the unconverted file name and tries RETR again - this 
> should succeed (we told the FTP test server to know this file name).

This indeed succeeds, but in the end:

Test failed: file français.txt not downloaded


As I cannot find any log file, here's the output the test produced
(using cut-and-past from the ssh tty):

Running Test-ftp-iri-fallback.px

Running test Test-ftp-iri-fallback
Calling ../src/wget --local-encoding=iso-8859-1 -S
ftp://localhost:39938/français.txt
--2017-06-09 16:42:53--  ftp://localhost:39938/fran%C3%A7ais.txt
   => â<80><98>français.txtâ<80><99>
Resolving localhost (localhost)... ::1, 127.0.0.1
Connecting to localhost (localhost)|::1|:39938... failed: Connection
refused.
Connecting to localhost (localhost)|127.0.0.1|:39938... connected.
Logging in as anonymous ...
220 GNU Wget Testing FTP Server ready.
--> USER anonymous^M

230 Anonymous user access granted.
--> SYST^M

215 UNIX Type: L8
--> PWD^M

257 "/"
--> TYPE I^M

200 TYPE changed to I.
==> CWD not needed.
--> SIZE français.txt^M

550 File or directory not found.
--> PASV^M

227 Entering Passive Mode (127,0,0,1,179,75)
--> RETR français.txt^M

Use of uninitialized value in string eq at
/var/opt/wget-tests/tests/FTPServer.pm line 251, <$socket> chunk 7.
550 File not found.

No such file â<80><98>français.txtâ<80><99>.

--2017-06-09 16:42:53--  ftp://localhost:39938/fran%E7ais.txt
   => â<80><98>français.txtâ<80><99>
Connecting to localhost (localhost)|127.0.0.1|:39938... connected.
Logging in as anonymous ...
220 GNU Wget Testing FTP Server ready.
--> USER anonymous^M

230 Anonymous user access granted.
--> SYST^M

215 UNIX Type: L8
--> PWD^M

257 "/"
--> TYPE I^M

200 TYPE changed to I.
==> CWD not needed.
--> SIZE français.txt^M

213 12
--> PASV^M

227 Entering Passive Mode (127,0,0,1,155,189)
--> RETR français.txt^M

150 Opening ASCII mode data connection.
Length: 12 (unauthoritative)

 0K   100% 2.95M=0s

226 File retrieval complete. Data connection has been closed.
2017-06-09 16:42:53 (2.95 MB/s) - â<80><98>français.txtâ<80><99> saved [12]

Test failed: file français.txt not downloaded


Thanks,

Josef



Re: [Bug-wget] What are the tests testing?

2017-06-12 Thread Josef Moellers
On 12.06.2017 09:23, Josef Moellers wrote:
> Hello Tim,
> 
> Thanks for the reply.

I ran just this test under strace:
top_srcdir=/var/opt/wget-tests
test=Test-ftp-iri-fallback.px
strace -fo test.out perl -I$top_srcdir/tests $top_srcdir/tests/$test
$top_srcdir

and it seems that it never tests for fran\347ais.txt, but it does test
for fran\303\247ais.txt:
3511  lstat("fran\303\247ais.txt", {st_mode=S_IFREG|0644, st_size=12,
...}) = 0
3511  geteuid() = 0
3511  lstat("fran\303\247ais.txt", {st_mode=S_IFREG|0644, st_size=12,
...}) = 0
3511  unlink("fran\303\247ais.txt") = 0

So, it does fall back to the UTF8 version of the filename but still
checks for the iso-8859-1 filename!


Josef

> On 10.06.2017 13:36, Tim Rühsen wrote:
>> On Freitag, 9. Juni 2017 17:02:15 CEST Josef Moellers wrote:
>>> Hi,
>>
>> Hi Josef,
>>
>>> I'm currently trying to build test suites for openQA.
>>> One of the candidates is wget and, luckily, it already provides quite an
>>> extensive test suite.
>>> I have successfully built an RPM which has all that is needed for the tests.
>>> One test, Test-ftp-iri-fallback.px, fails on SLES12-SP2 and I can't see
>>> why. 
>>
>> Look at tests/Test-ftp-iri-fallback.log, if you can't interpret the content 
>> send it here.
> 
> I cannot find any such file, no *.log" anywhere in the vicinity of the
> tests.
> 
> Ah ... maybe I should have addede that I'm working on a slightly older
> version of wget: 1.14, which we ship with SLES12.
> 
> NB I run the tests by calling
>   run-px /var/opt/wget-tests
> The tests are installed in /var/opt/wget-tests/tests and the wget binary
> is in /var/opt/wget-tests/src (although I would have preferred to use
> the system's own wget, but that's a thing to be considered later).
> 
> I want to run just the tests in an openQA environment to aid in
> integration testing. To that end, I am building an RPM with just enough
> to run the tests:
> tests/run-px
> tests/unit-tests
> tests/Test-*
> tests/*.pm
> tests/Makefile (not used)
> tests/WgetFeature.cfg
> tests/WgetTest.pm.in
> tests/certs/*
> src/wget
> 
>>> Is there a list describing exactly what each test checks and what a
>>> failure means?
>>
>> Each test should self-contain a short description of it's purpose, sometimes 
>> these are missing (accidentally).
> 
> The accident must have happened here ;-)
> NB The only difference I find between the 1.14 and the 1.19 versions of
> this test is that the 1.19 version has the "name" hash tag in the
> "FTPTest->new()" call.
> 
>> Test-ftp-iri-fallback tries to FTP-download a file containing non-ASCII 
>> char(s).
>> The behavior of Wget (with IRI support) is to convert the file name to UTF-8 
>> for using with a RETR command.
>> This should fail with a "550 file not found".
> 
> It does.
> 
>> Now Wget falls back to the unconverted file name and tries RETR again - this 
>> should succeed (we told the FTP test server to know this file name).
> 
> This indeed succeeds, but in the end:
> 
> Test failed: file français.txt not downloaded
> 
> 
> As I cannot find any log file, here's the output the test produced
> (using cut-and-past from the ssh tty):
> 
> Running Test-ftp-iri-fallback.px
> 
> Running test Test-ftp-iri-fallback
> Calling ../src/wget --local-encoding=iso-8859-1 -S
> ftp://localhost:39938/français.txt
> --2017-06-09 16:42:53--  ftp://localhost:39938/fran%C3%A7ais.txt
>=> â<80><98>français.txtâ<80><99>
> Resolving localhost (localhost)... ::1, 127.0.0.1
> Connecting to localhost (localhost)|::1|:39938... failed: Connection
> refused.
> Connecting to localhost (localhost)|127.0.0.1|:39938... connected.
> Logging in as anonymous ...
> 220 GNU Wget Testing FTP Server ready.
> --> USER anonymous^M
> 
> 230 Anonymous user access granted.
> --> SYST^M
> 
> 215 UNIX Type: L8
> --> PWD^M
> 
> 257 "/"
> --> TYPE I^M
> 
> 200 TYPE changed to I.
> ==> CWD not needed.
> --> SIZE français.txt^M
> 
> 550 File or directory not found.
> --> PASV^M
> 
> 227 Entering Passive Mode (127,0,0,1,179,75)
> --> RETR français.txt^M
> 
> Use of uninitialized value in string eq at
> /var/opt/wget-tests/tests/FTPServer.pm line 251, <$socket> chunk 7.
> 550 File not found.
> 
> No such file â<80><98>français.txtâ<80><99>.
> 
> --2017-06-09 16:42:53--  ftp://localhost:39938/fran%E7ais.txt
>=> â<80><98>français.txtâ<80><99>
> Connecting to localhost (localhost)|127.0.0.1|:39938... connected.
> Logging in as anonymous ...
> 220 GNU Wget Testing FTP Server ready.
> --> USER anonymous^M
> 
> 230 Anonymous user access granted.
> --> SYST^M
> 
> 215 UNIX Type: L8
> --> PWD^M
> 
> 257 "/"
> --> TYPE I^M
> 
> 200 TYPE changed to I.
> ==> CWD not needed.
> --> SIZE français.txt^M
> 
> 213 12
> --> PASV^M
> 
> 227 Entering Passive Mode (127,0,0,1,155,189)
> --> RETR français.txt^M
> 
> 150 Opening ASCII mode data connection.
> Length: 12 (unauthoritative)
> 
>  0K   100% 2.95M=0s
> 
> 226 

Re: [Bug-wget] What are the tests testing?

2017-06-12 Thread Josef Moellers
On 12.06.2017 09:37, Josef Moellers wrote:
> On 12.06.2017 09:23, Josef Moellers wrote:
>> Hello Tim,
>>
>> Thanks for the reply.
> 
> I ran just this test under strace:
> top_srcdir=/var/opt/wget-tests
> test=Test-ftp-iri-fallback.px
> strace -fo test.out perl -I$top_srcdir/tests $top_srcdir/tests/$test
> $top_srcdir
> 
> and it seems that it never tests for fran\347ais.txt, but it does test
> for fran\303\247ais.txt:
> 3511  lstat("fran\303\247ais.txt", {st_mode=S_IFREG|0644, st_size=12,
> ...}) = 0
> 3511  geteuid() = 0
> 3511  lstat("fran\303\247ais.txt", {st_mode=S_IFREG|0644, st_size=12,
> ...}) = 0
> 3511  unlink("fran\303\247ais.txt") = 0
> 
> So, it does fall back to the UTF8 version of the filename but still
> checks for the iso-8859-1 filename!

Update^2:

If I add the option "-O fran${ccedilla_l1}ais.txt" to the cmdline, then
the test succeeds:
:
226 File retrieval complete. Data connection has been closed.
2017-06-12 09:38:43 (3.07 MB/s) - ‘fran\347ais.txt’ saved [12]

Test successful.
-end of output-

Josef

>> On 10.06.2017 13:36, Tim Rühsen wrote:
>>> On Freitag, 9. Juni 2017 17:02:15 CEST Josef Moellers wrote:
 Hi,
>>>
>>> Hi Josef,
>>>
 I'm currently trying to build test suites for openQA.
 One of the candidates is wget and, luckily, it already provides quite an
 extensive test suite.
 I have successfully built an RPM which has all that is needed for the 
 tests.
 One test, Test-ftp-iri-fallback.px, fails on SLES12-SP2 and I can't see
 why. 
>>>
>>> Look at tests/Test-ftp-iri-fallback.log, if you can't interpret the content 
>>> send it here.
>>
>> I cannot find any such file, no *.log" anywhere in the vicinity of the
>> tests.
>>
>> Ah ... maybe I should have addede that I'm working on a slightly older
>> version of wget: 1.14, which we ship with SLES12.
>>
>> NB I run the tests by calling
>>  run-px /var/opt/wget-tests
>> The tests are installed in /var/opt/wget-tests/tests and the wget binary
>> is in /var/opt/wget-tests/src (although I would have preferred to use
>> the system's own wget, but that's a thing to be considered later).
>>
>> I want to run just the tests in an openQA environment to aid in
>> integration testing. To that end, I am building an RPM with just enough
>> to run the tests:
>> tests/run-px
>> tests/unit-tests
>> tests/Test-*
>> tests/*.pm
>> tests/Makefile (not used)
>> tests/WgetFeature.cfg
>> tests/WgetTest.pm.in
>> tests/certs/*
>> src/wget
>>
 Is there a list describing exactly what each test checks and what a
 failure means?
>>>
>>> Each test should self-contain a short description of it's purpose, 
>>> sometimes 
>>> these are missing (accidentally).
>>
>> The accident must have happened here ;-)
>> NB The only difference I find between the 1.14 and the 1.19 versions of
>> this test is that the 1.19 version has the "name" hash tag in the
>> "FTPTest->new()" call.
>>
>>> Test-ftp-iri-fallback tries to FTP-download a file containing non-ASCII 
>>> char(s).
>>> The behavior of Wget (with IRI support) is to convert the file name to 
>>> UTF-8 
>>> for using with a RETR command.
>>> This should fail with a "550 file not found".
>>
>> It does.
>>
>>> Now Wget falls back to the unconverted file name and tries RETR again - 
>>> this 
>>> should succeed (we told the FTP test server to know this file name).
>>
>> This indeed succeeds, but in the end:
>>
>> Test failed: file français.txt not downloaded
>>
>>
>> As I cannot find any log file, here's the output the test produced
>> (using cut-and-past from the ssh tty):
>> 
>> Running Test-ftp-iri-fallback.px
>>
>> Running test Test-ftp-iri-fallback
>> Calling ../src/wget --local-encoding=iso-8859-1 -S
>> ftp://localhost:39938/français.txt
>> --2017-06-09 16:42:53--  ftp://localhost:39938/fran%C3%A7ais.txt
>>=> â<80><98>français.txtâ<80><99>
>> Resolving localhost (localhost)... ::1, 127.0.0.1
>> Connecting to localhost (localhost)|::1|:39938... failed: Connection
>> refused.
>> Connecting to localhost (localhost)|127.0.0.1|:39938... connected.
>> Logging in as anonymous ...
>> 220 GNU Wget Testing FTP Server ready.
>> --> USER anonymous^M
>>
>> 230 Anonymous user access granted.
>> --> SYST^M
>>
>> 215 UNIX Type: L8
>> --> PWD^M
>>
>> 257 "/"
>> --> TYPE I^M
>>
>> 200 TYPE changed to I.
>> ==> CWD not needed.
>> --> SIZE français.txt^M
>>
>> 550 File or directory not found.
>> --> PASV^M
>>
>> 227 Entering Passive Mode (127,0,0,1,179,75)
>> --> RETR français.txt^M
>>
>> Use of uninitialized value in string eq at
>> /var/opt/wget-tests/tests/FTPServer.pm line 251, <$socket> chunk 7.
>> 550 File not found.
>>
>> No such file â<80><98>français.txtâ<80><99>.
>>
>> --2017-06-09 16:42:53--  ftp://localhost:39938/fran%E7ais.txt
>>=> â<80><98>français.txtâ<80><99>
>> Connecting to localhost (localhost)|127.0.0.1|:39938... connected.
>> Logging in as anonymous ...
>> 220 GNU Wget Testing FTP Server ready.
>> --> USER anonymous^M

Re: [Bug-wget] What are the tests testing?

2017-06-12 Thread Tim Rühsen
Hi Josef,


On 06/12/2017 09:23 AM, Josef Moellers wrote:
> Hello Tim,
> 
> Thanks for the reply.
> 
> On 10.06.2017 13:36, Tim Rühsen wrote:
>> On Freitag, 9. Juni 2017 17:02:15 CEST Josef Moellers wrote:
>>> Hi,
>>
>> Hi Josef,
>>
>>> I'm currently trying to build test suites for openQA.
>>> One of the candidates is wget and, luckily, it already provides quite an
>>> extensive test suite.
>>> I have successfully built an RPM which has all that is needed for the tests.
>>> One test, Test-ftp-iri-fallback.px, fails on SLES12-SP2 and I can't see
>>> why. 
>>
>> Look at tests/Test-ftp-iri-fallback.log, if you can't interpret the content 
>> send it here.
> 
> I cannot find any such file, no *.log" anywhere in the vicinity of the
> tests.

Ok, the .log files just contain the output of each single test when
tested with 'make check'. If you use run-px, copy & pasting from the
console is the right thing to do.

> Ah ... maybe I should have addede that I'm working on a slightly older
> version of wget: 1.14, which we ship with SLES12.

So I compiled 1.14 (git tag v1.14) and used run-px to run the test suite
- but still can't reproduce the problem (Debian unstable here, `locale`
shows all set to 'en_US.UTF-8').

> 
> 227 Entering Passive Mode (127,0,0,1,155,189)
> --> RETR français.txt^M
> 
> 150 Opening ASCII mode data connection.
> Length: 12 (unauthoritative)
> 
>  0K   100% 2.95M=0s
> 
> 226 File retrieval complete. Data connection has been closed.
> 2017-06-09 16:42:53 (2.95 MB/s) - â<80><98>français.txtâ<80><99> saved [12]
> 
> Test failed: file français.txt not downloaded

My out put looks identical except the these last lines:

227 Entering Passive Mode (127,0,0,1,175,123)
--> RETR français.txt

150 Opening ASCII mode data connection.
Length: 12 (unauthoritative)

 0K   100% 2.05M=0s

226 File retrieval complete. Data connection has been closed.
2017-06-12 09:39:27 (2.05 MB/s) - ‘fran\347ais.txt’ saved [12]

Test successful.


I just can guess:
- something with your locale (what does the 'locale' command output ?)
- something with iconv() function

Does the same test fail if you use Wget 1.19.1 ?


And from your 'update 2':
> If I add the option "-O fran${ccedilla_l1}ais.txt" to the cmdline,
> then the test succeeds:

Of course it does ;-) You simply created the expected output file...
(circumventing the real test.)

With Best Regards, Tim



signature.asc
Description: OpenPGP digital signature


Re: [Bug-wget] wget and srcset tag

2017-06-12 Thread Tim Rühsen
Hi Chris,


On 06/11/2017 05:24 PM, chris wrote:
> Hi,
> 
> I'm just wondering if I've possibly found a bug, unless I'm just doing
> something incorrectly (which I assume is more likely).
> 
> I grab my webpage using 'wget -T1 -t1 -E -k -H -nd -N -p -P site_output
> https://www.anfractuosity.com/projects/ultrasound-networking/ > note1 2>
> note2'
> 
> But i notice the srcset tags in the resulting downloaded files produce
> 'srcset="fsk.png.html 533w, fsk-266x300.png 266w" sizes="(max-width: 533px)
> 100vw, 533px" />' in the output index.html.
> 
> On the actual webpage it looks like "srcset="
> https://www.anfractuosity.com/wp-content/uploads/2014/02/fft.png 762w,"
> no .html extension on the .png.

You requested -E (--adjust-extension) and -k (--convert-links).
That would change the file name when the server tags the file as
content-type 'text/html'. You could see that in the debug output
(options -d or --debug).

> 
> Cheers
> Chris
> 

With Best Regards, Tim



signature.asc
Description: OpenPGP digital signature


Re: [Bug-wget] wget and srcset tag

2017-06-12 Thread chris
Hi Tim,

Thanks for your reply, I notice the following in the debug logs:

"""
will convert url
http://www.anfractuosity.com/wp-content/uploads/2014/02/fsk.png to local
site_output/fsk.png
will convert url
https://www.anfractuosity.com/wp-content/uploads/2014/02/fsk.png to local
site_output/fsk.png.html
"""

The difference between those URLs seems to be one is https and one isn't.
When I wget those URLs though, both seem to return a .png, with 'Length:
51068 (50K) [image/png]'.

So I'm a bit confused why I get the fsk.png.html URL.

cheers
Chris

On Mon, Jun 12, 2017 at 9:08 AM, Tim Rühsen  wrote:

> Hi Chris,
>
>
> On 06/11/2017 05:24 PM, chris wrote:
> > Hi,
> >
> > I'm just wondering if I've possibly found a bug, unless I'm just doing
> > something incorrectly (which I assume is more likely).
> >
> > I grab my webpage using 'wget -T1 -t1 -E -k -H -nd -N -p -P site_output
> > https://www.anfractuosity.com/projects/ultrasound-networking/ > note1 2>
> > note2'
> >
> > But i notice the srcset tags in the resulting downloaded files produce
> > 'srcset="fsk.png.html 533w, fsk-266x300.png 266w" sizes="(max-width:
> 533px)
> > 100vw, 533px" />' in the output index.html.
> >
> > On the actual webpage it looks like "srcset="
> > https://www.anfractuosity.com/wp-content/uploads/2014/02/fft.png
> 762w,"
> > no .html extension on the .png.
>
> You requested -E (--adjust-extension) and -k (--convert-links).
> That would change the file name when the server tags the file as
> content-type 'text/html'. You could see that in the debug output
> (options -d or --debug).
>
> >
> > Cheers
> > Chris
> >
>
> With Best Regards, Tim
>
>


Re: [Bug-wget] What are the tests testing?

2017-06-12 Thread Josef Moellers
On 12.06.2017 10:00, Tim Rühsen wrote:
> Hi Josef,
> 
> 
> On 06/12/2017 09:23 AM, Josef Moellers wrote:
>> Hello Tim,
>>
>> Thanks for the reply.
>>
>> On 10.06.2017 13:36, Tim Rühsen wrote:
>>> On Freitag, 9. Juni 2017 17:02:15 CEST Josef Moellers wrote:
 Hi,
>>>
>>> Hi Josef,
>>>
 I'm currently trying to build test suites for openQA.
 One of the candidates is wget and, luckily, it already provides quite an
 extensive test suite.
 I have successfully built an RPM which has all that is needed for the 
 tests.
 One test, Test-ftp-iri-fallback.px, fails on SLES12-SP2 and I can't see
 why. 
>>>
>>> Look at tests/Test-ftp-iri-fallback.log, if you can't interpret the content 
>>> send it here.
>>
>> I cannot find any such file, no *.log" anywhere in the vicinity of the
>> tests.
> 
> Ok, the .log files just contain the output of each single test when
> tested with 'make check'. If you use run-px, copy & pasting from the
> console is the right thing to do.
> 
>> Ah ... maybe I should have addede that I'm working on a slightly older
>> version of wget: 1.14, which we ship with SLES12.
> 
> So I compiled 1.14 (git tag v1.14) and used run-px to run the test suite
> - but still can't reproduce the problem (Debian unstable here, `locale`
> shows all set to 'en_US.UTF-8').
> 
>>
>> 227 Entering Passive Mode (127,0,0,1,155,189)
>> --> RETR français.txt^M
>>
>> 150 Opening ASCII mode data connection.
>> Length: 12 (unauthoritative)
>>
>>  0K   100% 2.95M=0s
>>
>> 226 File retrieval complete. Data connection has been closed.
>> 2017-06-09 16:42:53 (2.95 MB/s) - â<80><98>français.txtâ<80><99> saved [12]
>>
>> Test failed: file français.txt not downloaded
> 
> My out put looks identical except the these last lines:
> 
> 227 Entering Passive Mode (127,0,0,1,175,123)
> --> RETR français.txt
> 
> 150 Opening ASCII mode data connection.
> Length: 12 (unauthoritative)
> 
>  0K   100% 2.05M=0s
> 
> 226 File retrieval complete. Data connection has been closed.
> 2017-06-12 09:39:27 (2.05 MB/s) - ‘fran\347ais.txt’ saved [12]
> 
> Test successful.
> 
> 
> I just can guess:
> - something with your locale (what does the 'locale' command output ?)

LANG=POSIX
LC_CTYPE=en_US.UTF-8
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=

Changing everything to en_US.UTF-8 doesn't help:
I added "export LC...=en_US.UTF-8" lines prior to running the test.
I must admit, I never really understood this "locale" thingy, it bites
my whenever I come close.

> - something with iconv() function
> 
> Does the same test fail if you use Wget 1.19.1 ?

I can't get it to build on SLES12 as it does not have libidn2(-devel)!

I'll keep on trying, but until then ...

> And from your 'update 2':
>> If I add the option "-O fran${ccedilla_l1}ais.txt" to the cmdline,
>> then the test succeeds:
> 
> Of course it does ;-) You simply created the expected output file...
> (circumventing the real test.)

:-(

Josef



signature.asc
Description: OpenPGP digital signature


Re: [Bug-wget] What are the tests testing?

2017-06-12 Thread Tim Rühsen
On 06/12/2017 10:42 AM, Josef Moellers wrote:
> On 12.06.2017 10:00, Tim Rühsen wrote:
>> Hi Josef,
>>
>>
>> On 06/12/2017 09:23 AM, Josef Moellers wrote:
>>> Hello Tim,
>>>
>>> Thanks for the reply.
>>>
>>> On 10.06.2017 13:36, Tim Rühsen wrote:
 On Freitag, 9. Juni 2017 17:02:15 CEST Josef Moellers wrote:
> Hi,

 Hi Josef,

> I'm currently trying to build test suites for openQA.
> One of the candidates is wget and, luckily, it already provides quite an
> extensive test suite.
> I have successfully built an RPM which has all that is needed for the 
> tests.
> One test, Test-ftp-iri-fallback.px, fails on SLES12-SP2 and I can't see
> why. 

 Look at tests/Test-ftp-iri-fallback.log, if you can't interpret the 
 content 
 send it here.
>>>
>>> I cannot find any such file, no *.log" anywhere in the vicinity of the
>>> tests.
>>
>> Ok, the .log files just contain the output of each single test when
>> tested with 'make check'. If you use run-px, copy & pasting from the
>> console is the right thing to do.
>>
>>> Ah ... maybe I should have addede that I'm working on a slightly older
>>> version of wget: 1.14, which we ship with SLES12.
>>
>> So I compiled 1.14 (git tag v1.14) and used run-px to run the test suite
>> - but still can't reproduce the problem (Debian unstable here, `locale`
>> shows all set to 'en_US.UTF-8').
>>
>>>
>>> 227 Entering Passive Mode (127,0,0,1,155,189)
>>> --> RETR français.txt^M
>>>
>>> 150 Opening ASCII mode data connection.
>>> Length: 12 (unauthoritative)
>>>
>>>  0K   100% 2.95M=0s
>>>
>>> 226 File retrieval complete. Data connection has been closed.
>>> 2017-06-09 16:42:53 (2.95 MB/s) - â<80><98>français.txtâ<80><99> saved [12]
>>>
>>> Test failed: file français.txt not downloaded
>>
>> My out put looks identical except the these last lines:
>>
>> 227 Entering Passive Mode (127,0,0,1,175,123)
>> --> RETR français.txt
>>
>> 150 Opening ASCII mode data connection.
>> Length: 12 (unauthoritative)
>>
>>  0K   100% 2.05M=0s
>>
>> 226 File retrieval complete. Data connection has been closed.
>> 2017-06-12 09:39:27 (2.05 MB/s) - ‘fran\347ais.txt’ saved [12]
>>
>> Test successful.
>>
>>
>> I just can guess:
>> - something with your locale (what does the 'locale' command output ?)
> 
> LANG=POSIX
> LC_CTYPE=en_US.UTF-8
> LC_NUMERIC="POSIX"
> LC_TIME="POSIX"
> LC_COLLATE="POSIX"
> LC_MONETARY="POSIX"
> LC_MESSAGES="POSIX"
> LC_PAPER="POSIX"
> LC_NAME="POSIX"
> LC_ADDRESS="POSIX"
> LC_TELEPHONE="POSIX"
> LC_MEASUREMENT="POSIX"
> LC_IDENTIFICATION="POSIX"
> LC_ALL=
> 
> Changing everything to en_US.UTF-8 doesn't help:
> I added "export LC...=en_US.UTF-8" lines prior to running the test.

I used your settings, but still the tests succeed.

> I must admit, I never really understood this "locale" thingy, it bites
> my whenever I come close.

Just take your time and read up about charset conversion, it doesn't
take too long :-)

>> - something with iconv() function
>>
>> Does the same test fail if you use Wget 1.19.1 ?
> 
> I can't get it to build on SLES12 as it does not have libidn2(-devel)!
> 
> I'll keep on trying, but until then ...

There is no debug output by default with wget 1.14 (I didn't think of
it, sorry)... please add '-d' to the command line in
Test-ftp-iri-fallback.px, and send that output. Something must go wrong
there...

And then go back to v1.18 and try to build/test that (or eventually to
v1.17.1). That should compile with libidn instead libidn2.


With Best Regards, Tim



signature.asc
Description: OpenPGP digital signature


Re: [Bug-wget] What are the tests testing?

2017-06-12 Thread Josef Moellers
On 12.06.2017 11:48, Tim Rühsen wrote:
> On 06/12/2017 10:42 AM, Josef Moellers wrote:
>> On 12.06.2017 10:00, Tim Rühsen wrote:
>>> Hi Josef,
>>>
>>>
>>> On 06/12/2017 09:23 AM, Josef Moellers wrote:
 Hello Tim,

 Thanks for the reply.

 On 10.06.2017 13:36, Tim Rühsen wrote:
> On Freitag, 9. Juni 2017 17:02:15 CEST Josef Moellers wrote:
>> Hi,
>
> Hi Josef,
>
>> I'm currently trying to build test suites for openQA.
>> One of the candidates is wget and, luckily, it already provides quite an
>> extensive test suite.
>> I have successfully built an RPM which has all that is needed for the 
>> tests.
>> One test, Test-ftp-iri-fallback.px, fails on SLES12-SP2 and I can't see
>> why. 
>
> Look at tests/Test-ftp-iri-fallback.log, if you can't interpret the 
> content 
> send it here.

 I cannot find any such file, no *.log" anywhere in the vicinity of the
 tests.
>>>
>>> Ok, the .log files just contain the output of each single test when
>>> tested with 'make check'. If you use run-px, copy & pasting from the
>>> console is the right thing to do.
>>>
 Ah ... maybe I should have addede that I'm working on a slightly older
 version of wget: 1.14, which we ship with SLES12.
>>>
>>> So I compiled 1.14 (git tag v1.14) and used run-px to run the test suite
>>> - but still can't reproduce the problem (Debian unstable here, `locale`
>>> shows all set to 'en_US.UTF-8').
>>>

 227 Entering Passive Mode (127,0,0,1,155,189)
 --> RETR français.txt^M

 150 Opening ASCII mode data connection.
 Length: 12 (unauthoritative)

  0K   100% 2.95M=0s

 226 File retrieval complete. Data connection has been closed.
 2017-06-09 16:42:53 (2.95 MB/s) - â<80><98>français.txtâ<80><99> saved 
 [12]

 Test failed: file français.txt not downloaded
>>>
>>> My out put looks identical except the these last lines:
>>>
>>> 227 Entering Passive Mode (127,0,0,1,175,123)
>>> --> RETR français.txt
>>>
>>> 150 Opening ASCII mode data connection.
>>> Length: 12 (unauthoritative)
>>>
>>>  0K   100% 2.05M=0s
>>>
>>> 226 File retrieval complete. Data connection has been closed.
>>> 2017-06-12 09:39:27 (2.05 MB/s) - ‘fran\347ais.txt’ saved [12]
>>>
>>> Test successful.
>>>
>>>
>>> I just can guess:
>>> - something with your locale (what does the 'locale' command output ?)
>>
>> LANG=POSIX
>> LC_CTYPE=en_US.UTF-8
>> LC_NUMERIC="POSIX"
>> LC_TIME="POSIX"
>> LC_COLLATE="POSIX"
>> LC_MONETARY="POSIX"
>> LC_MESSAGES="POSIX"
>> LC_PAPER="POSIX"
>> LC_NAME="POSIX"
>> LC_ADDRESS="POSIX"
>> LC_TELEPHONE="POSIX"
>> LC_MEASUREMENT="POSIX"
>> LC_IDENTIFICATION="POSIX"
>> LC_ALL=
>>
>> Changing everything to en_US.UTF-8 doesn't help:
>> I added "export LC...=en_US.UTF-8" lines prior to running the test.
> 
> I used your settings, but still the tests succeed.
> 
>> I must admit, I never really understood this "locale" thingy, it bites
>> my whenever I come close.
> 
> Just take your time and read up about charset conversion, it doesn't
> take too long :-)
> 
>>> - something with iconv() function
>>>
>>> Does the same test fail if you use Wget 1.19.1 ?
>>
>> I can't get it to build on SLES12 as it does not have libidn2(-devel)!
>>
>> I'll keep on trying, but until then ...
> 
> There is no debug output by default with wget 1.14 (I didn't think of
> it, sorry)... please add '-d' to the command line in
> Test-ftp-iri-fallback.px, and send that output. Something must go wrong
> there...

It's attached as "out"

> 
> And then go back to v1.18 and try to build/test that (or eventually to
> v1.17.1). That should compile with libidn instead libidn2.

I'll first try to build 1.14 without any ouf our local patches. As you
say: it works on your system (maybe I should also try on a local Ubuntu
machine), so it's strange that it doesn't work on my VM. Maybe there's
something in one of the patches ...

Thanks for tyking your time to help,

Josef
Running test Test-ftp-iri-fallback
Calling ../src/wget -d --local-encoding=iso-8859-1 -S 
ftp://localhost:41909/français.txt
Setting --local-encoding (localencoding) to iso-8859-1
Setting --server-response (serverresponse) to 1
DEBUG output created by Wget 1.14 on linux-gnu.

URI encoding = ‘iso-8859-1’
--2017-06-12 11:52:19--  ftp://localhost:41909/fran%C3%A7ais.txt
   => ‘français.txt’
Resolving localhost (localhost)... ::1, 127.0.0.1
Caching localhost => ::1 127.0.0.1
Connecting to localhost (localhost)|::1|:41909... Closed fd 3
failed: Connection refused.
Connecting to localhost (localhost)|127.0.0.1|:41909... connected.
Created socket 3.
Releasing 0x01a63100 (new refcount 1).
Logging in as anonymous ... 
220 GNU Wget Testing FTP Server ready.
--> USER anonymous

230 Anonymous user access granted.
--> SYST

215 UNIX Type: L8
--> PWD

257 "/"
-

Re: [Bug-wget] What are the tests testing?

2017-06-12 Thread Josef Moellers
On 12.06.2017 11:55, Josef Moellers wrote:
> On 12.06.2017 11:48, Tim Rühsen wrote:
>> On 06/12/2017 10:42 AM, Josef Moellers wrote:
>>> On 12.06.2017 10:00, Tim Rühsen wrote:
 Hi Josef,


 On 06/12/2017 09:23 AM, Josef Moellers wrote:
> Hello Tim,
>
> Thanks for the reply.
>
> On 10.06.2017 13:36, Tim Rühsen wrote:
>> On Freitag, 9. Juni 2017 17:02:15 CEST Josef Moellers wrote:
>>> Hi,
>>
>> Hi Josef,
>>
>>> I'm currently trying to build test suites for openQA.
>>> One of the candidates is wget and, luckily, it already provides quite an
>>> extensive test suite.
>>> I have successfully built an RPM which has all that is needed for the 
>>> tests.
>>> One test, Test-ftp-iri-fallback.px, fails on SLES12-SP2 and I can't see
>>> why. 
>>
>> Look at tests/Test-ftp-iri-fallback.log, if you can't interpret the 
>> content 
>> send it here.
>
> I cannot find any such file, no *.log" anywhere in the vicinity of the
> tests.

 Ok, the .log files just contain the output of each single test when
 tested with 'make check'. If you use run-px, copy & pasting from the
 console is the right thing to do.

> Ah ... maybe I should have addede that I'm working on a slightly older
> version of wget: 1.14, which we ship with SLES12.

 So I compiled 1.14 (git tag v1.14) and used run-px to run the test suite
 - but still can't reproduce the problem (Debian unstable here, `locale`
 shows all set to 'en_US.UTF-8').

>
> 227 Entering Passive Mode (127,0,0,1,155,189)
> --> RETR français.txt^M
>
> 150 Opening ASCII mode data connection.
> Length: 12 (unauthoritative)
>
>  0K   100% 
> 2.95M=0s
>
> 226 File retrieval complete. Data connection has been closed.
> 2017-06-09 16:42:53 (2.95 MB/s) - â<80><98>français.txtâ<80><99> saved 
> [12]
>
> Test failed: file français.txt not downloaded

 My out put looks identical except the these last lines:

 227 Entering Passive Mode (127,0,0,1,175,123)
 --> RETR français.txt

 150 Opening ASCII mode data connection.
 Length: 12 (unauthoritative)

  0K   100% 2.05M=0s

 226 File retrieval complete. Data connection has been closed.
 2017-06-12 09:39:27 (2.05 MB/s) - ‘fran\347ais.txt’ saved [12]

 Test successful.


 I just can guess:
 - something with your locale (what does the 'locale' command output ?)
>>>
>>> LANG=POSIX
>>> LC_CTYPE=en_US.UTF-8
>>> LC_NUMERIC="POSIX"
>>> LC_TIME="POSIX"
>>> LC_COLLATE="POSIX"
>>> LC_MONETARY="POSIX"
>>> LC_MESSAGES="POSIX"
>>> LC_PAPER="POSIX"
>>> LC_NAME="POSIX"
>>> LC_ADDRESS="POSIX"
>>> LC_TELEPHONE="POSIX"
>>> LC_MEASUREMENT="POSIX"
>>> LC_IDENTIFICATION="POSIX"
>>> LC_ALL=
>>>
>>> Changing everything to en_US.UTF-8 doesn't help:
>>> I added "export LC...=en_US.UTF-8" lines prior to running the test.
>>
>> I used your settings, but still the tests succeed.
>>
>>> I must admit, I never really understood this "locale" thingy, it bites
>>> my whenever I come close.
>>
>> Just take your time and read up about charset conversion, it doesn't
>> take too long :-)
>>
 - something with iconv() function

 Does the same test fail if you use Wget 1.19.1 ?
>>>
>>> I can't get it to build on SLES12 as it does not have libidn2(-devel)!
>>>
>>> I'll keep on trying, but until then ...
>>
>> There is no debug output by default with wget 1.14 (I didn't think of
>> it, sorry)... please add '-d' to the command line in
>> Test-ftp-iri-fallback.px, and send that output. Something must go wrong
>> there...
> 
> It's attached as "out"
> 
>>
>> And then go back to v1.18 and try to build/test that (or eventually to
>> v1.17.1). That should compile with libidn instead libidn2.
> 
> I'll first try to build 1.14 without any ouf our local patches. As you
> say: it works on your system (maybe I should also try on a local Ubuntu
> machine), so it's strange that it doesn't work on my VM. Maybe there's
> something in one of the patches ...

FYI It's the attached patch which is supposed to fix CVE-2016-4971!

Without this patch, the test succeeds, with this patch, the test fails.

Josef
Index: wget-1.14/src/ftp.c
===
--- wget-1.14.orig/src/ftp.c
+++ wget-1.14/src/ftp.c
@@ -234,14 +234,15 @@ print_length (wgint size, wgint start, b
   logputs (LOG_VERBOSE, !authoritative ? _(" (unauthoritative)\n") : "\n");
 }
 
-static uerr_t ftp_get_listing (struct url *, ccon *, struct fileinfo **);
+static uerr_t ftp_get_listing (struct url *, struct url *, ccon *, struct fileinfo **);
 
 /* Retrieves a file with denoted parameters through opening an FTP
connection to the server.  It always closes the data connection,
a

Re: [Bug-wget] What are the tests testing?

2017-06-12 Thread Tim Rühsen
On 06/12/2017 02:19 PM, Josef Moellers wrote:
>>> And then go back to v1.18 and try to build/test that (or eventually to
>>> v1.17.1). That should compile with libidn instead libidn2.
>>
>> I'll first try to build 1.14 without any ouf our local patches. As you
>> say: it works on your system (maybe I should also try on a local Ubuntu
>> machine), so it's strange that it doesn't work on my VM. Maybe there's
>> something in one of the patches ...
> 
> FYI It's the attached patch which is supposed to fix CVE-2016-4971!
> 
> Without this patch, the test succeeds, with this patch, the test fails.

Thanks for letting us know.

Sigh, it means that someone (at SuSE ?) picked a patch that was made for
v1.18 and applied it to 1.14 without testing it (well, it is just a
'make check'). Smells somewhat like a greenhorn's mistake.

This makes me feel somewhat desperate :-(

SuSE should really thank you working on OpenCA and finding this out !

> 
> Josef

With Best Regards, Tim



signature.asc
Description: OpenPGP digital signature


Re: [Bug-wget] wget and srcset tag

2017-06-12 Thread Tim Rühsen
On 06/12/2017 10:27 AM, chris wrote:
> Hi Tim,
> 
> Thanks for your reply, I notice the following in the debug logs:
> 
> """
> will convert url
> http://www.anfractuosity.com/wp-content/uploads/2014/02/fsk.png to local
> site_output/fsk.png
> will convert url
> https://www.anfractuosity.com/wp-content/uploads/2014/02/fsk.png to local
> site_output/fsk.png.html
> """
> 
> The difference between those URLs seems to be one is https and one isn't.
> When I wget those URLs though, both seem to return a .png, with 'Length:
> 51068 (50K) [image/png]'.
> 
> So I'm a bit confused why I get the fsk.png.html URL.

What version of wget are you using ? (1.19.1 here)

I tried some combinations of srcset (with https and http) and your
original options. I thought of an issue with redirection (because that's
an answer with text/html Content-Type).

Could you create a small reproducer page ? e.g. like

https://www.anfractuosity.com/wp-content/uploads/2014/02/fsk.png
533w,
http://www.anfractuosity.com/wp-content/uploads/2014/02/fsk-266x300.png
266w">


With whatever paths you are using for the .png files.
I don't want to download tons of files (limited bandwidth here).

> cheers
> Chris
> 
> On Mon, Jun 12, 2017 at 9:08 AM, Tim Rühsen  wrote:
> 
>> Hi Chris,
>>
>>
>> On 06/11/2017 05:24 PM, chris wrote:
>>> Hi,
>>>
>>> I'm just wondering if I've possibly found a bug, unless I'm just doing
>>> something incorrectly (which I assume is more likely).
>>>
>>> I grab my webpage using 'wget -T1 -t1 -E -k -H -nd -N -p -P site_output
>>> https://www.anfractuosity.com/projects/ultrasound-networking/ > note1 2>
>>> note2'
>>>
>>> But i notice the srcset tags in the resulting downloaded files produce
>>> 'srcset="fsk.png.html 533w, fsk-266x300.png 266w" sizes="(max-width:
>> 533px)
>>> 100vw, 533px" />' in the output index.html.
>>>
>>> On the actual webpage it looks like "srcset="
>>> https://www.anfractuosity.com/wp-content/uploads/2014/02/fft.png
>> 762w,"
>>> no .html extension on the .png.
>>
>> You requested -E (--adjust-extension) and -k (--convert-links).
>> That would change the file name when the server tags the file as
>> content-type 'text/html'. You could see that in the debug output
>> (options -d or --debug).
>>
>>>
>>> Cheers
>>> Chris
>>>
>>
>> With Best Regards, Tim



signature.asc
Description: OpenPGP digital signature


Re: [Bug-wget] wget and srcset tag

2017-06-12 Thread Chris
Hi Tim,

I just created a test page at -
https://www.anfractuosity.com/files/test2.html
were I still get the issue.

The version is 'GNU Wget 1.19.1 built on linux-gnu.'

cheers
Chris


On 12 June 2017 at 15:35, Tim Rühsen  wrote:

> On 06/12/2017 10:27 AM, chris wrote:
> > Hi Tim,
> >
> > Thanks for your reply, I notice the following in the debug logs:
> >
> > """
> > will convert url
> > http://www.anfractuosity.com/wp-content/uploads/2014/02/fsk.png to local
> > site_output/fsk.png
> > will convert url
> > https://www.anfractuosity.com/wp-content/uploads/2014/02/fsk.png to
> local
> > site_output/fsk.png.html
> > """
> >
> > The difference between those URLs seems to be one is https and one isn't.
> > When I wget those URLs though, both seem to return a .png, with 'Length:
> > 51068 (50K) [image/png]'.
> >
> > So I'm a bit confused why I get the fsk.png.html URL.
>
> What version of wget are you using ? (1.19.1 here)
>
> I tried some combinations of srcset (with https and http) and your
> original options. I thought of an issue with redirection (because that's
> an answer with text/html Content-Type).
>
> Could you create a small reproducer page ? e.g. like
> 
>  srcset="https://www.anfractuosity.com/wp-content/uploads/2014/02/fsk.png
> 533w,
> http://www.anfractuosity.com/wp-content/uploads/2014/02/fsk-266x300.png
> 266w">
> 
>
> With whatever paths you are using for the .png files.
> I don't want to download tons of files (limited bandwidth here).
>
> > cheers
> > Chris
> >
> > On Mon, Jun 12, 2017 at 9:08 AM, Tim Rühsen  wrote:
> >
> >> Hi Chris,
> >>
> >>
> >> On 06/11/2017 05:24 PM, chris wrote:
> >>> Hi,
> >>>
> >>> I'm just wondering if I've possibly found a bug, unless I'm just doing
> >>> something incorrectly (which I assume is more likely).
> >>>
> >>> I grab my webpage using 'wget -T1 -t1 -E -k -H -nd -N -p -P site_output
> >>> https://www.anfractuosity.com/projects/ultrasound-networking/ > note1
> 2>
> >>> note2'
> >>>
> >>> But i notice the srcset tags in the resulting downloaded files produce
> >>> 'srcset="fsk.png.html 533w, fsk-266x300.png 266w" sizes="(max-width:
> >> 533px)
> >>> 100vw, 533px" />' in the output index.html.
> >>>
> >>> On the actual webpage it looks like "srcset="
> >>> https://www.anfractuosity.com/wp-content/uploads/2014/02/fft.png
> >> 762w,"
> >>> no .html extension on the .png.
> >>
> >> You requested -E (--adjust-extension) and -k (--convert-links).
> >> That would change the file name when the server tags the file as
> >> content-type 'text/html'. You could see that in the debug output
> >> (options -d or --debug).
> >>
> >>>
> >>> Cheers
> >>> Chris
> >>>
> >>
> >> With Best Regards, Tim
>
>


Re: [Bug-wget] Shouldn't wget strip leading spaces from a URL?

2017-06-12 Thread Ander Juaristi
Your shell strips all the additional spaces between command-line arguments, so
it's effectively like a browser ;)

How are you running wget?

On 06/06/17 01:12, L A Walsh wrote:
> if wget gets leading spaces in a URL, it complains:
>  "  http://www.kernel.org/pub/linux/utils/util-linux/v2.30: Scheme missing."
> 
> Isn't it required for a web client to strip leading spaces from
> URLs?
> 
> 
> 



Re: [Bug-wget] Shouldn't wget strip leading spaces from a URL?

2017-06-12 Thread L A Walsh



Ander Juaristi wrote:

Your shell strips all the additional spaces between command-line arguments, so
it's effectively like a browser ;)

How are you running wget?

---
W/cut+paste into target line, where URL is double-quoted.  More
often than not, I find it safer to double-quote a URL than not, because,
for example, shells react badly to embedded spaces, ampersands and question
marks.

	The URL I was copying was itself a link (but not to the 
same place, but some link-counter), but to not activate the link,

I had to start a bit before or after the link -- thus picked up a space.

Basically comes down to wget deliberately submitting a
URL that it knows is incorrect, OR as good-user-agent, trimming leading
spaces or not.  Mindless obedience is almost never a good thing,
whereas common sense is usually valued (both of these are in
reference to how a program behaves).

It should take extra effort for wget to NOT strip 
leading+trailing spaces, since stripping those leading and 
trailing spaces is what users would be used to in a browser

AND because it would be the common-sense & user-friendly
thing to do.





Re: [Bug-wget] wget and srcset tag

2017-06-12 Thread Tim Rühsen
On Montag, 12. Juni 2017 17:07:30 CEST Chris wrote:
> Hi Tim,
> 
> I just created a test page at -
> https://www.anfractuosity.com/files/test2.html
> were I still get the issue.
> 
> The version is 'GNU Wget 1.19.1 built on linux-gnu.'

Thanks, Chris.

The issue is reproducible with latest git, thanks to your test page. 
I'll create a test case tomorrow and then we'll fix it.
It has something to do with If-Modified-Since. If you use 
--no-if-modified-since 
the links are converted correctly.

The good news is: Wget2 (https://gitlab.com/gnuwget/wget2) does it correctly 
:-)

With Best Regards, Tim

> 
> cheers
> Chris
> 
> On 12 June 2017 at 15:35, Tim Rühsen  wrote:
> > On 06/12/2017 10:27 AM, chris wrote:
> > > Hi Tim,
> > > 
> > > Thanks for your reply, I notice the following in the debug logs:
> > > 
> > > """
> > > will convert url
> > > http://www.anfractuosity.com/wp-content/uploads/2014/02/fsk.png to local
> > > site_output/fsk.png
> > > will convert url
> > > https://www.anfractuosity.com/wp-content/uploads/2014/02/fsk.png to
> > 
> > local
> > 
> > > site_output/fsk.png.html
> > > """
> > > 
> > > The difference between those URLs seems to be one is https and one
> > > isn't.
> > > When I wget those URLs though, both seem to return a .png, with 'Length:
> > > 51068 (50K) [image/png]'.
> > > 
> > > So I'm a bit confused why I get the fsk.png.html URL.
> > 
> > What version of wget are you using ? (1.19.1 here)
> > 
> > I tried some combinations of srcset (with https and http) and your
> > original options. I thought of an issue with redirection (because that's
> > an answer with text/html Content-Type).
> > 
> > Could you create a small reproducer page ? e.g. like
> > 
> >  > srcset="https://www.anfractuosity.com/wp-content/uploads/2014/02/fsk.png
> > 533w,
> > http://www.anfractuosity.com/wp-content/uploads/2014/02/fsk-266x300.png
> > 266w">
> > 
> > 
> > With whatever paths you are using for the .png files.
> > I don't want to download tons of files (limited bandwidth here).
> > 
> > > cheers
> > > Chris
> > > 
> > > On Mon, Jun 12, 2017 at 9:08 AM, Tim Rühsen  wrote:
> > >> Hi Chris,
> > >> 
> > >> On 06/11/2017 05:24 PM, chris wrote:
> > >>> Hi,
> > >>> 
> > >>> I'm just wondering if I've possibly found a bug, unless I'm just doing
> > >>> something incorrectly (which I assume is more likely).
> > >>> 
> > >>> I grab my webpage using 'wget -T1 -t1 -E -k -H -nd -N -p -P
> > >>> site_output
> > >>> https://www.anfractuosity.com/projects/ultrasound-networking/ > note1
> > 
> > 2>
> > 
> > >>> note2'
> > >>> 
> > >>> But i notice the srcset tags in the resulting downloaded files produce
> > >> 
> > >>> 'srcset="fsk.png.html 533w, fsk-266x300.png 266w" sizes="(max-width:
> > >> 533px)
> > >> 
> > >>> 100vw, 533px" />' in the output index.html.
> > >>> 
> > >>> On the actual webpage it looks like "srcset="
> > >>> https://www.anfractuosity.com/wp-content/uploads/2014/02/fft.png
> > >> 
> > >> 762w,"
> > >> 
> > >>> no .html extension on the .png.
> > >> 
> > >> You requested -E (--adjust-extension) and -k (--convert-links).
> > >> That would change the file name when the server tags the file as
> > >> content-type 'text/html'. You could see that in the debug output
> > >> (options -d or --debug).
> > >> 
> > >>> Cheers
> > >>> Chris
> > >> 
> > >> With Best Regards, Tim



signature.asc
Description: This is a digitally signed message part.


Re: [Bug-wget] Shouldn't wget strip leading spaces from a URL?

2017-06-12 Thread Dale R. Worley
L A Walsh  writes:
> W/cut+paste into target line, where URL is double-quoted.  More often
> than not, I find it safer to double-quote a URL than not, because, for
> example, shells react badly to embedded spaces, ampersands and
> question marks.

But of course, no URL contains an embedded space.

If you double-quote, most shells have four special characters that are
processed within double-quotes:  " \ $ `  Of the four, only $ can appear
in a URL.

If you single-quote, most shells have only one special character, which
ends the string: '  Unfortunately, it's allowed in URLs.

So there's no quoting character which you can just put before and after
a URL and be sure that your shell won't damage the URL.

Dale



Re: [Bug-wget] Shouldn't wget strip leading spaces from a URL?

2017-06-12 Thread L A Walsh



Dale R. Worley wrote:

L A Walsh  writes:

W/cut+paste into target line, where URL is double-quoted.  More often
than not, I find it safer to double-quote a URL than not, because, for
example, shells react badly to embedded spaces, ampersands and
question marks.


But of course, no URL contains an embedded space.

---
Why not?

   John Mueller of Google posted a note about spaces in the URL on Google+. 
   You know, the URLs that look like www.domain.com/file name goes here.html.


   Should you fill those holes?

   John Mueller of Google said "the answer is not "no"" when it comes 
   to the question "Should you encode spaces in URLs as "%20", "+" or

   as a space (" ")?"


But what would someone at google know?







[Bug-wget] [GSoC Update] Week 2

2017-06-12 Thread Didik Setiawan
= SUMMARY ==
My public fork of Wget2 project is available here [1]. I will continuously keep
pushing my work so anyone interested can track me there. Feel free to
participate in the discussions going on merge request with my mentors. Your
feedback are highly appreciated.


=== INTRODUCTION ===
The purpose of this project is to use Libmicrohttpd as test suite for Wget2.  I
plan to do this by do some changes on function wget_test_start_server() also
wget_test_stop_server() on src/libtest.c of Wget2. With this approach, I don't
need to change existing test suite which call the internal server code through
functions mentioned above. I've count there are 36 test file which use
wget_test_start_server(). I must ensure all the test passed.
And for installation prerequisite, I must ensure that Libmicrohttpd are included
when building Wget2 binary. Then I need to modify configure.ac. I will give
proper warning about this requirement. There is a section in README.md where I
must explain to user to provide Libmicrohttpd to make all test running
correctly.
With Libmicrohttpd I can add new test using feature that not yet implemented in
old server code, but ready on Libmicrohttpd, such as HTTP authentication [2] and
concurrent request checking. 

Mentors:
Darshit Shah 
Ander Juaristi 


== UPDATES =
Things which were done in this week:

 * I have finished modify configure.ac to include Libmicrohttpd into Wget2. I
   keep my work in this branch [3] of my repository.
 * I have ensured that all make check passed on several testing machine
   including: Debian/GCC, Fedora/Clang, MingW64 and OSX.
   Fix from previous week:
   - Previous work just ask to install Libmicrohttpd as requirement, but not
 include it when build Wget2 binary. Based on discussion with Christian
 Grothoff and Tim Rühsen, Libmicrohttpd still need to provided as
 prerequisite for Wget2. For some operating systems, I need to provide
 contrib script to resolve this issue.
 * Started working on wget_test_start_server(). Workflow to resolve this:
   - Disable initial process for HTTP server socket.
   - Disable _http_server_thread, instead call new function which call
 Libmicrohttpd.
   - Create _http_server() function, wrapper for Libmicrohttpd. There is also
 function ahc_eco() which use to create proper HTTP response.


= NEXT STEPS ===
Things which would be done in the coming week:

 * Finished on wget_test_start_server() in order to call Libmicrohttpd as
   service for wget_test(). Problems and questions need to be resolved:
   - Decide what the best threading model for Libmicrohttpd. Currently using
 MHD_USE_INTERNALLY_POLLING_THREAD which use external select. I still check
 the comparison with legacy code that use Wget2 API wget_thread_start.
   - http_server_port still hardcoded.
   - In ahc_eco() of Libmicrohttpd, urls data still using static checking for
 matching with requested urls. In other word, it's hardcoded. Need to be
 changed to dynamic method to accomodate variadic data.
   - https still not touched yet.
   - What to do with FTP and FTPS functions? Since Libmicrohttpd just provide
 service for HTTP. Do we need keep the function for FTP{s}, or removing it?
   - Last check failed when the test try to resolve URL with question mark.
 E.g: "/subdir1/subpage1.html?query¶m", when I debug, it return just
 "/subdir1/subpage1.html" so the result is 404 not found. I also check using
 logging example source code provided in Libmicrohttpd tutorial [4]. When I
 access using http client such as Wget2 and Firefox, the result is still the
 same. The URL result omit the query part. Need to confirm to Libmicrohttpd
 side about this, whether it is intended behaviour or not.
 * Make sure all test suite running correctly.


[1]: https://gitlab.com/dstw/wget2
[2]: 
https://www.gnu.org/software/libmicrohttpd/manual/libmicrohttpd.html#microhttpd_002ddauth
[3]: https://gitlab.com/dstw/wget2/tree/use-mhd
[4]: https://www.gnu.org/software/libmicrohttpd/tutorial.html#logging_002ec

Regards,
Didik Setiawan