On closer inspection, I've found that the results from Wget and Firefox are
very different. Neither is perfect, but the Wget results are definitely
wrong. Here are the results from both:

 

=================================================

wget --append-output=Wget_Google.log --show-progress --no-directories
--adjust-extension --directory-prefix=download/Google2 --convert-links
--backup-converted --page-requisites --span-hosts http://www.Google.com

 

Contents of folder "download\Google2":

2016 12 07  19:00             5,482
googlelogo_white_background_color_272x92dp.png

2019 01 02  11:10            11,587      index.html

2019 01 02  11:10            11,437      index.html.orig

2016 12 16  06:30            12,263      nav_logo229.png

2018 11 16  04:00             6,913      robots.txt

               5 File(s)         47,682 bytes

 

Wget log: See attached "Wget_Google.log".

Result of save as viewed in Firefox: See attached "Google from Wget.png".

=================================================

Firefox at https://www.google.com/

File > Save Page As > Save as type: Web Page, complete

 

Contents of folder:

2019 01 02  11:15           222,403      Google2.htm

2019 01 02  11:15    <DIR>          Google2_files

 

Contents of subfolder "Google2_files"

2019 01 02  11:15           140,084      cbgapi.loaded_0

2019 01 02  11:15            13,504      googlelogo_color_272x92dp.png

2019 01 02  11:15            85,565
msb_wizaaabdasyncdvlfootiflipv6lummusfxz7cCd

2019 01 02  11:15           140,913
rsAA2YrTv-X7m9A6GmnfpSsKdPIfvIYg06ZQ

2019 01 02  11:15           403,380
rsACT90oGMg6Rr6Oa277nSkJoiMyEfVXOeOQ

               5 File(s)        783,446 bytes

 

Result of save as viewed in Firefox: See attached "Google from Firefox.png".

=================================================

Actual appearance of the webpage: See attached "Google original.png".

=================================================

 

Observations:

                * The main file saved by Firefox is 218 kb, that by Wget is
only 12 kb.

                * Firefox saves five additional files, Wget only three, and
none of them even have the same filenames!

                * Firefox gets the page layout right, including headers and
footers, but for some reason doesn't show the logo. Wget looks like it
downloaded a different page. The whole layout is different. But it got the
logo right.

 

What do I need to do for Wget to get the page correctly?

 

Thank you.

 

=================================================

 

 

 

 

From: [email protected] [mailto:[email protected]] 
Sent: Wednesday, January 2, 2019 04:50
To: '[email protected]'
Subject: How to simulate "Save as webpage, complete"?

 

Hi, not a bug, but a question:

 

The command:

wget --no-directories --adjust-extension --directory-prefix _files
--convert-links --page-requisites --span-hosts http://www.Google.com

 

saves the Google homepage as "index.html" along with associated files, all
together in the folder "_files". The result works nicely, but what I want is
for "index.html" to be in one folder and the associated files to be in a
subfolder of that called "_files". This is what a browser does when one asks
it to "save as webpage, complete." How do I simulate that behavior with
Wget?

 

The manual entry for -P / --directory-prefix says "the directory prefix is
the directory where all other files and subdirectories will be saved."
Because of the word "other," I thought this would do what I want, but it
didn't. It put all the files in the same directory, including "index.html".

 

I am using Wget, v. 1.20 as the Windows binary provided by Jernej Simončič
at www.eternallybored.org/misc/wget/ and running it in a DOS window
("Command Prompt") of Windows 7.

 

Thanks for your help.

 

Attachment: Wget_Google.log
Description: Binary data

Reply via email to