Re: [PLUG] wget NOT getting all the pages

Paul Goins Tue, 03 Jan 2023 15:51:05 -0800

Just to add a different angle - what is the reason you're trying to
download all the pages?  If it's a bunch of static content and it's your
own website, and if you have SSH access, it'd make much better sense to
copy the webroot dir via rsync or scp.


If you're dealing with dynamic content (e.g. something backed by a
database), that may not work.  But without knowing more details about why
you're doing this and what your goal is, it's hard to say more.

-
Paul

On Tue, Dec 27, 2022 at 4:18 PM American Citizen <[email protected]>
wrote:

> Hi
>
> I used wget recently to try to download all 26 or 27 pages of my
> website, but it seems to miss about 40% of the pages.
>
> Does anyone have the CLI command line which captures 100% of a website
> URLS ?
>
> I tried the typical
>
> %wget -r --tries=10 https://my.website.com/ -o logfile
>
> as suggested in the "man wget" command, but it did NOT capture all the
> webpages. I even tried a wait parameter, but that only slowed things up
> and did not remedy the missing websubpages issue.
>
> I appreciate any tips so that ALL of the website data can be captured by
> wget. Yes, I am aware of the robots.txt restricting downloadable
> information
>
> - Randall
>
>
>

Re: [PLUG] wget NOT getting all the pages

Reply via email to