Re: [PLUG] wget NOT getting all the pages

MC_Sequoia Tue, 27 Dec 2022 16:50:02 -0800

I'm not sure if this is applicable to your situation but wget can only get 
elements that are included directly on the page. For example, wget cannot run 
javascript code, and therefore cannot determine, which elements are loaded from 
javascript after a page is loaded.





Sent with Proton Mail secure email.

------- Original Message -------
On Tuesday, December 27th, 2022 at 4:18 PM, American Citizen 
<[email protected]> wrote:


> Hi
> 
> I used wget recently to try to download all 26 or 27 pages of my
> website, but it seems to miss about 40% of the pages.
> 
> Does anyone have the CLI command line which captures 100% of a website
> URLS ?
> 
> I tried the typical
> 
> %wget -r --tries=10 https://my.website.com/ -o logfile
> 
> as suggested in the "man wget" command, but it did NOT capture all the
> webpages. I even tried a wait parameter, but that only slowed things up
> and did not remedy the missing websubpages issue.
> 
> I appreciate any tips so that ALL of the website data can be captured by
> wget. Yes, I am aware of the robots.txt restricting downloadable information
> 
> - Randall

Re: [PLUG] wget NOT getting all the pages

Reply via email to