Problems with wget

Richard Owlett Mon, 26 Feb 2018 04:41:08 -0800

I'm attempting to download a site which is an instruction manual.
Its URL is of the form
   http://example.com/index.html
That page has several lines whose target URLs are of form
   http://example.com/page1.html
   http://example.com/page2.html
   http://example.com/page3.html
  etc.


I wish a single HTML file consisting of all the pages of the site.

Where <http://example.com/index.html> points to<http://example.com/pageN.html> I wish my local file to have appropriateinternal references.

There are references of form
   http://some_where_else.com/pagex.html
which I do not wish to download.



I tried
wget  -l 2 -O owl.html ‐‐no-parent http://example.com/index.html
It *almost* worked as intended.
I did get all the text of the site.

HOWEVER:
  1. I also got the text of <http://some_where_else.com/pagex.html>
  2. Where <http://example.com/index.html> referenced
     <http://example.com/pageN.html> there were still references to
     the original site rather than a relative link within owl.html .

Can wget actually do what I want?
If so how?
TIA

Problems with wget

Reply via email to