Ignore my last message.  For some reason, you need the --no-cookies
option for wget to recurse on this website.  No idea why.

So I ended up using something like

wget -x -m -k -H -Dclojure.org --no-cookies --html-extension http://clojure.org

You may have to play around a bit to pull down exactly what you want.

On Oct 12, 12:06 pm, Rob Lachlan <robertlach...@gmail.com> wrote:
> Many of the links off of the main page are ajax.  I don't think that
> wget can scrape that very easily.
>
> On Oct 12, 11:10 am, Tassilo Horn <tass...@member.fsf.org> wrote:
>
>
>
>
>
>
>
> > jingguo <yaojing...@gmail.com> writes:
>
> > Hi!
>
> > > When programming in Clojure, I use the Reference documentation on
> > > clojure.org a lot. But my network condition is horrible. So I want to
> > > make a off-line copy. I have tried "wget -x -m -khttp://clojure.org";
> > > to make a mirror of clojure.org. But it does not work.
>
> > First, I've thought it was due to a restrictive robots.txt on the clojure
> > server, so I tried
>
> >   wget -e robots=off -x -m -k --wait 1http://clojure.org/
>
> > but that also downloads just the index page...
>
> > But I guess for programming clojure, you mostly want only the API docs,
> > and that can be mirrored perfectly fine with
>
> >   wget -xmkhttp://clojure.github.com/clojure/
>
> > HTH,
> > Tassilo

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to