On 7 October 2011 16:44, Gary Gregory <garydgreg...@gmail.com> wrote: > On Fri, Oct 7, 2011 at 8:51 AM, sebb <seb...@gmail.com> wrote: > >> On 7 October 2011 13:44, Gary Gregory <garydgreg...@gmail.com> wrote: >> > On Fri, Oct 7, 2011 at 2:01 AM, Henri Yandell <flame...@gmail.com> >> wrote: >> > >> >> wget doesn't seem to like the url. curl is happy to do it, but it >> >> doesn't do -r afaik. >> >> >> > >> > Here is what I get with wget. How do I make it get the embedded URLs? I >> > don't care if it's curl, wget, or foobar, I just want instructions that >> > work. After figuring out all the Maven nonsense, now this. Sigh. >> >> It's easy enough to loop around the non-Maven files in the directory >> if you cannot get the index parsing to work. >> >> Or even use Lynx on p.a.o and browse to the directory, and download from >> there. >> >> If you cannot get it to work, let me know and I can help later (about >> to be busy). >> > > Yes please. :( It's this kind of ridiculous hoop jumping that makes me put > this task on the back burner, the one that's in the shed, deep in the woods.
Problem seems to be that the Nexus server has a robots.txt which does not allow downloads from that directory. The following works for me: wget -r -l 1 -np -nH -nd -nv -e robots=off --wait 10 --no-check-certificate URL -r recursive -l 1 1 level -np no parent -nH don't create host directories -nd don't create directories -nv quiet -e robots=off ignore robots.txt --wait 10 wait between retrievals > Gary > > >> >> >> wget -np -r --no-check-certificate >> > >> https://repository.apache.org/content/repositories/orgapachecommons-027/commons-io/commons-io/2.1/ >> > --2011-10-07 12:41:31-- >> > >> https://repository.apache.org/content/repositories/orgapachecommons-027/commons-io/commons-io/2.1/ >> > Resolving repository.apache.org... 140.211.11.57 >> > Connecting to repository.apache.org|140.211.11.57|:443... connected. >> > WARNING: cannot verify repository.apache.org's certificate, issued by >> > `/C=US/ST=Arizona/L=Scottsdale/O=GoDaddy.com, Inc./OU= >> > http://certificates.godaddy.com/repo >> > sitory/CN=Go Daddy Secure Certification Authority/serialNumber=07969287': >> > Self-signed certificate encountered. >> > HTTP request sent, awaiting response... 200 OK >> > Length: unspecified [text/html] >> > Saving to: ` >> > >> repository.apache.org/content/repositories/orgapachecommons-027/commons-io/commons-io/2.1/index.html >> > ' >> > >> > [ >> > <=> >> > ] 27,475 --.-K/s in 0.001s >> > >> > 2011-10-07 12:41:31 (22.9 MB/s) - ` >> > >> repository.apache.org/content/repositories/orgapachecommons-027/commons-io/commons-io/2.1/index.html >> ' >> > saved [27475] >> > >> > Loading robots.txt; please ignore errors. >> > --2011-10-07 12:41:31-- https://repository.apache.org/robots.txt >> > Connecting to repository.apache.org|140.211.11.57|:443... connected. >> > WARNING: cannot verify repository.apache.org's certificate, issued by >> > `/C=US/ST=Arizona/L=Scottsdale/O=GoDaddy.com, Inc./OU= >> > http://certificates.godaddy.com/repo >> > sitory/CN=Go Daddy Secure Certification Authority/serialNumber=07969287': >> > Self-signed certificate encountered. >> > HTTP request sent, awaiting response... 200 OK >> > Length: unspecified [text/plain] >> > Saving to: `repository.apache.org/robots.txt' >> > >> > [ >> > <=> >> > ] 86 --.-K/s in 0s >> > >> > 2011-10-07 12:41:31 (2.28 MB/s) - `repository.apache.org/robots.txt' >> saved >> > [86] >> > >> > FINISHED --2011-10-07 12:41:31-- >> > Downloaded: 2 files, 27K in 0.001s (22.3 MB/s) >> >> >> > >> > Gary >> > >> > >> >> I used to use the grab_releases.sh script in >> >> committers/tools/releases/, but it's based on the Apache web server >> >> autoindex and needs changing to work with Nexus' format. >> >> >> >> Hen >> >> >> >> On Thu, Oct 6, 2011 at 5:58 PM, Gary Gregory <garydgreg...@gmail.com> >> >> wrote: >> >> > Hi All, >> >> > >> >> > The instruction on https://wiki.apache.org/commons/UsingNexus say: >> >> > >> >> > wget -np -r >> >> > >> >> >> https://repository.apache.org/content/repositories/orgapachecommons-098/org/apache/commons/commons-foo/1.1/ >> >> > >> >> > Which for IO 2.1 means: >> >> > >> >> > wget -np -r --no-check-certificate >> >> > >> >> >> https://repository.apache.org/content/repositories/orgapachecommons-027/commons-io/commons-io/2.1/ >> >> > >> >> > When I do that from my home dir on p.a.o I get the index.html and >> that's >> >> it. >> >> > uh? >> >> > >> >> > Are these instructions up to date? >> >> > >> >> > -- >> >> > E-Mail: garydgreg...@gmail.com | ggreg...@apache.org >> >> > JUnit in Action, 2nd Ed: <http://goog_1249600977>http://bit.ly/ECvg0 >> >> > Spring Batch in Action: <http://s.apache.org/HOq>http://bit.ly/bqpbCK >> >> > Blog: http://garygregory.wordpress.com >> >> > Home: http://garygregory.com/ >> >> > Tweet! http://twitter.com/GaryGregory >> >> > >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >> >> For additional commands, e-mail: dev-h...@commons.apache.org >> >> >> >> >> > >> > >> > -- >> > E-Mail: garydgreg...@gmail.com | ggreg...@apache.org >> > JUnit in Action, 2nd Ed: <http://goog_1249600977>http://bit.ly/ECvg0 >> > Spring Batch in Action: <http://s.apache.org/HOq>http://bit.ly/bqpbCK >> > Blog: http://garygregory.wordpress.com >> > Home: http://garygregory.com/ >> > Tweet! http://twitter.com/GaryGregory >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >> For additional commands, e-mail: dev-h...@commons.apache.org >> >> > > > -- > E-Mail: garydgreg...@gmail.com | ggreg...@apache.org > JUnit in Action, 2nd Ed: <http://goog_1249600977>http://bit.ly/ECvg0 > Spring Batch in Action: <http://s.apache.org/HOq>http://bit.ly/bqpbCK > Blog: http://garygregory.wordpress.com > Home: http://garygregory.com/ > Tweet! http://twitter.com/GaryGregory > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org