On 7 October 2011 16:44, Gary Gregory <garydgreg...@gmail.com> wrote:
> On Fri, Oct 7, 2011 at 8:51 AM, sebb <seb...@gmail.com> wrote:
>
>> On 7 October 2011 13:44, Gary Gregory <garydgreg...@gmail.com> wrote:
>> > On Fri, Oct 7, 2011 at 2:01 AM, Henri Yandell <flame...@gmail.com>
>> wrote:
>> >
>> >> wget doesn't seem to like the url. curl is happy to do it, but it
>> >> doesn't do -r afaik.
>> >>
>> >
>> > Here is what I get with wget. How do I make it get the embedded URLs? I
>> > don't care if it's curl, wget, or foobar, I just want instructions that
>> > work. After figuring out all the Maven nonsense, now this. Sigh.
>>
>> It's easy enough to loop around the non-Maven files in the directory
>> if you cannot get the index parsing to work.
>>
>> Or even use Lynx on p.a.o and browse to the directory, and download from
>> there.
>>
>> If you cannot get it to work, let me know and I can help later (about
>> to be busy).
>>
>
> Yes please. :( It's this kind of ridiculous hoop jumping that makes me put
> this task on the back burner, the one that's in the shed, deep in the woods.

Problem seems to be that the Nexus server has a robots.txt which does
not allow downloads from that directory.

The following works for me:

wget -r -l 1 -np -nH -nd -nv -e robots=off --wait 10 --no-check-certificate URL

-r recursive
-l 1 1 level
-np no parent
-nH don't create host directories
-nd don't create directories
-nv quiet
-e robots=off ignore robots.txt
--wait 10 wait between retrievals

> Gary
>
>
>>
>> >> wget -np -r --no-check-certificate
>> >
>> https://repository.apache.org/content/repositories/orgapachecommons-027/commons-io/commons-io/2.1/
>> > --2011-10-07 12:41:31--
>> >
>> https://repository.apache.org/content/repositories/orgapachecommons-027/commons-io/commons-io/2.1/
>> > Resolving repository.apache.org... 140.211.11.57
>> > Connecting to repository.apache.org|140.211.11.57|:443... connected.
>> > WARNING: cannot verify repository.apache.org's certificate, issued by
>> > `/C=US/ST=Arizona/L=Scottsdale/O=GoDaddy.com, Inc./OU=
>> > http://certificates.godaddy.com/repo
>> > sitory/CN=Go Daddy Secure Certification Authority/serialNumber=07969287':
>> >  Self-signed certificate encountered.
>> > HTTP request sent, awaiting response... 200 OK
>> > Length: unspecified [text/html]
>> > Saving to: `
>> >
>> repository.apache.org/content/repositories/orgapachecommons-027/commons-io/commons-io/2.1/index.html
>> > '
>> >
>> >    [
>> > <=>
>> > ] 27,475      --.-K/s   in 0.001s
>> >
>> > 2011-10-07 12:41:31 (22.9 MB/s) - `
>> >
>> repository.apache.org/content/repositories/orgapachecommons-027/commons-io/commons-io/2.1/index.html
>> '
>> > saved [27475]
>> >
>> > Loading robots.txt; please ignore errors.
>> > --2011-10-07 12:41:31--  https://repository.apache.org/robots.txt
>> > Connecting to repository.apache.org|140.211.11.57|:443... connected.
>> > WARNING: cannot verify repository.apache.org's certificate, issued by
>> > `/C=US/ST=Arizona/L=Scottsdale/O=GoDaddy.com, Inc./OU=
>> > http://certificates.godaddy.com/repo
>> > sitory/CN=Go Daddy Secure Certification Authority/serialNumber=07969287':
>> >  Self-signed certificate encountered.
>> > HTTP request sent, awaiting response... 200 OK
>> > Length: unspecified [text/plain]
>> > Saving to: `repository.apache.org/robots.txt'
>> >
>> >    [
>> > <=>
>> > ] 86          --.-K/s   in 0s
>> >
>> > 2011-10-07 12:41:31 (2.28 MB/s) - `repository.apache.org/robots.txt'
>> saved
>> > [86]
>> >
>> > FINISHED --2011-10-07 12:41:31--
>> > Downloaded: 2 files, 27K in 0.001s (22.3 MB/s)
>> >>
>> >
>> > Gary
>> >
>> >
>> >> I used to use the grab_releases.sh script in
>> >> committers/tools/releases/, but it's based on the Apache web server
>> >> autoindex and needs changing to work with Nexus' format.
>> >>
>> >> Hen
>> >>
>> >> On Thu, Oct 6, 2011 at 5:58 PM, Gary Gregory <garydgreg...@gmail.com>
>> >> wrote:
>> >> > Hi All,
>> >> >
>> >> > The instruction on https://wiki.apache.org/commons/UsingNexus say:
>> >> >
>> >> > wget -np -r
>> >> >
>> >>
>> https://repository.apache.org/content/repositories/orgapachecommons-098/org/apache/commons/commons-foo/1.1/
>> >> >
>> >> > Which for IO 2.1 means:
>> >> >
>> >> > wget -np -r --no-check-certificate
>> >> >
>> >>
>> https://repository.apache.org/content/repositories/orgapachecommons-027/commons-io/commons-io/2.1/
>> >> >
>> >> > When I do that from my home dir on p.a.o I get the index.html and
>> that's
>> >> it.
>> >> > uh?
>> >> >
>> >> > Are these instructions up to date?
>> >> >
>> >> > --
>> >> > E-Mail: garydgreg...@gmail.com | ggreg...@apache.org
>> >> > JUnit in Action, 2nd Ed: <http://goog_1249600977>http://bit.ly/ECvg0
>> >> > Spring Batch in Action: <http://s.apache.org/HOq>http://bit.ly/bqpbCK
>> >> > Blog: http://garygregory.wordpress.com
>> >> > Home: http://garygregory.com/
>> >> > Tweet! http://twitter.com/GaryGregory
>> >> >
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>> >> For additional commands, e-mail: dev-h...@commons.apache.org
>> >>
>> >>
>> >
>> >
>> > --
>> > E-Mail: garydgreg...@gmail.com | ggreg...@apache.org
>> > JUnit in Action, 2nd Ed: <http://goog_1249600977>http://bit.ly/ECvg0
>> > Spring Batch in Action: <http://s.apache.org/HOq>http://bit.ly/bqpbCK
>> > Blog: http://garygregory.wordpress.com
>> > Home: http://garygregory.com/
>> > Tweet! http://twitter.com/GaryGregory
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>> For additional commands, e-mail: dev-h...@commons.apache.org
>>
>>
>
>
> --
> E-Mail: garydgreg...@gmail.com | ggreg...@apache.org
> JUnit in Action, 2nd Ed: <http://goog_1249600977>http://bit.ly/ECvg0
> Spring Batch in Action: <http://s.apache.org/HOq>http://bit.ly/bqpbCK
> Blog: http://garygregory.wordpress.com
> Home: http://garygregory.com/
> Tweet! http://twitter.com/GaryGregory
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to