________________________________
From: Alan Gauld <alan.ga...@btinternet.com>
To: tutor@python.org
Sent: Wed, May 18, 2011 4:40:19 PM
Subject: Re: [Tutor] can I walk or glob a website?


"Dave Angel" <da...@ieee.org> wrote

>> "Albert-Jan Roskam" <fo...@yahoo.com> wrote
>>> How can I walk (as in os.walk) or glob a website?
>> 
>> I don't think there is a way to do that via the web.

> It has to be (more or less) possible.  That's what google does for their 
> search 
>engine.

Google trawls the site following links. If thats all he wants then its fairly 
easy.
I took it he wanted to actually trawl the server getting *all* the pdf files not
just the published pdfs...

Depends what the real requirement is.

===> No, I meant only the published ones. I would consider it somewhat 
dodgy/unethical/whatever-you-wanna-call-it to download unpublished stuff. 
Indeed 
I only need published data.


Alan G. 

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to