Hi

On 10 October 2012 21:35, Benjamin Fishbein <bfishbei...@gmail.com> wrote:
> I've been scraping info from a website with a url program I wrote. But now I 
> can't open their webpage, no matter which web browser I use. I think they've 
> somehow blocked me. How can I get back in? Is it a temporary block? And can I 
> get in with the same computer from a different wifi?

Hard to know for certain what they've done, perhaps they've blocked
your IP. You can try connecting from another IP and see if that works.

2 points:
1) If you're going to be scraping websites, you should always play
nice with the web-server -- throttle your requests (put some random
delay between them) so they don't hammer the web-server too hard.  Not
doing this will enrage any webmaster.  He'll be very quick to figure
out why his website's being hammered, from where (the IP) and then
block you.  You'd probably do the same if you ran a website and you
noticed some particular IP hammering your site..
2) You should ideally always respect websites wishes regarding bots
and scraping.   If they don't want automated bots to be scraping them
then you should really not scrape that site.  And if you're going to
disregard their wishes and scrape it anyway (not recommended), then
all bets are off and you'll have to fly "under the radar" and ensure
that your scraping app looks as much like a browser as possible
(probably using modified headers that looks like what a browser will
send) and behaves as much like a human operator driving a browser as
possible, or you'll find yourself blocked as you've experienced above.

Walter
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to