I went to the URL you posted, and it looks like that error is the
content you should be recieving. Try refreshing your browser cache, you
could be loading a cached page.
Charles
yookyung wrote:
> I am trying to crawl webpages in citeseer domain (a collection of research
> papers mostly in compute
I am trying to crawl webpages in citeseer domain (a collection of research
papers mostly in computer science).
I have used the following code snippet.
#
import urllib
sock = urllib.urlopen("http://citeseer.ist.psu.edu";)
webcontent = sock.read().split('\n')
sock.close()
print webcontent
###