On Feb 1, 3:34 am, asit <lipu...@gmail.com> wrote: > I hv been developing a link scanner. Here the objective is to > recursively scan a particular web site. > > During this, my script methttp://images.google.co.in/imghp?hl=en&tab=wi > and passed it to the scan function, whose body is like this.. > > def scan(site): >
So you have this: site=http://images.google.co.in/imghp?hl=en&tab=wi ?? > log=open(logfile,'a') > log.write(site + "\n") > site = "http://" + site.lower() > So now: site = "http://" + "http://images.google.co.in/imghp?hl=en&tab=wi" Hmmm...let's see what happens when I run the following program: import urllib site = "http://" + "http://images.google.co.in/imghp?hl=en&tab=wi" html = urllib.urlopen(site) --output:-- Traceback (most recent call last): File "6test.py", line 4, in ? html = urllib.urlopen(site) File "/Library/Frameworks/Python.framework/Versions/2.4//lib/ python2.4/urllib.py", line 82, in urlopen return opener.open(url) File "/Library/Frameworks/Python.framework/Versions/2.4//lib/ python2.4/urllib.py", line 190, in open return getattr(self, name)(url) File "/Library/Frameworks/Python.framework/Versions/2.4//lib/ python2.4/urllib.py", line 303, in open_http h = httplib.HTTP(host) File "/Library/Frameworks/Python.framework/Versions/2.4//lib/ python2.4/httplib.py", line 1097, in __init__ self._setup(self._connection_class(host, port, strict)) File "/Library/Frameworks/Python.framework/Versions/2.4//lib/ python2.4/httplib.py", line 586, in __init__ self._set_hostport(host, port) File "/Library/Frameworks/Python.framework/Versions/2.4//lib/ python2.4/httplib.py", line 598, in _set_hostport raise InvalidURL("nonnumeric port: '%s'" % host[i+1:]) httplib.InvalidURL: nonnumeric port: '' -- http://mail.python.org/mailman/listinfo/python-list