On Feb 1, 3:34 am, asit <lipu...@gmail.com> wrote:
> I hv been developing a link scanner. Here the objective is to
> recursively scan a particular web site.
>
> During this, my script methttp://images.google.co.in/imghp?hl=en&tab=wi
> and passed it to the scan function, whose body is like this..
>
> def scan(site):
>

So you have this:

site=http://images.google.co.in/imghp?hl=en&tab=wi

??



>     log=open(logfile,'a')
>     log.write(site + "\n")
>     site = "http://"; + site.lower()
>

So now:

site = "http://"; + "http://images.google.co.in/imghp?hl=en&tab=wi";

Hmmm...let's see what happens when I run the following program:


import urllib

site = "http://"; + "http://images.google.co.in/imghp?hl=en&tab=wi";
html = urllib.urlopen(site)

--output:--
Traceback (most recent call last):
  File "6test.py", line 4, in ?
    html = urllib.urlopen(site)
  File "/Library/Frameworks/Python.framework/Versions/2.4//lib/
python2.4/urllib.py", line 82, in urlopen
    return opener.open(url)
  File "/Library/Frameworks/Python.framework/Versions/2.4//lib/
python2.4/urllib.py", line 190, in open
    return getattr(self, name)(url)
  File "/Library/Frameworks/Python.framework/Versions/2.4//lib/
python2.4/urllib.py", line 303, in open_http
    h = httplib.HTTP(host)
  File "/Library/Frameworks/Python.framework/Versions/2.4//lib/
python2.4/httplib.py", line 1097, in __init__
    self._setup(self._connection_class(host, port, strict))
  File "/Library/Frameworks/Python.framework/Versions/2.4//lib/
python2.4/httplib.py", line 586, in __init__
    self._set_hostport(host, port)
  File "/Library/Frameworks/Python.framework/Versions/2.4//lib/
python2.4/httplib.py", line 598, in _set_hostport
    raise InvalidURL("nonnumeric port: '%s'" % host[i+1:])
httplib.InvalidURL: nonnumeric port: ''


--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to