Tempo wrote: > Hello. I am getting an error and it has gotten me stuck. I think the > best thing I can do is post my code and the error message and thank > everybody in advanced for any help that you give this issue. Thank you. > > ############# > Here's the code: > ############# > > import urllib2 > import re > import xlrd > from BeautifulSoup import BeautifulSoup > > book = xlrd.open_workbook("ige_virtualMoney.xls") > sh = book.sheet_by_index(0) > rx = 1 > for rx in range(sh.nrows): > u = sh.cell_value(rx, 0) > page = urllib2.urlopen(u) > soup = BeautifulSoup(page) > p = soup.findAll('span', "sale") > p = str(p) > p2 = re.findall('\$\d+\.\d\d', p) > for price in p2: > print price > > ###################### > Here are the error messages: > ###################### > > Traceback (most recent call last): > File "E:\Python24\scraper.py", line 16, in -toplevel- > page = urllib2.urlopen(u) > File "E:\Python24\lib\urllib2.py", line 130, in urlopen > return _opener.open(url, data) > File "E:\Python24\lib\urllib2.py", line 350, in open > protocol = req.get_type() > File "E:\Python24\lib\urllib2.py", line 233, in get_type > raise ValueError, "unknown url type: %s" % self.__original > ValueError: unknown url type: List
You were expecting u to be a url string like "http://google.com", but it looks like it is actually a list. I'm not familiar with package xlrd but cell_value() must be returning a list and not a cell value. Presumably, the list contains the cell value probably in element 0. Put in a print statement before your call to urlopen() like: print u You'll likely discover your error. -- Paul McNett http://paulmcnett.com http://dabodev.com -- http://mail.python.org/mailman/listinfo/python-list