python version 2.2 ... but I'm betting there's a bug in the parser.From: "David A. Desrosiers" <[EMAIL PROTECTED]>Just installed plucker and plucker-desktop; was surprised how full of holes the latter seems to be. But now I've figured out how plucker itself works and am rolling my own. I'm very impressed, in general!What version of Python?
Consider this small example file:
<html>
<font
face="Comic Sans MS,helvetica"
<!-- color="darkcyan" -->
>
Greetings!
</font>
</html>
If I run this through the parser or with the color line decommented, the parser vomits, but if I delete the line, it works fine. Clearly, deleting a commented out line should not affect the parser, eh?
(Let's ignore the fact that this isn't strictly legal html.)
Here's the parser output:
migod@olympus(35): pluck
Pluckerdir is '/home/migod/.plucker'...
---- 0 collected, 1 to do ----
Processing plucker:/home.html...
Retrieved ok.
Parsed ok; added 1 document link.
---- 1 collected, 1 to do ----
Processing http://plg.uwaterloo.ca/~migod/techie/temp-clieLinux.html...
Retrieved ok.
Error: Unknown error parsing document http://plg.uwaterloo.ca/~migod/techie/temp-clieLinux.html:
Traceback (innermost last):
File "/usr/local/src/plucker-1.2//parser/python/PyPlucker/Parser.py", line 27, in generic_parser
parser = TextParser.StructuredHTMLParser (url, data, headers, config, attributes)
File "/usr/local/src/plucker-1.2//parser/python/PyPlucker/TextParser.py", line 899, in __init__
self.feed (text)
File "/usr/lib/python1.5/site-packages/xml/parsers/sgmllib.py", line 465, in finish_endtag
self.handle_endtag(tag, method)
File "/usr/local/src/plucker-1.2//parser/python/PyPlucker/TextParser.py", line 1005, in handle_endtag
if method: sgmllib.SGMLParser.handle_endtag(self, tag, method)
File "/usr/lib/python1.5/site-packages/xml/parsers/sgmllib.py", line 476, in handle_endtag
method()
File "/usr/local/src/plucker-1.2//parser/python/PyPlucker/TextParser.py", line 1406, in end_font
self._doc.unset_forecolor (forecolor)
File "/usr/local/src/plucker-1.2//parser/python/PyPlucker/TextParser.py", line 515, in unset_forecolor
if self._attributes.pop_forecolor (value):
File "/usr/local/src/plucker-1.2//parser/python/PyPlucker/TextParser.py", line 251, in pop_forecolor
foreres = self._forecolor[-1] != self._forecolor[-2]
IndexError: list index out of range
Parsing failed.
---- all 1 pages retrieved and parsed ----
Writing out collected data...
Writing document 'MyCoolLinks' to file /home/migod/.plucker/MyCoolLinks.pdb
Converting plucker:/home.html...
No default charset
Wrote 1 <= plucker:/~special~/index
Wrote 2 <= plucker:/home.html
Wrote 3 <= plucker:/~special~/pluckerlinks
Wrote 12 <= plucker:/~special~/links1
_______________________________________________
plucker-list mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list

