I've checked in a number of changes to the parser, mainly cleanup.
I'm only half-done with this, but it's at an internally consistent
state, and I thought I'd check it in so that people can look for
problems I haven't spotted.

Dirk, you'll want to look at what I've done in ImageParser.py.
I've defined a base class (ImageParser), and made pil2 and netpbm2
subclasses from that class.  Generic code moves into the base
class, and the specific subclass just implements the "size" and
"convert" methods.  I've also incorporated your "try_reduce_bpp"
and "try_reduce_dimension" flags.  (I like the "try_reduce_dimension"
so much I think it should be on by default.)  I was going to turn
the Windows image parser into a subclass of ImageParser, but I can't
really test it and thought I'd wait for you to do that.

I've attached the cvs log entry for this checkin below.

Bill

--------------
Writer.py:  Removed SimpleMapping and Resolver from Writer.py, replacing both with 
Mapper.
            Removed use of PluckerLinks -- functionality now implemented by Mapper.
PluckerDocs.py: Changed way that references to image documents are
                 preserved in parsed HTML; see PluckerDocs.py:register_document for
                 more info.
                 Changed octal function code references to hex in PluckerTextParagraph.
                 Re-did the way chunks are maintained in PluckerTextDocument.
ImageParser.py:  Added ImageParser class to ImageParser.py.
                 Re-wrote NewNetPBMImageParser and NewPythonImagingLibraryParser to be
                 subclasses of ImageParser.
                 Added support for WIDTH and HEIGHT attributes in IMG tags.
                 Added support for SECTION attribute in image tags, which allows a
                 cropped portion of the image to be included.
                 Added general support for 'try_reduce_dimension' and 'try_reduce_bpp'.
Retriever.py:  Added Accept header for outgoing GET message.
Spider.py:  Added 'as_dict()' method to SpiderLinks.
            Changed the way that image links are handled, to simplify loop a bit.
AliasList.py:  Added "as_dict()" method which returns the contents as a Python dict.
Parser.py:  Cleaned up code a bit.  Removed extra return value (typically 0).
Generally:  Changed use of string.atoi() to int() wherever possible.
            Changed string-style methods to class-style methods.

Reply via email to