Girish Redekar ha scritto:
I'm trying to build a search engine in python am stuck at the place where I parse HTML to get useful text. One should ideally be able to parse the text (out of HTML tags) along with its position (for phrase searches) and font-size (to weigh words appropriately).

Words weight should be done using semantics, not style.

However, if you really need it, for CSS parsing, there is cssutils package.
I'm writing a CSS parser, too:

using PLY, so it should easy to read/modify.
It is still in very early stage.

> [...]

Regards  Manlio Perillo
Web-SIG mailing list
Web SIG:

Reply via email to