I do want to ask, have any of you actually done any of this sort of thing before?
It always seems, every time I run a page that's been badly formatted through Tidy, it's a crap shoot as to whether it will wind up with the same visual representation that was intended in the first place. It's just that you guys are trying to create an elegant solution for an ideal world without fully understanding the interaction of all the pieces you guys are trying to put together. IF YOU ACTUALLY STUDY Gecko or KHTML, you'll find that great lengths have been done to make sure that special cases are taken care of, without disturbing the way the stream is entered into the parser. This isn't as simple as you guys are making it out to be, and it's all because it was never done RIGHT to begin with. It's not that this is impossible, but you are creating bigger headaches farther down the road, for what? to make sure this fits into your world view of how things should be? come on, guys. THINK. -Thom On 5 Mar 2007 02:53:09 -0800, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> or pass html through html tidy first. It appears unnecessary to me to go that way because it first parses HTML into a tree, then fixes some things and writes out HTML just to parse it again... I have read through the rules html tidy uses and in most cases the following rules will have the same or a quite similar result (ok it needs more testing with badly designed pages): * if the closing tag does not match the opening tag, search outwards until you find one (if you don't find, ignore) * be lazy with missing quotes in tag attributes * convert all tag names and attribute names to upper case * ignore <html>, <head>, <body> (except for attributes) * some tags always go to the HEAD section (e.g. <title>, <meta>) wherever they appear * ignore unknown tags As soon as I have new more or less stable code, I will upload a snapshot and you can look into it. -- hns _______________________________________________ Discuss-gnustep mailing list Discuss-gnustep@gnu.org http://lists.gnu.org/mailman/listinfo/discuss-gnustep
_______________________________________________ Discuss-gnustep mailing list Discuss-gnustep@gnu.org http://lists.gnu.org/mailman/listinfo/discuss-gnustep