Phil Ringnalda wrote: > > Nikolas 'Atrus' Coukouma wrote: > >> Using @rel with any linking element is perfectly valid and has been for >> years. >> @rel not being supported for anything other than the link element itself >> has also been an outstanding bug for just as long. There's lot of debate >> attached to at least one Mozilla bug (#57399 [1] - filed on 2000-10-20). >> >> Can we agree that this should be supported, but currently isn't? Unless >> there's a compelling reason not to, I think we might as well allow >> autodiscovery via either element. Any implementation guide should >> recommend duplicating the information in the interest of autodiscovery >> actually working. >> >> [1] https://bugzilla.mozilla.org/show_bug.cgi?id=57399 > > > -1 to saying in the spec that you can use either element, and in the > guide saying to use both if you want it to work, not just look pretty.
You're absolutely right. I was thing in more immediate terms before, but if we're going to make this part of the Atom working group, land of well-defined and reasonable specs, everything should work. > > As I remember it, when RSS autodiscovery started this cowpath, > aggregator developers generally didn't have an SGML parser handy, and > weren't especially happy about the idea of having to write their own > HTML parser. Finding one (or a few) of relatively few <link>s in the > first bit of the document feels a lot easier than having to look at > every <a> in the whole document. > > Now? I'd say most don't have an SGML parser handy, and won't be > especially happy about writing their own HTML parser. It's fairly rare > for someone to comment out bits of their <head>, and quite common for > them to comment out huge swaths of their <body>, including things a > template came with, like <a href="../xml/index.atom" rel="feed">Atom > feed</a>, with no thought that something will be seeing and using that > invisible link with an incorrect path. I added Atom autodiscovery to > my current aggregator, Feed on Feeds, with a ten second > copy/paste/change mime-type of the results of it using a regular > expression on the HTML. If instead I had to correctly parse the entire > HTML document, I'd... switch to something in Python, I guess. Is there something wrong with the HTML parsers? Perl has HTML::Parser Python has htmllib.py Ruby has ymHTML and a port of of the Python library called html-parseer PHP has PHP-HTML Common Lisp has phtml The W3C provides a simple parser written in C I'm sure I can find more, but I think the above is a sufficiently long list to illustrate my point. > Then, since I foolishly took the Firefox bug for better autodiscovery, > I'll also need to do it where I do have an excellent HTML parser, but > I have to do it on every single page that every single Firefox user > loads, whether or not they have any interest in feeds, or subscribed > to the feed ten thousand loads of that particular page ago. <link> is > easy, we've got a DOMLinkAdded event and most pages have very few of > them. <a>? Well, the performance hit probably won't be noticeable on > most pages. This is a single XPath query. Gecko has native support for it. I'm not sure about the others, but Sarissa is a fine library for DOM manipulation (including XSLT and XPath) from Javascript and it works with IE, KHTML, Opera ... > > Phil Ringnalda > Of course, if your XML library copes with all the errors present in normal HTML, it's probably nicer to use than any HTML parser. The point here is that most developers have access to an HTML parser. I admit that they might need patching, but at least 90% of the work is done. I'll try to find time to examine each of these libraries and make any changes needed. Hopefully they're already in good shape or the author is open to this sort thing. If all else fails, there's forking. If the problem is ignorance, I'll happily maintain a list. I'm also willing to write some sample implementations in all of the languages I listed before and more. I don't think this is terribly difficult. In fact, I just took a shot at altering Feeds on Feeds to support this and found it incredibly easy. patch: http://zaphod.student.umd.edu/~atrus/FoF_mod/a-support.patch There's other stuff in the same directory there if you want to poke at it. The changes just use PHP-HTML, which I mentioned earlier. Cheers, -Nikolas 'Atrus' Coukouma