Re: Does xml:base apply to type=html content?
At 19:17 06/04/04, Anne van Kesteren wrote: Quoting James Holderness [EMAIL PROTECTED]: If `Content-Location` is not usable or can't be used consistent on a website (for example, using it for both Atom and HTML content) I suggest we specify something that is consistent with what browsers do. And perhaps try to obsolete the relevant header if possible... Isn't this something the HTTP WG should be doing? Just for the record, there is currently no HTTP WG. The mailing list of the former HTTP WG still exists. As this was an IETF WG, and is an IETF mailing list (hosted by W3C), you can easily join and bring this issue up. Here's the necessary data: List-Id: ietf-http-wg.w3.org List-Help: http://www.w3.org/Mail/ List-Subscribe: mailto:[EMAIL PROTECTED] Resent-Message-Id: [EMAIL PROTECTED] List-Unsubscribe: mailto:[EMAIL PROTECTED] Resent-Message-Id: [EMAIL PROTECTED] Regards,Martin. I guess so. The HTML WG (W3C, same concept) should be doing a lot of things as well. That doesn't mean it actually happens... -- Anne van Kesteren http://annevankesteren.nl/
Re: Does xml:base apply to type=html content?
Quoting A. Pagaltzis [EMAIL PROTECTED]: * James Holderness [EMAIL PROTECTED] [2006-04-04 03:25]: The way I see it, until a standards body tells us otherwise, we are obliged to support the Content-Location header unless we can provide a very good argument for ignoring it. +1, standards aren’t there for people to cherry-pick the parts they find convenient or useful. If interoperable implementations for standards are not possible, they are useless. The goal of having standards is interoperable implementations. Opera has removed support for Content-Location (or least partially, not sure of the details) for the same reasons as Firefox. (This also isn't really about convenient or useful...) -- Anne van Kesteren http://annevankesteren.nl/
Re: Does xml:base apply to type=html content?
Anne van Kesteren wrote: Quoting James Holderness [EMAIL PROTECTED]: I think the issue of neutral link bookmarking is unlikely to be a problem for Atom aggregators though. Server bugs are another thing, but I think most feeds will be broken without an explicit xml:base anyway, so maybe that's not worth worrying about. I'm not sure though. Should the WG recommend ignoring Content-Location as a base URI, or should aggregators follow the RFC exactly as specified? If `Content-Location` is not usable or can't be used consistent on a website (for example, using it for both Atom and HTML content) I suggest we specify something that is consistent with what browsers do. And perhaps try to obsolete the relevant header if possible... The problem is not with Content-Location, but with RFC 3986. It says that same-document references must be resolved with respect to the base URI. It adds that when those references are resolved, they should not result in a new retrieval action, but that does not help with things like bookmarking (as James pointed out), and is almost impossible to implement. -- Sjoerd Visscher http://w3future.com/weblog/
Re: Does xml:base apply to type=html content?
Anne van Kesteren wrote: If `Content-Location` is not usable or can't be used consistent on a website (for example, using it for both Atom and HTML content) I suggest we specify something that is consistent with what browsers do. And perhaps try to obsolete the relevant header if possible... Isn't this something the HTTP WG should be doing? If they have plans to obsolete the header then I think it makes sense for us to discourage its use, but so far I've seen nothing suggesting they're even contemplating that. The way I see it, until a standards body tells us otherwise, we are obliged to support the Content-Location header unless we can provide a very good argument for ignoring it. The Firefox developers may well have done that for web browsers, but I'm not yet convinced that their issues are necessarily applicable to aggregators. Regards James
Re: Does xml:base apply to type=html content?
* James Holderness [EMAIL PROTECTED] [2006-04-04 03:25]: The way I see it, until a standards body tells us otherwise, we are obliged to support the Content-Location header unless we can provide a very good argument for ignoring it. +1, standards aren’t there for people to cherry-pick the parts they find convenient or useful. Regards, -- Aristotle Pagaltzis // http://plasmasturm.org/
Re: Does xml:base apply to type=html content?
Quoting James Holderness [EMAIL PROTECTED]: I think the issue of neutral link bookmarking is unlikely to be a problem for Atom aggregators though. Server bugs are another thing, but I think most feeds will be broken without an explicit xml:base anyway, so maybe that's not worth worrying about. I'm not sure though. Should the WG recommend ignoring Content-Location as a base URI, or should aggregators follow the RFC exactly as specified? If `Content-Location` is not usable or can't be used consistent on a website (for example, using it for both Atom and HTML content) I suggest we specify something that is consistent with what browsers do. And perhaps try to obsolete the relevant header if possible... -- Anne van Kesteren http://annevankesteren.nl/
Re: Does xml:base apply to type=html content?
I also understand there is some debate whether supporting Content-Location is a good idea at all (at least in web browsers). Firefox at one point started adding support, but they determined that it caused problems with broken servers (mostly IIS I believe). As far as I, support for Content-Location was reverted because in Firefox same-document references are broken when used with some sort of base URI. I.e. if you have a document with a href=#x with Content-Location: http://y/z, Firefox will go to the anchor named x in the document at http://y/z, instead of to the anchor named x in the current document. -- Sjoerd Visscher http://w3future.com/weblog/
Re: Does xml:base apply to type=html content?
Sjoerd Visscher wrote: As far as I, support for Content-Location was reverted because in Firefox same-document references are broken when used with some sort of base URI. I.e. if you have a document with a href=#x with Content-Location: http://y/z, Firefox will go to the anchor named x in the document at http://y/z, instead of to the anchor named x in the current document. That's not quite the way I understood it. The bug that ended it all can be seen here: https://bugzilla.mozilla.org/show_bug.cgi?id=241981 I think comment #5 sums it up best. Basically fragement links on a content-negotiated page would expose (via the Content-Location support) the URI of the actual page rather than the original neutral URI. This becomes an issue when you start bookmarking such links and sending them to your friends whose browsers don't necessarily support the same content types as yours. As for the other issues with Content-Location and broken servers, you can see most of the details here: https://bugzilla.mozilla.org/show_bug.cgi?id=109553 It appears to be mostly problems with Microsoft and Oracle servers. The solution at the time was to keep adding those servers to a blacklist (I died a little inside when I read that). Ultimately that became unnecessary when they backed out the fix. I think the issue of neutral link bookmarking is unlikely to be a problem for Atom aggregators though. Server bugs are another thing, but I think most feeds will be broken without an explicit xml:base anyway, so maybe that's not worth worrying about. I'm not sure though. Should the WG recommend ignoring Content-Location as a base URI, or should aggregators follow the RFC exactly as specified? Regards James Holderness
Re: Does xml:base apply to type=html content?
Friday, March 31, 2006, 3:31:12 AM, A. Pagaltzis wrote: In that scenario, either the tag soup from the other feeds must be fixed up so the view can be rendered as XHTML (which supports xml:base in content) XHTML 1.0 doesn't support xml:base does it? As I understand it, only specs that say that they support xml:base allow you to put xml:base on their elements, but any spec that allows URIrefs has the concept of a base-URI, so for envelope specs such as Atom, you'd expect xml:base in the envelope to set the base-URI for the content. -- Dave
Re: Does xml:base apply to type=html content?
Friday, March 31, 2006, 4:34:48 AM, you wrote: The escaped HTML content contained within the content element that David was originally concerned with is more than likely a copy of all or part of the elements and content contained inside the body tag of the external document referenced by an associated link element, and therefore no guarentee that the xml:base of the atom feed is going to be anywhere even close to accurate. That might be exactly the case where the xml:base is useful: the content came from different places, had relative URI-refs, so the xml:base was set on each entry to the source URIs of each document so that the relative links work[*] in both in cases. [*] in theory. -- Dave
Re: Does xml:base apply to type=html content?
Oh, I agree that if this can be done in a consistent manner, then using the xml:base attribute on each //entry/content element would be absolutely WONDERFUL. The trick is to figure out howtoensureaconsistentlevelofavaialability and accuracyofthevalue of eachinstanceof content/@xml:base. In thinking this trough a bit, as long as the original URI is intact, and the URI itself contains the correct URI for each resource that is referenced, then theabilityexistsviavariousmechanisminXSLT 2.0tobuildasimpletransformationfilethatcouldparseanydocument,wellformedornot,andexctractalloftheoriginalURI's,returningtheminasimpleRDFreferenceindex,tothenbeusedtodeterminewhatbaseURIshouldbeused,whichresourcesareexternal,andbuildoutthe result in a way that keeps the 404's to anabsoluteminimum. Actually, what we really need is a nice SHA1Decentralized HASH BASH weekend session to index all known resources, integrating the result string into every known file formatspecificationandensuringthataresourcealwaysknowshowtofindhiswaybacktowherehe/she/itcamefrom. Howbout'youguysbuildthatsystem,andinthemeantimeI'llwritesomequickanddirtytransformationsfilestotideusovertil'you'redone,K? I mean, what else are you gonna do after APP releases... You'll have LOADS of extra time on your hand. ;) K, Ready,set,go! On 3/31/06, David Powell [EMAIL PROTECTED] wrote: Friday, March 31, 2006, 4:34:48 AM, you wrote: The escaped HTML content contained within the content element that David was originally concerned with is more than likely a copy of all or part of the elements and content contained inside the body tag of the external document referenced by an associated link element, and therefore no guarentee that the xml:base of the atom feed is going to be anywhere even close to accurate.That might be exactly the case where the xml:base is useful: the content came from different places, had relative URI-refs, so thexml:base was set on each entry to the source URIs of each document sothat the relative links work[*] in both in cases.[*] in theory. --Dave-- M:D/M. David Petersonhttp://www.xsltblog.com/
Re: Does xml:base apply to type=html content?
hand*s*.On 3/31/06, M. David Peterson [EMAIL PROTECTED] wrote: Oh, I agree that if this can be done in a consistent manner, then using the xml:base attribute on each //entry/content element would be absolutely WONDERFUL. The trick is to figure out howtoensureaconsistentlevelofavaialability and accuracyofthevalue of eachinstanceof content/@xml:base.In thinking this trough a bit, as long as the original URI is intact, and the URI itself contains the correct URI for each resource that is referenced, then theabilityexistsviavariousmechanisminXSLT 2.0tobuildasimpletransformationfilethatcouldparseanydocument,wellformedornot,andexctractalloftheoriginalURI's,returningtheminasimpleRDFreferenceindex,tothenbeusedtodeterminewhatbaseURIshouldbeused,whichresourcesareexternal,andbuildoutthe result in a way that keeps the 404's to anabsoluteminimum. Actually, what we really need is a nice SHA1Decentralized HASH BASH weekend session to index all known resources, integrating the result string into every known file formatspecificationandensuringthataresourcealwaysknowshowtofindhiswaybacktowherehe/she/itcamefrom. Howbout'youguysbuildthatsystem,andinthemeantimeI'llwritesomequickanddirtytransformationsfilestotideusovertil'you'redone,K? I mean, what else are you gonna do after APP releases... You'll have LOADS of extra time on your hand. ;) K, Ready,set,go!On 3/31/06, David Powell [EMAIL PROTECTED] wrote: Friday, March 31, 2006, 4:34:48 AM, you wrote: The escaped HTML content contained within the content element that David was originally concerned with is more than likely a copy of all or part of the elements and content contained inside the body tag of the external document referenced by an associated link element, and therefore no guarentee that the xml:base of the atom feed is going to be anywhere even close to accurate.That might be exactly the case where the xml:base is useful: the content came from different places, had relative URI-refs, so thexml:base was set on each entry to the source URIs of each document sothat the relative links work[*] in both in cases.[*] in theory. --Dave-- M:D/M. David Peterson http://www.xsltblog.com/ -- M:D/M. David Petersonhttp://www.xsltblog.com/
RE: Does xml:base apply to type=html content?
However, exempting [EMAIL PROTECTED]'html'` content from xml:base processing won't help. Agreed and an excellent point. I guess that the end-result of this is that regardless of how one wants to interpret any of the relevant specs on this issue, a client should assume that xml:base applies to URI references in @type=html content. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of A. Pagaltzis Sent: Thursday, March 30, 2006 6:31 PM To: Atom Syntax Subject: Re: Does xml:base apply to type=html content? * Sean Lyndersay [EMAIL PROTECTED] [2006-03-31 04:00]: This is unfortunate, because HTML itself only allows base elements in the header (one per page). So if anyone wants to build a client that displays more than one item at a time using a standard HTML renderer (and most client render HTML using someone else's renderer, not their own), they have to go groveling in HTML to do URL fixup (or use iframes). That's exactly the problem currently facing Liferea. However, exempting [EMAIL PROTECTED]'html'` content from xml:base processing won't help. If the items can come from multiple feeds, such as is supported by Liferea, then mixing items from an Atom feed that uses xml:base and other feeds automatically runs into the same issue. In that scenario, either the tag soup from the other feeds must be fixed up so the view can be rendered as XHTML (which supports xml:base in content), or URL fixup needs to be done on the content from the Atom feed so it can be passed to a tag soup renderer. Regards, -- Aristotle Pagaltzis // http://plasmasturm.org/
RE: Does xml:base apply to type=html content?
I haven't looked in detail at how IE does on the xml:base comformance tests, since the current beta has no support for xml:base. In light of that fact, I'm glad we failed outright instead of halfway; halfway would have been weird :). We're actually implementing xml:base support right now (and in the process, fixing the relative URL issue that Sam Ruby pointed out in our normalization format), so we'll be broken on those conformance tests for while. The fix won't make it out in the next public release, but it should make the one after that. I'll let you know how we do on those tests when the code is done. Sean -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of James Holderness Sent: Thursday, March 30, 2006 9:24 PM To: Atom Syntax Subject: Re: Does xml:base apply to type=html content? Sean Lyndersay wrote: In my own case (IE7) case, this isn't that big a deal because we have to grovel in HTML for many other reasons, but I suspect it'd be pain for other clients. Looking at the results of the Atom XmlBaseConformanceTests [1] mosts of the clients tested seemed capable of handling relative references inside HTML to some extent. Even the ones that don't necessarily pass all the tests at least get enough right to suggest that they're on the right track. IE7 is actually one of the few clients that I would consider to have failed outright. Is the latest beta any better at handling xml:base or do these problems still exist? Regards James [1] http://www.intertwingly.net/wiki/pie/XmlBaseConformanceTests
Re: Does xml:base apply to type=html content?
Friday, March 31, 2006, 11:02:18 AM, Sean Lyndersay wrote: I haven't looked in detail at how IE does on the xml:base comformance tests, since the current beta has no support for xml:base. In light of that fact, I'm glad we failed outright instead of halfway; halfway would have been weird :). We're actually implementing xml:base support right now (and in the process, fixing the relative URL issue that Sam Ruby pointed out in our normalization format), so we'll be broken on those conformance tests for while. The fix won't make it out in the next public release, but it should make the one after that. I'll let you know how we do on those tests when the code is done. Great. It would be good if you could preserve the effective base-URI of feeds and entries, so that applications using Atom extensions that contain relative URIRefs can resolve them into URIs. I suppose that it could be done by pinning an absolute xml:base onto the channel and item elements. -- Dave
Re: Does xml:base apply to type=html content?
* David Powell [EMAIL PROTECTED] [2006-03-31 09:55]: XHTML 1.0 doesn't support xml:base does it? As I understand it, only specs that say that they support xml:base allow you to put xml:base on their elements, but any spec that allows URIrefs has the concept of a base-URI, so for envelope specs such as Atom, you'd expect xml:base in the envelope to set the base-URI for the content. To be honest, I’m not sure about the precise spec interactions for this case. What I do know however is that Gecko respects xml:base in XHTML content. Regards, -- Aristotle Pagaltzis // http://plasmasturm.org/
Re: Does xml:base apply to type=html content?
* M. David Peterson [EMAIL PROTECTED] [2006-03-31 07:55]: I speaking in terms of mashups... If a feed comes from one source, then I would agree... but mashups from both a syndication as well as an application standpoint are become the primary focus of EVERY major vendor. Its in this scenario that I see the problem of assuming the xml:base in current context has any value whatsoever. Pick a planet, any planet, and my point suddenly and immediattelly becomes relavent. No. That is only a problem if you just mash markup together without taking care to preserve base URIs by adding xml:base at the junction points as necessary. Copying an atom:entry from one feed to another correctly requires that you query the base URI which is in effect in the scope of the atom:entry in the source feed, and add an xml:base attribute to that effect to the copied atom:entry in the destination feed. If you do this, any xml:base attributes within the copy of the atom:entry will continue to resolve correctly. It’s much easier to get right than copying markup without violating namespace-wellformedness, even. Regards, -- Aristotle Pagaltzis // http://plasmasturm.org/
Re: Does xml:base apply to type=html content?
On Mar 31, 2006, at 7:01 AM, A. Pagaltzis wrote: * M. David Peterson [EMAIL PROTECTED] [2006-03-31 07:55]: I speaking in terms of mashups... If a feed comes from one source, then I would agree... but mashups from both a syndication as well as an application standpoint are become the primary focus of EVERY major vendor. Its in this scenario that I see the problem of assuming the xml:base in current context has any value whatsoever. No. That is only a problem if you just mash markup together without taking care to preserve base URIs by adding xml:base at the junction points as necessary. Copying an atom:entry from one feed to another correctly requires that you query the base URI which is in effect in the scope of the atom:entry in the source feed, and add an xml:base attribute to that effect to the copied atom:entry in the destination feed. If you do this, any xml:base attributes within the copy of the atom:entry will continue to resolve correctly. It’s much easier to get right than copying markup without violating namespace-wellformedness, even. Exactly. When creating a mashup feed, there are any number of things that the ... masher(?) has to be careful of--for example: * Getting namespace prefixes right * Creating an atom:source element and putting the right data into it * Ensuring that all entries use the same character encoding * Ensuring that the xml:lang in context is correct * Ensuring that the xml:base in context is correct * If any of the source data isn't Atom, ensuring that all the required elements exist (...even if the source data IS Atom--you never know when you're going to aggregate from an invalid Atom feed-- then you have to decide whether to fix the entry or drop it to make your output correct) If we start assuming that mashers can't do those correctly, then we may as well not be using Atom, or even XML. If we did a proper job of specifying Atom, then we should be able to hold publishers' feet to the fire and make them get their feeds right. In Atom, xml:base is the mechanism used to determine base URIs.
Re: Does xml:base apply to type=html content?
Hey Aristotle, Firstly, thanks for your recent effort to let the foks at O'Reilly know that some serious issues have been introduced into the system since the upgrade to Movable type. I can tell you how aggravated I was to get an email from Lawrence Lessig last Saturday, which was in response to the email I sent letting him now I had made announcement regarding the stream and download... His paraphrased response was: What Post? All I get is a Page Not Found Error.Of course the reason he got this was the fact that MovableType refuses to pull their head out and realize if I update the post, that doesn't mean I want you to change the name of the html file as well!, and because i addes a few extra links and such thats exactly what took place, breaking the link I had sent him. I originally thought well maybe there simply building in a versioning system of sorts, incrementing the file name by one number with each update, but in fact thats not what they're doing at all, as its quite a random process as to what name it chooses to rename it to, often times reverting back to the original name it had it set to e.g. title_of_post_1.html will be changed to title_of_post_2.html and then back to title_of_post_1.html, or sometimes it will drop the number off entirely -- title_of_post.html -- and its all COMPLETELY random as you can make 15 updates to a post and nothing will change, and then suddenly it will decides to rename itself and break any and all link that either you sent out, or being stored on del.icio.us, or whatever else.Anyway, the point is that the reason everything is currently broken is because they moved things to a MovableType system 2 weeks ago, and in doing so its been a bit of a mess as of late. To Justin's credit though he's been fixing each and every problem as soon as that problem is made known, so if nothing else, at least theres someone both compitent and willing to do what needs to be done to keep the system running. And in fact, one of the problems (although fortunately there have been MANY, so its not as bad as it could have been) came from yours truly, as I copied over the link for the popup Free Culture reader/player, and like an idiot didnt check to see if the place that provided the copy paste code had quotes all of their attributes, and as such came to discover yesterday tha, In fact, this was the cause of breaking the feed. How can a blogging engine written in the supposed Text Processing Wonder Language, and even further, been around the game as long as anyone has, not even check for simple things like unquoted attibutes, and instead output broken XML? Okay, rant now over... :)On 3/31/06, A. Pagaltzis [EMAIL PROTECTED] wrote: No. That is only a problem if you just mash markup togetherwithout taking care to preserve base URIs by adding xml:baseat the junction points as necessary.Wellyeah...butagain,pickaplanet,anyplanet. ;) I guess thats kind of my point... Even those of us who make codeboth our life and our livelyhoodhave continued to runintotoomanyissues,and have just given up,orsimplydecidedLetsjustwaittil'AtomreleasesandthenbuildoursystemsfromtheAtomfeedsprovodide by the same folks. In fact, it seems to me that this might be the exact thing taking place, as I seem to be noticing that the overall quality has been getting better as of late. That said, I'm not sure if my concerns are founded on much of anything beyond trying to ensure that the little things that might get in the way can be cleared up and simply not be an issue any longer.As such, if all thats been proven is that my comments and base:concerns are in fact comptetely off base... ;) SWEET!!! :D Copying an atom:entry from one feed to another correctly requiresthat you query the base URI which is in effect in the scope ofthe atom:entry in the source feed, and add an xml:base attributeto that effect to the copied atom:entry in the destination feed. If you do this, any xml:base attributes within the copy of theatom:entry will continue to resolve correctly.Yeah, I can totally see that... In fact I am trying to pull together some XSLT 2.0 transform code that can be used to, in essence, canonicalize the usage of xml:base and the associated URI's, and if I can get a solid hour this morning to finish it off, then I will post the location to this thread so you all can tear it to pieces, such that we quicly develop a simple utlity that can be pushed out to the masses to build a local cache, and keep the base and relative URI pristine clean. This all relates to both the ChannelXML and AtomicRSS code that are my current areas of focus, so It's much easier to get right than copying markup withoutviolating namespace-wellformedness, even.I agree. As long as all of the proper URI's are in place, is just a matter of determining the proper base value in regards to the URI's that each will relate to, and for the most part be able to call it good. You up for PowerHack ExtremeXSLT session sometime in the next few hours? :D
Re: Does xml:base apply to type=html content?
Cool... Lets do it... I'm starting with Atom as I have already been working on a RSS to Atom transform, and it only takes 10 or so test feeds for you to realize this isn't a little 15 minute throw it together and expect it to just work type project, which by simple habit, when you have control over the source XML such that you can ensure a proper level of quality, thats definitely one of the habits that I am learning to unlearn as the once it works, its almost certain to always work just doesn't work with each of the RSS 0.XX.XXX.XX formats being thrown at you without any sort of sense of comfort that There IS an and to all of this as I coming to believe this is no longer a creature comfort I will be enjoying in the land of XSLT for quite some time ;) Let me get a few lines of code together, and send it back to this post... any and all comments, and even furthermore, testing anyone can throw at it will be GREATLY appreciated :) On 3/31/06, Antone Roundy [EMAIL PROTECTED] wrote: On Mar 31, 2006, at 7:01 AM, A. Pagaltzis wrote: * M. David Peterson [EMAIL PROTECTED] [2006-03-31 07:55]: I speaking in terms of mashups... If a feed comes from one source, then I would agree...but mashups from both a syndication as well as an application standpoint are become the primary focus of EVERY major vendor. Its in this scenario that I see the problem of assuming the xml:base in current context has any value whatsoever. No. That is only a problem if you just mash markup together without taking care to preserve base URIs by adding xml:base at the junction points as necessary. Copying an atom:entry from one feed to another correctly requires that you query the base URI which is in effect in the scope of the atom:entry in the source feed, and add an xml:base attribute to that effect to the copied atom:entry in the destination feed. If you do this, any xml:base attributes within the copy of the atom:entry will continue to resolve correctly. It's much easier to get right than copying markup without violating namespace-wellformedness, even.Exactly.When creating a mashup feed, there are any number of thingsthat the ... masher(?) has to be careful of--for example:* Getting namespace prefixes right * Creating an atom:source element and putting the right data into it* Ensuring that all entries use the same character encoding* Ensuring that the xml:lang in context is correct* Ensuring that the xml:base in context is correct * If any of the source data isn't Atom, ensuring that all therequired elements exist (...even if the source data IS Atom--younever know when you're going to aggregate from an invalid Atom feed--then you have to decide whether to fix the entry or drop it to make your output correct)If we start assuming that mashers can't do those correctly, then wemay as well not be using Atom, or even XML.If we did a proper jobof specifying Atom, then we should be able to hold publishers' feet to the fire and make them get their feeds right.In Atom, xml:baseis the mechanism used to determine base URIs.-- M:D/M. David Peterson http://www.xsltblog.com/
Re: Does xml:base apply to type=html content?
On Mar 30, 2006, at 9:20 PM, James M Snell wrote: I would agree that, as a best practice, the xml:base should appear on the content element, but implementations need to be prepared to use whatever the in-scope URI is (e.g. if no xml:base is specified, relative refs in the content will be relative to Content-Location or the feeds Request URI). Maybe. Highly error-prone. In other words, consumers of the feed *have* to assume that the current xml:base in context is going to be correct and publishers of the feed simply have to be responsible for Doing The Right Thing. Agreed. I think providing an xml:base in your feed is a best practice. -Tim
Re: Does xml:base apply to type=html content?
Good enough for me :) (although, I had already been convinced of this by the rest of you as well)On 3/31/06, Tim Bray [EMAIL PROTECTED] wrote:On Mar 30, 2006, at 9:20 PM, James M Snell wrote: I would agree that, as a best practice, the xml:base should appear on the content element, but implementations need to be prepared to use whatever the in-scope URI is (e.g. if no xml:base is specified, relative refs in the content will be relative to Content-Location or the feeds Request URI).Maybe.Highly error-prone. In other words, consumers of the feed *have* to assume that the current xml:base in context is going to be correct and publishers of the feed simply have to be responsible for Doing The Right Thing.Agreed.I think providing an xml:base in your feed is a best practice. -Tim-- M:D/M. David Petersonhttp://www.xsltblog.com/
Re: Does xml:base apply to type=html content?
On Mar 30, 2006, at 10:30 PM, James M Snell wrote: Antone Roundy wrote: [snip] 2) If you're consuming Atom and you encounter a relative URI, how should you choose the appropriate base URI with which to resolve it? I think there are only three remotely possible answers to #2: xml:base (including the URI from which the feed was retrieved if xml:base isn't explicitly defined), the URI of the self link, and the URI of the alternate link. Given that Atom explicitly supports xml:base, if it's explicitly defined, it's difficult to justify ignoring it in favor of anything else. There is no basis in any of the specs for using the URI of the self or alternate link as a base uri for resolving relative references in the content. The process for resolving relative references is very clearly defined. Right--my point is: 1) If the original publisher made the mistake of using relative references without explicitly setting xml:base (figuring that consumers could resolve the references relative to the location of the feed), and then the feed got moved or mirrored, one would certainly fail at finding the things the publisher intended to point to if the URI from which the feed was retrieved was used as the base URI, but might succeed by using the self link as the base URI. (I do not advocate doing this as default behavior, as stated below). 2) If the original publisher made the mistake of not even thinking about relative references in the content and therefore didn't set xml:base, the relative references may very well be relative to the location pointed to by the alternate link. For example, the person generating the content may have been thinking my blog entry will appear at http://example.org/blog/2006/03/foo.html, so I can use the relative URL ../../../img/button.gif to point to the image at http://example.org/img/button.gif;. If the alternate link points to http://example.org/blog/2006/03/foo.html, then the consumer that wants to find the image will only succeed by using the alternate link as the base URI. (I do not advocate doing this as default behavior, as stated below). Moral of this story: failing to explicitly set xml:base is bad because it tempts consumers to ignore the spec in order to get what they want. I do not advocate ignoring the spec as default behavior. But honestly, I might give the user of a consuming application the option of overriding the default behavior on specific feeds if they know that the publisher makes the mistake of publishing links relative to the self or alternate link without setting xml:base. I'd LIKE to be able to hold the publisher's feet to the fire and make them fix the feed, but sometimes my users hold MY feet to the fire and make me give them usable workarounds. Antone
Re: Does xml:base apply to type=html content?
Quoting A. Pagaltzis [EMAIL PROTECTED]: * David Powell [EMAIL PROTECTED] [2006-03-31 09:55]: XHTML 1.0 doesn't support xml:base does it? As I understand it, only specs that say that they support xml:base allow you to put xml:base on their elements, but any spec that allows URIrefs has the concept of a base-URI, so for envelope specs such as Atom, you'd expect xml:base in the envelope to set the base-URI for the content. To be honest, I’m not sure about the precise spec interactions for this case. What I do know however is that Gecko respects xml:base in XHTML content. Opera 9 weeklies should do the same... -- Anne van Kesteren http://annevankesteren.nl/
Re: Does xml:base apply to type=html content?
Tim Bray wrote: On Mar 30, 2006, at 9:20 PM, James M Snell wrote: I would agree that, as a best practice, the xml:base should appear on the content element, but implementations need to be prepared to use whatever the in-scope URI is (e.g. if no xml:base is specified, relative refs in the content will be relative to Content-Location or the feeds Request URI). Maybe. Highly error-prone. Not sure what you mean by highly error-prone, but I do know that support for Content-Location in aggregators is essentially non-existent. I've run tests on 16 different aggregators and Snarfer was the only one that supported Content-Location as a base URI. Thunderbird was the next best in that it made use of the Location header when there was a redirect. A couple of others at least used the request URI. However the rest either used the feed alternate link, the element alternate link or the server hostname. Two didn't seem to support relative URIs at all. Aggregators tested: Blogbridge, Bloglines, BottomFeeder, FeedDemon, FeedReader, Google Reader, GreatNews, JetBrains Omea, Netvibes, Newsgator Online, NewzCrawler, RSSBandit, RSSOwl, Sharpreader, Snarfer, and Thunderbird. I also understand there is some debate whether supporting Content-Location is a good idea at all (at least in web browsers). Firefox at one point started adding support, but they determined that it caused problems with broken servers (mostly IIS I believe). I know there was some discussion about doing server detection and working around those servers, but some other issue came up that made them give up the whole idea (I can't remember what). I'm not sure whether any of these issues apply to feed readers. Regards James
RE: Does xml:base apply to type=html content?
This is unfortunate, because HTML itself only allows base elements in the header (one per page). So if anyone wants to build a client that displays more than one item at a time using a standard HTML renderer (and most client render HTML using someone else's renderer, not their own), they have to go groveling in HTML to do URL fixup (or use iframes). In my own case (IE7) case, this isn't that big a deal because we have to grovel in HTML for many other reasons, but I suspect it'd be pain for other clients. My own reading goes like this: Since xml:base is an XML concept, it should apply only to relative references in XML content (including XHTML). From the XML perspective, the HTML content is just a string, so the xml:base should not apply. Sean -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tim Bray Sent: Thursday, March 23, 2006 10:49 AM To: David Powell Cc: Atom Syntax Subject: Re: Does xml:base apply to type=html content? On Mar 23, 2006, at 10:03 AM, David Powell wrote: xml:base applies to type=xhtml content, but I'm not sure whether it is supposed to apply to escaped type=html content? I reckon that it does. RFC4287, section 2: Any element defined by this specification MAY have an xml:base attribute [W3C.REC-xmlbase-20010627]. When xml:base is used in an Atom Document, it serves the function described in section 5.1.1 of [RFC3986], establishing the base URI (or IRI) for resolving any relative references found within the effective scope of the xml:base attribute. Seems pretty clear to me. Yes, the base URI of that HTML is now whatever xml:base said it was -Tim
Re: Does xml:base apply to type=html content?
* Sean Lyndersay [EMAIL PROTECTED] [2006-03-31 04:00]: This is unfortunate, because HTML itself only allows base elements in the header (one per page). So if anyone wants to build a client that displays more than one item at a time using a standard HTML renderer (and most client render HTML using someone else's renderer, not their own), they have to go groveling in HTML to do URL fixup (or use iframes). That’s exactly the problem currently facing Liferea. However, exempting [EMAIL PROTECTED]'html'` content from xml:base processing won’t help. If the items can come from multiple feeds, such as is supported by Liferea, then mixing items from an Atom feed that uses xml:base and other feeds automatically runs into the same issue. In that scenario, either the tag soup from the other feeds must be fixed up so the view can be rendered as XHTML (which supports xml:base in content), or URL fixup needs to be done on the content from the Atom feed so it can be passed to a tag soup renderer. Regards, -- Aristotle Pagaltzis // http://plasmasturm.org/
Re: Does xml:base apply to type=html content?
I have to wonder why xml:base would apply to anything other than the hardlineschemaspecific@hrefattributevalues of the structured document in which the schema directly applys to.Extending this,a good portion of an Atomdocumentisfairlyrigidinregardstowhatisandisnotalloweduntilyoureachthecontentelement.Withinthecontentelementcanbebasicallyanythingaslongasitseither - non-escaped plain text with a @type value set to text, - escaped text,with a @type set to a valid 'text' mime-type- enitity escaped with @type set to html,- xhtmlwrappedinaproperlyxhtml namespaceddiv with @type set to xhtml, - base64 encoded with @type set to the proper media type,or- itsxml with @type set to aproperXMLmime-type. In each of these cases, the only one that shold have even a remote chance of the current value of the @xml:baseincurrentcontextapplyingtoisinlinexml.Butgiventhefactthatthoseofuswhoareinliningxml(thatisn'txhtmlpulledfromareferenceddocument)aredoingsousingacompletely differentnamespace,schema,etc... thenthechancesthat the current @xml:basevalueincontextevenmakingitintotherelatedxmlbeforebeingreplacedbyanother@xml:basevalueisnotallthatgreat.Andifitdoes?Thenitscontextdocumentisgoingtobeit'scontainingAtomfile,inwhichxml:basewouldapply,buttowhat?It'sinadifferentnamespace,hasadifferent schema that applies to it, which would then mean that the chances of theAtomsavvy processor understanding thataparticularelementorattributevalueisaURI,andshouldthereforeapplythecurrent@xml:basevalueincontexttothesevaluesobviouslyisnotsomethingthatfitswithintheconfinesoftheAtomspecicationgiventhefactthattheresnoguarenteethataschemalanguageitevenpartiallyunderstandsisgoingtobeappliedtothecontainedcontenttoactasaURI-guideforthenowlegally BlindasaBAtomprocessor. ;) Withallofthis stated,if you're not all already sick of me, heres one last final pointtohelppushyouovertheedge;) :D The escaped HTML contentcontained within the content element thatDavidwasoriginallyconcerned withis more than likely a copy of all or part of the elements and content contained inside the body tag of the external document referenced by an associated link element,andthereforenoguarenteethatthexml:baseoftheatomfeedisgoingtobeanywhereevenclosetoaccurate.Ofcourse for the Atom feed to validate correctly, thelinkelements @relvalue will need to be either 'alternate', 'via', 'related',oraspecconforming IRI,as'enclosure',ifinline,isbase64encoded,and'self''? Well now that wouldn't apply correctly to a link/@relwhohas a grandparentby the nameof feed, now would it :) So this all brings us down to the last possible scenario... The @src of the content element.It would seem to me that if there is an @xml:basevaluecurrentlyincontext,thenassoonasit reachesthe''characteroftheopeningcontentelement,itnolongerhasjurisdiction... Kind of like a Canadian mounty has to call it quits once He/She reaches to CA/USA borderline... Or something like that anway :)Peace, Love, and all the Atomic Joy you can handle is wished upon all of you :) On 3/30/06, Sean Lyndersay [EMAIL PROTECTED] wrote: This is unfortunate, because HTML itself only allows base elements in the header (one per page). So if anyone wants to build a client that displays more than one item at a time using a standard HTML renderer (and most client render HTML using someone else's renderer, not their own), they have to go groveling in HTML to do URL fixup (or use iframes). In my own case (IE7) case, this isn't that big a deal because we have to grovel in HTML for many other reasons, but I suspect it'd be pain for other clients.My own reading goes like this: Since xml:base is an XML concept, it should apply only to relative references in XML content (including XHTML). From the XML perspective, the HTML content is just a string, so the xml:base should not apply. Sean-Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] ] On Behalf Of Tim BraySent: Thursday, March 23, 2006 10:49 AMTo: David PowellCc: Atom SyntaxSubject: Re: Does xml:base apply to type=html content?On Mar 23, 2006, at 10:03 AM, David Powell wrote: xml:base applies to type=xhtml content, but I'm not sure whether it is supposed to apply to escaped type=html content? I reckon that it does.RFC4287, section 2: Any element defined by this specification MAY have an xml:baseattribute [W3C.REC-xmlbase-20010627].When xml:base is used in anAtom Document, it serves the function described in section 5.1.1 of [RFC3986], establishing the base URI (or IRI) for resolving anyrelative references found within the effective scope of the xml:baseattribute.Seems pretty clear to me.Yes, the base URI of that HTML is now whatever xml:base said it was -Tim-- M:D/M. David Petersonhttp://www.xsltblog.com/
Re: Does xml:base apply to type=html content?
@href attribute *or other attribute or elements who's value CAN or MUST be a URI/IRI* On 3/30/06, M. David Peterson [EMAIL PROTECTED] wrote:I have to wonder why xml:base would apply to anything other than the hardlineschemaspecific@hrefattributevalues of the structured document in which the schema directly applys to.Extending this,a good portion of an Atomdocumentisfairlyrigidinregardstowhatisandisnotalloweduntilyoureachthecontentelement.Withinthecontentelementcanbebasicallyanythingaslongasitseither - non-escaped plain text with a @type value set to text, - escaped text,with a @type set to a valid 'text' mime-type- enitity escaped with @type set to html,- xhtmlwrappedinaproperlyxhtml namespaceddiv with @type set to xhtml, - base64 encoded with @type set to the proper media type,or- itsxml with @type set to aproperXMLmime-type. In each of these cases, the only one that shold have even a remote chance of the current value of the @xml:baseincurrentcontextapplyingtoisinlinexml.Butgiventhefactthatthoseofuswhoareinliningxml(thatisn'txhtmlpulledfromareferenceddocument)aredoingsousingacompletely differentnamespace,schema,etc... thenthechancesthat the current @xml:basevalueincontextevenmakingitintotherelatedxmlbeforebeingreplacedbyanother@xml:basevalueisnotallthatgreat.Andifitdoes?Thenitscontextdocumentisgoingtobeit'scontainingAtomfile,inwhichxml:basewouldapply,buttowhat?It'sinadifferentnamespace,hasadifferent schema that applies to it, which would then mean that the chances of theAtomsavvy processor understanding thataparticularelementorattributevalueisaURI,andshouldthereforeapplythecurrent@xml:basevalueincontexttothesevaluesobviouslyisnotsomethingthatfitswithintheconfinesoftheAtomspecicationgiventhefactthattheresnoguarenteethataschemalanguageitevenpartiallyunderstandsisgoingtobeappliedtothecontainedcontenttoactasaURI-guideforthenowlegally BlindasaBAtomprocessor. ;) Withallofthis stated,if you're not all already sick of me, heres one last final pointtohelppushyouovertheedge;) :D The escaped HTML contentcontained within the content element thatDavidwasoriginallyconcerned withis more than likely a copy of all or part of the elements and content contained inside the body tag of the external document referenced by an associated link element,andthereforenoguarenteethatthexml:baseoftheatomfeedisgoingtobeanywhereevenclosetoaccurate.Ofcourse for the Atom feed to validate correctly, thelinkelements @relvalue will need to be either 'alternate', 'via', 'related',oraspecconforming IRI,as'enclosure',ifinline,isbase64encoded,and'self''? Well now that wouldn't apply correctly to a link/@relwhohas a grandparentby the nameof feed, now would it :) So this all brings us down to the last possible scenario... The @src of the content element.It would seem to me that if there is an @xml:basevaluecurrentlyincontext,thenassoonasit reachesthe''characteroftheopeningcontentelement,itnolongerhasjurisdiction... Kind of like a Canadian mounty has to call it quits once He/She reaches to CA/USA borderline... Or something like that anway :)Peace, Love, and all the Atomic Joy you can handle is wished upon all of you :) On 3/30/06, Sean Lyndersay [EMAIL PROTECTED] wrote: This is unfortunate, because HTML itself only allows base elements in the header (one per page). So if anyone wants to build a client that displays more than one item at a time using a standard HTML renderer (and most client render HTML using someone else's renderer, not their own), they have to go groveling in HTML to do URL fixup (or use iframes). In my own case (IE7) case, this isn't that big a deal because we have to grovel in HTML for many other reasons, but I suspect it'd be pain for other clients.My own reading goes like this: Since xml:base is an XML concept, it should apply only to relative references in XML content (including XHTML). From the XML perspective, the HTML content is just a string, so the xml:base should not apply. Sean-Original Message-From: [EMAIL PROTECTED] [mailto: [EMAIL PROTECTED] ] On Behalf Of Tim BraySent: Thursday, March 23, 2006 10:49 AMTo: David PowellCc: Atom SyntaxSubject: Re: Does xml:base apply to type=html content?On Mar 23, 2006, at 10:03 AM, David Powell wrote: xml:base applies to type=xhtml content, but I'm not sure whether it is supposed to apply to escaped type=html content? I reckon that it does.RFC4287, section 2: Any element defined by this specification MAY have an xml:baseattribute [W3C.REC-xmlbase-20010627].When xml:base is used in anAtom Document, it serves the function described in section 5.1.1 of [RFC3986], establishing the base URI (or IRI) for resolving anyrelative references found within the effective scope of the xml:baseattribute.Seems pretty clear to me.Yes, the base URI of that HTML is now whatever xml:base said it was -Tim-- M:D/M. David Peterson http://www.xsltblog.com/ -- M:D/M. David Petersonhttp://www.xsltblog.com/
Re: Does xml:base apply to type=html content?
Oopps Canadian *M*ount*ie*Sorry Tim! :)On 3/30/06, M. David Peterson [EMAIL PROTECTED] wrote: @href attribute *or other attribute or elements who's value CAN or MUST be a URI/IRI* On 3/30/06, M. David Peterson [EMAIL PROTECTED] wrote:I have to wonder why xml:base would apply to anything other than the hardlineschemaspecific@hrefattributevalues of the structured document in which the schema directly applys to.Extending this,a good portion of an Atomdocumentisfairlyrigidinregardstowhatisandisnotalloweduntilyoureachthecontentelement.Withinthecontentelementcanbebasicallyanythingaslongasitseither - non-escaped plain text with a @type value set to text, - escaped text,with a @type set to a valid 'text' mime-type- enitity escaped with @type set to html,- xhtmlwrappedinaproperlyxhtml namespaceddiv with @type set to xhtml, - base64 encoded with @type set to the proper media type,or- itsxml with @type set to aproperXMLmime-type. In each of these cases, the only one that shold have even a remote chance of the current value of the @xml:baseincurrentcontextapplyingtoisinlinexml.Butgiventhefactthatthoseofuswhoareinliningxml(thatisn'txhtmlpulledfromareferenceddocument)aredoingsousingacompletely differentnamespace,schema,etc... thenthechancesthat the current @xml:basevalueincontextevenmakingitintotherelatedxmlbeforebeingreplacedbyanother@xml:basevalueisnotallthatgreat.Andifitdoes?Thenitscontextdocumentisgoingtobeit'scontainingAtomfile,inwhichxml:basewouldapply,buttowhat?It'sinadifferentnamespace,hasadifferent schema that applies to it, which would then mean that the chances of theAtomsavvy processor understanding thataparticularelementorattributevalueisaURI,andshouldthereforeapplythecurrent@xml:basevalueincontexttothesevaluesobviouslyisnotsomethingthatfitswithintheconfinesoftheAtomspecicationgiventhefactthattheresnoguarenteethataschemalanguageitevenpartiallyunderstandsisgoingtobeappliedtothecontainedcontenttoactasaURI-guideforthenowlegally BlindasaBAtomprocessor. ;) Withallofthis stated,if you're not all already sick of me, heres one last final pointtohelppushyouovertheedge;) :D The escaped HTML contentcontained within the content element thatDavidwasoriginallyconcerned withis more than likely a copy of all or part of the elements and content contained inside the body tag of the external document referenced by an associated link element,andthereforenoguarenteethatthexml:baseoftheatomfeedisgoingtobeanywhereevenclosetoaccurate.Ofcourse for the Atom feed to validate correctly, thelinkelements @relvalue will need to be either 'alternate', 'via', 'related',oraspecconforming IRI,as'enclosure',ifinline,isbase64encoded,and'self''? Well now that wouldn't apply correctly to a link/@relwhohas a grandparentby the nameof feed, now would it :) So this all brings us down to the last possible scenario... The @src of the content element.It would seem to me that if there is an @xml:basevaluecurrentlyincontext,thenassoonasit reachesthe''characteroftheopeningcontentelement,itnolongerhasjurisdiction... Kind of like a Canadian mounty has to call it quits once He/She reaches to CA/USA borderline... Or something like that anway :)Peace, Love, and all the Atomic Joy you can handle is wished upon all of you :) On 3/30/06, Sean Lyndersay [EMAIL PROTECTED] wrote: This is unfortunate, because HTML itself only allows base elements in the header (one per page). So if anyone wants to build a client that displays more than one item at a time using a standard HTML renderer (and most client render HTML using someone else's renderer, not their own), they have to go groveling in HTML to do URL fixup (or use iframes). In my own case (IE7) case, this isn't that big a deal because we have to grovel in HTML for many other reasons, but I suspect it'd be pain for other clients.My own reading goes like this: Since xml:base is an XML concept, it should apply only to relative references in XML content (including XHTML). From the XML perspective, the HTML content is just a string, so the xml:base should not apply. Sean-Original Message-From: [EMAIL PROTECTED] [mailto: [EMAIL PROTECTED] ] On Behalf Of Tim BraySent: Thursday, March 23, 2006 10:49 AMTo: David PowellCc: Atom SyntaxSubject: Re: Does xml:base apply to type=html content?On Mar 23, 2006, at 10:03 AM, David Powell wrote: xml:base applies to type=xhtml content, but I'm not sure whether it is supposed to apply to escaped type=html content? I reckon that it does.RFC4287, section 2: Any element defined by this specification MAY have an xml:baseattribute [W3C.REC-xmlbase-20010627].When xml:base is used in anAtom Document, it serves the function described in section 5.1.1 of [RFC3986], establishing the base URI (or IRI) for resolving anyrelative references found within the effective scope of the xml:baseattribute.Seems pretty clear to me.Yes, the base URI of that HTML is now whatever xml:base said it was -Tim-- M:D/M. David Peterson http://www.xsltblog.com/ -- M:D/M. David Petersonhttp
Re: Does xml:base apply to type=html content?
In retrospect, it likely would have been a good idea for us to have covered this in the Atom spec. The definition of xml:base does include a statement that [t]he base URI for a URI reference appearing in text content is the base URI of the element containing the text. That would include URI references contained within the escaped HTML markup of Text constructs and the content element. - James Sean Lyndersay wrote: This is unfortunate, because HTML itself only allows base elements in the header (one per page). So if anyone wants to build a client that displays more than one item at a time using a standard HTML renderer (and most client render HTML using someone else's renderer, not their own), they have to go groveling in HTML to do URL fixup (or use iframes). In my own case (IE7) case, this isn't that big a deal because we have to grovel in HTML for many other reasons, but I suspect it'd be pain for other clients. My own reading goes like this: Since xml:base is an XML concept, it should apply only to relative references in XML content (including XHTML). From the XML perspective, the HTML content is just a string, so the xml:base should not apply. Sean -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tim Bray Sent: Thursday, March 23, 2006 10:49 AM To: David Powell Cc: Atom Syntax Subject: Re: Does xml:base apply to type=html content? On Mar 23, 2006, at 10:03 AM, David Powell wrote: xml:base applies to type=xhtml content, but I'm not sure whether it is supposed to apply to escaped type=html content? I reckon that it does. RFC4287, section 2: Any element defined by this specification MAY have an xml:base attribute [W3C.REC-xmlbase-20010627]. When xml:base is used in an Atom Document, it serves the function described in section 5.1.1 of [RFC3986], establishing the base URI (or IRI) for resolving any relative references found within the effective scope of the xml:base attribute. Seems pretty clear to me. Yes, the base URI of that HTML is now whatever xml:base said it was -Tim
Re: Does xml:base apply to type=html content?
Then it should be a best practice thatifthey invoke this,thexml:basevalueshouldbesetupontheelementcontainingthetext,inthiscase,thecontentelement. Obviously youcan'tsimplyassumethatthecurrentxml:baseincontexthas anydirectrelation,andthereforevaluetothecurrententry/contentincontext,as,usingAristotle'susecase(andabillionothersjustlikeit -- if not a billion now, it won't be too long before that number is quite realistic, and in fact only scratching the Atom feed surface of the not too distant future),thereisnowaythatonecansimplyassumethatthecurrent@xml:basevalueislegit. It seems to me that this current definition of xml:base didn't take into consideration the fact that the world would soon be revolving around XML mashups,allofwhichcancontainanynumberofpossiblecombinationsofURI'sofwhichmayhaveabsolutelynothingevenremotelyincommonwithanother. Seems like maybe its time for a quick update to the xml:base definition, as this is not just an issue that effects Atom syndication feeds.On 3/30/06, James M Snell [EMAIL PROTECTED] wrote: In retrospect, it likely would have been a good idea for us to havecovered this in the Atom spec.The definition of xml:base does includea statement that [t]he base URI for a URI reference appearing in text content is the base URI of the element containing the text.That wouldinclude URI references contained within the escaped HTML markup of Textconstructs and the content element.- JamesSean Lyndersay wrote: This is unfortunate, because HTML itself only allows base elements in the header (one per page). So if anyone wants to build a client that displays more than one item at a time using a standard HTML renderer (and most client render HTML using someone else's renderer, not their own), they have to go groveling in HTML to do URL fixup (or use iframes). In my own case (IE7) case, this isn't that big a deal because we have to grovel in HTML for many other reasons, but I suspect it'd be pain for other clients. My own reading goes like this: Since xml:base is an XML concept, it should apply only to relative references in XML content (including XHTML). From the XML perspective, the HTML content is just a string, so the xml:base should not apply. Sean -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] ] On Behalf Of Tim Bray Sent: Thursday, March 23, 2006 10:49 AM To: David Powell Cc: Atom Syntax Subject: Re: Does xml:base apply to type=html content? On Mar 23, 2006, at 10:03 AM, David Powell wrote: xml:base applies to type=xhtml content, but I'm not sure whether it is supposed to apply to escaped type=html content? I reckon that it does. RFC4287, section 2: Any element defined by this specification MAY have an xml:base attribute [W3C.REC-xmlbase-20010627].When xml:base is used in an Atom Document, it serves the function described in section 5.1.1 of [RFC3986], establishing the base URI (or IRI) for resolving any relative references found within the effective scope of the xml:base attribute. Seems pretty clear to me.Yes, the base URI of that HTML is now whatever xml:base said it was -Tim-- M:D/M. David Petersonhttp://www.xsltblog.com/
Re: Does xml:base apply to type=html content?
Sean Lyndersay wrote: In my own case (IE7) case, this isn't that big a deal because we have to grovel in HTML for many other reasons, but I suspect it'd be pain for other clients. Looking at the results of the Atom XmlBaseConformanceTests [1] mosts of the clients tested seemed capable of handling relative references inside HTML to some extent. Even the ones that don't necessarily pass all the tests at least get enough right to suggest that they're on the right track. IE7 is actually one of the few clients that I would consider to have failed outright. Is the latest beta any better at handling xml:base or do these problems still exist? Regards James [1] http://www.intertwingly.net/wiki/pie/XmlBaseConformanceTests
Re: Does xml:base apply to type=html content?
Antone Roundy wrote: [snip] 2) If you're consuming Atom and you encounter a relative URI, how should you choose the appropriate base URI with which to resolve it? I think there are only three remotely possible answers to #2: xml:base (including the URI from which the feed was retrieved if xml:base isn't explicitly defined), the URI of the self link, and the URI of the alternate link. Given that Atom explicitly supports xml:base, if it's explicitly defined, it's difficult to justify ignoring it in favor of anything else. There is no basis in any of the specs for using the URI of the self or alternate link as a base uri for resolving relative references in the content. The process for resolving relative references is very clearly defined. If xml:base isn't explicitly defined, there may be some justification for using the self link rather than the URI from which the feed was retrieved. It's sloppy on the publisher's part, but might be more likely to succeed in practice. -1. The alternate link is only a possible choice if there is at least one alternate link, and if either there is only one, or there are more than one, and all of them point to documents in the same directory. I'd say it's a fairly weak choice. Conclusion: you've got to resolve relative URIs with respect to SOMETHING, and clearly the best choice is xml:base if it's explicitly defined. If not, the self link and the URI from which the feed is retrieved each have some merit. Wrong. You've got to resolve relative URI's with respect to the proper base URI. Let's reserve the sloppy guessing hacks for specs that actually need them. - James
Re: Does xml:base apply to type=html content?
Yeah, I 100% agree with you on ALL of this... The break down was more to showcase here's the best case scenario thats even remotely possible but we all know that remotely possible and real world reliable are near polar opposites... Actually, I think what you have layed out here has quite a bit of real world merit. Definitely something that can be used to build some sort of foundation on in regards to best practice type efforts. In fact, if you take a look at my last follow-up its obvious we're both in the same chapter, you're just near the end, and I just turned the first page. :) On 3/30/06, Antone Roundy [EMAIL PROTECTED] wrote: On Mar 30, 2006, at 8:34 PM, M. David Peterson wrote: ...the content element can be basically anything as long as its either - non-escaped plain text with a @type value set to text, - escaped text,with a @type set to a valid 'text' mime-type - enitity escaped with @type set to html, - xhtml wrapped in a properly xhtml namespaced div with @type set to xhtml, - base64 encoded with @type set to the proper media type, or - its xml with @type set to a proper XML mime-type. In each of these cases, the only one that shold have even a remote chance of the current value of the @xml:base in current context applying to is inline xml The escaped HTML content contained within the content element that David was originally concerned with is more than likely a copy of all or part of the elements and content contained inside the body tag of the external document referenced by an associated link element, and therefore no guarentee that the xml:base of the atom feed is going to be anywhere even close to accurate.On what basis are you concluding that Atom publishers are more likelyto be smart enough to set xml:base correctly when publishing inline XML than when publishing escaped HTML?What if the source materialis tag soup HTML?You could clean it up and turn it into XHTML orpublish it as is as escaped HTML.Either option is valid, and may bepreferable in some situations.I don't see how any assumptions can be made about the publisher's ability to set xml:base correctly basedon the content type.If you're assuming that xml:base is going to be set only at the topof the Atom document, then it may very well fail to be correct for a lot of the content.But xml:base may also be set at on the entry orcontent element, and could easily be set correctly based on thepublisher's knowledge of the appropriate base URI for the content.Anyway,theoretical arguments aside, there are two questions to answer for the real world:1) If you're publishing Atom, in which content @types can you userelative URIs with reasonable confidence that consumers will applythe base URI correctly?2) If you're consuming Atom and you encounter a relative URI, how should you choose the appropriate base URI with which to resolve it?I think there are only three remotely possible answers to #2:xml:base (including the URI from which the feed was retrieved ifxml:base isn't explicitly defined), the URI of the self link, and the URI of the alternate link.Given that Atom explicitly supportsxml:base, if it's explicitly defined, it's difficult to justifyignoring it in favor of anything else.If xml:base isn't explicitly defined, there may be some justification for using the self link rather than the URI from which the feed wasretrieved.It's sloppy on the publisher's part, but might be morelikely to succeed in practice.The alternate link is only a possible choice if there is at least one alternate link, and if either there is only one, or there are morethan one, and all of them point to documents in the same directory.I'd say it's a fairly weak choice.Conclusion: you've got to resolve relative URIs with respect to SOMETHING, and clearly the best choice is xml:base if it's explicitlydefined. If not, the self link and the URI from which the feed isretrieved each have some merit.If that's the correct answer for #2, then in a reasonably perfect world, the answer to #1 should be that relative URIs should be safeanywhere as long as you're explicitly (and correctly!) definingxml:base.In the real world, I'd guess that more consumingapplications will get it right in inline XML than in escaped HTML. -- M:D/M. David Petersonhttp://www.xsltblog.com/
Re: Does xml:base apply to type=html content?
Yeah, agreed... In fact, I think at this stage of the game my inexperience inunderstandingallthatmustbeconsideredduring the development of a standard as far reaching as Atom both is and, even more so, will be is beginning to show through. None-the-less, this is an area I want to learn as much as I possibly can, so I'm going to simply chill out in the background and take notes for a bit, as you all obviously understand these things FAR beyond what I could even imagine at this stage. My pen and notepad are now firnly in place of where my keyboard once was... well, speaking in the terms just slightyinthe future...like right now..; :)On 3/30/06, James M Snell [EMAIL PROTECTED] wrote: I would agree that, as a best practice, the xml:base should appear onthe content element, but implementations need to be prepared to usewhatever the in-scope URI is (e.g. if no xml:base is specified, relative refs in the content will be relative to Content-Location or the feedsRequest URI).In other words, consumers of the feed *have* to assumethat the current xml:base in context is going to be correct andpublishers of the feed simply have to be responsible for Doing The Right Thing.- JamesM. David Peterson wrote: Then it should be a best practice that if they invoke this, the xml:base value should be set upon the element containing the text, in this case, the content element. Obviously you can't simply assume that the current xml:base in context has any direct relation, and therefore value to the current entry/content in context, as, using Aristotle's use case (and a billion others just like it -- if not a billion now, it won't be too long before that number is quite realistic, and in fact only scratching the Atom feed surface of the not too distant future), there is no way that one can simply assume that the current @xml:base value is legit. It seems to me that this current definition of xml:base didn't take into consideration the fact that the world would soon be revolving around XML mashups, all of which can contain any number of possible combinations of URI's of which may have absolutely nothing even remotely in common with another. Seems like maybe its time for a quick update to the xml:base definition, as this is not just an issue that effects Atom syndication feeds. On 3/30/06, * James M Snell* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: In retrospect, it likely would have been a good idea for us to have covered this in the Atom spec.The definition of xml:base does include a statement that [t]he base URI for a URI reference appearing in text content is the base URI of the element containing the text.That would include URI references contained within the escaped HTML markup of Text constructs and the content element. - James Sean Lyndersay wrote: This is unfortunate, because HTML itself only allows base elements in the header (one per page). So if anyone wants to build a client that displays more than one item at a time using a standard HTML renderer (and most client render HTML using someone else's renderer, not their own), they have to go groveling in HTML to do URL fixup (or use iframes). In my own case (IE7) case, this isn't that big a deal because we have to grovel in HTML for many other reasons, but I suspect it'd be pain for other clients. My own reading goes like this: Since xml:base is an XML concept, it should apply only to relative references in XML content (including XHTML). From the XML perspective, the HTML content is just a string, so the xml:base should not apply. Sean -Original Message- From: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] [mailto: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED]] On Behalf Of Tim Bray Sent: Thursday, March 23, 2006 10:49 AM To: David Powell Cc: Atom Syntax Subject: Re: Does xml:base apply to type=html content? On Mar 23, 2006, at 10:03 AM, David Powell wrote: xml:base applies to type=xhtml content, but I'm not sure whether it is supposed to apply to escaped type=html content? I reckon that it does. RFC4287, section 2: Any element defined by this specification MAY have an xml:base attribute [ W3C.REC-xmlbase-20010627].When xml:base is used in an Atom Document, it serves the function described in section 5.1.1 of [RFC3986], establishing the base URI (or IRI) for resolving any relative references found within the effective scope of the xml:base attribute. Seems pretty clear to me.Yes, the base URI of that HTML is now whatever xml:base said it was -Tim-- M:D/ M. David Peterson http://www.xsltblog.com/ http://www.xsltblog.com/-- M:D/M. David Peterson http://www.xsltblog.com/
Re: Does xml:base apply to type=html content?
I speaking in terms of mashups... If a feed comes from one source, then I would agree... but mashups from both a syndication as well as an application standpoint are become the primary focus of EVERY major vendor. Itsin thisscenario that I see the problem of assuming the xml:baseincurrentcontexthasanyvaluewhatsoever. Pick a planet, any planet, and my point suddenly and immediattelly becomes relavent.On 3/30/06, Antone Roundy [EMAIL PROTECTED] wrote: On Mar 30, 2006, at 10:00 PM, M. David Peterson wrote: Then it should be a best practice that if they invoke this, the xml:base value should be set upon the element containing the text, in this case, the content element.Obviously you can't simply assume that the current xml:base in context has any direct relation, and therefore value to the current entry/content in context, as, using Aristotle's use case (and a billion others just like it -- if not a billion now, it won't be too long before that number is quite realistic, and in fact only scratching the Atom feed surface of the not too distant future), there is no way that one can simply assume that the current @xml:base value is legit. I disagree.The best practice should be to set xml:base explicitlyin any document using relative URIs, and at any point in the documentwhere the relative URIs appear, ensure that the xml:base in context is the correct base URI by overriding it if necessary.If thispractice is followed, and only if this practice is followed, thenconsumers will be able to reliably resolve relative URIs.I see nojustification for assuming that the xml:base in context is invalid and using some other base URI just because xml:base is set somewhereother than the containing element.It's a pretty sorry world if wenot only assume, but operate on the assumption that publishers areand will continue to be that inept. Just to amplify one point: you can't simply assume that the current xml:base in context has any direct relation...What you can't simply assume is that it the xml:base in context doesNOT have any direct relation to the content.Part of the point of XML is that we'll all be better off if consumers rely on publishersdoing things correctly (in this case, getting xml:base right) andhold publishers to it until they get it right.Antone -- M:D/M. David Petersonhttp://www.xsltblog.com/
Re: Does xml:base apply to type=html content?
On 31/3/06 3:08 PM, Antone Roundy [EMAIL PROTECTED] wrote: The escaped HTML content contained within the content element that David was originally concerned with is more than likely a copy of all or part of the elements and content contained inside the body tag of the external document referenced by an associated link element, and therefore no guarentee that the xml:base of the atom feed is going to be anywhere even close to accurate. I'm doing something similar right now, scraping some website that doesn't provide feeds for what I want. I check the html of the page I scraped and if they have a base I use that, else I use the URL I used to fetch the page. The tag soup I extract for each entry contains relative references. I really don't want to go fixing that tag soup so I just stick that base url into xml:base for each entry (and not just at the top of the feed, because I'm scraping paginated results). e.
Does xml:base apply to type=html content?
xml:base applies to type=xhtml content, but I'm not sure whether it is supposed to apply to escaped type=html content? I reckon that it does. Anybody came across this? Any opinions? -- Dave