Re: [whatwg] A new attribute for and low-power devices
On 18/5/09 22:28, Benjamin M. Schwartz wrote: Then I will attempt to convince you. Suppose the additional attribute is a boolean called "decorative", defaulting to "false" if not present. Note that /if/ ARIA roles are ultimately incorporated in text/html, that would more or less duplicate role="presentation". http://www.w3.org/TR/wai-aria/#presentation -- Benjamin Hawkes-Lewis
Re: [whatwg] External document subset support
Hello, I don't want to go too far off topic here, but I'll respond to the points as I do think it illustrates one of the uses of entities (localization)--which would apply to some degree in XHTML (at least for entities) as well as in XML. Kristof Zelechovski wrote: Using entities in XSL to share code was my mistake once too; it is similar to using data members not wrapped in properties in data types. XSL itself provides a better structured approach for code reuse. Unless you're talking about variables, I guess I'd need elaboration, but I don't want to go too far off track on list here... Being able to use localized programming language constructs is at the same time trivial (replace this with that), I think that depends on how familiar the script and language is to you (cognates help many non-English Europeans, whereas the same does not apply elsewhere). To take some of my wife's family younger cousins, for example, who are not particularly educated yet who use computers as many Chinese do, they found it much easier to get a grasp of this "Chinese XHTML" than the English one, even though they had had some previous English instruction. I think actual research would need to be done on this, since it is well possible that only programmer types make it past the barrier to entry, and then, they may be even more inclined to dismiss the benefits for others less skilled; i.e., "I did it, so others should", or they want to get away from their linguistic background distinctiveness, or have perhaps irrational fears that this would lead to their people being satisfied with lower standards, etc. (just as many oppose bilingual education even while it may even help transition students to the mainstream language). expensive (you have to translate the documentation) Not sure what you mean by cost of translating the documentation. Cost for whom? If your audience is intended for that audience--e.g., Chinese code at a Chinese website--who needs to translate anything? On the contrary, they avoid the need to translate... and not that useful (you freeze the language and cut the programmers off from the recent developments in the language). I don't think it would be that hard to update the translating template--it's not that difficult. But I'm definitely not talking about relying on this anyways. There are big advantages to having a common language as far as the ability to learn from others' code from people around the world, etc. But just as I replied to someone on another list who said this was not "semantic", this is very much semantic to those for whom it is their native language--perhaps even more in the spirit of pure XML (though Babelizing semantics even further, no doubt, if people actually starting using this on a large scale, as search engines would have to be aware of either the post-transformation result or the localized XML, etc.). Languages tend to use English keywords regardless of the culture of their designer because: 1. no matter how deep you go, there is always a place where you have to switch to English in order to refer to some precedent technology, Yes, like in my use of (though no doubt browsers could be fairly trivially programmed to recognize localized processing instructions, as well). Anyways, again, I'm in favor of a common language, and would even hope very much that countries around the world could democratically agree on an official standard (including possibly English, which if its use is as widespread and popular as its proponents believe, should have little problem obtaining a democratic majority) so that children will everywhere begin earlier to have access to such a common language. Nevertheless, if you're a beginner, having to deal with one line of English is a lot easier than having to deal with a whole syntax in English, if that's not your native language. I think the fact that a number of open source projects I've encountered still have not only comments but also even variables in the programmer's original language is evidence that there is some desire for convenient localization. If you have tools that translate it before serving the code, it is still available anyways. 2. the English words/roots used in the language design often have a slightly different meaning from the English source, Maybe, but it is much easier to learn a few exceptions which are probably at least related in meaning, than to have to learn something completely foreign. Would you like to learn an Arabic-script XHTML, even if there was a one-to-one mapping from your keyboard already? Of course you could, but you have to admit it would take a little time out for you, especially if you were not already inclined to do coding/markup. It's not only a vocabulary issue here, but a script issue too--moreover, using that script may force you to switch between your keyboard layouts each time you want to make a document. 3. they are suffic
[whatwg] Reserving "id" attribute values?
In order to comply with XML ID requirements in XML, and facilitate future transitions to XML, can HTML 5 explicitly encourage id attribute values to follow this pattern (e.g., disallowing numbers for the starting character)? Also, there is this minor errata: http://www.whatwg.org/specs/web-apps/current-work/#refsCSS21 is broken (in section 3.2) Brett
Re: [whatwg] longdesc [was: A new attribute for and low-power devices]
On Mon, May 18, 2009 at 8:08 PM, Jim Jewett wrote: > The 99% misused is at best debatable. I'm pretty sure that using a > longer human-readable description instead of an URL was once > (admittedly long ago) recommended. In HTML 3.2, longdesc didn't exist: http://www.w3.org/TR/REC-html32#img In HTML 4, it's required to be a URI: http://www.w3.org/TR/html4/struct/objects.html#adef-longdesc-IMG Your other points seem reasonable, though. I'll also note, for the record, that since r25335 in August 2007, MediaWiki doesn't generate bogus longdescs all over the place, so that might improve the statistics if they were run again today.
[whatwg] longdesc [was: A new attribute for and low-power devices]
> In the ~0.1% of images where > longdesc= is used, it's misused literally over 99% of the time: > http://blog.whatwg.org/the-longdesc-lottery Responding for the archive; that blog bost keeps getting cited, but it isn't up to Mark's usual standards. longdesc is not a success story, but neither is it the miserable failure suggested by those numbers. The 99.9% unused is (or at least was) probably close to correct, and is a good thing. I just checked the front page of CNN, where there are 137 images, of which at most one would benefit from a longdesc -- and even that one is pretty questionable. The 99% misused is at best debatable. I'm pretty sure that using a longer human-readable description instead of an URL was once (admittedly long ago) recommended. It worked at least as well with the browsers I tested with at the time. Blanks should be treated the same way as blank alts -- an explicit statement that this image does not need a long description. URLs which are redundant to something else in the area are actually a good thing, since that "something" isn't standardized. (aria-described-by should offer a better solution going forward.) http://wiki.whatwg.org/wiki/Longdesc_usage makes it clear that useful (if not pedantically correct) usage is much greater than 1% of the actual usage. Not as high as it should be, certainly, but still better than, say, the percentage of tables which represent data rather than layout. -jJ
Re: [whatwg] A new attribute for and low-power devices
On Mon, May 18, 2009 at 5:28 PM, Benjamin M. Schwartz wrote: > Authors who are only testing on modern desktops will, as you say, likely > ignore this issue. I therefore fully expect that they will never set this > attribute. Isn't that like saying that authors who are only testing on normal browsers will likely ignore the longdesc= attribute? It seems like most authors do just ignore it, but the ones who don't get it wrong far more often than they get it right. In the ~0.1% of images where longdesc= is used, it's misused literally over 99% of the time: http://blog.whatwg.org/the-longdesc-lottery It thus ends up being so useless for users that even if you do provide a good longdesc, no one will actually use it. There's so little signal and so much noise that screenreader users just don't bother checking it, if they even know that it exists. It thus seems like it would be prudent to wait on implementation experience to see if a new attribute is actually needed here. Adding attributes that don't affect most users is a recipe for widespread misuse. In the worst case, browsers might very well refuse to support the attribute because it's come into wide misuse before any browser actually supports it, so supporting it breaks sites. (I'm pretty sure there are examples of this happening, although I can't think of any offhand.)
Re: [whatwg] A new attribute for and low-power devices
Simon Pieters wrote: > On Mon, 18 May 2009 18:59:01 +0200, Benjamin M. Schwartz > wrote: > >> Simon Pieters wrote: >>> If there is a controls attribute or if scripting is disabled, show >>> controls, else use author-provided scripted button (if any) to play the >>> video. >> >> Consider a webpage in which a side-effect of clicking on some scripted >> button is to trigger a small animation (using ) elsewhere on the >> page. If your browser is configured to show full-screen, this >> webpage will become nearly unusable, because the small animation will >> take >> over the screen every time you click on a button. > > I'm not convinced that this will be a problem in practice. > > >> I am proposing an additional attribute for so that the browser >> will know not to do that. > > I'm not convinced that an additional attribute would solve the problem: > it is likely that some authors would use the attribute incorrectly, > because it doesn't have any effect in their primary testing environment. > If an author sets the attribute where it shouldn't be set, it > effectively makes the video unavailable to users whose UA acts upon the > attribute, which seems bad. Then I will attempt to convince you. Suppose the additional attribute is a boolean called "decorative", defaulting to "false" if not present. Authors who are only testing on modern desktops will, as you say, likely ignore this issue. I therefore fully expect that they will never set this attribute. If the attribute is not set, then most browsers should assume that the video may be of some significance, and ensure that the user can play it. I think the risk of authors accidentally setting "decorative" on critical videos is small. I also think that if a popular mobile browsing platform were to respect this flag, major websites would use it correctly and user experience would be improved. > I think a more effective solution is to give > a non-modal message to the user saying "This page is trying to play a > video. Press the Foo key to play.", or similar. Are you going to pop up a message of this kind for every tag on every page? A page decorated with many small tags in place of animated GIFs is going to be quite difficult to use in a mobile browser where each one is associated with a different approval dialog, and approving causes them to take over the 4-inch screen. --Ben signature.asc Description: OpenPGP digital signature
Re: [whatwg] A new attribute for and low-power devices
On Mon, 18 May 2009 18:59:01 +0200, Benjamin M. Schwartz wrote: Simon Pieters wrote: If there is a controls attribute or if scripting is disabled, show controls, else use author-provided scripted button (if any) to play the video. Consider a webpage in which a side-effect of clicking on some scripted button is to trigger a small animation (using ) elsewhere on the page. If your browser is configured to show full-screen, this webpage will become nearly unusable, because the small animation will take over the screen every time you click on a button. I'm not convinced that this will be a problem in practice. I am proposing an additional attribute for so that the browser will know not to do that. I'm not convinced that an additional attribute would solve the problem: it is likely that some authors would use the attribute incorrectly, because it doesn't have any effect in their primary testing environment. If an author sets the attribute where it shouldn't be set, it effectively makes the video unavailable to users whose UA acts upon the attribute, which seems bad. I think a more effective solution is to give a non-modal message to the user saying "This page is trying to play a video. Press the Foo key to play.", or similar. -- Simon Pieters Opera Software
Re: [whatwg] DOMTokenList is unordered but yet requires sorting
DOMTokenList, as an object, is semantically unordered, therefore an arbitrary ordering can be used for enumeration. The item method of DOMTokenList provides an enumerator and imposes such an ordering. Since no other enumerator is available to counter the claim, it may be tempting to say, as a simplification, that DOMTokenList itself is ordered. I would rather discourage you from putting it that way though because it precludes inventing another enumerator in the future or as an extension. HTH, Chris
Re: [whatwg] A new attribute for and low-power devices
Simon Pieters wrote: > If there is a controls attribute or if scripting is disabled, show > controls, else use author-provided scripted button (if any) to play the > video. Consider a webpage in which a side-effect of clicking on some scripted button is to trigger a small animation (using ) elsewhere on the page. If your browser is configured to show full-screen, this webpage will become nearly unusable, because the small animation will take over the screen every time you click on a button. I am proposing an additional attribute for so that the browser will know not to do that. --Ben signature.asc Description: OpenPGP digital signature
Re: [whatwg] DOMTokenList is unordered but yet requires sorting
On Mon, May 18, 2009 at 00:18, Simon Pieters wrote: > Immagine if it is specified that the order is not relevant and > implementations can use any order (so long as it's stable). So one UA uses > one order and another uses another. Then one of those UAs becomes very > popular. Web pages start to depend on the order of the popular UA (e.g. they > use the first item and expect it to be the "right" one). Now those pages > don't work in the less popular UA and that UA vendor has to reverse engineer > the popular UA and implement the same order. > > The above has happened with the DOM Core .attributes attribute, IIRC. It also happened to for in order for JS objects. Simon, I think you have convinced me at least. I therefore think that a better wording in the spec is to say that DOMTokenList acts as a sorted set. -- erik
Re: [whatwg] A new attribute for and low-power devices
On Mon, 18 May 2009 16:59:03 +0200, Benjamin M. Schwartz wrote: Simon Pieters wrote: Is there a problem with always falling back to the poster image and just play the video (full-screen or on-top) when the user indicates he wants to see the video? If every menu button has a tag associated with it to show a little 3D animation, then (a) how do you indicate to the user that it is a video without disrupting the page layout?, You just show the poster image. and (b) how do you allow the user to request playback without interfering with the function of the button? If there is a controls attribute or if scripting is disabled, show controls, else use author-provided scripted button (if any) to play the video. If the device has a way to show context menus, then you could start playback from the context menu. You could also have separate view that lists all media on the page, similar to Firefox's Media tab in View Page Info. -- Simon Pieters Opera Software
Re: [whatwg] A new attribute for and low-power devices
Simon Pieters wrote: > Is there a problem with always falling back to the poster image and just > play the video (full-screen or on-top) when the user indicates he wants > to see the video? If every menu button has a tag associated with it to show a little 3D animation, then (a) how do you indicate to the user that it is a video without disrupting the page layout?, and (b) how do you allow the user to request playback without interfering with the function of the button? --Ben signature.asc Description: OpenPGP digital signature
Re: [whatwg] Annotating structured data that HTML has no semantics for
On May 18, 2009, at 16:05, Eduard Pascual wrote: On Mon, May 18, 2009 at 10:38 AM, Henri Sivonen wrote: (If we were limited to reasoning about something that we don't have experience with yet, I might believe that people can't be too inept to use prefix-based indirection. However, a decade of actual evidence shows that actual behavior defies reasoning here and prefix-based indirection is something that both authors and implementors get wrong over and over again.) Curious: you refer to "a decade of actual evidence", but you fail to refer to any actual evidence. I'm eager to see that evidence; could you share it with us? Thank you. I thought everyone had seen the confusion. There are pointers at http://wiki.whatwg.org/wiki/Namespace_confusion The wiki page is less than a decade old, so it's length isn't quite that impressive. I have been a Java programmer for some years, and still find that convention absurd, horrible, and annoying. I'll agree that CURIEs are ugly, and maybe hard to understand, but reversed domains are equally ugly and hard to understand. Problems shared by CURIEs, URIs and reverse DNS names: * Long. * Identifiers outlive organization charts. Ehm. CURIEs ain't really long: the main point of prefixes is to make them as short as reasonably possible. You need to consider the length of the prefix declarations, too. Problems that reverse DNS names and URIs don't have but CURIEs have: * Prefix-based indirection. Indirection can't be taken as a problem when most currently used RDFa tools don't use it at all (which proves that they can work without relying on it). What do you mean? Current RDFa tools don't use prefixes? (I understand that if the microdata syntax offered no advantages over RDFa, then it would be a wasted effort to diverge. Which are the advantages it offers? The syntax is simpler for the use cases it was designed for. It uses a simpler conceptual model (trees as opposed to graphs). It allows short token identifiers. It doesn't use prefix-based indirection. It doesn't violate the DOM Consistency Design Principle. Ok, the syntax is simpler for a subset of the use cases; but it leaves entirely out the rest of use cases. What are the rest of the use cases? Why weren't they put forward when Hixie asked for use cases? The DOM Consistency again is not an advantage of the microdata syntax because this could have been fulfilled with other syntaxes as well. It's an advantage over RDFa-in-XHTML-served-as-text/html. It's not an advantage over microformats or may not be an advantage over a speculative yet undefined variation of RDFa. It seems to me that it avoids much of what microformats advocates find objectionable Could you specify, please? Do you mean anything else than WHATWG's almost irrational hate toward CURIEs and everything that involves prefixes? RDFa uses a data model that is an overkill for the use cases. Which use cases? http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-April/019374.html No, it *can't* represent a full RDF model: it has already been shown several times on this thread. That's a feature. What?? Being unable to deal with all the use cases is a feature?? Being simpler while addressing all the use cases is a feature. Wait. Are you refering to microdata as an incremental improvement over RDFa?? IMO, it's rather a decremental enworsement. That depends on the point of view. I'm sensing two major points of view: 1) Graphs are more general than trees. Hence, being able to serialize graphs is better. 2) Graphs are more general than trees. Hence, graphs are harder to design UIs for, harder to traverse and harder for authors to grasp. Hence, if trees are enough to address use cases, we should only enable trees to be serialized. ¬¬ Again, what's your basis to decide that "trees are enough to address use cases"?? Of course, they are enough to solve some use cases, but the convenience of dealing with just trees is not worth sacrificing the needs of those use cases you are arbirarily deciding to ignore. I don't see anything on http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-April/019374.html that doesn't boil down to trees or simple key-value pairs attached to an item. I subscribe to view #2, and it seems that trees are indeed enough for the use cases (that were stipulated by the pro-graph people!). - Microdata can't represent the full RDF data model (while RDFa can): some complex structures are just not expressable with microdata. That's not a use case. That's "theoretical purity". It's not "theoretical purity", it's something simpler: *extensibility*. And, with over two decades between versions of the specs, this is a strong requirement: if a problem is noticed after HTML5 becomes "the standard", it's essential to be able to solve it without waiting 10 or 20 years for HTML6 to come out. Well, you have to commit to some bounds on extensibility
Re: [whatwg] A new attribute for and low-power devices
On Mon, 18 May 2009 16:22:29 +0200, Benjamin M. Schwartz wrote: As I have mentioned earlier, there are some devices that will be unable to render faithfully inline, due to the limitations of hardware video accelerators. However, it occurs to me that there are two essentially different uses for 1. Important content for the webpage. An example would be the central video on a web page whose purpose is to allow users to view that video. This is currently done principally using Adobe Flash and (to a lesser extent) tags. 2. Incidental animations. Examples include decorative elements in a web page's interface, animated sidebar advertisements, and other small page elements of this kind. This was historically a popular use for animated-GIF, though Flash has largely overtaken it here as well. In case 1, a browser on a low-powered device may show the video "full-screen or in an independent resizable window" (to quote the spec). The browser might also show the video at the specified size, but on top of the page, rather than at its "correct" location in the middle of the rendering stack. However, for case 2, showing the video full-screen or moving it to the top of the rendering stack would clearly be a bad idea, as the video does not contain the content of interest to the user. In this case, if browsers cannot display the video as specified, they should probably fall back to the "poster" image. With the current tag definition, browsers will have to grow ugly heuristics for this case, based on video's size, aspect ratio, "loop", and "controls". To avoid this heuristic hack, I suggest that gain an additional attribute to indicate which behavior is preferable. A boolean attribute like "decorative", "incidental", or "significant" would greatly assist browsers in determining the correct behavior. Is there a problem with always falling back to the poster image and just play the video (full-screen or on-top) when the user indicates he wants to see the video? -- Simon Pieters Opera Software
[whatwg] A new attribute for and low-power devices
As I have mentioned earlier, there are some devices that will be unable to render faithfully inline, due to the limitations of hardware video accelerators. However, it occurs to me that there are two essentially different uses for 1. Important content for the webpage. An example would be the central video on a web page whose purpose is to allow users to view that video. This is currently done principally using Adobe Flash and (to a lesser extent) tags. 2. Incidental animations. Examples include decorative elements in a web page's interface, animated sidebar advertisements, and other small page elements of this kind. This was historically a popular use for animated-GIF, though Flash has largely overtaken it here as well. In case 1, a browser on a low-powered device may show the video "full-screen or in an independent resizable window" (to quote the spec). The browser might also show the video at the specified size, but on top of the page, rather than at its "correct" location in the middle of the rendering stack. However, for case 2, showing the video full-screen or moving it to the top of the rendering stack would clearly be a bad idea, as the video does not contain the content of interest to the user. In this case, if browsers cannot display the video as specified, they should probably fall back to the "poster" image. With the current tag definition, browsers will have to grow ugly heuristics for this case, based on video's size, aspect ratio, "loop", and "controls". To avoid this heuristic hack, I suggest that gain an additional attribute to indicate which behavior is preferable. A boolean attribute like "decorative", "incidental", or "significant" would greatly assist browsers in determining the correct behavior. --Ben signature.asc Description: OpenPGP digital signature
Re: [whatwg] External document subset support
Using entities in XSL to share code was my mistake once too; it is similar to using data members not wrapped in properties in data types. XSL itself provides a better structured approach for code reuse. Being able to use localized programming language constructs is at the same time trivial (replace this with that), expensive (you have to translate the documentation) and not that useful (you freeze the language and cut the programmers off from the recent developments in the language). Languages tend to use English keywords regardless of the culture of their designer because: 1. no matter how deep you go, there is always a place where you have to switch to English in order to refer to some precedent technology, 2. the English words/roots used in the language design often have a slightly different meaning from the English source, 3. they are sufficiently few to be learned easily; it may be harder to grasp what they actually mean in the particular context. (Toy languages for children make an exception, of course; however, even children tend to mock them nowadays.) Best regards, Chris
Re: [whatwg] Annotating structured data that HTML has no semanticsfor
Being unable to deal with all use cases sometimes is a feature. For example, regular expressions are unable to recognize all recursive languages; it is a feature. As a compensation for that loss, they do not suffer from the halting problem. HTH, Chris
Re: [whatwg] Annotating structured data that HTML has no semantics for
On May 18, 2009, at 6:05 AM, Eduard Pascual wrote: On Mon, May 18, 2009 at 10:38 AM, Henri Sivonen wrote: On May 14, 2009, at 23:52, Eduard Pascual wrote: On Thu, May 14, 2009 at 3:54 PM, Philip Taylor > wrote: It doesn't matter one syntax or another. But if a syntax already exists (RDFa), building a new syntax should be properly justified. It was at the start of this thread: http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-May/019681.html Ian's initial message goes step by step through the creation of this new syntax; but does *not* mention at all *why* it was being created on the first place. The insight into the choices taken is indeed a good think, and I thank Ian for it; but he omitted to provide insight into the first choice taken: discarding the multiple options already available (not only Microformats and RDFa, but also other less discussed ones such as eRDF, EASE, etc). I think Ian did explain why he discarded RDFa as an option. In the email linked above, Ian Hickson wrote: Another solution we could consider is RDFa: http://damowmow.com/";> Hedral Hedral is a male american domestic shorthair, with a fluffy black fur with white paws and belly. This unfortunately also has a number of problems. - it uses prefixes, which most authors simply do not understand, and which many implementors end up getting wrong (e.g. SearchMonkey hard-coded certain prefixes in its first implementation, Google's handling of RDF blocks for license declarations is all done with regular expressions instead of actually parsing the namespaces, etc). Even if implemented right, namespaces still lead to flaky copy-and-paste behaviour. - it sometimes uses rel="" and sometimes uses property="" and it's hard to know when to use one or the other. - it introduces much more power than is necessary to solve this problem. I believe Microformats were discarded as a solution because the proposed use case was as follows: USE CASE: Annotate structured data that HTML has no semantics for, and which nobody has annotated before, and may never again, for private use or use in a small self-contained community. But Microformats are only intended for widely used and generally agreed upon public vocabularies. The Microformats process is not applicable to private-use/small-community vocabularies. And Microformats define specific vocabularies, not a general way to add new kinds of semantic markup. I expect Microformats experts would agree with this assessment. So I think it is clear why neither Microformats or RDFa were seen as suitable solutions to the use case, even if the matter was addressed somewhat briefly. Regards, Maciej
Re: [whatwg] Annotating structured data that HTML has no semantics for
On Mon, May 18, 2009 at 10:38 AM, Henri Sivonen wrote: > On May 14, 2009, at 23:52, Eduard Pascual wrote: > >> On Thu, May 14, 2009 at 3:54 PM, Philip Taylor >> wrote: >> It doesn't matter one syntax or another. But if a syntax already >> exists (RDFa), building a new syntax should be properly justified. > > It was at the start of this thread: > http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-May/019681.html Ian's initial message goes step by step through the creation of this new syntax; but does *not* mention at all *why* it was being created on the first place. The insight into the choices taken is indeed a good think, and I thank Ian for it; but he omitted to provide insight into the first choice taken: discarding the multiple options already available (not only Microformats and RDFa, but also other less discussed ones such as eRDF, EASE, etc). Sure, there has been a lot of discussion on this topic; and it's possible that the choice was taken as part of such discussions. In any case, I think Ian should have clearly stated the reasons to build a brand new solution when many others have been out for a while and users have been able to try and test them. Please keep in mind that I'm not critizicing the choice itself (at least, not now), but the lack of information and reasoning behind that choice. > >> As >> of now, the only supposed benefit I have heard of for this syntax is >> that it avoids CURIEs... yet it replaces them with reversed domains?? >> Is that a benefit? > > There's no indirection. A decade of Namespaces in XML shows that both > authors and implementors have trouble getting prefix-based indirection > right. Really? I haven't seen any hint about that. Sure, there will be some people who have trouble understanding namespaces, just like there is some people who have trouble understanding why something like "foobar" is wrong. Please, could you quote a source for that claim? I could also claim something like "fifteen years of Java show that reversed domains are error-prone and harmful", and even argue about it; but this kind of arguments, without a serious analisis or study to back them, are completely meaningless and definitely subjective. > > (If we were limited to reasoning about something that we don't have > experience with yet, I might believe that people can't be too inept to use > prefix-based indirection. However, a decade of actual evidence shows that > actual behavior defies reasoning here and prefix-based indirection is > something that both authors and implementors get wrong over and over again.) Curious: you refer to "a decade of actual evidence", but you fail to refer to any actual evidence. I'm eager to see that evidence; could you share it with us? Thank you. > >> I have been a Java programmer for some years, and >> still find that convention absurd, horrible, and annoying. I'll agree >> that CURIEs are ugly, and maybe hard to understand, but reversed >> domains are equally ugly and hard to understand. > > Problems shared by CURIEs, URIs and reverse DNS names: > * Long. > * Identifiers outlive organization charts. Ehm. CURIEs ain't really long: the main point of prefixes is to make them as short as reasonably possible. Good identifiers outlive bad organization charts. Good organization outlives bad identifiers. Good organization and good identifier tend to outlive the context they are used in. > > Problems that reverse DNS names don't have but CURIEs and URIs do have: > * "http://"; 7 characters of even extra length. > * Affordance of dereferencability when mere identifier sementics are meant. A CURIE (at least as typed by an author) doesn't have the "http://": it is a prefix, a colon, and whatever goes after it. Once resolved (ie: after replacing the prefix and colon by what the prefix represents) what you get is no longer a CURIE, but a URI like the ones you'd type in your browser or inside a link's href attribute. Derefercability is not a problem on itself: having more than what is strictly needed can be either irrelevant or an advantage, not a problem. Of course, it *may* be the cause of some actual problem, but in that case you should rather describe the problem itself, so it can be evaluated. > > Problems that reverse DNS names and URIs don't have but CURIEs have: > * Prefix-based indirection. Indirection can't be taken as a problem when most currently used RDFa tools don't use it at all (which proves that they can work without relying on it). Sure, it's not as big an advantage as some may claim it to be. But the ability of indirection itself, even if not 100% guaranteed to work, it is an actual advantage. As a real world example, I have been able to learn about vocabularies I didn't know by following the "links" on prefix declarations in documents using them. > * Violation of the DOM Consistency Design Principle if xmlns:foo used. *if* xmlns:foo is used. Very strong emphasis on the conditional, and on the multiple possibilities that have already been propo
[whatwg] DOM3 Load and Save for simple parsing/serialization?
One more thought... While it is great that innerHTML is being officially standardized, I'm afraid it would be rather hackish to have to use it for parsing and serializing dynamically created content which wasn't destined to make it immediately into the document, if at all. Has any thought been given to standardizing on at least a part of DOM Level 3 Load and Save in HTML5? The API, if simply applied to serialization, would look like this : var ser = DOMImplementationLS.createLSSerializer(); var str = ser.writeToString(document); and like this for parsing to the DOM: var lsParser = DOMImplementationLS.createLSParser(1, null); // 1 for synchronous; null for no schema type var lsInput = DOMImplementationLS.createLSInput(); lsInput.stringData = ''; var doc = lsParser.parse(lsInput); If a revision to the DOM3 module is not in order (which, e.g., simplifies the parsing from a string for simple cases) and the above is considered too cumbersome, maybe some other cross-browser standard could be agreed upon? I think using DOM3 would facilitate readily adding additional aspects of the module in the future (as ECMAScript seems to be positively albeit slowly expanding to ever new uses) and offer familiarity for those working in other contexts with DOM Level 3, while ECMAScript users can still wrap these in their own simpler functions. However, I can also see the desire for something simpler (as I say, maybe an addendum to the L&S module). But I do hope something might be considered, since I find this to be a quite frequent need and do not like relying on feature-checking for non-standard methods in the various browsers as well as being unclear on how to future-proof my code to work with standards-compliant browsers... thanks, Brett
Re: [whatwg] Link rot is not dangerous
On May 18, 2009, at 14:45, Dan Brickley wrote: On 18/5/09 10:34, Henri Sivonen wrote: It seems to me that the positions that RDF applications should "Follow Their Nose" and that link rot is not dangerous (to RDF) are contradictory positions. That's a strong claim. There is certainly a balance to be found between taking advantage of de-referencable URIs and relying on their de-referencability. De-referencing is a privilege not a right, after all. If there's value in apps dereferencing namespace URIs, those URIs going undereferencable leads to loss of value. Hence, link rot would cause loss of value i.e. be 'dangerous' by breaking something. If I lost control of xmlns.com tommorrow, and it became un-rescuably owned by offshore spam-virus-malware pirates, that doesn't change history. For nine years, the FOAF documentation has lived there, and we can use URIs to ask other services about what they saw during that period: http://web.archive.org/web/*/http://xmlns.com/foaf/0.1/ Do any RDF consumer apps that dereference namespace URIs actually fall back on web.archive.org? If I'm a FOAF author, what recourse do I have if URI dereferencing- based functionality breaks in some apps due to xmlns.com going unavailable when other apps have hard-coded xmlns.com URIs so if I simply changed my predicates I'd break existing apps? At least authors who rely on Y!/AOL/Google serving JS libraries can start using a copy of any JS library on another CDN without changing how the script runs. Since there is useful information to know about FOAF properties and terms from its schema and human-oriented docs, it would be a shame if people ignored that. Since domain names can be lost, it would also be a shame if directly de-referencing URIs to the schema was the only way people could find that info. Fortunately, neither is the case. I wasn't talking about people but about apps dereferencing NS URIs to enable their functionality. That link rot hasn't been a practical problem to the Semantic Web community suggests that applications don't really Follow Their Nose in practice. Can anyone point me to a deployed end user application that uses RDF internally and Follows Its Nose? The search site, sindice.com does this: Thanks. Whether you consider sindice.com end-user facing or not, I don't know. I wouldn't characterize it as an end-user app. It exposes terms like "RDF" and "triples" and shows qnames to the user. -- Henri Sivonen hsivo...@iki.fi http://hsivonen.iki.fi/
Re: [whatwg] Annotating structured data that HTML has no semantics for
Henri Sivonen wrote: The interesting question here is whether there's a better system. 1) Centralized allocation of short names. Sounds like "urn:" to me. Registry is defined in RFC 3406. 2) Prefixing a short name by (an abbreviation of) the name of the vocabulary, which makes the probability of collision negligible once the designer has googled to check the probable absence of public collisions at minting time (e.g. "openid.delegate"). Too fragile for disambiguation for my taste. That depends on the choice of the URI scheme. I guess one could use e.g. "data:,foo" URIs as a namespace URI, but why not just use "foo"? URI give you the choice of having something easily referenceable (if you want), or not. Problems that reverse DNS names and URIs don't have but CURIEs have: * Prefix-based indirection. HTML developers regularly have to deal with a much more complicated indirection mechanism (CSS). This would be a persuasive argument if we were reasoning about a feature we don't have experience with yet. However, experience shows prefix-based indirection is too hard. If at the same time CSS isn't too hard, I just have to accept the evidence from the real world even if it defies reasoning. No, I don't think we have evidence that prefix-based indirection is too hard. There are way to many people getting it right. ... Either @prefix or RDFa-profiles would break the network effects of the deployment of outside-of-REC RDFa-in-XHTML-as-text/html, so if breaking network effects is on the table in the form of @prefix and RDFa-profiles, I don't see why microdata wouldn't be on the table as far as network effects go. Introducing @prefix will be much simpler to deploy than introducing a completely different system. That being said, I do agree that the current situation is a mess, and that the RDFa-in-XHTML spec has created it. Given the current situation, the simplest possible solution probably is to live with it, and use xmlns declarations in HTML for the purpose of RDFa as well. BR, Julian
Re: [whatwg] Link rot is not dangerous
On 18/5/09 10:34, Henri Sivonen wrote: On May 15, 2009, at 19:20, Manu Sporny wrote: There have been a number of people now that have gone to great lengths to outline how awful link rot is for CURIEs and the semantic web in general. This is a flawed conclusion, based on the assumption that there must be a single vocabulary document in existence, for all time, at one location. The "flawed" conclusion flows out of "Follow Your Nose" advocacy, and is not flawed if one takes "Follow Your Nose" seriously. It seems to me that the positions that RDF applications should "Follow Their Nose" and that link rot is not dangerous (to RDF) are contradictory positions. That's a strong claim. There is certainly a balance to be found between taking advantage of de-referencable URIs and relying on their de-referencability. De-referencing is a privilege not a right, after all. If I lost control of xmlns.com tommorrow, and it became un-rescuably owned by offshore spam-virus-malware pirates, that doesn't change history. For nine years, the FOAF documentation has lived there, and we can use URIs to ask other services about what they saw during that period: http://web.archive.org/web/*/http://xmlns.com/foaf/0.1/ Since there is useful information to know about FOAF properties and terms from its schema and human-oriented docs, it would be a shame if people ignored that. Since domain names can be lost, it would also be a shame if directly de-referencing URIs to the schema was the only way people could find that info. Fortunately, neither is the case. That link rot hasn't been a practical problem to the Semantic Web community suggests that applications don't really Follow Their Nose in practice. Can anyone point me to a deployed end user application that uses RDF internally and Follows Its Nose? The search site, sindice.com does this: "Yes Sindice dereferences URIs it finds in RDF instance data, including class and property URIs. It performs OWL reasoning using the retrieved information, mostly to infer additional triples based on subclass and subproperty relationships. Doing this helps us to increase recall in queries." (from Richard Cyganiak, who I asked offlist for confirmation) Whether you consider sindice.com end-user facing or not, I don't know. I put in roughly the same category as Google's Social Graph API. But it's a non-trivial implementation that aggregates and integrates a lot of data. BTW here's another use case for identifying properties and classes by URI: we can decentralise the translation of their labels into other languages. Here are some Korean descriptions of FOAF, for example: http://svn.foaf-project.org/foaftown/foaf18n/foaf-kr.rdf cheers, Dan
Re: [whatwg] Annotating structured data that HTML has no semantics for
On May 18, 2009, at 12:18, Julian Reschke wrote: Henri Sivonen wrote: There's no indirection. A decade of Namespaces in XML shows that both authors and implementors have trouble getting prefix-based indirection right. It's true that people get this wrong again and again. But it's also true that lots of developers understand it once for all, and then consistently get it right. The interesting question here is whether there's a better system. 1) Centralized allocation of short names. 2) Prefixing a short name by (an abbreviation of) the name of the vocabulary, which makes the probability of collision negligible once the designer has googled to check the probable absence of public collisions at minting time (e.g. "openid.delegate"). I have been a Java programmer for some years, and still find that convention absurd, horrible, and annoying. I'll agree that CURIEs are ugly, and maybe hard to understand, but reversed domains are equally ugly and hard to understand. Problems shared by CURIEs, URIs and reverse DNS names: * Long. * Identifiers outlive organization charts. That depends on the choice of the URI scheme. I guess one could use e.g. "data:,foo" URIs as a namespace URI, but why not just use "foo"? Problems that reverse DNS names and URIs don't have but CURIEs have: * Prefix-based indirection. HTML developers regularly have to deal with a much more complicated indirection mechanism (CSS). This would be a persuasive argument if we were reasoning about a feature we don't have experience with yet. However, experience shows prefix-based indirection is too hard. If at the same time CSS isn't too hard, I just have to accept the evidence from the real world even if it defies reasoning. The syntax is simpler for the use cases it was designed for. It uses a simpler conceptual model (trees as opposed to graphs). It allows short token identifiers. It doesn't use prefix-based indirection. It doesn't violate the DOM Consistency Design Principle. (devil's advocate argument) - so how does the syntax behave for those use cases it *hasn't* been designed for? That's hard to test, because the use case search has been exhausted for the moment. It seems we'd need to wait to see new use cases to pop up. RDFa uses a data model that is an overkill for the use cases. It would be interesting to understand which use cases that RDFa can do are not supported by "microdata" (I don't understand enough about the subject to try myself), and whether the potential advantage of having a simpler model outweighs the disadvantage of not using network effects and creating a competing syntax. Are there use cases of RDFa that are currently known but that the call for use cases didn't turn up? Either @prefix or RDFa-profiles would break the network effects of the deployment of outside-of-REC RDFa-in-XHTML-as-text/html, so if breaking network effects is on the table in the form of @prefix and RDFa-profiles, I don't see why microdata wouldn't be on the table as far as network effects go. -- Henri Sivonen hsivo...@iki.fi http://hsivonen.iki.fi/
Re: [whatwg] External document subset support
On May 18, 2009, at 11:50, Brett Zamir wrote: Henri Sivonen wrote: On May 18, 2009, at 09:36, Brett Zamir wrote: Section 10.1, "Writing XHTML documents" observes: "According to the XML specification, XML processors are not guaranteed to process the external DTD subset referenced in the DOCTYPE." While this is true, since no doubt the majority of web browsers are already able to process external stylesheets or scripts, might the very useful feature of external entity files, be employed by XHTML 5 as a stricter subset of XML (similar to how XML Namespaces re-annexed the colon character) in order to allow this useful feature to work for XHTML (to have access to HTML entities or other useful entities for one, as well as enable a poor man's localization, etc.)? See http://hsivonen.iki.fi/no-dtd/ explains why DTDs don't work for the Web in the general case. While that is a thoughtful and helpful article, your arguments there mostly relate to validation from a central spec. No, my arguments don't relate to validation but to having to dereference a URI that isn't under the author's control and that gets copied around as boilerplate. Also, as far as heavy server loads for frequent DTDs, entities could be deliberately not defined at a resolvable URL. There are existing XML doctypes out there with resolvable URIs, so you'd need a blacklist to bootstrap such a solution. The same problems of denial-of-service could exist with stylesheet requests, script requests, etc. No, styles and scripts are commonly site-specific, so there isn't a Web-wide single point of failure whose URI gets copied around as boilerplate. Even some sites, like Yahoo, have encouraged referring to their frequently accessed external files to take advantage of caching. At least the serving infrastructure for those URIs has been designed for high load unlike the server for many existing DTD URIs out there. Furthermore, JS libraries have obvious functionality in existing browsers, so it's unlikely that authors would reference JS libraries as part of boilerplate without actually intending to take the perf hit of loading the library. The spec could even insist on same-domain, though I don't see any need for that. Without same-origin (as in not even performing a CORS GET), you'd need to blacklist at least w3.org due to existing references out there. (Note that for security, same-origin/CORS is must-have anyway.) I also disagree with throwing our hands up in the air about character entities (or thinking that the (English-based) HTML ones are sufficient). That's a text input method issue that needs to be solved on the authoring side for text input of all kind--not just text input for writing XML in a text editor. Moreover, the browser with the largest market share offers such support already, and those who depend on it may already view other browsers not supporting the standard as "broken". IE doesn't support XHTML or SVG which are the popular XML formats one might want to load into a browsing context. Loading same-origin DTDs for the purpose of localization is a semi- defensible case, but it's a lot of complexity for a use case that is way on the wrong side of 80/20 on the Web scale. How so? Localized sites are a minority on the Web, and chances that localized Web apps would switch to a client-side localization method that relies on server-side negotiation of the localization and requires XML to work seem dim. Even if it is a niche group which uses TEI, Docbook, etc. or who wants to be able to build say a browser extension which can take advantage of their rich semantics, this is still a use for citizens of the web. If you need a browser extension for content, you shut out users of browsers that don't have the particular extension available. It's like using Flash. If people can push forward with backwards-incompatible technologies like the video element, 3d-animation, or whatever, it seems not much to ask to support the humble external entity file... :) The upside of video and 3D is much more significant than the upside of supporting external DTDs. Besides, if the use case for DTDs is localization within an origin, the server can perform the XML parse and reserialize into DTDless XML. (That's how I've implemented this pattern in the past without client-side support.) That is assuming people are aware of scripting and have access to such resources. Localization with DTDs but without scripting is already tricky, since one would need to tweak conneg. Furthermore, localization with DTDs makes more sense for Web app UIs than static content, and Web apps typically have server-side program code anyway. Wasn't it one of the aims of the likes of XSL, XQuery, and XForms to use a syntax which doesn't require knowledge of an unrelated scripting language (and those are pretty complex examples unlike e
[whatwg] File package protocol and manifest support?
While this may be too far in the game to bring up, I'd very much be interested (and think others would be too) to have a standard means of representing not only individual files, but also groups of files on the web. One application of this would be for a web user to be able to do the following (taking advantage of both offline applications and related somewhat to custom protocols): 1) Click a link in a particular protocol containing a list of files or leading to a manifest file which contains a list of files. Very importantly, the files would NOT need to be from the same site. 2) If the files have not been downloaded already, the browser accesses the files (possibly first decompressing them) to store for offline use. 3) If the files were XML/XHTML, take advantage of any attached XSL, XQuery, or CSS in reassembling them. 4) If the files were SQL, reassemble them in a table-agnostic manner--e.g., allow the user to choose which columns to view and in which order and how many records at a time (including allowing a single-record "flashcard"-like view), also allowing for automated generation of certain columns using JavaScript. 5) If the files included templates, use these for the display and populate for the user to view. 6) Bring the user to a particular view of the pages, starting for example, at a particular paragraph indicated by the link or manifest file, highlight the document or a portion of the targeted page with a certain font and color, etc. It seems limiting that while we can reference individual sites' data at best targeting an existing anchor or predefined customizability, we do not have any built-in way to bookmark and share views of that data over the web. In considering building a Firefox extension to try this as a proof of concept, METS (http://www.loc.gov/standards/mets/ ) seems to have many aspects which could be useful as a base in such a standard, including the useful potential of enabling links to be described for files which may not exist as hyperlinks within the files--i.e., XLink linkbases). Besides this offline packages use, such a language might work just as well to build a standard for hierarchical sitemaps, linkbases, or Gopher 2.0 (and not being limited to its usual web view, equivalent of "icon view" on the desktop, but conceivably allowing "column browser" or tree views for hierarchical data ranging from interlinked genealogies to directories along the lines of http://www.dmoz.org/ or http://dir.yahoo.com ), including for representing files on one's own local system yet leading to other sites. The same manifest files might be browseable directly (e.g., Gopher-mode), being targeted to continguously lead to other such manifest file views until reaching a document (the Gopher-view could optionally remain in sight as the end document loaded), or, as mentioned above, as a cached and integrated offline application (especially where compressed files and SQL were involved). Brett
Re: [whatwg] Annotating structured data that HTML has no semantics for
Henri Sivonen wrote: There's no indirection. A decade of Namespaces in XML shows that both authors and implementors have trouble getting prefix-based indirection right. It's true that people get this wrong again and again. But it's also true that lots of developers understand it once for all, and then consistently get it right. The interesting question here is whether there's a better system. I have been a Java programmer for some years, and still find that convention absurd, horrible, and annoying. I'll agree that CURIEs are ugly, and maybe hard to understand, but reversed domains are equally ugly and hard to understand. Problems shared by CURIEs, URIs and reverse DNS names: * Long. * Identifiers outlive organization charts. That depends on the choice of the URI scheme. Problems that reverse DNS names don't have but CURIEs and URIs do have: * "http://"; 7 characters of even extra length. * Affordance of dereferencability when mere identifier sementics are meant. Again, that depends on the URI scheme. Problems that reverse DNS names and URIs don't have but CURIEs have: * Prefix-based indirection. HTML developers regularly have to deal with a much more complicated indirection mechanism (CSS). * Violation of the DOM Consistency Design Principle if xmlns:foo used. I think there is consensus that this is a drawback, but not about how significant this is. The syntax is simpler for the use cases it was designed for. It uses a simpler conceptual model (trees as opposed to graphs). It allows short token identifiers. It doesn't use prefix-based indirection. It doesn't violate the DOM Consistency Design Principle. (devil's advocate argument) - so how does the syntax behave for those use cases it *hasn't* been designed for? Compared to microformats, microdata defines the processing model and conformance criteria. The microformats community has failed to provide processing model and conformance criteria on similar level of detail. Indeed. The processing model side is perceived to be such a serious issue that the lack of a unified microformats parsing spec is cited as a motivation to use RDFa instead of microformats. Indeed. RDFa uses a data model that is an overkill for the use cases. It would be interesting to understand which use cases that RDFa can do are not supported by "microdata" (I don't understand enough about the subject to try myself), and whether the potential advantage of having a simpler model outweighs the disadvantage of not using network effects and creating a competing syntax. ... BR, Julian
Re: [whatwg] External document subset support
Henri Sivonen wrote: On May 18, 2009, at 09:36, Brett Zamir wrote: Section 10.1, "Writing XHTML documents" observes: "According to the XML specification, XML processors are not guaranteed to process the external DTD subset referenced in the DOCTYPE." While this is true, since no doubt the majority of web browsers are already able to process external stylesheets or scripts, might the very useful feature of external entity files, be employed by XHTML 5 as a stricter subset of XML (similar to how XML Namespaces re-annexed the colon character) in order to allow this useful feature to work for XHTML (to have access to HTML entities or other useful entities for one, as well as enable a poor man's localization, etc.)? See http://hsivonen.iki.fi/no-dtd/ explains why DTDs don't work for the Web in the general case. While that is a thoughtful and helpful article, your arguments there mostly relate to validation from a central spec. Also, as far as heavy server loads for frequent DTDs, entities could be deliberately not defined at a resolvable URL. The same problems of denial-of-service could exist with stylesheet requests, script requests, etc. Even some sites, like Yahoo, have encouraged referring to their frequently accessed external files to take advantage of caching. The spec could even insist on same-domain, though I don't see any need for that. If I give my website out to Slashdot, I shouldn't be surprised when I get "slashdotted", and if I do, that's my fault, not the web's fault. A DTD doesn't need to reference a central location, nor would it be likely that major browsers would fail to use the PUBLIC identifier to avoid checking for the SYSTEM file. I also disagree with throwing our hands up in the air about character entities (or thinking that the (English-based) HTML ones are sufficient). As I said, just because the original spec defined it as optional, does not mean we must perpetually remain stuck in the past, especially in the case of XML-on-the-web which is not going to break a whole lot of browsing uses at all if external DTDs are suddently made possible. Moreover, the browser with the largest market share offers such support already, and those who depend on it may already view other browsers not supporting the standard as "broken". Loading same-origin DTDs for the purpose of localization is a semi-defensible case, but it's a lot of complexity for a use case that is way on the wrong side of 80/20 on the Web scale. How so? And besides localization, there are many other uses such as providing a convenient tool for editors to avoid finding a copyright symbol, etc. Not everyone uses an IDE which makes these available or knows how to use it. I'm assisting such a project which has this issue. And I really don't buy the web/non-web dichotomy which some people make. If there's an offline use, there's an online use, pure and simple. And a client-side-only use as well--to be able to read my own documents, I'd like to do so in a browser--many others besides me like to "live in" their browsers. Even if it is a niche group which uses TEI, Docbook, etc. or who wants to be able to build say a browser extension which can take advantage of their rich semantics, this is still a use for citizens of the web. If people can push forward with backwards-incompatible technologies like the video element, 3d-animation, or whatever, it seems not much to ask to support the humble external entity file... :) Besides, if the use case for DTDs is localization within an origin, the server can perform the XML parse and reserialize into DTDless XML. (That's how I've implemented this pattern in the past without client-side support.) That is assuming people are aware of scripting and have access to such resources. Wasn't it one of the aims of the likes of XSL, XQuery, and XForms to use a syntax which doesn't require knowledge of an unrelated scripting language (and those are pretty complex examples unlike entities)? (Btw, you and I discussed this before, though I didn't get a response from you to my last post: https://bugzilla.mozilla.org/show_bug.cgi?id=22942#c109 ; I don't mean to go off-topic but you might wish to consider or respond to some of its points as well...) best wishes, Brett
Re: [whatwg] Annotating structured data that HTML has no semantics for
On May 14, 2009, at 23:52, Eduard Pascual wrote: On Thu, May 14, 2009 at 3:54 PM, Philip Taylor > wrote: It doesn't matter one syntax or another. But if a syntax already exists (RDFa), building a new syntax should be properly justified. It was at the start of this thread: http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-May/019681.html As of now, the only supposed benefit I have heard of for this syntax is that it avoids CURIEs... yet it replaces them with reversed domains?? Is that a benefit? There's no indirection. A decade of Namespaces in XML shows that both authors and implementors have trouble getting prefix-based indirection right. (If we were limited to reasoning about something that we don't have experience with yet, I might believe that people can't be too inept to use prefix-based indirection. However, a decade of actual evidence shows that actual behavior defies reasoning here and prefix-based indirection is something that both authors and implementors get wrong over and over again.) I have been a Java programmer for some years, and still find that convention absurd, horrible, and annoying. I'll agree that CURIEs are ugly, and maybe hard to understand, but reversed domains are equally ugly and hard to understand. Problems shared by CURIEs, URIs and reverse DNS names: * Long. * Identifiers outlive organization charts. Problems that reverse DNS names don't have but CURIEs and URIs do have: * "http://"; 7 characters of even extra length. * Affordance of dereferencability when mere identifier sementics are meant. Problems that reverse DNS names and URIs don't have but CURIEs have: * Prefix-based indirection. * Violation of the DOM Consistency Design Principle if xmlns:foo used. (I understand that if the microdata syntax offered no advantages over RDFa, then it would be a wasted effort to diverge. Which are the advantages it offers? The syntax is simpler for the use cases it was designed for. It uses a simpler conceptual model (trees as opposed to graphs). It allows short token identifiers. It doesn't use prefix-based indirection. It doesn't violate the DOM Consistency Design Principle. On May 15, 2009, at 14:11, Eduard Pascual wrote: On Thu, May 14, 2009 at 10:17 PM, Maciej Stachowiak wrote: [...] From my cursory study, I think microdata could subsume many of the use cases of both microformats and RDFa. Maybe. But microformats and RDFa can handle *all* of these cases. Again, which are the benefits of creating something entirely new to replace what already exists while it can't even handle all the cases of what it is replacing? Compared to microformats, microdata defines the processing model and conformance criteria. The microformats community has failed to provide processing model and conformance criteria on similar level of detail. The processing model side is perceived to be such a serious issue that the lack of a unified microformats parsing spec is cited as a motivation to use RDFa instead of microformats. It seems to me that it avoids much of what microformats advocates find objectionable Could you specify, please? Do you mean anything else than WHATWG's almost irrational hate toward CURIEs and everything that involves prefixes? RDFa uses a data model that is an overkill for the use cases. but at the same time it seems it can represent a full RDF data model. No, it *can't* represent a full RDF model: it has already been shown several times on this thread. That's a feature. Wait. Are you refering to microdata as an incremental improvement over RDFa?? IMO, it's rather a decremental enworsement. That depends on the point of view. I'm sensing two major points of view: 1) Graphs are more general than trees. Hence, being able to serialize graphs is better. 2) Graphs are more general than trees. Hence, graphs are harder to design UIs for, harder to traverse and harder for authors to grasp. Hence, if trees are enough to address use cases, we should only enable trees to be serialized. I subscribe to view #2, and it seems that trees are indeed enough for the use cases (that were stipulated by the pro-graph people!). - Microdata can't represent the full RDF data model (while RDFa can): some complex structures are just not expressable with microdata. That's not a use case. That's "theoretical purity". - Microdata relies on reversed domains. While some people argue these to be better than CURIEs, they are equally horrendous for the average user, and have the additional disadvantage that they don't map to anything useful (if they map to something at all), while CURIEs map to the descriptions and/or definitions of what they represent. I consider it an advantage that reverse domains don't suggest that you should try dereferencing identifiers as if they were addresses. -- Henri Sivonen hsivo...@iki.fi http://hsivonen.iki.fi/
Re: [whatwg] Link rot is not dangerous
On May 15, 2009, at 19:20, Manu Sporny wrote: There have been a number of people now that have gone to great lengths to outline how awful link rot is for CURIEs and the semantic web in general. This is a flawed conclusion, based on the assumption that there must be a single vocabulary document in existence, for all time, at one location. The "flawed" conclusion flows out of "Follow Your Nose" advocacy, and is not flawed if one takes "Follow Your Nose" seriously. It seems to me that the positions that RDF applications should "Follow Their Nose" and that link rot is not dangerous (to RDF) are contradictory positions. That link rot hasn't been a practical problem to the Semantic Web community suggests that applications don't really Follow Their Nose in practice. Can anyone point me to a deployed end user application that uses RDF internally and Follows Its Nose? (For clarity: I'm not saying that link rot is dangerous to RDF apps. I'm saying that taking the position that it is not dangerous contradicts Follow Your Nose advocacy. I think "Follow Your Nose" is impractical on the Web scale and is alien to naming schemes used in technologies that have been successfully deployed on the Web scale [e.g. HTML, CSS, JavaScript, DOM and Unicode].) - RDFa parsers can be given an override list of legacy vocabularies that will be loaded from disk (from a cached copy). "Cache" means that you can still go find the original and the cache is just nearer. If a cached copy of the vocabulary cannot be found, it can be re- created from scratch if necessary. Do any end user applications that use RDF internally provide a UI for installing local re-creations? On May 15, 2009, at 20:25, Shelley Powers wrote: Also don't lose sight that this is really no more serious an issue than, say, a company originating "com.sun.*" being purchased by another company, named "com.oracle.*". And you can't say, "Well that's not the same", because it is. It's not the same. A Java classloader doesn't "Follow Its Nose". A classloader will find classes in my classpath even if there weren't a server at sun.com. Likewise, http://sun.com/foo RDF predicates would continue to work in applications that don't "Follow Their Nose" even if the server at sun.com disappeared. However, if the com.sun.* classes were renamed to com.oracle.* and the com.sun.* copies withdrawn in a new release of a library, other classes that have been compiled against com.sun.* classes would cease to load. This is analogous to applications programmed to recognize http://web.resource.org/cc/* predicates not recognizing http://creativecommons.org/ns#* predicates. (You can't Follow Your Nose from the former to the latter, BTW.) -- Henri Sivonen hsivo...@iki.fi http://hsivonen.iki.fi/
Re: [whatwg] External document subset support
On May 18, 2009, at 09:36, Brett Zamir wrote: Section 10.1, "Writing XHTML documents" observes: "According to the XML specification, XML processors are not guaranteed to process the external DTD subset referenced in the DOCTYPE." While this is true, since no doubt the majority of web browsers are already able to process external stylesheets or scripts, might the very useful feature of external entity files, be employed by XHTML 5 as a stricter subset of XML (similar to how XML Namespaces re- annexed the colon character) in order to allow this useful feature to work for XHTML (to have access to HTML entities or other useful entities for one, as well as enable a poor man's localization, etc.)? See http://hsivonen.iki.fi/no-dtd/ explains why DTDs don't work for the Web in the general case. Loading same-origin DTDs for the purpose of localization is a semi- defensible case, but it's a lot of complexity for a use case that is way on the wrong side of 80/20 on the Web scale. Besides, if the use case for DTDs is localization within an origin, the server can perform the XML parse and reserialize into DTDless XML. (That's how I've implemented this pattern in the past without client-side support.) -- Henri Sivonen hsivo...@iki.fi http://hsivonen.iki.fi/
Re: [whatwg] DOMTokenList is unordered but yet requires sorting
On Thu, 14 May 2009 22:58:20 +0200, Erik Arvidsson wrote: On Thu, May 14, 2009 at 00:30, Kristof Zelechovski wrote: If a token list represented an ordered set, it could not be sorted to get an item because the host would have to preserve the original (document) order of tokens. The question is why does the set need to be ordered at all as long as it is stable between changes? It would make implementations more efficient with little no no loss in functionality. Immagine if it is specified that the order is not relevant and implementations can use any order (so long as it's stable). So one UA uses one order and another uses another. Then one of those UAs becomes very popular. Web pages start to depend on the order of the popular UA (e.g. they use the first item and expect it to be the "right" one). Now those pages don't work in the less popular UA and that UA vendor has to reverse engineer the popular UA and implement the same order. The above has happened with the DOM Core .attributes attribute, IIRC. -- Simon Pieters Opera Software