Re: [whatwg] microdata questions
On Mon, 10 Feb 2014, Eric Devine wrote: 1. Section 5.5.1 of the Microdata spec prescribes how microdata should be respresented as JSON, but it does provide a MIME type. I'm writing a REST API that I would like to be able to return JSON in microdata format, but I need the client to explicitly request this via the HTTP Accept header. The main concern is to know when to return plain properties as an array with one element. As a general rule I would recommend against using Accept headers to do anything. You're better off making the JSON data its own resource, IMHO. Having said that, as you noted in a later e-mail, the MIME type suggested by the HTML spec is application/microdata+json. http://whatwg.org/html#application/microdata+json 2. Section 5.2.4 does not provide a way to apply a property value to the value attribute of an option element. Is this an oversight, or is there simply not a convincing enough use case for the need? There's not any way currently to make for controls map to microdata. It's not clear exactly what it would mean. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] microdata questions
I found the answer to my first question application/microdata+json from W3C, but I would still appreciate feed back on my second question below. Thanks, Eric On Mon, Feb 10, 2014 at 11:16 AM, Eric Devine devin...@gmail.com wrote: 1. Section 5.5.1 of the Microdata spec prescribes how microdata should be respresented as JSON, but it does provide a MIME type. I'm writing a REST API that I would like to be able to return JSON in microdata format, but I need the client to explicitly request this via the HTTP Accept header. The main concern is to know when to return plain properties as an array with one element. 2. Section 5.2.4 does not provide a way to apply a property value to the value attribute of an option element. Is this an oversight, or is there simply not a convincing enough use case for the need? Thanks for any feedback, Eric Devine
Re: [whatwg] Microdata status
On Wed, May 29, 2013 at 9:39 PM, Michael[tm] Smith m...@w3.org wrote: +Ojan, +Alex Jirka Kosek ji...@kosek.cz, 2013-05-14 17:22 +0200: Hi, are there any plans to change Microdata API? From the following conversation between Chromium developers it's not clear to me whether they consider API itself bad or only their implementation. https://groups.google.com/a/chromium.org/forum/m/#!topic/blink-dev/b54nW_mGSVU Any insight welcomed. Not claiming to speak for anybody on the Chrome/Blink team but as far as that conversation among the Chromium developers, looking at it from the outside at least, my read is that they consider the current API spec to be bad -- not just their implementation. That said, it doesn't seem like anybody in the discussion other than Ojan mentioned anything bad in particular about the API spec. Ojan's comment: I have one concern with the feature as specced is that getItems and the various Collection returning properties/methods all return live NodeLists/Collections. [...] Live NodeLists/Collections impose a large cost on the rest of the codebase and fundamentally make regular DOM operations slower. This concern could be addressed without much of a change to the current API by returning static NodeLists and/or Collections. Hixie, consider this feedback on the API. :) We're very unlikely to implement any new APIs that return live NodeLists/Collections. Whether addressing that would be enough that we'd be want to ship Microdata is unclear to me. Then there's a general comment from Alex: The current micro data API is...poor. I think we should write it off and try again. No opinions in what that means for our impl in the meantime, though (other than it shouldn't ship, of course). I'm happy to put work into a better API if someone will collaborate on impl. So anyway, it looks like the gist from the overall discussion is: They've completely removed the Microdata API implementation from Blink, and unless Alex or somebody else writes up an alternative API proposal they can be happier with, it seems unlikely they're going to be re-implementing anything based on the current Microdata API spec. --Mike -- Michael[tm] Smith http://people.w3.org/mike
Re: [whatwg] Microdata status
Le 30 mai 2013 à 12:39, Michael[tm] Smith a écrit : Alex or somebody else writes up an alternative API proposal they can be happier with, it seems unlikely they're going to be re-implementing anything based on the current Microdata API spec. In the process, if it ever happens, I would love to see something more or less common in between RDFaLite, data-* and microdata. When I explored [1] different ways of expressing the same information, the JS code to access the data is quite different and makes it not very user friendly in the end. [1]: http://dev.opera.com/articles/view/geolocation-html-api/ -- Karl Dubost http://www.la-grange.net/karl/
Re: [whatwg] Microdata status
+Ojan, +Alex Jirka Kosek ji...@kosek.cz, 2013-05-14 17:22 +0200: Hi, are there any plans to change Microdata API? From the following conversation between Chromium developers it's not clear to me whether they consider API itself bad or only their implementation. https://groups.google.com/a/chromium.org/forum/m/#!topic/blink-dev/b54nW_mGSVU Any insight welcomed. Not claiming to speak for anybody on the Chrome/Blink team but as far as that conversation among the Chromium developers, looking at it from the outside at least, my read is that they consider the current API spec to be bad -- not just their implementation. That said, it doesn't seem like anybody in the discussion other than Ojan mentioned anything bad in particular about the API spec. Ojan's comment: I have one concern with the feature as specced is that getItems and the various Collection returning properties/methods all return live NodeLists/Collections. [...] Live NodeLists/Collections impose a large cost on the rest of the codebase and fundamentally make regular DOM operations slower. Then there's a general comment from Alex: The current micro data API is...poor. I think we should write it off and try again. No opinions in what that means for our impl in the meantime, though (other than it shouldn't ship, of course). I'm happy to put work into a better API if someone will collaborate on impl. So anyway, it looks like the gist from the overall discussion is: They've completely removed the Microdata API implementation from Blink, and unless Alex or somebody else writes up an alternative API proposal they can be happier with, it seems unlikely they're going to be re-implementing anything based on the current Microdata API spec. --Mike -- Michael[tm] Smith http://people.w3.org/mike
Re: [whatwg] Microdata feedback
On Thu, 08 Dec 2011 22:04:41 +0100, Ian Hickson i...@hixie.ch wrote: I changed the spec as you suggest. Thanks! -- Philip Jägenstedt Core Developer Opera Software
Re: [whatwg] Microdata feedback
On Sat, 9 Jul 2011, Philip Jägenstedt wrote: On Sat, 09 Jul 2011 01:19:02 +0200, Ian Hickson i...@hixie.ch wrote: On Sat, 9 Jul 2011, Philip Jägenstedt wrote: Step 11 is If current has an itemprop attribute specified, add it to results. but should be If current has one or more property names, add it to results. Property names are defined in http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#property-names Why? If you start with div itemprop=foo, then div.itemProp.remove(foo) would give you div itemprop=. It'd be weird if the element still showed up in the properties collection after removing the only property name. The .properties attribute must return an HTMLPropertiesCollection rooted at the Document node, whose filter matches only elements that have property names, which further filters the results of the algorithm. Similarly, everything that uses the algorithm here does things for each property name, so if itemprop= doesn't have any tokens, nothing happens and it doesn't matter that the algorithm returns it. Ah, I see my misunderstanding. Purely editorial: It would, IMO, be more clear if that check were in the algorithm itself. That's the way it's going to be (has been) implemented since there's no reason to do the filtering as a separate step. Do as you wish. I changed the spec as you suggest. I agree that it's cleaner. I checked and I don't think it'll have any negative side-effects, though it does change the precise number of conformance errors in some invalid documents (not a truly practical concern since conformance checkers are only required to report zero errors if there are none and at least one error if there are any). -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Microdata - Handling the case where a string is upgraded to an object
On Thu, 14 Jul 2011, Tab Atkins Jr. wrote: It seems that this may be a useful problem to solve in Microdata. We can expose either an attribute or a privileged property name for the object's name/title/string representation. Then, when using the .items accessor, objects can be returned with a custom .toString that returns that value, so they can be used as strings in legacy code. So complex properties would need to state the data in two forms, or pick one of subproperties and annoint it as being the special fallback? On Mon, 18 Jul 2011, Philip Jägenstedt wrote: I take it the problem is with code like this: div itemscope itemtype=personspan itemprop=nameFoo Barsson/span/div script var p = document.getItems(person)[0]; alert(p.properties.namedItem(name)[0].itemValue); /script If the HTML changes to div itemscope itemtype=personspan itemprop=name itemscopespan itemprop=givenNameFoo/span span itemprop=familyNameBarsson/span/span/div then the script would be alerting [object HTMLElement] instead of Foo Barsson. Indeed. It's not clear to me what else we would return, especially considering itemref=. On Mon, 18 Jul 2011, Tab Atkins Jr. wrote: Yeah. I suspect this kind of API change is relatively common, and it's the sort of thing that would *always* be painful. In some of the sample vocabularies, there are properties that can either take a string or a structured item as a value. In the latter cases, there's no trivial way to provide a string alternative. As for the solution, are you suggesting that .itemValue return a special object which is like HTMLElement in all regards except for how it toString()s? Yes. Some HTMLElement objects already have a custom toString(). On Tue, 19 Jul 2011, Philip Jägenstedt wrote: Currently, it's spec'd as returning the element itself. This isn't terribly useful, at least I've just checked e.itemScope and then accessed e.properties directly rather than going through e.itemValue.properties. Yeah, it's mostly just so that people can take the itemValue into a local variable, and then manipulate it without having to worry about what type it is until later. Given this, a simpler fix would be to let .itemValue act like .textContent when an itemscope attribute is present. .textContent doesn't necessarily have anything to do with the modelled data. I'm not sure that really makes sense. Still, I'm not sure if it's a good idea. It makes the Microdata model kind of odd if a property is both an item and has a fallback text representation. It will also mask the fact that a text property has been upgraded to an item, somewhat decreasing the chance that the consuming code will be updated. Yeah. And authors would have to make sure the textContent is usable as fallback, which isn't at all a given, IMHO. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] microdata: itemprop in col tag
On Sun, 16 Oct 2011, David Karger wrote: One natural way to represent a collection of structured items is in an html table. this can coexist with microdata, by using tr itemscope and td itemprop tags. But by ignoring the structure of the table, this creates a lot of redundant attribute specification. It would yield cleaner markup if it were possible to use col itemprop=foo to indicate an item property that should be inherited by all cells in the given column. In other words, to assert that any td associated with a col should inherit the itemprop associated with that col . It would yield even cleaner markup if there were a way to indicate that every tr was a distinct itemscope (the common case). For example, to use table itemtype=bar to indicate that each row of the table scopes an item of type bar. Or perhaps table itemscope could be interpreted as asserting a distinct itemscope for each row without specifying a type. But even using just the col inheritance rule, while still placing itemscope in tr tags, would save a quadratic quantity of markup. Yeah, microdata doesn't handle tables well. I'm a little reluctant to add magic to handle tables, because it can make it quite hard to work out what's going on, and it's not clear how common the problem really is. If it turns out to be a common issue, then it's something we should definitely consider, though. On Sun, 16 Oct 2011, Tab Atkins Jr. wrote: Just put an @itemref on each col, pointing to the tds that are part of that column. It's more verbose, but it doesn't rely on special HTML-only rules. That's a possible workaround for now, true. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Microdata getItems()
On 09/08/11 20:48, Ian Hickson wrote: On Tue, 9 Aug 2011, Rob Crowther wrote: Correct. Browsers aren't expected to know about the vocabularies, let alone validate them. Thanks. I think this could be made more clear in the spec. However if I remove itemscope from the element the Opera beta implementation still returns it as a top level microdata item even though it is now invalid. Is this expected behaviour? No. Looks like this was me doing something stupid, Opera is indeed only returning the items with both itemscope and itemtype. Rob
Re: [whatwg] Microdata getItems()
On Tue, 9 Aug 2011, Rob Crowther wrote: I just want to confirm that my understanding of this is correct: getItems() will return a NodeList of top level microdata items and this is irrespective of whether or not the items are actually valid in terms of their type? That is, it is the developer's responsibility to confirm that the vCard has an fn and an n before further processing? Correct. Browsers aren't expected to know about the vocabularies, let alone validate them. One further question - if an itemtype attribute is present there must also be an itemscope. However if I remove itemscope from the element the Opera beta implementation still returns it as a top level microdata item even though it is now invalid. Is this expected behaviour? No. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Microdata - Handling the case where a string is upgraded to an object
On Mon, 18 Jul 2011 22:01:37 +0200, Tab Atkins Jr. jackalm...@gmail.com wrote: On Mon, Jul 18, 2011 at 4:20 AM, Philip Jägenstedt phil...@opera.com As for the solution, are you suggesting that .itemValue return a special object which is like HTMLElement in all regards except for how it toString()s? Yes. Currently, it's spec'd as returning the element itself. This isn't terribly useful, at least I've just checked e.itemScope and then accessed e.properties directly rather than going through e.itemValue.properties. Given this, a simpler fix would be to let .itemValue act like .textContent when an itemscope attribute is present. Still, I'm not sure if it's a good idea. It makes the Microdata model kind of odd if a property is both an item and has a fallback text representation. It will also mask the fact that a text property has been upgraded to an item, somewhat decreasing the chance that the consuming code will be updated. -- Philip Jägenstedt Core Developer Opera Software
Re: [whatwg] Microdata - Handling the case where a string is upgraded to an object
On Thu, 14 Jul 2011 20:49:44 +0200, Tab Atkins Jr. jackalm...@gmail.com wrote: Some IRC discussion this morning concerned the scenario where an API starts by exposing a property as a string, but later wants to change it to be a complex object. This appears to be a reasonably common scenario. For example, a vocabulary with a name property may start with it being a string, and then later change to an object exposing firstname/lastname/etc properties. A vocabulary for a music library may start by having track as a string, then later expanding it to expose the track title, the individual artist, the running time, etc. In a very similar vein, the CSSOM is currently defined to always return property values as strings. We want to instead return complex objects that expose useful information and interfaces specialized on the value's type, however. For compat reasons, we have to use an entirely different accessor in order to expose this type of thing. It seems that this may be a useful problem to solve in Microdata. We can expose either an attribute or a privileged property name for the object's name/title/string representation. Then, when using the .items accessor, objects can be returned with a custom .toString that returns that value, so they can be used as strings in legacy code. Thoughts? There is no items IDL attribute, do you mean getItems() or .itemValue perhaps? I take it the problem is with code like this: div itemscope itemtype=personspan itemprop=nameFoo Barsson/span/div script var p = document.getItems(person)[0]; alert(p.properties.namedItem(name)[0].itemValue); /script If the HTML changes to div itemscope itemtype=personspan itemprop=name itemscopespan itemprop=givenNameFoo/span span itemprop=familyNameBarsson/span/span/div then the script would be alerting [object HTMLElement] instead of Foo Barsson. I'm not sure why this would be a problem. If someone changes the page, then can't they adjust the script to match? Is it extensions and libraries that you're worried about? As for the solution, are you suggesting that .itemValue return a special object which is like HTMLElement in all regards except for how it toString()s? -- Philip Jägenstedt Core Developer Opera Software
Re: [whatwg] Microdata - Handling the case where a string is upgraded to an object
On Mon, Jul 18, 2011 at 4:20 AM, Philip Jägenstedt phil...@opera.com wrote: There is no items IDL attribute, do you mean getItems() or .itemValue perhaps? Yes, sorry. I take it the problem is with code like this: div itemscope itemtype=personspan itemprop=nameFoo Barsson/span/div script var p = document.getItems(person)[0]; alert(p.properties.namedItem(name)[0].itemValue); /script If the HTML changes to div itemscope itemtype=personspan itemprop=name itemscopespan itemprop=givenNameFoo/span span itemprop=familyNameBarsson/span/span/div then the script would be alerting [object HTMLElement] instead of Foo Barsson. I'm not sure why this would be a problem. If someone changes the page, then can't they adjust the script to match? That only works if the page is using its own Microdata, not if someone else is consuming the Microdata. Is it extensions and libraries that you're worried about? Yeah. I suspect this kind of API change is relatively common, and it's the sort of thing that would *always* be painful. As for the solution, are you suggesting that .itemValue return a special object which is like HTMLElement in all regards except for how it toString()s? Yes. ~TJ
Re: [whatwg] Microdata feedback
On Thu, 2011-07-07 at 22:33 +, Ian Hickson wrote: The JSON algorithm now ends the crawl when it hits a loop, and replaces the offending duplicate item with the string ERROR. The RDF algorithm preserves the loops, since doing so is possible with RDF. Turns out the algorithm almost did this already, looks like it was an oversight. It seems to me that this approach creates an incentive for people who want to do RDFesque things to publish deliberately non-conforming microdata content that works the way they want for RDF-based consumers but breaks for non-RDF consumers. If such content abounds and non-RDF consumers are forced to support loopiness but extending the JSON conversion algorithm in ad hoc ways, part of the benefit of microdata over RDFa (treeness) is destroyed and the benefit of being well-defined would be destroyed, too, for non-RDF consumption cases. -- Henri Sivonen hsivo...@iki.fi http://hsivonen.iki.fi/
Re: [whatwg] Microdata feedback
On Tue, 12 Jul 2011 09:41:18 +0200, Henri Sivonen hsivo...@iki.fi wrote: On Thu, 2011-07-07 at 22:33 +, Ian Hickson wrote: The JSON algorithm now ends the crawl when it hits a loop, and replaces the offending duplicate item with the string ERROR. The RDF algorithm preserves the loops, since doing so is possible with RDF. Turns out the algorithm almost did this already, looks like it was an oversight. It seems to me that this approach creates an incentive for people who want to do RDFesque things to publish deliberately non-conforming microdata content that works the way they want for RDF-based consumers but breaks for non-RDF consumers. If such content abounds and non-RDF consumers are forced to support loopiness but extending the JSON conversion algorithm in ad hoc ways, part of the benefit of microdata over RDFa (treeness) is destroyed and the benefit of being well-defined would be destroyed, too, for non-RDF consumption cases. I don't have a strong opinion, but note that even before this change the algorithm produced a non-tree for the Avenue Q example [1] where the adr property is shared between two items using itemref. (In JSON, it is flattened.) If we want to ensure that RDF consumers don't depend on non-treeness, then this should change as well. [1] http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#examples-4 -- Philip Jägenstedt Core Developer Opera Software
Re: [whatwg] Microdata feedback
On Tue, 12 Jul 2011, Henri Sivonen wrote: On Thu, 2011-07-07 at 22:33 +, Ian Hickson wrote: The JSON algorithm now ends the crawl when it hits a loop, and replaces the offending duplicate item with the string ERROR. The RDF algorithm preserves the loops, since doing so is possible with RDF. Turns out the algorithm almost did this already, looks like it was an oversight. It seems to me that this approach creates an incentive for people who want to do RDFesque things to publish deliberately non-conforming microdata content that works the way they want for RDF-based consumers but breaks for non-RDF consumers. If such content abounds and non-RDF consumers are forced to support loopiness but extending the JSON conversion algorithm in ad hoc ways, part of the benefit of microdata over RDFa (treeness) is destroyed and the benefit of being well-defined would be destroyed, too, for non-RDF consumption cases. The problem here is that RDF and microdata have different data models, and RDF cannot represent microdata's data model with fidelity. For example, consider how this converts to RDF and compare it to the microdata equivalent: div itemscope itemtype=http://example.com/; itemid=http://example.com/1; span itemprop=ax/span /div div itemscope itemtype=http://example.com/; itemid=http://example.com/1; span itemprop=bx/span /div There are other things RDF can't represent easily, e.g. it cannot easily represent the order of the values in this item: div itemscope itemtype=http://example.com/; span itemprop=a1/span span itemprop=a2/span /div As such, I suggest we not worry about the itemref= loop case, or that we try to fix all these cases together (not sure how we'd fix them). -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Microdata feedback
On Sat, 09 Jul 2011 01:19:02 +0200, Ian Hickson i...@hixie.ch wrote: On Sat, 9 Jul 2011, Philip Jägenstedt wrote: Step 11 is If current has an itemprop attribute specified, add it to results. but should be If current has one or more property names, add it to results. Property names are defined in http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#property-names Why? If you start with div itemprop=foo, then div.itemProp.remove(foo) would give you div itemprop=. It'd be weird if the element still showed up in the properties collection after removing the only property name. The .properties attribute must return an HTMLPropertiesCollection rooted at the Document node, whose filter matches only elements that have property names, which further filters the results of the algorithm. Similarly, everything that uses the algorithm here does things for each property name, so if itemprop= doesn't have any tokens, nothing happens and it doesn't matter that the algorithm returns it. Ah, I see my misunderstanding. Purely editorial: It would, IMO, be more clear if that check were in the algorithm itself. That's the way it's going to be (has been) implemented since there's no reason to do the filtering as a separate step. Do as you wish. -- Philip Jägenstedt Core Developer Opera Software
Re: [whatwg] Microdata feedback
On Fri, 08 Jul 2011 00:33:14 +0200, Ian Hickson i...@hixie.ch wrote: On Wed, 8 Jun 2011, Tomasz Jamroszczak wrote: I've been looking into Microdata specification and it struck me, that crawling algorithm is so complex, when it comes to expressing simple ideas. I think that foremost the algorithm should be described in the specification with explanation what it's supposed to do, before steps of what exactly is to be done are written. Yeah. Turns out the algorithms involved here are quite badly broken. It was intended to expose the microdata graph as completely as possible while dropping anything that would introduce a loop, at the point where the first repetition would start (so A-B-C=A would break at the =), in the API, in the JSON, and in the conformance rules. I didn't do a good job speccing that, though! I've fixed the algorithms to make sense (I hope). http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#the-properties-of-an-item I had a look at this to verify that it is black-box-equivalent to what Opera has implemented, and only discovered one issue: div itemprop= should not be added to the .properties collection, because it has no properties. My bad for suggesting that the criteria should be the presence of an itemprop attribute, it should be an itemprop attribute containing at least one token. Can you update the spec to match? (I implemented the spec'd algorithm pedantically in https://gitorious.org/microdatajs/microdatajs/commit/217cc34e7e679e2e4ea3e670a0dcdd155a7b9800 for verification, it passes the unit tests with said modification.) On Wed, 29 Jun 2011, Philip Jägenstedt wrote: Note also that other algorithms defined in terms of items and their properties need to handle loopiness in some way. That's currently RDF, vCard and iCal conversion. Perhaps something like loopy item could be defined and those algorithms could skip loopy items wherever they occur? Simply failing is also an acceptable solution, IMO. I fixed vCard with a patch that just outputs AGENT;TYPE=VCARD:ERROR in the case of a loop. (Can only happen if the input is non-conforming, so it doesn't matter if the output is non-conforming.) WFM The vEvent stuff was already loop-safe. The JSON algorithm now ends the crawl when it hits a loop, and replaces the offending duplicate item with the string ERROR. WFM The RDF algorithm preserves the loops, since doing so is possible with RDF. Turns out the algorithm almost did this already, looks like it was an oversight. WFM, but note step 3: Add a mapping from the item item to the subject subject in memory, if there isn't one already. Step 1 guarantees that there is no entry for item, so step 3 can be unconditional. On Wed, 29 Jun 2011, Philip Jägenstedt wrote: Indeed, multiple types doesn't work at all if you want to mix different types. I was assuming that the use case was to extend types, kind of like http://schema.org/Person/Governor. However, it doesn't work all that well even in that case, since there's no way to know which type is the extension of the other and which properties exist only on the extended type. I don't really understand this use case. Can you elaborate on the problem that needs solving here? It's whatever problem http://schema.org/docs/extension.html is trying to solve, which is something like allow people to geek out with more specific vocabularies without interfering with search results. I whined a bit in http://groups.google.com/group/schemaorg-discussion/browse_thread/thread/6de3a1761b115271, the short story being: * extensibility encoded with a microsyntax in the URL, making it not-so-opaque * such URLs make the DOM API less useful Perhaps bending Microdata to accommodate for this is not the best idea. If I were schema.org, I would just encourage people to do this: div itemscope itemtype=http://schema.org/Person; div id=wrapper div itemprop=nameArnold/div div itemscope itemtype=http://example.com/Governor; itemref=wrapper div itemprop=stateCalifornia/div /div /div /div Making extensions unsightly is probably a good thing, to discourage people from going too crazy with it. This way it's also clear which properties only apply to the extended type. -- Philip Jägenstedt Core Developer Opera Software
Re: [whatwg] Microdata feedback
On Fri, 08 Jul 2011 21:31:49 +0200, Ian Hickson i...@hixie.ch wrote: On Fri, 8 Jul 2011, Philip Jägenstedt wrote: On Fri, 08 Jul 2011 00:33:14 +0200, Ian Hickson i...@hixie.ch wrote: On Wed, 8 Jun 2011, Tomasz Jamroszczak wrote: I've been looking into Microdata specification and it struck me, that crawling algorithm is so complex, when it comes to expressing simple ideas. I think that foremost the algorithm should be described in the specification with explanation what it's supposed to do, before steps of what exactly is to be done are written. Yeah. Turns out the algorithms involved here are quite badly broken. It was intended to expose the microdata graph as completely as possible while dropping anything that would introduce a loop, at the point where the first repetition would start (so A-B-C=A would break at the =), in the API, in the JSON, and in the conformance rules. I didn't do a good job speccing that, though! I've fixed the algorithms to make sense (I hope). http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#the-properties-of-an-item I had a look at this to verify that it is black-box-equivalent to what Opera has implemented, and only discovered one issue: div itemprop= should not be added to the .properties collection, because it has no properties. My bad for suggesting that the criteria should be the presence of an itemprop attribute, it should be an itemprop attribute containing at least one token. Can you update the spec to match? What needs updating? As far as I can tell, what you describe is what the spec requires. Step 11 is If current has an itemprop attribute specified, add it to results. but should be If current has one or more property names, add it to results. Property names are defined in http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#property-names Why? If you start with div itemprop=foo, then div.itemProp.remove(foo) would give you div itemprop=. It'd be weird if the element still showed up in the properties collection after removing the only property name. On Wed, 29 Jun 2011, Philip Jägenstedt wrote: Indeed, multiple types doesn't work at all if you want to mix different types. I was assuming that the use case was to extend types, kind of like http://schema.org/Person/Governor. However, it doesn't work all that well even in that case, since there's no way to know which type is the extension of the other and which properties exist only on the extended type. I don't really understand this use case. Can you elaborate on the problem that needs solving here? It's whatever problem http://schema.org/docs/extension.html is trying to solve, which is something like allow people to geek out with more specific vocabularies without interfering with search results. That doesn't seem to be a problem. I don't really understand what problem this is solving. Neither do I. If the problem is just I want to annotate data that isn't defined in this vocabulary, that's already possible using URL property names. If I were schema.org, I would just encourage people to do this: div itemscope itemtype=http://schema.org/Person; div id=wrapper div itemprop=nameArnold/div div itemscope itemtype=http://example.com/Governor; itemref=wrapper div itemprop=stateCalifornia/div /div /div /div That's a bit weird. Why not just:? div itemscope itemtype=http://schema.org/Person; div itemprop=nameArnold/div div itemprop=http://example.com/Governor/state;California/div /div Yeah, that's better, at least when the number of additional attributes is small. It's hard to know without knowing what concrete user problem we're trying to solve here. I'll leave this discussion to the schema.org sponsors and just hope that the method in http://schema.org/docs/extension.html doesn't catch on. -- Philip Jägenstedt Core Developer Opera Software
Re: [whatwg] Microdata feedback
On Mon, 18 Jan 2010 16:24:46 +0100, Jeremy Keith jer...@adactio.com wrote: Hixie wrote: Finally on vCard, the final part of the extraction algorithm goes to great trouble to guess what is the family name and what is the given name. This guess will be broken for transliterated east Asian names (CJKV that I know of, maybe others too). Just saying. Also, why is it important to explicitly add N: for organizations? This is intended to be compatible with Microformats vCard, which has these weird rules. If you think we should remove them, please at least first speak to Tantek and see why he thinks. The fn optimisation pattern isn't intended to catch 100% of cases, just the situation Firstname Lastname or Firstname Middlename Lastname. So if you just use fn (formatted name) and don't use n (name), the name will be extracted/guessed using the optimisation pattern. In cases where the pattern doesn't work (e.g. Anne van Kesteren, or east Asian names) you can still explicitly specify the family name and given name, over-riding the fn optimisation pattern. If you do this, you need to explicitly state this is the name (n) as well as the formatted name (fn). This is going to break badly whenever a template uses vCard microdata and its author either doesn't know the family name and given name (because the data was never collected) or doesn't even consider that the vcard conversion does this funny guesswork. If a social network site or similar does this, then Anne van Kesteren and Zhang Min (fictional name) will have their names messed up with no way of fixing it. At least I haven't seen a site which asks users to both fill in their full name and each component, which is what you need to get this right. Similarly, for organisations, you don't have to explicitly set n (name) if you apply both fn (formatted name) and org (organisation name) to a string. This time, the optimisation pattern assumes that the fn is the name of the organisation. Technically, the n property is *always* required but if you use either of those two optimisation patterns, the n is inferred from fn. If this is just a technical problem with some software requiring N to be present, would it be OK to just output an empty N like for organizations? -- Philip Jägenstedt Core Developer Opera Software
Re: [whatwg] Microdata feedback
On Mon, 18 Jan 2010 13:58:16 +0100, Ian Hickson i...@hixie.ch wrote: I'd like at some point to introduce some sort of semantic textContent that handles br, pre, bdo, dir=, img alt, del, space- collapsing, and newline elimination, but there hasn't been much enthusiasm around the idea, and it's not clear what else it would be good for. I've changed the example, at least, to have it work ok, and added a comment in the example about it. OK. Won't hold my breath for semantic textContent, but it sounds like a good solution. On Thu, 19 Nov 2009, Philip Jägenstedt wrote: In a (slightly edited) Jack Bauer example [1], Chrome, Firefox and presumably Safari has the meta elements moved to head. This will severely break script-based implementation of microdata, which are likely to be used for the time being until the DOM API is implemented natively. I can't see any workaround for this, so I suggest that meta simply not be used for microdata, preferably by making it non-conforming and removing it from the definitions/algorithms. This is a short-term problem that only affects scripted implementations that are shipped with the pages, so the workaround is simple: don't use meta and link. Any implementations outside of the page can just fix their parser to be HTML5-compatible. OK, fair enough. Thanks for all the other fixes, still reviewing the algorithm change... -- Philip Jägenstedt Core Developer Opera Software
Re: [whatwg] Microdata feedback
On Mon, 18 Jan 2010, Aryeh Gregor wrote: On Mon, Jan 18, 2010 at 7:58 AM, Ian Hickson i...@hixie.ch wrote: I've made it redirect to the spec. Could you say that the URL *should* provide human-readable information about the vocabulary? We all know the problems with having centrally-stored machine-readable data about your specs, but encouraging the URL to provide human-readable info seems helpful. (If they aren't supposed to be dereferenced, why use HTTP?) Why indeed. Is there something else we could use instead? Graphs are intended to be supported in v2, using a mechanism You seem to have left this sentence unfinished. ...using a mechanism intended for that purpose. Nothing to see here. :-) On Mon, 18 Jan 2010, Julian Reschke wrote: SHOULD return human-readable information is good, if you also add SHOULD NOT automatically dereference. I've added something akin to that SHOULD NOT, but the spec doesn't have a specification conformance class, so there's nothing to apply the SHOULD to. So I haven't added it. (I don't generally think specifications being conformance classes really makes much sense.) -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Microdata feedback
Hixie wrote: Finally on vCard, the final part of the extraction algorithm goes to great trouble to guess what is the family name and what is the given name. This guess will be broken for transliterated east Asian names (CJKV that I know of, maybe others too). Just saying. Also, why is it important to explicitly add N: for organizations? This is intended to be compatible with Microformats vCard, which has these weird rules. If you think we should remove them, please at least first speak to Tantek and see why he thinks. The fn optimisation pattern isn't intended to catch 100% of cases, just the situation Firstname Lastname or Firstname Middlename Lastname. So if you just use fn (formatted name) and don't use n (name), the name will be extracted/guessed using the optimisation pattern. In cases where the pattern doesn't work (e.g. Anne van Kesteren, or east Asian names) you can still explicitly specify the family name and given name, over-riding the fn optimisation pattern. If you do this, you need to explicitly state this is the name (n) as well as the formatted name (fn). Similarly, for organisations, you don't have to explicitly set n (name) if you apply both fn (formatted name) and org (organisation name) to a string. This time, the optimisation pattern assumes that the fn is the name of the organisation. Technically, the n property is *always* required but if you use either of those two optimisation patterns, the n is inferred from fn. HTH, Jeremy -- Jeremy Keith a d a c t i o http://adactio.com/
Re: [whatwg] Microdata feedback
On Mon, Jan 18, 2010 at 7:58 AM, Ian Hickson i...@hixie.ch wrote: I've made it redirect to the spec. Could you say that the URL *should* provide human-readable information about the vocabulary? We all know the problems with having centrally-stored machine-readable data about your specs, but encouraging the URL to provide human-readable info seems helpful. (If they aren't supposed to be dereferenced, why use HTTP?) Graphs are intended to be supported in v2, using a mechanism You seem to have left this sentence unfinished.
Re: [whatwg] Microdata feedback
Aryeh Gregor wrote: On Mon, Jan 18, 2010 at 7:58 AM, Ian Hickson i...@hixie.ch wrote: I've made it redirect to the spec. Could you say that the URL *should* provide human-readable information about the vocabulary? We all know the problems with having centrally-stored machine-readable data about your specs, but encouraging the URL to provide human-readable info seems helpful. (If they aren't supposed to be dereferenced, why use HTTP?) ... SHOULD return human-readable information is good, if you also add SHOULD NOT automatically dereference. BR, Julian
Re: [whatwg] Microdata DOM API issues
On Sat, 14 Nov 2009 00:34:12 +0100, Tab Atkins Jr. jackalm...@gmail.com wrote: On Fri, Nov 13, 2009 at 5:14 PM, Philip Jägenstedt phil...@opera.com wrote: The itemref mechanism allows creating arbitrary graphs of items, rather than the tree of items that is the intended microdata model (right?). Even though my default reaction to graphs is oh cool, for microdata when the domain model is a graph you should probably just represent it with a level of indirection (RDF). Options: 1. patch the algorithms which can go into recursion 2. patch http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#associating-names-with-items to first check if an itemref'd property creates a loop before adding it to candidates 3. ? I think I prefer 2. Looping in data-graphs is often useful, so I'm not sure I want to throw it out generally. Your statement in the first paragraph I'm quoting, though, says that you'd rather leave loops to be defined in the vocabulary itself? So loops would be done by, frex, itemprop'ing a link to the other element rather than itemref'ing the other element directly? Yes, that's basically what I'm saying. One option is to simply use microdata such that the RDF you extract is the graph you want (it will probably look quite ugly though). Another is always referencing subitems by a mechanism other than refid. For example, in the MusicBrainz XML webservice when an artist contains a release which itself references artists (e.g. as the producer), a stub item is used with only artist name and id, rather than including all information recursively. In microdata I would do: section itemscope itemtype=http://musicbrainz.org/artist/; itemid=http://musicbrainz.org/artist/4d5447d7-c61c-4120-ba1b-d7f471d385b9; h1 itemprop=nameJohn Lennon/h1 section h1Releases/h1 section itemprop=release itemscope itemtype=http://musicbrainz.org/release/; itemid=http://musicbrainz.org/release/f237e6a0-4b0e-4722-8172-66f4930198bc; h1Imagine/h1 Producer: span itemprop=producer itemscope itemtype=http://musicbrainz.org/artist/; itemid=http://musicbrainz.org/artist/e7b587f7-e678-47c1-81dd-e7bb7855b0f9; span itemprop=namePhil Spector/span/span /section /section /section Even if John Lennon were the producer here, you don't get any looping in the microdata itself. If you want to know everything about the producer, you should just follow the itemid... I haven't looked that much at the RDF extraction algorithm yet, but I think this example might even create the proper graph with loops if the producer were John Lennon. That would probably be fine, and is compatible with a tree-based data model like JSON. Vocabs should know when loops are permissible/desirable for themselves. I agree, I don't see that we have a problem here. -- Philip Jägenstedt Opera Software
Re: [whatwg] Microdata DOM API issues
On Thu, 12 Nov 2009 03:23:54 +0100, Philip Jägenstedt phil...@opera.com wrote: Why are the algorithms for extracting RDF gone? All that's left is the book example with the equivalent Turtle, but it would be nice if it were actually defined how to extract RDF. The same for the JSON stuff, was that no good? D'oh! I've been reading the multipage version and missed that it's on another page: http://www.whatwg.org/specs/web-apps/current-work/multipage/converting-html-to-other-formats.html I'll have to try implementing that and see if there are any more issues. -- Philip Jägenstedt Opera Software
Re: [whatwg] Microdata DOM API issues
On Fri, 13 Nov 2009 19:27:39 +0100, Philip Jägenstedt phil...@opera.com wrote: On Thu, 12 Nov 2009 03:23:54 +0100, Philip Jägenstedt phil...@opera.com wrote: Why are the algorithms for extracting RDF gone? All that's left is the book example with the equivalent Turtle, but it would be nice if it were actually defined how to extract RDF. The same for the JSON stuff, was that no good? D'oh! I've been reading the multipage version and missed that it's on another page: http://www.whatwg.org/specs/web-apps/current-work/multipage/converting-html-to-other-formats.html I'll have to try implementing that and see if there are any more issues. http://www.whatwg.org/specs/web-apps/current-work/multipage/converting-html-to-other-formats.html#json This was easy to implement, but the algorithm isn't guaranteed to terminate. div itemscope div itemprop=foo itemscope itemref=oops id=oops/div /div This simple input causes the algorithm to recurse as the item references itself. I went back to the vCard algorithm and found that it too will fail to terminate with this input: span itemscope itemtype=http://microformats.org/profile/hcard; span itemprop=agent itemscope id=oops itemref=oops itemtype=http://microformats.org/profile/hcard; /span vEvent is safe as the algorithm never recurses, but the RDF conversion algorithm would hit the same problem. It's certainly possible to create loops which are less easy to spot: div itemscope div itemprop=prop1 itemscope itemref=id2 id=id1/div div itemprop=prop2 itemscope itemref=id3 id=id2/div ... div itemprop=propn itemscope itemref=id1 id=idn/div /div Or this: div itemscope div itemprop=foo itemscope id=a div itemprop=bar itemscope itemref=a/div /div /div The itemref mechanism allows creating arbitrary graphs of items, rather than the tree of items that is the intended microdata model (right?). Even though my default reaction to graphs is oh cool, for microdata when the domain model is a graph you should probably just represent it with a level of indirection (RDF). Options: 1. patch the algorithms which can go into recursion 2. patch http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#associating-names-with-items to first check if an itemref'd property creates a loop before adding it to candidates 3. ? I think I prefer 2. -- Philip Jägenstedt
Re: [whatwg] Microdata DOM API issues
On Fri, Nov 13, 2009 at 5:14 PM, Philip Jägenstedt phil...@opera.com wrote: The itemref mechanism allows creating arbitrary graphs of items, rather than the tree of items that is the intended microdata model (right?). Even though my default reaction to graphs is oh cool, for microdata when the domain model is a graph you should probably just represent it with a level of indirection (RDF). Options: 1. patch the algorithms which can go into recursion 2. patch http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#associating-names-with-items to first check if an itemref'd property creates a loop before adding it to candidates 3. ? I think I prefer 2. Looping in data-graphs is often useful, so I'm not sure I want to throw it out generally. Your statement in the first paragraph I'm quoting, though, says that you'd rather leave loops to be defined in the vocabulary itself? So loops would be done by, frex, itemprop'ing a link to the other element rather than itemref'ing the other element directly? That would probably be fine, and is compatible with a tree-based data model like JSON. Vocabs should know when loops are permissible/desirable for themselves. ~TJ
Re: [whatwg] Microdata feedback
On Wed, 14 Oct 2009 13:53:46 +0200, Ian Hickson i...@hixie.ch wrote: On Fri, 21 Aug 2009, Philip Jägenstedt wrote: Shouldn't namedItem [6] be namedItems? Code like .namedItem().item(0) would be quite confusing. [6] http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#dom-htmlpropertycollection-nameditem I don't understand what this is referring to. I was incorrectly under the impressions that .namedItem on other collections always returned a single element and arguing that since HTMLPropertyCollection.namedItem always returns a PropertyNodeList namedItems in plural would make more sense. Now I see that some other namedItem methods aren't as simple as I'd thought, so I'm not sure what to make of it. Is there a reason why HTMLPropertyCollection.namedItem unlike some other collections' .namedItem don't return an element if there is only 1 element in the collection at the time the method is called? Perhaps this is legacy quirks that we don't want to replicate? On Tue, 25 Aug 2009, Philip Jägenstedt wrote: There's something like an inverse relationship between simplicity of the syntax and complexity of the resulting markup, the best balance point isn't clear (to me at least). Perhaps option 3 is better, never allowing item+itemprop on the same element. That would preclude being able to make trees. Given that flat items like vcard/vevent are likely to be the most common use case I think we should optimize for that. Child items can be created by using a predefined item property: itemprop=com.example.childtype item. The value of that property would then be the first item in tree-order (or all items in the subtree, not sure). This way, items would have better copy-paste resilience as the whole item element could be made into a top-level item simply by moving it, without meddling with the itemprop. That sounds kinda confusing... More confusing than item+itemprop on the same element? In many cases the property value is the contained text, having it be the contained item node(s) doesn't seem much stranger. Based on the studies Google did, I'm not convinced that people will find the nesting that complicated. IMHO the proposal above is more confusing, too. I'm not sure this is solving a problem that needs solving. If the parent-item (com.example.blog) doesn't know what the child-items are, it would simply use itemprop=item. I don't understand this at all. This was an attempt to have anonymous sub-items. Re-thinking this, perhaps a better solution would be to have each item behave in much the same way that the document itself does. That is, simply add items in the subtree without using itemprop and access them with .getItems(itemType) on the outer item. How would you do things like agent in the vEvent vocabulary? Comparing the current model with a DOM tree, it seems odd in that a property could be an item. It would be like an element attribute being another element: outer foo=inner//. That kind of thing could just as well be outerfooinner//foo/outer, outerinner type=foo//outer or even outerinner//outer if the relationship between the elements is clear just from the fact that they have a parent-child relationship (usually the case). Microdata's datamodel is more similar to JSON's than XML's. It's only in the case where both itemprop and item have a type that an extra level of nesting will be needed and I expect that to be the exception. Changing the model to something more DOM-tree-like is probably going to be easier to understand for many web developers. I dunno. People didn't seem to have much trouble getting it once we used itemscope= rather than just item=. People understand the JSON datamodel pretty well, why would this be different? After http://blog.whatwg.org/usability-testing-html5, the recent syntax changes, the improved DOM API and the passage of time I'm not very worried about the things I was worrying about above. If there's any specific point that seems valid after another review I'll send separate feedback on it. Thanks for all the other fixes! -- Philip Jägenstedt Opera Software
Re: [whatwg] Microdata
On Aug 22, 2009, at 5:51 PM, Ian Hickson wrote: Based on some of the feedback on Microdata recently, e.g.: http://www.jenitennison.com/blog/node/124 ...and a number of e-mails sent to this list and the W3C lists, I am going to try some tweaks to the Microdata syntax. Google has kindly offered to provide usability testing resources so that we can try a variety of different syntaxes and see which one is easiest for authors to understand. If anyone has any concrete syntax ideas that they would like me to consider, please let me know. There's a (pretty low) limit to how many syntaxes we can perform usability tests on, though, so I won't be able to test every idea. Here's an idea I've been mulling around. I think it would simplify the syntax and semantic model considerably. Why do we need separate items and item properties? They seem to confuse people, when something can be both an item and an itemprop at the same time. They also seem to duplicate a certain amount of information; items can have types, while itemprops can have names, but they both seem to serve about the same role, which is to indicate how to interpret them in the context of page or larger item. What if we just had item, filling both of the roles? The value of the item would be either an associative array of the descendent items (or ones associated using about) if those exists, or the text content of the item (or URL, depending on the tag) if it has no items within it. Here's an example used elsewhere in the thread, marked up as I suggest: section id=bt200x item=com.example.product link item=about href=http://example.com/products/bt200x; h1 item=nameGPS Receiver BT 200X/h1 pRating: #x22C6;#x22C6;#x22C6;#x2729;#x2729; meta item=rating content=2/p pRelease Date: time item=reldate datetime=2009-01-22January 22/time/p p item=reviewa item=reviewer href=http://ln.hixie.ch/;Ian /a: span item=textLots of memory, not much battery, very little accuracy./span/p /section figure item=work img item=about src=image.jpeg legend pcite item=titleMy Pond/cite/p psmallLicensed under the a item=license href=http://www.opensource.org/licenses/mit-license.php;MIT license/a./small /legend /figure pimg subject=bt200x item=image src=bt200x.jpeg alt=.../p This would translate into the following JSON. Note that this is a simpler structure than the existing one proposed for microdata; it is a lot closer to how people generally use JSON natively, rather than using an extra level of nesting to distinguish types and properties: // JSON DESCRIPTION OF MARKED UP DATA // document URL: http://www.example.org/sample/test.html { com.example.product: [ { about: [ http://example.com/products/bt200x; ], image: [ http://www.example.org/sample/bt200x.jpeg; ] name: [ GPS Receiver BT 200X ], reldate: [ 2009-01-22 ], review: [ { reviewer: [ http://ln.hixie.ch/; ], text: [ Lots of memory, not much battery, very little accuracy. ] } ], }, ], work: [ { about: [ http://www.example.org/sample/image.jpeg; ], license: [ http://www.opensource.org/licenses/mit- license.php ] title: [ My Pond ], } ] } This has the slightly surprising property of making something like this: section item=fooSome text. a href=somewhereA link/a. Some more text/section Result in: // http://example.org/sample/test { foo: [ Some text. A link. Some more text ] } While simply changing link an item: section item=fooSome text a item=link href=somewhereA link/ a. Some more text/section Gives you: // http://example.org/sample/test { foo: [ { link: [ http://example.org/sample/somewhere; ] } ] } However, I think that people will generally expect item to be used for its text/URL content only on leaf nodes or nodes without much nested within them, while they would expect item to return structured, nested data when the DOM is nested deeply with items inside it, so I don't think people would be surprised by this behavior very often. I haven't yet looked at every use case proposed so far to see how well this idea works for them, nor have I worked out the API differences (which should be simpler than the existing API). If there seem to be no serious problems with this idea, I can write up a more detailed justification and examples. -- Brian
Re: [whatwg] Microdata
On Tue, 25 Aug 2009 00:29:06 +0200, Ian Hickson i...@hixie.ch wrote: On Mon, 24 Aug 2009, Philip Jägenstedt wrote: I've found two related things that are a bit problematic. First, because itemprops are only associated with ancestor item elements or via the subject attribute, it's always necessary to find or create a separate element for the item. This leads to more convoluted markup for small items, so it would be nice if the first item and itemprop could be on the same element when it makes sense: p item=vevent itemprop=description Concert at span itemprop=dtstart19:00/span at span itemprop=locationthe beach/span. /p rather than p item=vevent span itemprop=description Concert at span itemprop=dtstart19:00/span at span itemprop=locationthe beach/span. /span /p As specced now, having itemprop= and item= on the same element implies that the value of the property is an item rooted at this element. Not supporting the above was intentional, to keep the mental model of the markup very simple, rather than having shortcuts. (RDFa has lots of shortcuts and it ended up being very difficult to keep the mental model straight.) There's something like an inverse relationship between simplicity of the syntax and complexity of the resulting markup, the best balance point isn't clear (to me at least). Perhaps option 3 is better, never allowing item+itemprop on the same element. Second, because composite items can only be made by adding item and itemprop to the same element, the embedded item has to know that it has a parent and what itemprop it should use to describe itself. James gave the example of something like planet where each article could be a com.example.blog item and within each article there could be any arbitrary author-supplied microdata [1]. I also feel that the item+itemprop syntax for composite items is one of the least intuitive parts of the current spec. It's easy to get confused about what the type of the item vs the itemprop should be and which item the itemprop actually belongs to. Fair points. Given that flat items like vcard/vevent are likely to be the most common use case I think we should optimize for that. Child items can be created by using a predefined item property: itemprop=com.example.childtype item. Ok... The value of that property would then be the first item in tree-order (or all items in the subtree, not sure). This way, items would have better copy-paste resilience as the whole item element could be made into a top-level item simply by moving it, without meddling with the itemprop. That sounds kinda confusing... More confusing than item+itemprop on the same element? In many cases the property value is the contained text, having it be the contained item node(s) doesn't seem much stranger. If the parent-item (com.example.blog) doesn't know what the child-items are, it would simply use itemprop=item. I don't understand this at all. This was an attempt to have anonymous sub-items. Re-thinking this, perhaps a better solution would be to have each item behave in much the same way that the document itself does. That is, simply add items in the subtree without using itemprop and access them with .getItems(itemType) on the outer item. Comparing the current model with a DOM tree, it seems odd in the a property could be an item. It would be like an element attribute being another element: outer foo=inner//. That kind of thing could just as well be outerfooinner//foo/outer, outerinner type=foo//outer or even outerinner//outer if the relationship between the elements is clear just from the fact that they have a parent-child relationship (usually the case). All examples of nested items in the spec are on the form p itemprop=subtype item These would be replaced with p item=subtype It's only in the case where both itemprop and item have a type that an extra level of nesting will be needed and I expect that to be the exception. Changing the model to something more DOM-tree-like is probably going to be easier to understand for many web developers. It would also fix the problem in my other mail where it's a bit tricky to determine via the DOM API whether a property is a string or an item. When on the topic of the DOM API, document.getItems(outer)[0].getItems(inner)[0] would be so much clearer than what we currently have. Example: p item=vcard itemprop=n item My name is span itemprop=given-namePhilip/span span itemprop=family-nameJägenstedt/span. /p I don't understand what this maps to at all. The same as p item=vcard span itemprop=n item My name is span itemprop=given-namePhilip/span span itemprop=family-nameJägenstedt/span. /span /p Unless I've misunderstood the n in vcard (there's no example in the spec). But let's move on. I'll admit that my examples are a bit simple, but the main point in my opinion is to make item+itemprop less confusing. There are basically only 3
Re: [whatwg] Microdata
On Tue, 25 Aug 2009 09:43:58 +0200, Philip Jägenstedt phil...@opera.com wrote: On Tue, 25 Aug 2009 00:29:06 +0200, Ian Hickson i...@hixie.ch wrote: On Mon, 24 Aug 2009, Philip Jägenstedt wrote: I've found two related things that are a bit problematic. First, because itemprops are only associated with ancestor item elements or via the subject attribute, it's always necessary to find or create a separate element for the item. This leads to more convoluted markup for small items, so it would be nice if the first item and itemprop could be on the same element when it makes sense: p item=vevent itemprop=description Concert at span itemprop=dtstart19:00/span at span itemprop=locationthe beach/span. /p rather than p item=vevent span itemprop=description Concert at span itemprop=dtstart19:00/span at span itemprop=locationthe beach/span. /span /p As specced now, having itemprop= and item= on the same element implies that the value of the property is an item rooted at this element. Not supporting the above was intentional, to keep the mental model of the markup very simple, rather than having shortcuts. (RDFa has lots of shortcuts and it ended up being very difficult to keep the mental model straight.) There's something like an inverse relationship between simplicity of the syntax and complexity of the resulting markup, the best balance point isn't clear (to me at least). Perhaps option 3 is better, never allowing item+itemprop on the same element. Second, because composite items can only be made by adding item and itemprop to the same element, the embedded item has to know that it has a parent and what itemprop it should use to describe itself. James gave the example of something like planet where each article could be a com.example.blog item and within each article there could be any arbitrary author-supplied microdata [1]. I also feel that the item+itemprop syntax for composite items is one of the least intuitive parts of the current spec. It's easy to get confused about what the type of the item vs the itemprop should be and which item the itemprop actually belongs to. Fair points. Given that flat items like vcard/vevent are likely to be the most common use case I think we should optimize for that. Child items can be created by using a predefined item property: itemprop=com.example.childtype item. Ok... The value of that property would then be the first item in tree-order (or all items in the subtree, not sure). This way, items would have better copy-paste resilience as the whole item element could be made into a top-level item simply by moving it, without meddling with the itemprop. That sounds kinda confusing... More confusing than item+itemprop on the same element? In many cases the property value is the contained text, having it be the contained item node(s) doesn't seem much stranger. If the parent-item (com.example.blog) doesn't know what the child-items are, it would simply use itemprop=item. I don't understand this at all. This was an attempt to have anonymous sub-items. Re-thinking this, perhaps a better solution would be to have each item behave in much the same way that the document itself does. That is, simply add items in the subtree without using itemprop and access them with .getItems(itemType) on the outer item. Comparing the current model with a DOM tree, it seems odd in the a property could be an item. It would be like an element attribute being another element: outer foo=inner//. That kind of thing could just as well be outerfooinner//foo/outer, outerinner type=foo//outer or even outerinner//outer if the relationship between the elements is clear just from the fact that they have a parent-child relationship (usually the case). All examples of nested items in the spec are on the form p itemprop=subtype item These would be replaced with p item=subtype It's only in the case where both itemprop and item have a type that an extra level of nesting will be needed and I expect that to be the exception. Changing the model to something more DOM-tree-like is probably going to be easier to understand for many web developers. It would also fix the problem in my other mail where it's a bit tricky to determine via the DOM API whether a property is a string or an item. When on the topic of the DOM API, document.getItems(outer)[0].getItems(inner)[0] would be so much clearer than what we currently have. Example: p item=vcard itemprop=n item My name is span itemprop=given-namePhilip/span span itemprop=family-nameJägenstedt/span. /p I don't understand what this maps to at all. The same as p item=vcard span itemprop=n item My name is span itemprop=given-namePhilip/span span itemprop=family-nameJägenstedt/span. /span /p Unless I've misunderstood the n in vcard (there's no example in the spec). But let's move on. I'll admit that my examples are a bit simple, but
Re: [whatwg] Microdata
On Sat, 22 Aug 2009 23:51:48 +0200, Ian Hickson i...@hixie.ch wrote: Based on some of the feedback on Microdata recently, e.g.: http://www.jenitennison.com/blog/node/124 ...and a number of e-mails sent to this list and the W3C lists, I am going to try some tweaks to the Microdata syntax. Google has kindly offered to provide usability testing resources so that we can try a variety of different syntaxes and see which one is easiest for authors to understand. If anyone has any concrete syntax ideas that they would like me to consider, please let me know. There's a (pretty low) limit to how many syntaxes we can perform usability tests on, though, so I won't be able to test every idea. I've found two related things that are a bit problematic. First, because itemprops are only associated with ancestor item elements or via the subject attribute, it's always necessary to find or create a separate element for the item. This leads to more convoluted markup for small items, so it would be nice if the first item and itemprop could be on the same element when it makes sense: p item=vevent itemprop=description Concert at span itemprop=dtstart19:00/span at span itemprop=locationthe beach/span. /p rather than p item=vevent span itemprop=description Concert at span itemprop=dtstart19:00/span at span itemprop=locationthe beach/span. /span /p Second, because composite items can only be made by adding item and itemprop to the same element, the embedded item has to know that it has a parent and what itemprop it should use to describe itself. James gave the example of something like planet where each article could be a com.example.blog item and within each article there could be any arbitrary author-supplied microdata [1]. I also feel that the item+itemprop syntax for composite items is one of the least intuitive parts of the current spec. It's easy to get confused about what the type of the item vs the itemprop should be and which item the itemprop actually belongs to. Given that flat items like vcard/vevent are likely to be the most common use case I think we should optimize for that. Child items can be created by using a predefined item property: itemprop=com.example.childtype item. The value of that property would then be the first item in tree-order (or all items in the subtree, not sure). This way, items would have better copy-paste resilience as the whole item element could be made into a top-level item simply by moving it, without meddling with the itemprop. If the parent-item (com.example.blog) doesn't know what the child-items are, it would simply use itemprop=item. Example: p item=vcard itemprop=n item My name is span itemprop=given-namePhilip/span span itemprop=family-nameJägenstedt/span. /p I'll admit that my examples are a bit simple, but the main point in my opinion is to make item+itemprop less confusing. There are basically only 3 options: 1. for compositing items (like now) 2. as shorthand on the top-level item (my suggestion) 3. disallow I'd primarily like for 1 and 2 to be tested, but 3 is a real option too. [1] http://krijnhoetmer.nl/irc-logs/whatwg/20090824#l-375 -- Philip Jägenstedt Opera Software
Re: [whatwg] Microdata
On Mon, 24 Aug 2009, Philip Jägenstedt wrote: I've found two related things that are a bit problematic. First, because itemprops are only associated with ancestor item elements or via the subject attribute, it's always necessary to find or create a separate element for the item. This leads to more convoluted markup for small items, so it would be nice if the first item and itemprop could be on the same element when it makes sense: p item=vevent itemprop=description Concert at span itemprop=dtstart19:00/span at span itemprop=locationthe beach/span. /p rather than p item=vevent span itemprop=description Concert at span itemprop=dtstart19:00/span at span itemprop=locationthe beach/span. /span /p As specced now, having itemprop= and item= on the same element implies that the value of the property is an item rooted at this element. Not supporting the above was intentional, to keep the mental model of the markup very simple, rather than having shortcuts. (RDFa has lots of shortcuts and it ended up being very difficult to keep the mental model straight.) Second, because composite items can only be made by adding item and itemprop to the same element, the embedded item has to know that it has a parent and what itemprop it should use to describe itself. James gave the example of something like planet where each article could be a com.example.blog item and within each article there could be any arbitrary author-supplied microdata [1]. I also feel that the item+itemprop syntax for composite items is one of the least intuitive parts of the current spec. It's easy to get confused about what the type of the item vs the itemprop should be and which item the itemprop actually belongs to. Fair points. Given that flat items like vcard/vevent are likely to be the most common use case I think we should optimize for that. Child items can be created by using a predefined item property: itemprop=com.example.childtype item. Ok... The value of that property would then be the first item in tree-order (or all items in the subtree, not sure). This way, items would have better copy-paste resilience as the whole item element could be made into a top-level item simply by moving it, without meddling with the itemprop. That sounds kinda confusing... If the parent-item (com.example.blog) doesn't know what the child-items are, it would simply use itemprop=item. I don't understand this at all. Example: p item=vcard itemprop=n item My name is span itemprop=given-namePhilip/span span itemprop=family-nameJägenstedt/span. /p I don't understand what this maps to at all. I'll admit that my examples are a bit simple, but the main point in my opinion is to make item+itemprop less confusing. There are basically only 3 options: 1. for compositing items (like now) 2. as shorthand on the top-level item (my suggestion) 3. disallow I'd primarily like for 1 and 2 to be tested, but 3 is a real option too. [1] http://krijnhoetmer.nl/irc-logs/whatwg/20090824#l-375 We can't disallow nesting items as values of properties, there are a whole bunch of use cases that depend on it. Could you show how your syntax proposals would look when marking up the following data? // JSON DESCRIPTION OF MARKED UP DATA // document URL: http://www.example.org/sample/test.html { items: [ { type: com.example.product, properties: { about: [ http://example.com/products/bt200x; ], image: [ http://www.example.org/sample/bt200x.jpeg; ] // please keep this one outside the item in the DOM name: [ GPS Receiver BT 200X ], reldate: [ 2009-01-22 ], review: [ { type: , properties: { reviewer: [ http://ln.hixie.ch/; ], text: [ Lots of memory, not much battery, very little accuracy. ] } } ], } }, { type: work, properties: { about: [ http://www.example.org/sample/image.jpeg; ], license: [ http://www.opensource.org/licenses/mit-license.php; ] title: [ My Pond ], } } ] } Here's how it would be marked up today: section id=bt200x item=com.example.product link itemprop=about href=http://example.com/products/bt200x; h1 itemprop=nameGPS Receiver BT 200X/h1 pRating: #x22C6;#x22C6;#x22C6;#x2729;#x2729; meta itemprop=rating content=2/p pRelease Date: time itemprop=reldate datetime=2009-01-22January 22/time/p p itemprop=review itema itemprop=reviewer href=http://ln.hixie.ch/;Ian/a: span itemprop=textLots of memory, not much battery, very little accuracy./span/p /section figure item=work img itemprop=about src=image.jpeg legend pcite itemprop=titleMy Pond/cite/p psmallLicensed under the a itemprop=license href=http://www.opensource.org/licenses/mit-license.php;MIT license/a./small /legend /figure pimg subject=bt200x itemprop=image
Re: [whatwg] Microdata
On Sat, Aug 22, 2009 at 11:51 PM, Ian Hicksoni...@hixie.ch wrote: Based on some of the feedback on Microdata recently, e.g.: http://www.jenitennison.com/blog/node/124 ...and a number of e-mails sent to this list and the W3C lists, I am going to try some tweaks to the Microdata syntax. Google has kindly offered to provide usability testing resources so that we can try a variety of different syntaxes and see which one is easiest for authors to understand. If anyone has any concrete syntax ideas that they would like me to consider, please let me know. There's a (pretty low) limit to how many syntaxes we can perform usability tests on, though, so I won't be able to test every idea. This would be more than just tweaking the syntax, but I think appropriate to bring forth my CRDF proposal as a suggestion for an alternative to Microdata. For reference, the latest version of the document can be found at [1], and the discussion that has happenned about it can be found at [2]. Rather than just saying use that syntax, I'm including here what IMO are the most prominent advantages (and potential issues) of that proposal, in no particular order: + Optional use of selectors: while the ability to use selectors seems quite useful, specially to handle list or collection cases, it has been argued that users may have problems with elaborated selectors. Since the last update of the CRDF document, this is addressed with the expanded inline content model: it should possible to express with only inline CRDF, and without using selectors at all, any semantics that can be represented with RDFa, Microdata, EASE, or eRDF. In other words: while CRDF can take full benefit of selectors to make better and/or clearer documents, it can still handle most cases (those actually handled by existing solutions) without them. + Microformats mapping: for good data (specifically, all content that doesn't duplicate any singular property), CRDF allows trivially mapping Microformat-marked data to an arbitrary RDF vocabulary (or even to multiple, overlapping vocabularies), thus allowing its re-use with RDF-related tools and/or combining it with RDF data from other sources and/or marked with other syntaxes. In order to achieve 100% compatibility with Microformats.org' processing model (including any form of bad data), a minor addition to Selectors is suggested in the document, although no substantial feedback has been given on it (neither against nor in favor). + Microformats-like but decentralized: the main issue with Microformats, at least with non-widespread vocabularies, is centralization: it requires a criticall mass of use-cases to get the Microformats community to engage in the process of creating a new vocabulary. With CRDF, any author may build their own vocabulary (implementing it as a CRDF mapping to RDF) and use it on their pages. If a vocabulary later gains momentum and is adopted by a wide enough set of authors, it'd be up to the Microformats community to decide whether standarize it or not. + Prefix declarations go out of HTML: After so many discussions, namespace prefixes has been the main source of criticism against RDFa. One of these criticism is the range of technicall issues that arise from the xmlns: syntax for defining namespace prefixes (in tag-soup syntax). CRDF handles this case by taking away the responsibility of prefix declarations from HTML: having a CSS-based syntax, CRDF takes the obvious step and uses CSS's own syntax for namespace declarations. + Entirely RDF based: while this might seem a purely theoretical advantage, there is also a practical benefit: once extracted from the webpage, CRDF data can be easily combined with any already existing RDF data; and can be used with RDF-related tools. - Copy-paste brittleness: IMO, the only serious drawback from CRDF; but there are some points worth making: 1) When used inline, CRDF can achieve the same resilience than RDFa, which is quite close to Microdata's. 2) I have noticed that some browsers can manage to copy-paste CSS-styled content preserving (most of) format. It shouldn't be hard for implementors to extend such functionality to CRDF. Of course, the support for this is not consistent among browsers, and also seems to vary for different paste targets. If there is some real interest, I might do some testing with multiple browsers and paste targets (for now, I have noticed that both IE and FF preserve most CSS formatting (but not layout) when pasting to Word, but pasting to OOo Writter gets rendered with the default formatting for the tags). It would be interesting, on this aspect, to hear about browser vendors: would they be willing to extend the CSS copy-paste capabilities to CRDF if it got adopted? - Prefix-based indirection: I'd bet that there are people on this list ready to argue that namespace prefixes are a good thing; but it seems that it raises some issues, so I'll include them and share my PoV on the topic: 1) For those who care
Re: [whatwg] Microdata
On Saturday, August 22, 2009, Eduard Pascual herenva...@gmail.com wrote: On Sat, Aug 22, 2009 at 11:51 PM, Ian Hicksoni...@hixie.ch wrote: Based on some of the feedback on Microdata recently, e.g.: http://www.jenitennison.com/blog/node/124 ...and a number of e-mails sent to this list and the W3C lists, I am going to try some tweaks to the Microdata syntax. Google has kindly offered to provide usability testing resources so that we can try a variety of different syntaxes and see which one is easiest for authors to understand. If anyone has any concrete syntax ideas that they would like me to consider, please let me know. There's a (pretty low) limit to how many syntaxes we can perform usability tests on, though, so I won't be able to test every idea. This would be more than just tweaking the syntax, but I think appropriate to bring forth my CRDF proposal as a suggestion for an alternative to Microdata. For reference, the latest version of the document can be found at [1], and the discussion that has happenned about it can be found at [2]. Rather than just saying use that syntax, I'm including here what IMO are the most prominent advantages (and potential issues) of that proposal, in no particular order: + Optional use of selectors: while the ability to use selectors seems quite useful, specially to handle list or collection cases, it has been argued that users may have problems with elaborated selectors. Since the last update of the CRDF document, this is addressed with the expanded inline content model: it should possible to express with only inline CRDF, and without using selectors at all, any semantics that can be represented with RDFa, Microdata, EASE, or eRDF. In other words: while CRDF can take full benefit of selectors to make better and/or clearer documents, it can still handle most cases (those actually handled by existing solutions) without them. + Microformats mapping: for good data (specifically, all content that doesn't duplicate any singular property), CRDF allows trivially mapping Microformat-marked data to an arbitrary RDF vocabulary (or even to multiple, overlapping vocabularies), thus allowing its re-use with RDF-related tools and/or combining it with RDF data from other sources and/or marked with other syntaxes. In order to achieve 100% compatibility with Microformats.org' processing model (including any form of bad data), a minor addition to Selectors is suggested in the document, although no substantial feedback has been given on it (neither against nor in favor). + Microformats-like but decentralized: the main issue with Microformats, at least with non-widespread vocabularies, is centralization: it requires a criticall mass of use-cases to get the Microformats community to engage in the process of creating a new vocabulary. With CRDF, any author may build their own vocabulary (implementing it as a CRDF mapping to RDF) and use it on their pages. If a vocabulary later gains momentum and is adopted by a wide enough set of authors, it'd be up to the Microformats community to decide whether standarize it or not. + Prefix declarations go out of HTML: After so many discussions, namespace prefixes has been the main source of criticism against RDFa. One of these criticism is the range of technicall issues that arise from the xmlns: syntax for defining namespace prefixes (in tag-soup syntax). CRDF handles this case by taking away the responsibility of prefix declarations from HTML: having a CSS-based syntax, CRDF takes the obvious step and uses CSS's own syntax for namespace declarations. + Entirely RDF based: while this might seem a purely theoretical advantage, there is also a practical benefit: once extracted from the webpage, CRDF data can be easily combined with any already existing RDF data; and can be used with RDF-related tools. - Copy-paste brittleness: IMO, the only serious drawback from CRDF; but there are some points worth making: 1) When used inline, CRDF can achieve the same resilience than RDFa, which is quite close to Microdata's. 2) I have noticed that some browsers can manage to copy-paste CSS-styled content preserving (most of) format. It shouldn't be hard for implementors to extend such functionality to CRDF. Of course, the support for this is not consistent among browsers, and also seems to vary for different paste targets. If there is some real interest, I might do some testing with multiple browsers and paste targets (for now, I have noticed that both IE and FF preserve most CSS formatting (but not layout) when pasting to Word, but pasting to OOo Writter gets rendered with the default formatting for the tags). It would be interesting, on this aspect, to hear about browser vendors: would they be willing to extend the CSS copy-paste capabilities to CRDF if it got adopted? - Prefix-based indirection: I'd bet that there are people on this list ready to argue that namespace
Re: [whatwg] Microdata Revisited
On Mon, Aug 3, 2009 at 2:58 AM, Martin McEvoymar...@weborganics.co.uk wrote: Hello All I have been working on a new proposal for HTML 5 Microdata, I thought you might all like to take a look at what I have come up with so far. please visit http://weborganics.co.uk/test/microdata.html Any feed back would be nice ;) I'm in general vary of the use of prefixes here. Maciej summarized things very nicely in [1] / Jonas [1] http://lists.w3.org/Archives/Public/public-html/2009Jul/0919.html
Re: [whatwg] Microdata and Linked Data
(I trimmed public-html from the CC list to avoid cross-posting, and because the whatwg list has had most of the traffic on this topic so far; please feel free to forward this to public-html if you would rather discuss that there instead.) On Fri, 24 Jul 2009, Peter Mika wrote: The use of a URI as the value of the id attribute. It seems to me there is actually nothing in the spec that would stop this: Identifiers are opaque strings. Particular meanings should not be derived from the value of the id attribute. This is great because in principle I could do something like: section id=http://john.example.com#hedral; item=org.example.animal.cat com.example.feline h1 itemprop=org.example.name com.example.fnHedral/h1 /section I assume you can achieve something similar with the about property but that would require me to write: section item=org.example.animal.cat com.example.feline h1 itemprop=org.example.name com.example.fnHedral/h1 a itemprop=about href=http://john.example.com#hedral/ /section This is longer by itself, and if I want an internal identifier as well, than I have to write: section id=hedral item=org.example.animal.cat com.example.feline h1 itemprop=org.example.name com.example.fnHedral/h1 a itemprop=about href=http://john.example.com#hedral/ /section In practice, all the use cases that were brought up that needed to identify the item were cases where there was a URL already in the page, e.g. in a link or an img or a video element, such that it actually ends up better if we use itemprop=about rather than having a dedicated attribute (like id= or about=) for identifying types. Are there use cases where this is not the case? For example, when would you need to have an internal identifier? The other area that could be possibly improved is the connection of type identifiers with ontologies on the web. I would actually like the notion of reverse domain names if -- there would be an explicit agreement that they are of the form xxx.yyy.zzz.classname -- there would be a registry for mappings from xxx.yyy.zzz to URIs. For example, org.foaf-project.Person could be linked to http://xmlns.com/foaf/0.1/Person by having the mapping from org.foaf-project to http://xmlns.com/foaf/0.1/. It wouldn't be perfect, the FOAF ontology as you see is not at org.foaf-project but at com.xmlns. However, it would be a step in the right direction. What problem is this solving? I would consider adding the sameAs property as part of the standard vocabulary. This is a term from the OWL vocabulary that is widely used in the Linked Data world for connecting entities that are deemed to be equivalent. Alternatively, we could add the entire RDFS and OWL vocabulary to the spec. Could you elaborate on this? What are the use cases that this is intended to address? What do you mean by adding the sameAs property? I don't expect that writing full URIs for property names will be appealing to users, but of course I'm not a big fan either of defining prefixes individually as done in RDFa with the CURIE mechanism. Still, prefixes would be useful, e.g. foaf:Person is much shorter to write than com.foaf-project.Person and also easier to remember. So would there be a way to reintroduce the notion of prefixes, with possibly pointing to a registry that defines the mapping from prefixes to namespaces? section id=hedral namespaces=http://www.w3c.org/registry/; item=animal:cat h1 itemprop=animal:nameHedral/h1 /section Here the registry would define a number of prefixes. However, the mechanism would be open in that other organizations or even individuals could maintain registries. I'm definitely against any in-page indirection mechanism, because we have seen with XML Namespaces (and with RDFa) that prefixes are simply a huge source of problems. However, there actually already is a registry for registering strings that start with a keyword and a colon: the scheme registry. So if animals become important enough that they need their own scheme, I guess people could register them that way. Alternatively, a short domain followed by a keyword seems like a reasonable option: instead of animal:cat, have org.animal.cat: it's only four more characters. (Actually, with ICANN considering opening up TLDs, people could just register those: animal.cat is a valid reverse DNS label if animal is a TLD!) -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Microdata and Linked Data
Hello Ian Ian Hickson wrote: I'm definitely against any in-page indirection mechanism, because we have seen with XML Namespaces (and with RDFa) that prefixes are simply a huge source of problems. They are indeed, XML namespaces fixed one problem calling different things by the same name but they created another problem of calling the same thing by different names, Prefixes are not themselves bad, misunderstood or any kind of indirection mechanism, they are just short hand urls, they are actually quite intuitive if used correctly. RDFa Is currently trying to solve its problems with xmlns, is just a minor design flaw, xmlns is used for structure not content and they realize that issue. Best wishes -- Martin McEvoy http://weborganics.co.uk/
Re: [whatwg] Microdata and Linked Data
On Fri, Jul 24, 2009 at 1:07 PM, Peter Mikapm...@yahoo-inc.com wrote: [...] #2 The other area that could be possibly improved is the connection of type identifiers with ontologies on the web. I would actually like the notion of reverse domain names if -- there would be an explicit agreement that they are of the form xxx.yyy.zzz.classname -- there would be a registry for mappings from xxx.yyy.zzz to URIs. For example, org.foaf-project.Person could be linked to http://xmlns.com/foaf/0.1/Person by having the mapping from org.foaf-project to http://xmlns.com/foaf/0.1/. It wouldn't be perfect, the FOAF ontology as you see is not at org.foaf-project but at com.xmlns. However, it would be a step in the right direction. [...] #4 I don't expect that writing full URIs for property names will be appealing to users, but of course I'm not a big fan either of defining prefixes individually as done in RDFa with the CURIE mechanism. Still, prefixes would be useful, e.g. foaf:Person is much shorter to write than com.foaf-project.Person and also easier to remember. So would there be a way to reintroduce the notion of prefixes, with possibly pointing to a registry that defines the mapping from prefixes to namespaces? section id=hedral namespaces=http://www.w3c.org/registry/; item=animal:cat h1 itemprop=animal:nameHedral/h1 /section Here the registry would define a number of prefixes. However, the mechanism would be open in that other organizations or even individuals could maintain registries. IMO, both of these proposals are quite related. However, you added substantial differences I can't really understand between them. For #2 you suggest to have a sort of centralized registry of mappings between the reversed domains and the vocabularies they refer to. What happens if next year I have to use an unusual vocabulary for my site that is not included on the registry? Would I have to get the vocabulary included on the registry before my pages' microdata can be mapped to the appropriate RDF graph? On the other hand, on #4, you are opening the gate to independent entities (be them organizations or individuals) to define the prefixes they would be using for their pages' metadata: why don't apply this to #2 as well? IMO, it would be more important for #2 than for #4; since #4 only provides syntax sugar while #2 enables something that would be undoable without it (mapping Microdata to arbitrary RDF). About #1, I'm not sure about what you are exacly proposing, so I can't provide much feedback on it. Maybe you could make it a bit clearer: are you proposing any specific change to the spec? If so, what would be the change? If now, what are you proposing then? Finally, about #3 I'm not familiar with the OWL vocabulary, so I can't say too much about it. But if your second proposal gets into the spec, then this would become just syntax sugar, since any property from any existing RDF vocabulary could be expressed; and if #4 also got in, the benefit of built-in properties would be minimal compared to using a reasonably short prefix (such as owl:). Just my two cents. Regards, Eduard Pascual
Re: [whatwg] Microdata and Linked Data
Yes, #2 and #4 are quite related in that they both concern the abbreviation mechanism for URIs and might be considered alternative proposals. On the other hand, on #4, you are opening the gate to independent entities (be them organizations or individuals) to define the prefixes they would be using for their pages' metadata: why don't apply this to #2 as well? IMO, it would be more important for #2 than for #4; since #4 only provides syntax sugar while #2 enables something that would be undoable without it (mapping Microdata to arbitrary RDF). Yes, the idea of distributing the registration could be applied to #2. About #1, I'm not sure about what you are exacly proposing, so I can't provide much feedback on it. Maybe you could make it a bit clearer: are you proposing any specific change to the spec? If so, what would be the change? If now, what are you proposing then? Removing the about property, showing how id can be used in this way, and changing the description of how you transform an HTML5 document to RDF. Finally, about #3 I'm not familiar with the OWL vocabulary, so I can't say too much about it. But if your second proposal gets into the spec, then this would become just syntax sugar, since any property from any existing RDF vocabulary could be expressed; and if #4 also got in, the benefit of built-in properties would be minimal compared to using a reasonably short prefix (such as owl:). I agree... I'm personally not so attached to reverse domain names, but I might have missed a lot of the previous discussions on why they are good to have. In any case, my intention was to get the discussion restarted around these issues: it seems to me there was a lot of discussion at the very beginning on microdata vs. RDFa when microdata was first proposed, but then the discussion died without necessarily finding the best solution (for my taste). Cheers, Peter
Re: [whatwg] Microdata and Linked Data
Fair point. Just brainstorming here: how about making about an attribute? div item id=amanda about=http://;/div pName: span subject=amanda itemprop=nameAmanda/span/p We still have two identifiers, but at least giving the URI is simplified. Best, Peter Julian Reschke wrote: Peter Mika wrote: Hi All, I've been taking a closer look at microdata. While I like the proposal in general, in particular the chance to unite microformat style annotations with some of the Semantic Web formalism (such as URIs for objects), there are still a number of points that I feel could be improved. So here are my proposals for discussion: #1 The use of a URI as the value of the id attribute. It seems to me there is actually nothing in the spec that would stop this: ... IDs like that would be very hard to use as fragment identifier... ... BR, Julian
Re: [whatwg] microdata use cases and Getting data out of poorly written Web pages
On Fri, 8 May 2009, Shelley Powers wrote: It's difficult to tell where one should comment on the so-called microdata use cases. I'm forced to send to multiple mailing lists. Please don't cross-post to the WHATWG list and other lists -- you may pick either one, I read all of them. (Cross-posting results in a lot of confusion because some of the lists only allow members to posts, which others allow anyone to post, so we end up with fragmented threads.) Ian, I would like to see the original request that went into this particular use case. In particular, I'd like to know who originated it, so that we can ensure that the person has read your follow-up, as well as how you condensed the use case down (to check if your interpretation is proper or not). I did not keep track of where the use cases came from (I generally ignore the source of requests so as to avoid any possible bias). However, I can probably figure out some of the sources of a particular scenario if you have a specific one in mind. Could you clarify which scenario or requirement you are particularly interested in? In addition, from my reading of this posting of yours titled [whatwg] Getting data out of poorly written Web pages, is this open for any discussion? Naturally, all input is always welcome. It seems to me that you received the original data, generated a use case document from the data, unilaterally, and now you're making unilateral decisions as to whether the use case requires a change in HTML5 or not. Is this what we can expect from all of the use cases? Yes. If my proposals don't actually address the use cases, then please do point how that is the case. Similarly, if there are missing use cases, please bring them up. All input is always welcome (whether on the lists, or direct e-mal, on blogs, or wherever). None of the text in the HTML5 spec is frozen, it's merely a proposal. If there are use cases that should be addressed that are not addressed then we should address them. (Regarding microdata note that I've so far only sent proposals for three of the 20 use cases that I collected. I've still got a lot to go through.) -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] microdata use cases and Getting data out of poorly written Web pages
Ian Hickson wrote: On Fri, 8 May 2009, Shelley Powers wrote: It's difficult to tell where one should comment on the so-called microdata use cases. I'm forced to send to multiple mailing lists. Please don't cross-post to the WHATWG list and other lists -- you may pick either one, I read all of them. (Cross-posting results in a lot of confusion because some of the lists only allow members to posts, which others allow anyone to post, so we end up with fragmented threads.) But different people respond to the mailings in different ways, depending on the list. This isn't just you, Ian. How can I ensure that the W3C people have access to the same concerns? Ian, I would like to see the original request that went into this particular use case. In particular, I'd like to know who originated it, so that we can ensure that the person has read your follow-up, as well as how you condensed the use case down (to check if your interpretation is proper or not). I did not keep track of where the use cases came from (I generally ignore the source of requests so as to avoid any possible bias). Documenting the originator of a use case is introducing bias? In what universe? If anything, documenting where the use cases come from, and providing access to the original, raw data helps to ensure that bias has not been introduced. More importantly, it gives your teammates a chance to verify your interpretation of the use cases, and provide correction, if needed. However, I can probably figure out some of the sources of a particular scenario if you have a specific one in mind. Could you clarify which scenario or requirement you are particularly interested in? Ian, I think its important that you provide a place documenting the original raw data. This provides a historical perspective on the decisions going into HTML5 if nothing else. If you need help, I'm willing to help you. You'll need to forward me the emails you received, and send me links to the other locations. I'll then put all these into a document and we can work to map to your condensed document. That way there's accountability at all steps in the decision process, as well as transparency. Once I put the document together, we can put with other documents that also provide history of the decision processes. In addition, from my reading of this posting of yours titled [whatwg] Getting data out of poorly written Web pages, is this open for any discussion? Naturally, all input is always welcome. No, I didn't ask if input was welcome. I asked if this was still open for discussion, or if you have made up your mind, and and further discussion will just be wasting everyone's time. It seems to me that you received the original data, generated a use case document from the data, unilaterally, and now you're making unilateral decisions as to whether the use case requires a change in HTML5 or not. Is this what we can expect from all of the use cases? Yes. That's not appropriate for a team environment. If my proposals don't actually address the use cases, then please do point how that is the case. Similarly, if there are missing use cases, please bring them up. All input is always welcome (whether on the lists, or direct e-mal, on blogs, or wherever). None of the text in the HTML5 spec is frozen, it's merely a proposal. If there are use cases that should be addressed that are not addressed then we should address them. Again, how can I? I don't have the original data. (Regarding microdata note that I've so far only sent proposals for three of the 20 use cases that I collected. I've still got a lot to go through.) After digging, I found another one, at http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-May/019620.html Again, though, the writing style indicates the item is closed, and discussion is not welcome. I have to assume that this is how you mentally perceive the item, and therefore though we may respond, the response will make no difference. And I can't find the third one. Perhaps you can provide a direct link. I'm concerned, too, about the fact that the discussion for these is happening on the WhatWG group, but not in the HTML WG email list. I've never understood two different email lists, and have felt having both is confusing, and potentially misleading. Regardless, shouldn't this discussion be taking place in the HTML WG, too? Isn't the specification the W3C HTML5 specification, also? I'm just concerned because from what I can see of both groups, interests and concerns differ between the groups. That means only addressing issues in one group, would leave out potentially important discussions in the other group. Shelley