On Sun, 29 Nov 2009 14:28:05 +0100, Philip Jägenstedt <phil...@opera.com> wrote:

Now, back to the problem of one property, multiple items. The algorithm for finding the properties of an item [2] is an attempt at optimizing the search for properties starting at an item element. I think we should replace this algorithm with an algorithm for finding the item of a property. This was previously the case with the spec before the itemref mechanism. I would suggest something along these lines:

1. let current be the element with the itemprop attribute
2. if current has an ID, for each element e in document order:
2.1. if e has an itemref attribute:
2.1.1. split the value of that itemref attribute on spaces. for each resulting token, ID:
2.1.1.1. if ID equals the ID of current, return e
3. reaching this step indicates that the item wasn't found via itemref on this element
4. let parent be the parent element of current
5. if parent is null, return null
6. if parent has the itemscope attribute, return parent
7. otherwise, let current be parent and jump to step 2.

This algorithm will find the parent item of a property, if there is one. itemref'ing takes precedence over "parent-child linking", so in Tim's example the properties of Shanghai would be applied to only the Shanghai sub-item. I'm not convinced writing markup like that is a good idea, but at least this way it has sane processing. HTMLPropertiesCollection on any given element would simply match all elements in the document for which the the algorithm returns that very element. It should be invalid for there to be any elements in the document with itemprop where this algorithm returns null or the element itself.

I will try implementing this algorithm in MicrodataJS [3] and see if it works OK. While it may look less efficient than the current algorithm, consider that a browser won't implement either algorithm as writting, only act as if they did. The expensive step of going through all elements with itemref attributes is actually no more expensive than e.g. document.querySelector('.classname') if implemented natively.

[1] http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-November/024095.html [2] http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#the-properties-of-an-item
[3] http://gitorious.org/microdatajs


With an added check to ignore self-referencing itemrefs, my algorithm seems to work. The only test cases I have where the result (as seen through HTMLPropertiesCollection) isn't the same is one similar to Tim's "1 property 2 items" and one involving self-reference. Incidentally, the cases which caused the JSON and vCard extraction algorithm to recurse infinitely now terminate with sane results.

A consequence of this change is that when two elements add the same property by itemref, only one will get it (the first in document order). This means that it isn't possible to share properties between items, which is precisely the point to avoid loops. If there was a use case that required property sharing, this needs some more tinkering. I'm inclined to say that when such sharing is wanted, one should add a level of indirection, e.g. with an ID. This way the microdata model is kept strictly tree-like.

To make the limitations clear to authors, an element with itemprop for which the algorithm returns null should be invalid. For elements with itemref, it should be invalid for any of the referenced elements to either not exist or to have another item as their "owner". In short, itemref'ing must be consistent.

For the curious, from the (not so optimized) JavaScript implementation:

function getCorrespondingItem(node) {
  var current = node;
  while (current) {
    if (current.id) {
      var referrer = document.querySelector('*[itemref~='+current.id+']');
      if (referrer && referrer != node)
        return referrer;
    }
    current = current.parentNode;
    if (current && current.itemScope)
      return current;
  }
  return null;
}

--
Philip Jägenstedt

Reply via email to