Tim van Oostrom wrote:
Philip Jägenstedt wrote:
On Sun, 29 Nov 2009 12:46:16 +0100, Tim van Oostrom <t...@depulz.nl>
wrote:
Philip Jägenstedt wrote:
On Thu, 26 Nov 2009 22:30:41 +0100, Tim van Oostrom <t...@depulz.nl>
wrote:
Hi, I made a forumpost :
http://forums.whatwg.org/viewtopic.php?t=4176, concerning a
possible "microdata specification bug" and a bug in the
james.html5.org microdata extractor.
Comes down to <link/> and <meta/> elements possibly being unfit
for use with the itemscope attribute.
I made an example in the forum post with some nice ubb formatting .
There are some other issues with <link> and <meta> you might want
to review first: [1]
Ok
Your second example was:
<div itemtype="http://url.to/geoVocab#country" itemscope>
<span itemprop="http://xmlns.com/foaf/spec/index.rdf#name"
lang="cn">中華人民共和國</span>
<span itemprop="http://xmlns.com/foaf/spec/index.rdf#name"
lang="en">China</span>
<link itemprop="http://url.to/city"
href="http://url.to/shanghai" itemscope itemref="city-shanghai" />
<div id="city-shanghai">
<span
itemprop="http://xmlns.com/foaf/spec/index.rdf#name">Shanghai</span>
<span itemprop="http://url.to/demoVocab#population">14.61
million people</span>
<span itemprop="http://url.to/physicsVocab#time"
datetime="2009-11-26 11:43">11:43 pm (CT)</span>
</div>
</div>
<link>, <meta> and any other void elements are usually the wrong
choice for itemprop+itemscope because they don't have child
elements, so itemref is the only way to add properties.
Yes, see forumpost. Shouldn't this be noted in the Spec then ?
Yes, the spec certainly needs some notes on how to use <link> and
<meta>.
And other void alements such as : area, base, br, col, command, embed,
hr, img, input, link, meta, param, source
(http://dev.w3.org/html5/markup/syntax.html)
Basically, the microdata can't really be on all elements as stated in :
HTML5 spec, 5.2.2 Items
According to this an "itemref" attribute can never be added to an
"item" within an itemscope of another "item" without the crawled
prop/val pairs also applying to the ancestors itemscope.
Ah, I think you've found the root of the problem. By allowing a
property to be part of several items at once, we get different kinds
of strange problems. Except from messing up your example, it seems it
is the real cause for the infinite recursion bug I wrote about in
[1]. Then I was so focused on the recursion that I suggested a rather
complex solution to detect loops in the microdata, when it seems it
could be solved simply be making sure that a property belongs to only
1 item. Detailed suggestion below.
Now, back to the problem of one property, multiple items. The
algorithm for finding the properties of an item [2] is an attempt at
optimizing the search for properties starting at an item element. I
think we should replace this algorithm with an algorithm for finding
the item of a property. This was previously the case with the spec
before the itemref mechanism. I would suggest something along these
lines:
1. let current be the element with the itemprop attribute
2. if current has an ID, for each element e in document order:
2.1. if e has an itemref attribute:
2.1.1. split the value of that itemref attribute on spaces. for each
resulting token, ID:
2.1.1.1. if ID equals the ID of current, return e
3. reaching this step indicates that the item wasn't found via
itemref on this element
4. let parent be the parent element of current
5. if parent is null, return null
6. if parent has the itemscope attribute, return parent
7. otherwise, let current be parent and jump to step 2.
This algorithm will find the parent item of a property, if there is
one. itemref'ing takes precedence over "parent-child linking", so in
Tim's example the properties of Shanghai would be applied to only the
Shanghai sub-item. I'm not convinced writing markup like that is a
good idea, but at least this way it has sane processing.
Which is important in the markup-souped web of non-linked-data :-)
HTMLPropertiesCollection on any given element would simply match all
elements in the document for which the the algorithm returns that
very element. It should be invalid for there to be any elements in
the document with itemprop where this algorithm returns null or the
element itself.
I will try implementing this algorithm in MicrodataJS [3] and see if
it works OK. While it may look less efficient than the current
algorithm, consider that a browser won't implement either algorithm
as writting, only act as if they did. The expensive step of going
through all elements with itemref attributes is actually no more
expensive than e.g. document.querySelector('.classname') if
implemented natively.
I did something like this in my experimental/unfinished/test/learn
microdata extractor based on jquery which is here :
http://www.depulz.nl/microdata/ (works at least in FF 3.5 and opera 10.10).
[1]
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-November/024095.html
[2]
http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#the-properties-of-an-item
[3] http://gitorious.org/microdatajs