Ian Hickson wrote:
On Tue, 7 Dec 2010, Nathan wrote:
Ian Hickson wrote:
On Tue, 7 Dec 2010, Nathan wrote:
Ian Hickson wrote:
I've used dce: and dct:, since now the example has both.
A general comment, microdata appears to be incredibly verbose for authors
when using multiple vocabularies to describe things, the example at
http://dev.w3.org/html5/md/#examples is almost painful to read, let alone
write.
Is there no way to reduce the repetition of long URIs for properties and
types as illustrated by the Turtle equivalent in the referred to example?
Does HTML or Microdata cater for this in any way?
When we did the usability studies for this we found that in practice (and
much to my surprise) the verbosity had no impact on the usability of the
language, so we didn't do anything to reduce it.
I'd love to see those results, any chance of a link to them?
I blogged about it here at the time:
http://blog.whatwg.org/usability-testing-html5
For privacy reasons I'm not able to make the actual raw videos available,
but if you have any specific questions then I can try to answer them. In
general I would encourage people to try to reproduce these results as that
is the best way to check them.
I'm glad to see you did some usability testing, although a little
surprised at the number of people and ack of variety in the tests.
However, I'm here looking towards the future and genuinely concerned
about data-in-html...
Furthermore, in practice, most use cases for microdata don't involve
multiple vocabularies but a single vocabulary explicitly named using
itemtype="", for which the vocabulary's short names are used.
If I understand correctly, that's because microformats constrain
vocabularies to only describing a single type of thing, and this has
spilled through in to microdata thus constraining descriptions of things
to only use a single vocabulary.
No, I'm talking about use cases here, not syntax. When designing
microdata, I collected a long list of use cases, for which it was
subsequently designed. The vast majority of those use cases only involve
one vocabulary at a time.
It may be that microdata is not designed for the same use cases that you
are interested in, in which case it would make sense that you would have a
different point of view on this.
Great, and hopefully my point of view and use-cases for using open, well
defined vocabularies, such as dublin core and the various vocabs on
w3.org, will be just as valid as your own and those previously tested?
Also, as far as I can tell in your initial usability tests, it was never
assessed whether using some for of URI compacting made Microdata more
useable, so it would probably be wise to consider that too, especially
since millions already use it in countless other web-centric technologies.
Furthermore, I'm quite concerned that:
- Vocabularies are encouraged not to be dereferencable, as opposed to
being encouraged to dereference to a vocab which is both human and
machine readable (for instance published with microdata annotations).
- The process for creating URI identifiers for microformat properties
is so complex ( uri + "microdata#" + urlencode(itemtype + "#:" +
property ), that this process is hidden in specs and not well known, and
that the description of those properties is only available in the spec,
in plain text, and has to be hard coded. for example:
http://www.whatwg.org/specs/web-apps/current-work/#licensing-works
- There's no clear path between microdata and full linkeddata
annotations, in say RDFa, indeed it uses entirely different properties
in an entirely different way, if anything it should be a subset, or RDFa
a superset. A single unified story on how to publish machine readable
data in HTML.
I'm sure that there are countless people, including myself, who would be
more than happy to look at the use cases and design requirements for
microdata, and come up with a proposal that addressed all of these
concerns, such that microdata+microformats complemented linkeddata+rdfa.
I feel it's very important to take the lessons learned within the
general web development community, and the semantic web community, and
apply them to data in HTML in order to best serve all potential
audiences. Rather than vs, or one precluding the other, they should
complement, whilst recognising that there are different use-cases and
audiences, and also that audiences will need to transition between both
depending on the use case, changing requirements + levels of
understanding over time.
Best,
Nathan