Re: [whatwg] microdata questions

2014-04-01 Thread Ian Hickson
On Mon, 10 Feb 2014, Eric Devine wrote:
> 
> 1. Section 5.5.1 of the Microdata spec prescribes how microdata should 
> be respresented as JSON, but it does provide a MIME type. I'm writing a 
> REST API that I would like to be able to return JSON in microdata 
> format, but I need the client to explicitly request this via the HTTP 
> Accept header. The main concern is to know when to return plain 
> properties as an array with one element.

As a general rule I would recommend against using Accept headers to do 
anything. You're better off making the JSON data its own resource, IMHO.

Having said that, as you noted in a later e-mail, the MIME type suggested 
by the HTML spec is "application/microdata+json".

   http://whatwg.org/html#application/microdata+json


> 2. Section 5.2.4 does not provide a way to apply a property value to the 
> value attribute of an  element. Is this an oversight, or is 
> there simply not a convincing enough use case for the need?

There's not any way currently to make for controls map to microdata. It's 
not clear exactly what it would mean.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] microdata questions

2014-02-10 Thread Eric Devine
I found the answer to my first question "application/microdata+json" from
W3C, but I would still appreciate feed back on my second question below.

Thanks,
Eric


On Mon, Feb 10, 2014 at 11:16 AM, Eric Devine  wrote:

> 1. Section 5.5.1 of the Microdata spec prescribes how microdata should be
> respresented as JSON, but it does provide a MIME type. I'm writing a REST
> API that I would like to be able to return JSON in microdata format, but I
> need the client to explicitly request this via the HTTP Accept header. The
> main concern is to know when to return plain properties as an array with
> one element.
>
> 2. Section 5.2.4 does not provide a way to apply a property value to the
> value attribute of an  element. Is this an oversight, or is there
> simply not a convincing enough use case for the need?
>
> Thanks for any feedback,
> Eric Devine
>


[whatwg] Microdata feedback

2013-08-06 Thread Ian Hickson
On Wed, 13 Feb 2013, Ed Summers wrote:
> 
> I am looking for some guidance about the use of multiple itemtypes in 
> microdata [1], specifically the phrase "defined to use the same 
> vocabulary" in:
> 
> """
> The item types must all be types defined in applicable specifications
> and must all be defined to use the same vocabulary.
> """
> 
> For example, does this mean that I can't say:
> 
> http://acme.com/Foo http://zenith.com/Bar";> ... 
> 

It depends on what http://acme.com/Foo and http://zenith.com/Bar are. If 
they use the same vocabulary, then you can do it. If they're separate 
vocabularies, then no.


> The reason I ask is that there is some desire over in the schema.org 
> community [2] to provide a mechanism for schema.org to be specialized. 
> For example, in the case of an audiobook:
> 
> http://schema.org/Book
> http://www.productontology.org/id/Audiobook";> ... 
> 
> The idea being not to overload schema.org with more vocabulary, and to 
> let vocabularies grow a bit more organically.

If they're the same vocabulary -- that is, the properties on this .../Book 
vocabulary and this .../Audiobook vocabulary don't clash -- properties 
mean the same thing in both -- then it's fine.


> This schema.org group is currently thinking of using a one off property 
> additionalType that would be used like so:
> 
> http://schema.org/Book";>
>href="http://www.productontology.org/id/Audiobook";>
>   ...
> 
> 
> I personally find this to be kind of distasteful since it replicates the 
> mechanics that microdata's itemtype already offers.

It's essentially equivalent, yes.


> So, my question: is it the case that itemtype cannot reference types in 
> different vocabularies like the example above? If so, I'm curious to 
> know what the rationale was, and if perhaps it could be relaxed.

If they're different vocabularies (i.e. the same terms are used to mean 
different things), then you wouldn't know which was meant, so it would be 
ambiguous. There's an open bug about this topic with an open question:

   https://www.w3.org/Bugs/Public/show_bug.cgi?id=13527


On Thu, 14 Feb 2013, Ed Summers wrote:
> 
> In John's email [1] he proposed limiting multiple types to being from 
> the same origin domain, not the same vocabulary as is stated in the 
> Microdata spec. It sounds like an obvious question, but is there a 
> precise definition of what is meant by "same vocabulary"? Or is it just 
> a hand wavy way of talking about what humans understand when putting the 
> itemtype URLs in their browsers, reading, and understanding that they 
> are types that are part of some larger coherent whole?

"Vocabulary" means the set of properties that are defined. There's some 
non-normative text in the HTML spec that talks about this:

# The type gives the context for the properties, thus selecting a
# vocabulary: a property named "class" given for an item with the type
# "http://census.example/person"; might refer to the economic class of
# an individual, while a property named "class" given for an item with
# the type "http://example.com/school/teacher"; might refer to the
# classroom a teacher has been assigned. Several types can share a
# vocabulary. For example, the types
# "http://example.org/people/teacher"; and
# "http://example.org/people/engineer"; could be defined to use the
# same vocabulary (though maybe some properties would not be
# especially useful in both cases, e.g. maybe the
# "http://example.org/people/engineer"; type might not typically be
# used with the "classroom" property). Multiple types defined to use
# the same vocabulary can be given for a single item by listing the
# URLs as a space-separated list in the attribute' value. An item
# cannot be given two types if they do not use the same vocabulary,
# however.


On Tue, 19 Feb 2013, Judson Lester wrote:
>
> There was an email from last year suggesting that the values of input 
> elements be derived from their value attributes - the purpose there 
> being to be able to control the form via the microdata interface.  I've 
> only been able to read it in the archives - the brief exchange was 
> between Igor Nikolev and Ian Hickson, who was curious about use cases.
> 
> Conversely, it would be useful to be able to use input elements to 
> contain item values, and at the moment, since their values would be 
> derived from their textContent, they're useless for that.  
> Specifically, it's often reasonable to present a representation as the 
> default values in a form and allow for updates simply by posting the 
> changed values.  It seems unwieldy to need to replicate that information 
> in e.g. data elements.
> 
> While it would be simple to treat the defaultValue as the item property 
> value for elements (and for radio inputs, let the representation mark 
> the selected input as the itemprop), it seems counter to the spirit of 
> the proposal.  The alternative would be to do something like excluding 
> unsuccessful input elements during the proper

Re: [whatwg] Microdata status

2013-05-30 Thread Karl Dubost

Le 30 mai 2013 à 12:39, Michael[tm] Smith a écrit :
> Alex or somebody else writes up an alternative API proposal they can be
> happier with, it seems unlikely they're going to be re-implementing
> anything based on the current Microdata API spec.


In the process, if it ever happens, I would love to see something more or less 
common in between RDFaLite, data-* and microdata. When I explored [1] different 
ways of expressing the same information, the JS code to access the data is 
quite different and makes it not very user friendly in the end.

[1]: http://dev.opera.com/articles/view/geolocation-html-api/

-- 
Karl Dubost
http://www.la-grange.net/karl/



Re: [whatwg] Microdata status

2013-05-29 Thread Ojan Vafai
On Wed, May 29, 2013 at 9:39 PM, Michael[tm] Smith  wrote:

> +Ojan, +Alex
>
> Jirka Kosek , 2013-05-14 17:22 +0200:
>
> > Hi,
> >
> > are there any plans to change Microdata API? From the following
> > conversation between Chromium developers it's not clear to me whether
> > they consider API itself bad or only their implementation.
> >
> >
> https://groups.google.com/a/chromium.org/forum/m/#!topic/blink-dev/b54nW_mGSVU
> >
> > Any insight welcomed.
>
> Not claiming to speak for anybody on the Chrome/Blink team but as far as
> that conversation among the Chromium developers, looking at it from the
> outside at least, my read is that they consider the current API spec to be
> bad -- not just their implementation.
>
> That said, it doesn't seem like anybody in the discussion other than Ojan
> mentioned anything bad in particular about the API spec. Ojan's comment:
>
>   "I have one concern with the feature as specced is that getItems and the
>   various Collection returning properties/methods all return live
>   NodeLists/Collections. [...] Live NodeLists/Collections impose a large
>   cost on the rest of the codebase and fundamentally make regular DOM
>   operations slower.
>

This concern could be addressed without much of a change to the current API
by returning static NodeLists and/or Collections. Hixie, consider this
feedback on the API. :) We're very unlikely to implement any new APIs that
return live NodeLists/Collections.

Whether addressing that would be enough that we'd be want to ship Microdata
is unclear to me.

Then there's a general comment from Alex:
>
>   "The current micro data API is...poor. I think we should write it off and
>   try again. No opinions in what that means for our impl in the meantime,
>   though (other than it shouldn't ship, of course). I'm happy to put work
>   into a better API if someone will collaborate on impl."
>
> So anyway, it looks like the gist from the overall discussion is: They've
> completely removed the Microdata API implementation from Blink, and unless
> Alex or somebody else writes up an alternative API proposal they can be
> happier with, it seems unlikely they're going to be re-implementing
> anything based on the current Microdata API spec.
>
>   --Mike
>
> --
> Michael[tm] Smith http://people.w3.org/mike
>


Re: [whatwg] Microdata status

2013-05-29 Thread Michael[tm] Smith
+Ojan, +Alex

Jirka Kosek , 2013-05-14 17:22 +0200:

> Hi,
> 
> are there any plans to change Microdata API? From the following
> conversation between Chromium developers it's not clear to me whether
> they consider API itself bad or only their implementation.
> 
> https://groups.google.com/a/chromium.org/forum/m/#!topic/blink-dev/b54nW_mGSVU
> 
> Any insight welcomed.

Not claiming to speak for anybody on the Chrome/Blink team but as far as
that conversation among the Chromium developers, looking at it from the
outside at least, my read is that they consider the current API spec to be
bad -- not just their implementation.

That said, it doesn't seem like anybody in the discussion other than Ojan
mentioned anything bad in particular about the API spec. Ojan's comment:

  "I have one concern with the feature as specced is that getItems and the
  various Collection returning properties/methods all return live
  NodeLists/Collections. [...] Live NodeLists/Collections impose a large
  cost on the rest of the codebase and fundamentally make regular DOM
  operations slower.

Then there's a general comment from Alex:

  "The current micro data API is...poor. I think we should write it off and
  try again. No opinions in what that means for our impl in the meantime,
  though (other than it shouldn't ship, of course). I'm happy to put work
  into a better API if someone will collaborate on impl."

So anyway, it looks like the gist from the overall discussion is: They've
completely removed the Microdata API implementation from Blink, and unless
Alex or somebody else writes up an alternative API proposal they can be
happier with, it seems unlikely they're going to be re-implementing
anything based on the current Microdata API spec.

  --Mike

-- 
Michael[tm] Smith http://people.w3.org/mike


Re: [whatwg] Microdata status

2013-05-28 Thread Ian Hickson
On Tue, 14 May 2013, Jirka Kosek wrote:
> 
> are there any plans to change Microdata API? From the following 
> conversation between Chromium developers it's not clear to me whether 
> they consider API itself bad or only their implementation.
> 
> https://groups.google.com/a/chromium.org/forum/m/#!topic/blink-dev/b54nW_mGSVU
> 
> Any insight welcomed.

I don't think there's any pending feedback on the API in the spec, so 
there's no current plans to change it.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


[whatwg] Microdata status

2013-05-14 Thread Jirka Kosek
Hi,

are there any plans to change Microdata API? From the following
conversation between Chromium developers it's not clear to me whether
they consider API itself bad or only their implementation.

https://groups.google.com/a/chromium.org/forum/m/#!topic/blink-dev/b54nW_mGSVU

Any insight welcomed.

Jirka

-- 
--
  Jirka Kosek  e-mail: ji...@kosek.cz  http://xmlguru.cz
--
   Professional XML consulting and training services
  DocBook customization, custom XSLT/XSL-FO document processing
--
 OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 rep.
--
Bringing you XML Prague conferencehttp://xmlprague.cz
--


[whatwg] microdata in stone spec vs the living std RDFa

2012-12-25 Thread Vipul S. Chawathe
XHTML5 microdata itemprop needs to clarify how experimental REST IRI is
changed to release IRI. To relate with living standard unlike W3 model,
within CURIE, if prefix lic is used for http://localhost/license, the prefix
could be redefined using one central profile page, like RDFa does define
common vocab in initialization context. Online hints are tied to 6
candidates who write Hello World web-page for which using hard-coded URIs
thats simple and not just feasible, but if real web-pages would be only
about Hello World photocopies, Google would be into anything but search.
However, as computer professional who does self-help for occasional tasks
such as working on provenance, using persistent URL within debug pages is
ugly. Let's complete the reinvention of wheel (or if preferred living
standard)-reinvention to credit Google inventing repackaged somebody's RDFa
into plagiarist's microdata & rip-off the CURIEs concept as well, to make it
realistically advanced rather than aiming at newbies who are at most wannabe
script kiddies, silly, common sense lacking use-case. After all RDFa is
"living", and microdata is the kind of stuff WHATWG blames W3 for, so to
beat the evil, WHATWG wants to be the worse amongst evils.



Re: [whatwg] Microdata feedback

2011-12-09 Thread Philip Jägenstedt

On Thu, 08 Dec 2011 22:04:41 +0100, Ian Hickson  wrote:


I changed the spec as you suggest.


Thanks!

--
Philip Jägenstedt
Core Developer
Opera Software


Re: [whatwg] microdata: itemprop in tag

2011-12-08 Thread Ian Hickson
On Sun, 16 Oct 2011, David Karger wrote:
>
> One natural way to represent a collection of structured items is in an 
> html table.  this can coexist with microdata, by using  
> and  tags.  But by ignoring the structure of the table, 
> this creates a lot of redundant attribute specification.
> 
> It would yield cleaner markup if it were possible to use  itemprop="foo"> to indicate an item property that should be inherited by 
> all cells in the given column.  In other words, to assert that any  
> associated with a  should inherit the itemprop associated with that 
>  .
> 
> It would yield even cleaner markup if there were a way to indicate that 
> every  was a distinct itemscope (the common case).  For example, to 
> use  to indicate that each row of the table scopes 
> an item of type bar.  Or perhaps  could be interpreted 
> as asserting a distinct itemscope for each row without specifying a 
> type.
> 
> But even using just the  inheritance rule, while still placing 
> itemscope in  tags, would save a quadratic quantity of markup.

Yeah, microdata doesn't handle tables well.

I'm a little reluctant to add magic to handle tables, because it can make 
it quite hard to work out what's going on, and it's not clear how common 
the problem really is. If it turns out to be a common issue, then it's 
something we should definitely consider, though.


On Sun, 16 Oct 2011, Tab Atkins Jr. wrote:
> 
> Just put an @itemref on each , pointing to the s that are part 
> of that column.  It's more verbose, but it doesn't rely on special 
> HTML-only rules.

That's a possible workaround for now, true.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Microdata - Handling the case where a string is upgraded to an object

2011-12-08 Thread Ian Hickson
On Thu, 14 Jul 2011, Tab Atkins Jr. wrote:
>
> It seems that this may be a useful problem to solve in Microdata.  We 
> can expose either an attribute or a privileged property name for the 
> object's "name"/"title"/"string representation".  Then, when using the 
> .items accessor, objects can be returned with a custom .toString that 
> returns that value, so they can be used as strings in legacy code.

So "complex" properties would need to state the data in two forms, or pick 
one of subproperties and annoint it as being the special fallback?


On Mon, 18 Jul 2011, Philip Jägenstedt wrote:
> 
> I take it the problem is with code like this:
> 
> Foo
> Barsson
> 
> var p = document.getItems("person")[0];
> alert(p.properties.namedItem("name")[0].itemValue);
> 
> 
> If the HTML changes to
> 
>  itemprop="givenName">Foo  itemprop="familyName">Barsson
> 
> then the script would be alerting "[object HTMLElement]" instead of "Foo 
> Barsson".

Indeed. It's not clear to me what else we would return, especially 
considering itemref="".


On Mon, 18 Jul 2011, Tab Atkins Jr. wrote:
> 
> Yeah.  I suspect this kind of API change is relatively common, and it's 
> the sort of thing that would *always* be painful.

In some of the sample vocabularies, there are properties that can either 
take a string or a structured item as a value. In the latter cases, 
there's no trivial way to provide a string alternative.


> > As for the solution, are you suggesting that .itemValue return a 
> > special object which is like HTMLElement in all regards except for how 
> > it toString()s?
> 
> Yes.

Some HTMLElement objects already have a custom toString().


On Tue, 19 Jul 2011, Philip Jägenstedt wrote:
> 
> Currently, it's spec'd as returning the element itself. This isn't 
> terribly useful, at least I've just checked e.itemScope and then 
> accessed e.properties directly rather than going through 
> e.itemValue.properties.

Yeah, it's mostly just so that people can take the itemValue into a local 
variable, and then manipulate it without having to worry about what type 
it is until later.


> Given this, a simpler fix would be to let .itemValue act like 
> .textContent when an itemscope attribute is present.

.textContent doesn't necessarily have anything to do with the modelled 
data. I'm not sure that really makes sense.


> Still, I'm not sure if it's a good idea. It makes the Microdata model 
> kind of odd if a property is both an item and has a fallback text 
> representation. It will also mask the fact that a text property has been 
> upgraded to an item, somewhat decreasing the chance that the consuming 
> code will be updated.

Yeah. And authors would have to make sure the textContent is usable as 
fallback, which isn't at all a given, IMHO.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Microdata feedback

2011-12-08 Thread Ian Hickson
On Sat, 9 Jul 2011, Philip Jägenstedt wrote:
> On Sat, 09 Jul 2011 01:19:02 +0200, Ian Hickson  wrote:
> > On Sat, 9 Jul 2011, Philip Jägenstedt wrote:
> > > 
> > > Step 11 is "If current has an itemprop attribute specified, add it 
> > > to results." but should be "If current has one or more property 
> > > names, add it to results." Property names are defined in 
> > > http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#property-names
> > > 
> > > Why? If you start with , then 
> > > div.itemProp.remove("foo") would give you . It'd be 
> > > weird if the element still showed up in the properties collection 
> > > after removing the only property name.
> > 
> > The .properties attribute "must return an HTMLPropertiesCollection 
> > rooted at the Document node, whose filter matches only elements that 
> > have property names", which further filters the results of the 
> > algorithm. Similarly, everything that uses the algorithm here does 
> > things "for each property name", so if itemprop="" doesn't have any 
> > tokens, nothing happens and it doesn't matter that the algorithm 
> > returns it.
> 
> Ah, I see my misunderstanding.
> 
> Purely editorial: It would, IMO, be more clear if that check were in the 
> algorithm itself. That's the way it's going to be (has been) implemented 
> since there's no reason to do the filtering as a separate step. Do as 
> you wish.

I changed the spec as you suggest. I agree that it's cleaner. I checked 
and I don't think it'll have any negative side-effects, though it does 
change the precise number of conformance errors in some invalid documents 
(not a truly practical concern since conformance checkers are only 
required to report zero errors if there are none and at least one error if 
there are any).

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] microdata: itemprop in tag

2011-10-16 Thread Tab Atkins Jr.
On Sun, Oct 16, 2011 at 7:47 PM, David Karger  wrote:
> One natural way to represent a collection of structured items is in an html
> table.  this can coexist with microdata, by using  and  itemprop> tags.  But by ignoring the structure of the table, this creates a
> lot of redundant attribute specification.
>
> It would yield cleaner markup if it were possible to use  itemprop="foo"> to indicate an item property that should be inherited by all
> cells in the given column.  In other words, to assert that any 
> associated with a  should inherit the itemprop associated with that
>  .

Just put an @itemref on each , pointing to the s that are
part of that column.  It's more verbose, but it doesn't rely on
special HTML-only rules.

> It would yield even cleaner markup if there were a way to indicate that
> every  was a distinct itemscope (the common case).  For example, to use
>  to indicate that each row of the table scopes an item
> of type bar.    Or perhaps  could be interpreted as
> asserting a distinct itemscope for each row without specifying a type.

I'm not sure I understand.  Are you trying to mark up one item per
row, and just trying to save putting an @itemscope attribute on the
rows?  That's a fairly insignificant amount of savings for the
confusion it can cause (because now the itemscopes aren't obvious).

~TJ


[whatwg] microdata: itemprop in tag

2011-10-16 Thread David Karger
One natural way to represent a collection of structured items is in an 
html table.  this can coexist with microdata, by using  
and  tags.  But by ignoring the structure of the table, 
this creates a lot of redundant attribute specification.


It would yield cleaner markup if it were possible to use itemprop="foo"> to indicate an item property that should be inherited by 
all cells in the given column.  In other words, to assert that any  
associated with a  should inherit the itemprop associated with that 
 .


It would yield even cleaner markup if there were a way to indicate that 
every  was a distinct itemscope (the common case).  For example, to 
use  to indicate that each row of the table scopes 
an item of type bar.Or perhaps  could be 
interpreted as asserting a distinct itemscope for each row without 
specifying a type.


But even using just the  inheritance rule, while still placing 
itemscope in  tags, would save a quadratic quantity of markup.


Re: [whatwg] Microdata getItems()

2011-08-10 Thread Rob Crowther

On 09/08/11 20:48, Ian Hickson wrote:

On Tue, 9 Aug 2011, Rob Crowther wrote:

Correct. Browsers aren't expected to know about the vocabularies, let
alone validate them.


Thanks.  I think this could be made more clear in the spec.




However if I remove itemscope from the element
the Opera beta implementation still returns it as a top level microdata
item even though it is now invalid.  Is this expected behaviour?


No.

Looks like this was me doing something stupid, Opera is indeed only 
returning the items with both itemscope and itemtype.


Rob


Re: [whatwg] Microdata getItems()

2011-08-09 Thread Ian Hickson
On Tue, 9 Aug 2011, Rob Crowther wrote:
>
> I just want to confirm that my understanding of this is correct: 
> getItems() will return a NodeList of top level microdata items and this 
> is irrespective of whether or not the items are actually valid in terms 
> of their type?  That is, it is the developer's responsibility to confirm 
> that the vCard has an fn and an n before further processing?

Correct. Browsers aren't expected to know about the vocabularies, let 
alone validate them.


> One further question - if an itemtype attribute is present there must 
> also be an itemscope.  However if I remove itemscope from the element 
> the Opera beta implementation still returns it as a top level microdata 
> item even though it is now invalid.  Is this expected behaviour?

No.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


[whatwg] Microdata getItems()

2011-08-09 Thread Rob Crowther
I just want to confirm that my understanding of this is correct: 
getItems() will return a NodeList of top level microdata items and this 
is irrespective of whether or not the items are actually valid in terms 
of their type?  That is, it is the developer's responsibility to confirm 
that the vCard has an fn and an n before further processing?


It makes sense to me because I don't expect the browser to be 
downloading random vocabularies off the internet to check these things, 
but it doesn't seem to be explicitly referenced in the spec.  There is a 
section which talks about de-referencing says that the browser can 
dereference the URL to provide item specific processing, but only if the 
applicable specification allows it.


One further question - if an itemtype attribute is present there must 
also be an itemscope.  However if I remove itemscope from the element 
the Opera beta implementation still returns it as a top level microdata 
item even though it is now invalid.  Is this expected behaviour?


Rob


Re: [whatwg] Microdata - Handling the case where a string is upgraded to an object

2011-07-19 Thread Philip Jägenstedt
On Mon, 18 Jul 2011 22:01:37 +0200, Tab Atkins Jr.   
wrote:



On Mon, Jul 18, 2011 at 4:20 AM, Philip Jägenstedt 



As for the solution, are you suggesting that .itemValue return a special
object which is like HTMLElement in all regards except for how it
toString()s?


Yes.


Currently, it's spec'd as returning the element itself. This isn't  
terribly useful, at least I've just checked e.itemScope and then accessed  
e.properties directly rather than going through e.itemValue.properties.  
Given this, a simpler fix would be to let .itemValue act like .textContent  
when an itemscope attribute is present.


Still, I'm not sure if it's a good idea. It makes the Microdata model kind  
of odd if a property is both an item and has a fallback text  
representation. It will also mask the fact that a text property has been  
upgraded to an item, somewhat decreasing the chance that the consuming  
code will be updated.


--
Philip Jägenstedt
Core Developer
Opera Software


Re: [whatwg] Microdata - Handling the case where a string is upgraded to an object

2011-07-18 Thread Tab Atkins Jr.
On Mon, Jul 18, 2011 at 4:20 AM, Philip Jägenstedt  wrote:
> There is no items IDL attribute, do you mean getItems() or .itemValue
> perhaps?

Yes, sorry.


> I take it the problem is with code like this:
>
> Foo
> Barsson
> 
> var p = document.getItems("person")[0];
> alert(p.properties.namedItem("name")[0].itemValue);
> 
>
> If the HTML changes to
>
>  itemprop="givenName">Foo  itemprop="familyName">Barsson
>
> then the script would be alerting "[object HTMLElement]" instead of "Foo
> Barsson".
>
> I'm not sure why this would be a problem. If someone changes the page, then
> can't they adjust the script to match?

That only works if the page is using its own Microdata, not if someone
else is consuming the Microdata.

> Is it extensions and libraries that
> you're worried about?

Yeah.  I suspect this kind of API change is relatively common, and
it's the sort of thing that would *always* be painful.

> As for the solution, are you suggesting that .itemValue return a special
> object which is like HTMLElement in all regards except for how it
> toString()s?

Yes.

~TJ


Re: [whatwg] Microdata - Handling the case where a string is upgraded to an object

2011-07-18 Thread Philip Jägenstedt
On Thu, 14 Jul 2011 20:49:44 +0200, Tab Atkins Jr.   
wrote:



Some IRC discussion this morning concerned the scenario where an API
starts by exposing a property as a string, but later wants to change
it to be a complex object.

This appears to be a reasonably common scenario.  For example, a
vocabulary with a "name" property may start with it being a string,
and then later change to an object exposing "firstname"/"lastname"/etc
properties.  A vocabulary for a music library may start by having
"track" as a string, then later expanding it to expose the track
title, the individual artist, the running time, etc.

In a very similar vein, the CSSOM is currently defined to always
return property values as strings.  We want to instead return complex
objects that expose useful information and interfaces specialized on
the value's type, however.  For compat reasons, we have to use an
entirely different accessor in order to expose this type of thing.

It seems that this may be a useful problem to solve in Microdata.  We
can expose either an attribute or a privileged property name for the
object's "name"/"title"/"string representation".  Then, when using the
.items accessor, objects can be returned with a custom .toString that
returns that value, so they can be used as strings in legacy code.

Thoughts?


There is no items IDL attribute, do you mean getItems() or .itemValue  
perhaps?


I take it the problem is with code like this:

Foo  
Barsson


var p = document.getItems("person")[0];
alert(p.properties.namedItem("name")[0].itemValue);


If the HTML changes to

itemprop="givenName">Foo itemprop="familyName">Barsson


then the script would be alerting "[object HTMLElement]" instead of "Foo  
Barsson".


I'm not sure why this would be a problem. If someone changes the page,  
then can't they adjust the script to match? Is it extensions and libraries  
that you're worried about?


As for the solution, are you suggesting that .itemValue return a special  
object which is like HTMLElement in all regards except for how it  
toString()s?


--
Philip Jägenstedt
Core Developer
Opera Software


[whatwg] Microdata - Handling the case where a string is upgraded to an object

2011-07-14 Thread Tab Atkins Jr.
Some IRC discussion this morning concerned the scenario where an API
starts by exposing a property as a string, but later wants to change
it to be a complex object.

This appears to be a reasonably common scenario.  For example, a
vocabulary with a "name" property may start with it being a string,
and then later change to an object exposing "firstname"/"lastname"/etc
properties.  A vocabulary for a music library may start by having
"track" as a string, then later expanding it to expose the track
title, the individual artist, the running time, etc.

In a very similar vein, the CSSOM is currently defined to always
return property values as strings.  We want to instead return complex
objects that expose useful information and interfaces specialized on
the value's type, however.  For compat reasons, we have to use an
entirely different accessor in order to expose this type of thing.

It seems that this may be a useful problem to solve in Microdata.  We
can expose either an attribute or a privileged property name for the
object's "name"/"title"/"string representation".  Then, when using the
.items accessor, objects can be returned with a custom .toString that
returns that value, so they can be used as strings in legacy code.

Thoughts?

~TJ


Re: [whatwg] Microdata feedback

2011-07-12 Thread Ian Hickson
On Tue, 12 Jul 2011, Henri Sivonen wrote:
> On Thu, 2011-07-07 at 22:33 +, Ian Hickson wrote:
> > The JSON algorithm now ends the crawl when it hits a loop, and 
> > replaces the offending duplicate item with the string "ERROR".
> > 
> > The RDF algorithm preserves the loops, since doing so is possible with 
> > RDF. Turns out the algorithm almost did this already, looks like it 
> > was an oversight.
> 
> It seems to me that this approach creates an incentive for people who 
> want to do RDFesque things to publish deliberately non-conforming 
> microdata content that works the way they want for RDF-based consumers 
> but breaks for non-RDF consumers. If such content abounds and non-RDF 
> consumers are forced to support loopiness but extending the JSON 
> conversion algorithm in ad hoc ways, part of the benefit of microdata 
> over RDFa (treeness) is destroyed and the benefit of being well-defined 
> would be destroyed, too, for non-RDF consumption cases.

The "problem" here is that RDF and microdata have different data models, 
and RDF cannot represent microdata's data model with fidelity.

For example, consider how this converts to RDF and compare it to the 
microdata equivalent:

   http://example.com/"; itemid="http://example.com/1";>
x
   
   http://example.com/"; itemid="http://example.com/1";>
x
   

There are other things RDF can't represent easily, e.g. it cannot easily 
represent the order of the values in this item:

   http://example.com/";>
1
2
   

As such, I suggest we not worry about the itemref="" loop case, or that we 
try to fix all these cases together (not sure how we'd fix them).

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Microdata feedback

2011-07-12 Thread Philip Jägenstedt

On Tue, 12 Jul 2011 09:41:18 +0200, Henri Sivonen  wrote:


On Thu, 2011-07-07 at 22:33 +, Ian Hickson wrote:

The JSON algorithm now ends the crawl when it hits a loop, and replaces
the offending duplicate item with the string "ERROR".

The RDF algorithm preserves the loops, since doing so is possible with
RDF. Turns out the algorithm almost did this already, looks like it was  
an

oversight.


It seems to me that this approach creates an incentive for people who
want to do RDFesque things to publish deliberately non-conforming
microdata content that works the way they want for RDF-based consumers
but breaks for non-RDF consumers. If such content abounds and non-RDF
consumers are forced to support loopiness but extending the JSON
conversion algorithm in ad hoc ways, part of the benefit of microdata
over RDFa (treeness) is destroyed and the benefit of being well-defined
would be destroyed, too, for non-RDF consumption cases.


I don't have a strong opinion, but note that even before this change the  
algorithm produced a non-tree for the "Avenue Q" example [1] where the  
"adr" property is shared between two items using itemref. (In JSON, it is  
flattened.) If we want to ensure that RDF consumers don't depend on  
non-treeness, then this should change as well.


[1]  
http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#examples-4


--
Philip Jägenstedt
Core Developer
Opera Software


Re: [whatwg] Microdata feedback

2011-07-12 Thread Henri Sivonen
On Thu, 2011-07-07 at 22:33 +, Ian Hickson wrote:
> The JSON algorithm now ends the crawl when it hits a loop, and replaces 
> the offending duplicate item with the string "ERROR".
> 
> The RDF algorithm preserves the loops, since doing so is possible with 
> RDF. Turns out the algorithm almost did this already, looks like it was an 
> oversight.

It seems to me that this approach creates an incentive for people who
want to do RDFesque things to publish deliberately non-conforming
microdata content that works the way they want for RDF-based consumers
but breaks for non-RDF consumers. If such content abounds and non-RDF
consumers are forced to support loopiness but extending the JSON
conversion algorithm in ad hoc ways, part of the benefit of microdata
over RDFa (treeness) is destroyed and the benefit of being well-defined
would be destroyed, too, for non-RDF consumption cases.

-- 
Henri Sivonen
hsivo...@iki.fi
http://hsivonen.iki.fi/



Re: [whatwg] Microdata feedback

2011-07-08 Thread Philip Jägenstedt

On Sat, 09 Jul 2011 01:19:02 +0200, Ian Hickson  wrote:


On Sat, 9 Jul 2011, Philip Jägenstedt wrote:


Step 11 is "If current has an itemprop attribute specified, add it to
results." but should be "If current has one or more property names, add
it to results." Property names are defined in
http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#property-names

Why? If you start with , then
div.itemProp.remove("foo") would give you . It'd be
weird if the element still showed up in the properties collection after
removing the only property name.


The .properties attribute "must return an HTMLPropertiesCollection rooted
at the Document node, whose filter matches only elements that have
property names", which further filters the results of the algorithm.
Similarly, everything that uses the algorithm here does things "for each
property name", so if itemprop="" doesn't have any tokens, nothing  
happens

and it doesn't matter that the algorithm returns it.


Ah, I see my misunderstanding.

Purely editorial: It would, IMO, be more clear if that check were in the  
algorithm itself. That's the way it's going to be (has been) implemented  
since there's no reason to do the filtering as a separate step. Do as you  
wish.


--
Philip Jägenstedt
Core Developer
Opera Software


Re: [whatwg] Microdata feedback

2011-07-08 Thread Ian Hickson
On Sat, 9 Jul 2011, Philip Jägenstedt wrote:
> 
> Step 11 is "If current has an itemprop attribute specified, add it to 
> results." but should be "If current has one or more property names, add 
> it to results." Property names are defined in 
> http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#property-names
> 
> Why? If you start with , then 
> div.itemProp.remove("foo") would give you . It'd be 
> weird if the element still showed up in the properties collection after 
> removing the only property name.

The .properties attribute "must return an HTMLPropertiesCollection rooted 
at the Document node, whose filter matches only elements that have 
property names", which further filters the results of the algorithm. 
Similarly, everything that uses the algorithm here does things "for each 
property name", so if itemprop="" doesn't have any tokens, nothing happens 
and it doesn't matter that the algorithm returns it.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Microdata feedback

2011-07-08 Thread Philip Jägenstedt

On Fri, 08 Jul 2011 21:31:49 +0200, Ian Hickson  wrote:


On Fri, 8 Jul 2011, Philip Jägenstedt wrote:

On Fri, 08 Jul 2011 00:33:14 +0200, Ian Hickson  wrote:
> On Wed, 8 Jun 2011, Tomasz Jamroszczak wrote:
> >
> > I've been looking into Microdata specification and it struck me,
> > that crawling algorithm is so complex, when it comes to expressing
> > simple ideas.  I think that foremost the algorithm should be
> > described in the specification with explanation what it's supposed
> > to do, before steps of what exactly is to be done are written.
>
> Yeah. Turns out the algorithms involved here are quite badly broken.
>
> It was intended to expose the microdata graph as completely as
> possible while dropping anything that would introduce a loop, at the
> point where the first repetition would start (so A->B->C=>A would
> break at the =), in the API, in the JSON, and in the conformance
> rules. I didn't do a good job speccing that, though!
>
> I've fixed the algorithms to make sense (I hope).

http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#the-properties-of-an-item

I had a look at this to verify that it is black-box-equivalent to what
Opera has implemented, and only discovered one issue:

 should not be added to the .properties collection,
because it has no properties. My bad for suggesting that the criteria
should be the presence of an itemprop attribute, it should be an
itemprop attribute containing at least one token. Can you update the
spec to match?


What needs updating? As far as I can tell, what you describe is what the
spec requires.


Step 11 is "If current has an itemprop attribute specified, add it to  
results." but should be "If current has one or more property names, add it  
to results." Property names are defined in  
http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#property-names


Why? If you start with , then  
div.itemProp.remove("foo") would give you . It'd be weird  
if the element still showed up in the properties collection after removing  
the only property name.





> On Wed, 29 Jun 2011, Philip Jägenstedt wrote:
> >
> > Indeed, multiple types doesn't work at all if you want to mix
> > different types. I was assuming that the use case was to extend
> > types, kind of like http://schema.org/Person/Governor. However, it
> > doesn't work all that well even in that case, since there's no way
> > to know which type is the extension of the other and which
> > properties exist only on the extended type.
>
> I don't really understand this use case. Can you elaborate on the
> problem that needs solving here?

It's whatever problem  is trying
to solve, which is something like "allow people to geek out with more
specific vocabularies without interfering with search results".


That doesn't seem to be a problem. I don't really understand what problem
this is solving.


Neither do I.

If the problem is just "I want to annotate data that isn't defined in  
this

vocabulary", that's already possible using URL property names.



If I were schema.org, I would just encourage people to do this:

http://schema.org/Person";>
 
   Arnold
   http://example.com/Governor";  
itemref="wrapper">

 California
   
 



That's a bit weird. Why not just:?

 http://schema.org/Person";>
  Arnold
  http://example.com/Governor/state";>California
 


Yeah, that's better, at least when the number of additional attributes is  
small.



It's hard to know without knowing what concrete user problem we're trying
to solve here.


I'll leave this discussion to the schema.org sponsors and just hope that  
the method in  doesn't catch on.


--
Philip Jägenstedt
Core Developer
Opera Software


Re: [whatwg] Microdata feedback

2011-07-08 Thread Ian Hickson
On Fri, 8 Jul 2011, Philip Jägenstedt wrote:
> On Fri, 08 Jul 2011 00:33:14 +0200, Ian Hickson  wrote:
> > On Wed, 8 Jun 2011, Tomasz Jamroszczak wrote:
> > > 
> > > I've been looking into Microdata specification and it struck me, 
> > > that crawling algorithm is so complex, when it comes to expressing 
> > > simple ideas.  I think that foremost the algorithm should be 
> > > described in the specification with explanation what it's supposed 
> > > to do, before steps of what exactly is to be done are written.
> > 
> > Yeah. Turns out the algorithms involved here are quite badly broken.
> > 
> > It was intended to expose the microdata graph as completely as 
> > possible while dropping anything that would introduce a loop, at the 
> > point where the first repetition would start (so A->B->C=>A would 
> > break at the =), in the API, in the JSON, and in the conformance 
> > rules. I didn't do a good job speccing that, though!
> > 
> > I've fixed the algorithms to make sense (I hope).
> 
> http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#the-properties-of-an-item
> 
> I had a look at this to verify that it is black-box-equivalent to what 
> Opera has implemented, and only discovered one issue:
> 
>  should not be added to the .properties collection, 
> because it has no properties. My bad for suggesting that the criteria 
> should be the presence of an itemprop attribute, it should be an 
> itemprop attribute containing at least one token. Can you update the 
> spec to match?

What needs updating? As far as I can tell, what you describe is what the 
spec requires.


> > The RDF algorithm preserves the loops, since doing so is possible with 
> > RDF. Turns out the algorithm almost did this already, looks like it 
> > was an oversight.
> 
> WFM, but note step 3: "Add a mapping from the item item to the subject 
> subject in memory, if there isn't one already." Step 1 guarantees that 
> there is no entry for item, so step 3 can be unconditional.

Good point. Fixed.


> > On Wed, 29 Jun 2011, Philip Jägenstedt wrote:
> > > 
> > > Indeed, multiple types doesn't work at all if you want to mix 
> > > different types. I was assuming that the use case was to extend 
> > > types, kind of like http://schema.org/Person/Governor. However, it 
> > > doesn't work all that well even in that case, since there's no way 
> > > to know which type is the extension of the other and which 
> > > properties exist only on the extended type.
> > 
> > I don't really understand this use case. Can you elaborate on the 
> > problem that needs solving here?
> 
> It's whatever problem  is trying 
> to solve, which is something like "allow people to geek out with more 
> specific vocabularies without interfering with search results".

That doesn't seem to be a problem. I don't really understand what problem 
this is solving.

If the problem is just "I want to annotate data that isn't defined in this 
vocabulary", that's already possible using URL property names.


> If I were schema.org, I would just encourage people to do this:
> 
> http://schema.org/Person";>
>  
>Arnold
>http://example.com/Governor"; itemref="wrapper">
>  California
>
>  
> 

That's a bit weird. Why not just:?

 http://schema.org/Person";>
  Arnold
  http://example.com/Governor/state";>California
 

It's hard to know without knowing what concrete user problem we're trying 
to solve here.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Microdata feedback

2011-07-08 Thread Philip Jägenstedt

On Fri, 08 Jul 2011 00:33:14 +0200, Ian Hickson  wrote:


On Wed, 8 Jun 2011, Tomasz Jamroszczak wrote:


I've been looking into Microdata specification and it struck me, that
crawling algorithm is so complex, when it comes to expressing simple
ideas.  I think that foremost the algorithm should be described in the
specification with explanation what it's supposed to do, before steps of
what exactly is to be done are written.


Yeah. Turns out the algorithms involved here are quite badly broken.

It was intended to expose the microdata graph as completely as possible
while dropping anything that would introduce a loop, at the point where
the first repetition would start (so A->B->C=>A would break at the =),
in the API, in the JSON, and in the conformance rules. I didn't do a good
job speccing that, though!

I've fixed the algorithms to make sense (I hope).


http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#the-properties-of-an-item

I had a look at this to verify that it is black-box-equivalent to what  
Opera has implemented, and only discovered one issue:


 should not be added to the .properties collection,  
because it has no properties. My bad for suggesting that the criteria  
should be the presence of an itemprop attribute, it should be an itemprop  
attribute containing at least one token. Can you update the spec to match?


(I implemented the spec'd algorithm pedantically in  
  
for verification, it passes the unit tests with said modification.)





On Wed, 29 Jun 2011, Philip Jägenstedt wrote:


Note also that other algorithms defined in terms of items and their
properties need to handle loopiness in some way. That's currently RDF,
vCard and iCal conversion. Perhaps something like "loopy item" could be
defined and those algorithms could skip loopy items wherever they occur?
Simply failing is also an acceptable solution, IMO.


I fixed vCard with a patch that just outputs "AGENT;TYPE=VCARD:ERROR" in
the case of a loop. (Can only happen if the input is non-conforming, so  
it

doesn't matter if the output is non-conforming.)


WFM


The vEvent stuff was already loop-safe.

The JSON algorithm now ends the crawl when it hits a loop, and replaces
the offending duplicate item with the string "ERROR".


WFM


The RDF algorithm preserves the loops, since doing so is possible with
RDF. Turns out the algorithm almost did this already, looks like it was  
an

oversight.


WFM, but note step 3: "Add a mapping from the item item to the subject  
subject in memory, if there isn't one already." Step 1 guarantees that  
there is no entry for item, so step 3 can be unconditional.





On Wed, 29 Jun 2011, Philip Jägenstedt wrote:


Indeed, multiple types doesn't work at all if you want to mix different
types. I was assuming that the use case was to extend types, kind of
like http://schema.org/Person/Governor. However, it doesn't work all
that well even in that case, since there's no way to know which type is
the extension of the other and which properties exist only on the
extended type.


I don't really understand this use case. Can you elaborate on the problem
that needs solving here?


It's whatever problem  is trying to  
solve, which is something like "allow people to geek out with more  
specific vocabularies without interfering with search results". I whined a  
bit in  
,  
the short story being:


 * extensibility encoded with a microsyntax in the URL, making it  
not-so-opaque

 * such URLs make the DOM API less useful

Perhaps bending Microdata to accommodate for this is not the best idea. If  
I were schema.org, I would just encourage people to do this:


http://schema.org/Person";>
  
Arnold
http://example.com/Governor";  
itemref="wrapper">

  California

  


Making extensions unsightly is probably a good thing, to discourage people  
from going too crazy with it. This way it's also clear which properties  
only apply to the extended type.


--
Philip Jägenstedt
Core Developer
Opera Software


[whatwg] Microdata feedback

2011-07-07 Thread Ian Hickson
On Wed, 8 Jun 2011, Tomasz Jamroszczak wrote:
> 
> I've been looking into Microdata specification and it struck me, that 
> crawling algorithm is so complex, when it comes to expressing simple 
> ideas.  I think that foremost the algorithm should be described in the 
> specification with explanation what it's supposed to do, before steps of 
> what exactly is to be done are written.

Yeah. Turns out the algorithms involved here are quite badly broken.

It was intended to expose the microdata graph as completely as possible 
while dropping anything that would introduce a loop, at the point where 
the first repetition would start (so A->B->C=>A would break at the =), 
in the API, in the JSON, and in the conformance rules. I didn't do a good 
job speccing that, though!

I've fixed the algorithms to make sense (I hope).


> Let's see, what are the properties of Microdata item from HTML element 
> with id=up from following HTML:
> 
> 
>   
> 

The element id=up has one property, prop1, whose value is an item on the 
element id=down. The element id=down has one property, prop0, whose value 
is the item on the element with id=up. If you crawl from id=up, my intent 
was to have the prop0 be dropped from the graph. If you crawl from 
id=down, my intent was to have prop1 be dropped from the graph. In 
addition, the document is intended to be non-conforming. If you serialise 
it for JSON, my intent was for the item on id=up to be the "top" one, and 
for it to have one property whose value is the item on id=down, which 
would itself have no values.

Note that the above would be non-conforming on its own because there are 
no top-level microdata items in the above snippet.


> I can imagine good usages of loops of Microdata items, for example "John 
> knows Amy, Amy knows John":
> 
> 
>   
> 
>
> 
>   
> 
> 
> There's loop:  jonh->amy1->john->... .

itemref="" doesn't reference items for property values. It just references 
an element to get a list of properties for an item.

The example above is non-conforming because itemref="" can only be 
specified on an itemscope="" element, itemprop="" is not value without a 
value, and there's no top-level items.

The right way to do what you describe above is (provided the vocabulary 
is defined in a way that supports this):

 http://example.com/john"; itemtype="...">
   http://example.com/fred1 http://example.com/jenny2 
http://example.com/amy1";>
 

 http://example.com/amy1"; itemtype="...">
   http://example.com/john";>
 


> If the loop is to be excluded, and thus recursion, the same data could 
> be written as:
> 
> 
>   1
>   John
>   2
> 
>
> 
>   2
>   Amy
>   1
> .

That's another way to do it, yes.


> maybe with some  instead of  or more verbosely:
> 
> John knows  itemprop="http://xmlns.com/foaf/0.1/knows"; href="#amy">Amy.
>
> Amy knows  itemprop="http://xmlns.com/foaf/0.1/knows"; href="#john">John.

That works too.


> The problem I'm addressing revolves around meaning of link between 
> itemref and id attributes.  Is it meant to be a part of Microdata data 
> model?

No, it's just syntactic sugar to allow pages to use microdata without 
having to twist their markup into a pretzel to make it work.


> Or maybe it is introduced to cope with the fact that Microdata graph is 
> defined on top of existing data, which is something completely 
> different, and is meant to be rendered to the user (that is on top of 
> HTML tree)?

Right.


> So the meaning of itemref attribute should also hint interpretation of 
> it inside the specification.

Done.


On Fri, 10 Jun 2011, Philip Jägenstedt wrote:
> 
> I don't think the spec needs to be giving suggestions for efficient 
> implementation for live collections, because we inevitable won't 
> implement exactly that algorithm anyway.

The aim wasn't to give suggestions for efficient implementations. The aim 
was to give algorithms for which an efficient implementation existed, 
rather than requiring something nigh on impossible to implement 
efficiently. The aim wasn't reached, though, in that the algorithm in the 
spec was just completely bogus. Sorry about that.


On Tue, 28 Jun 2011, Tomasz Jamroszczak wrote:
> 
> For sure itemRef attribute of Microdata have to stay, because it makes 
> possible separation of data (the Microdata item properties, the 
> semantics) and view (where contents of those properties should be laid 
> out for browser user). Without itemRef, Microdata becomes "Picodata".

That may not be all bad. :-)

You know something is done not when there's nothing new to add, but when 
there's nothing left to remove.


> But then, what to do when translating Microdata to other format, such as 
> stringification to JSON in Drag'n'drop?  The JSON itself is quite 
> primitive when it comes to stringification loops - it just throws an 
> exception.  We thought we'll be more flexible.  We'll make 
> stringification "as best as possible", and cutting only the last 
> offending link of a cycle.  See 
> http://people.ope

[whatwg] Microdata property sharing with itemref

2011-06-14 Thread Philip Jägenstedt

A question came up in the Schema.org discussion group today:

http://groups.google.com/group/schemaorg-discussion/browse_thread/thread/69b733066ae7?pli=1

The question was how to fix http://www.2gc.co.uk/a2gc-people to link  
together properties that were in different parts of the document into a  
single item. The answer is of course to use itemref, here simplified even  
further to illustrate:


http://schema.org/Organization";>
  2GC Active Management
  
itemtype="http://schema.org/Person"; itemref="GL">
  

  Gavin Lawrie
  Founder & Managing Director


  

  

  Gavin Lawrie: Founder & Managing Director
  Gavin is ...


  



The ugly: . That itemscope is there only  
to prevent the description property of the Person from applying to the  
organization, and does so because the algorithm to crawl the properties of  
an item stops at itemscope. This is a silly hack, because it is not an  
item, and I don't expect many people would find this solution even if they  
knew about the problem.


Have others encountered this problem and how did you deal with it?

Should we have yet another property like "itemunscope" that stops the  
crawl algorithm but does not create a new item?


Could we tweak the validity definitions so that this kind of thing would  
cause validators to complain, or should we leave it completely to  
vocabulary-specific validators to spot this kind of thing? (They can't if  
they operate on the microdata level and not DOM level, which I think they  
should.)


Sorry for the lack of solutions in this mail, but I though it was worth  
raising anyway.


(One idea of mine that was discussed some time ago was to only let each  
property belong to one item, letting items steal each others properties by  
use of itemref. Even though it would solve this particular problem, it is  
quite weird and was rightfully rejected.)


--
Philip Jägenstedt
Core Developer
Opera Software


[whatwg] Microdata feedback: please state that property value ordering is in the data model, and give usage guidelines

2011-06-08 Thread Dan Brickley
Hello,

Reading 
http://www.whatwg.org/specs/web-apps/current-work/multipage/links.html#microdata

Section '5.2.3 Names: the itemprop attribute' states something
important about Microdata's data model,

"Within an item, the properties are unordered with respect to each
other, except for properties with the same name, which are ordered in
the order they are given by the algorithm that defines the properties
of an item."

... and gives an example "In the following example, the "a" property
has the values "1" and "2", in that order,  ...

 test
 2


 1
"

However '5.2.1 The microdata model' does not mention anything of this
data model feature. If property values (for some specific
property/item context), this should be mentioned when introducing the
data model; if only by copying or linking the above sentence ("Within
an item, ...").

Is the expectation that Microdata vocabulary authors can decide
whether such ordering is meaningful, when they define / describe their
properties?

For example, in academic publishing where they care about being first
named author, the ordering of 'itemprop="author"' might seem to
matter. 5.2.3 suggests that the ordering information is at least
preserved in Microdata's data model. If someone creates an 'author'
property for Microdata, should they state that property ordering is
meaningful, or is that not their decision?

Thanks,

Dan


[whatwg] Microdata feedback

2011-05-31 Thread Ian Hickson

On Tue, 15 Mar 2011, Hay (Husky) wrote:
> 
> Consider, for example, a list that contains custom data that needs to be 
> displayed using Javascript. In most cases, the data-* attributes are a 
> nice way to embed non-visual data to be read out later, but that doesn't 
> work for hierarchical structures.

You can use nested HTML elements with data-* attributes. For example, the 
JSON value {a:{b:'x',y:'z'}} could be represented as:

   


   


> 1) Microdata. This could work, but only if the data should be displayed 
> as well. If the data should be processed (and for example, be shown in 
> another part of the page) this doesn't work really well. You could the 
> hide the parent element with CSS, but that's pretty clunky.

You can use microdata with  and  if you don't want to actually 
show the information.

Hiding information in a page (whether it's in microdata, data-*, 

[whatwg] Microdata Feedback: A Server Side implementation of a Microdata Consumer library.

2011-02-11 Thread Emiliano Martinez Luque
Hi everybody, I originally intended to send this message to the
implementors list but seeing in the archives that there hasn't been
much activity there for the last couple of months, I'm sending this to
the general list. Well, basically I just wanted to announce that I've
just released ( http://github.com/emluque/MD_Extract ) a library for
server side Microdata consuming. There are some known issues (
particularly with non-ASCII-extending character encodings, also the
text extraction mechanism from a tree of nodes is very basic, etc. )
but I still felt it was sensible to release it to showcase the
possibilities of the Microdata specification.

I based the implementation on the Algorithm provided by the WhatWG but
there are some variations, the most notable one being that I'm
constructing an intermediate results data structure while traversing
the Html tree rather than storing them in a list and then sorting them
later in tree order as the spec says. I did take Tab's suggestion of
doing a first pass through the Html tree and storing a list of
references to elements with ids ( which was a great suggestion, it
makes the code way clearer and it completely changed the way I was
thinking about the problem ).

To test this:

1. Make sure you have PHP 5 with Tidy (
http://www.php.net/manual/en/tidy.installation.php ) and MB_String (
http://ar.php.net/manual/en/mbstring.installation.php ) support.
2. Download the folder, uncompress it and move it to an apache dir. (
or clone it from github: git clone
https://github.com/emluque/MD_Extract.git )
3. Access the /examples folder with your browser.

Other than that, it reports most common errors ( like an element
marked up with itemscope not having child nodes, or a img element
marked with itemprop and not having an src attribute ). I believe that
apart from the known issues, and thinking just about microdata syntax,
it's 100% compliant with the latest microdata spec (Though there might
be some edge cases I might not be considering).

I'm hoping that it gets tested, this time I made it so that all it
takes (other than having the appropriate configuration of PHP) is
downloading and uncompressing the folder, please do, you will like it.
And please fill any bug reports through the github interface or
through the contact form at my personal page at
http://www.metonymie.com .

Again thank you for a great spec,

-- 
Emiliano Martínez Luque
http://www.metonymie.com


Re: [whatwg] Microdata feedback

2010-01-20 Thread Philip Jägenstedt

On Mon, 18 Jan 2010 13:58:16 +0100, Ian Hickson  wrote:


I'd like at some point to introduce some sort of "semantic" textContent
that handles , , , dir="", , , space-
collapsing, and newline elimination, but there hasn't been much  
enthusiasm

around the idea, and it's not clear what else it would be good for.

I've changed the example, at least, to have it work ok, and added a
comment in the example about it.


OK. Won't hold my breath for semantic textContent, but it sounds like a  
good solution.



On Thu, 19 Nov 2009, Philip Jägenstedt wrote:


In a (slightly edited) Jack Bauer example [1], Chrome, Firefox and
presumably Safari has the meta elements moved to head. This will
severely break script-based implementation of microdata, which are
likely to be used for the time being until the DOM API is implemented
natively. I can't see any workaround for this, so I suggest that 
simply not be used for microdata, preferably by making it non-conforming
and removing it from the definitions/algorithms.


This is a short-term problem that only affects scripted implementations
that are shipped with the pages, so the workaround is simple: don't use
 and . Any implementations outside of the page can just fix
their parser to be HTML5-compatible.


OK, fair enough.

Thanks for all the other fixes, still reviewing the algorithm change...

--
Philip Jägenstedt
Core Developer
Opera Software


Re: [whatwg] Microdata feedback

2010-01-20 Thread Philip Jägenstedt
On Mon, 18 Jan 2010 16:24:46 +0100, Jeremy Keith   
wrote:



Hixie wrote:

Finally on vCard, the final part of the extraction algorithm goes to
great trouble to guess what is the family name and what is the given
name. This guess will be broken for transliterated east Asian names
(CJKV that I know of, maybe others too). Just saying. Also, why is it
important to explicitly add N: for organizations?


This is intended to be compatible with Microformats vCard, which has
these weird rules. If you think we should remove them, please at least
first speak to Tantek and see why he thinks.


The fn optimisation pattern isn't intended to catch 100% of cases, just  
the situation "Firstname Lastname" or "Firstname Middlename Lastname".  
So if you just use fn (formatted name) and don't use n (name), the name  
will be extracted/guessed using the optimisation pattern.


In cases where the pattern doesn't work (e.g. "Anne van Kesteren", or  
east Asian names) you can still explicitly specify the family name and  
given name, over-riding the fn optimisation pattern. If you do this, you  
need to explicitly state this is the name (n) as well as the formatted  
name (fn).


This is going to break badly whenever a template uses vCard microdata and  
its author either doesn't know the family name and given name (because the  
data was never collected) or doesn't even consider that the vcard  
conversion does this funny guesswork. If a social network site or similar  
does this, then Anne van Kesteren and Zhang Min (fictional name) will have  
their names messed up with no way of fixing it. At least I haven't seen a  
site which asks users to both fill in their full name and each component,  
which is what you need to get this right.


Similarly, for organisations, you don't have to explicitly set n (name)  
if you apply both fn (formatted name) and org (organisation name) to a  
string. This time, the optimisation pattern assumes that the fn is the  
name of the organisation.


Technically, the n property is *always* required but if you use either  
of those two optimisation patterns, the n is inferred from fn.


If this is just a technical problem with some software requiring N to be  
present, would it be OK to just output an empty N like for organizations?


--
Philip Jägenstedt
Core Developer
Opera Software


Re: [whatwg] Microdata feedback

2010-01-19 Thread Ian Hickson

On Mon, 18 Jan 2010, Aryeh Gregor wrote:
> On Mon, Jan 18, 2010 at 7:58 AM, Ian Hickson  wrote:
> > I've made it redirect to the spec.
> 
> Could you say that the URL *should* provide human-readable information 
> about the vocabulary?  We all know the problems with having 
> centrally-stored machine-readable data about your specs, but encouraging 
> the URL to provide human-readable info seems helpful.  (If they aren't 
> supposed to be dereferenced, why use HTTP?)

Why indeed. Is there something else we could use instead?


> > Graphs are intended to be supported in v2, using a mechanism
> 
> You seem to have left this sentence unfinished.

...using a mechanism intended for that purpose. Nothing to see here. :-)


On Mon, 18 Jan 2010, Julian Reschke wrote:
> 
> SHOULD return human-readable information is good, if you also add SHOULD 
> NOT automatically dereference.

I've added something akin to that SHOULD NOT, but the spec doesn't have a 
"specification" conformance class, so there's nothing to apply the SHOULD 
to. So I haven't added it. (I don't generally think specifications being 
conformance classes really makes much sense.)

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Microdata feedback

2010-01-18 Thread Julian Reschke

Aryeh Gregor wrote:

On Mon, Jan 18, 2010 at 7:58 AM, Ian Hickson  wrote:

I've made it redirect to the spec.


Could you say that the URL *should* provide human-readable information
about the vocabulary?  We all know the problems with having
centrally-stored machine-readable data about your specs, but
encouraging the URL to provide human-readable info seems helpful.  (If
they aren't supposed to be dereferenced, why use HTTP?)
...


SHOULD return human-readable information is good, if you also add SHOULD 
NOT automatically dereference.


BR, Julian


Re: [whatwg] Microdata feedback

2010-01-18 Thread Aryeh Gregor
On Mon, Jan 18, 2010 at 7:58 AM, Ian Hickson  wrote:
> I've made it redirect to the spec.

Could you say that the URL *should* provide human-readable information
about the vocabulary?  We all know the problems with having
centrally-stored machine-readable data about your specs, but
encouraging the URL to provide human-readable info seems helpful.  (If
they aren't supposed to be dereferenced, why use HTTP?)

> Graphs are intended to be supported in v2, using a mechanism

You seem to have left this sentence unfinished.


Re: [whatwg] Microdata feedback

2010-01-18 Thread Jeremy Keith

Hixie wrote:

Finally on vCard, the final part of the extraction algorithm goes to
great trouble to guess what is the family name and what is the given
name. This guess will be broken for transliterated east Asian names
(CJKV that I know of, maybe others too). Just saying. Also, why is it
important to explicitly add N: for organizations?


This is intended to be compatible with Microformats vCard, which has
these weird rules. If you think we should remove them, please at least
first speak to Tantek and see why he thinks.


The fn optimisation pattern isn't intended to catch 100% of cases,  
just the situation "Firstname Lastname" or "Firstname Middlename  
Lastname". So if you just use fn (formatted name) and don't use n  
(name), the name will be extracted/guessed using the optimisation  
pattern.


In cases where the pattern doesn't work (e.g. "Anne van Kesteren", or  
east Asian names) you can still explicitly specify the family name and  
given name, over-riding the fn optimisation pattern. If you do this,  
you need to explicitly state this is the name (n) as well as the  
formatted name (fn).


Similarly, for organisations, you don't have to explicitly set n  
(name) if you apply both fn (formatted name) and org (organisation  
name) to a string. This time, the optimisation pattern assumes that  
the fn is the name of the organisation.


Technically, the n property is *always* required but if you use either  
of those two optimisation patterns, the n is inferred from fn.


HTH,

Jeremy

--
Jeremy Keith

a d a c t i o

http://adactio.com/




[whatwg] Microdata feedback

2010-01-18 Thread Ian Hickson
On Thu, 12 Nov 2009, Philip Jägenstedt wrote:
>
> I've been playing with the microdata DOM APIs again, continuing the 
> JavaScript experimental implementation 
> . It's not small or elegant, but at 
> least some spec issues have come up in the process.
> 
> What is the http://www.w3.org/1999/xhtml/microdata# URI?

It provides a way to map microdata property names to URLs in an 
unambiguous way.



> http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#associating-names-with-items
> 
> "Otherwise, if one of the other elements in pending is an ancestor 
> element of candidate, and that element is scope, then remove candidate 
> from pending."
> 
> "Otherwise, if one of the other elements in pending is an ancestor 
> element of candidate, and that element also has scope as its nearest 
> ancestor element with an itemscope attribute specified, then remove 
> candidate from pending."
> 
> The intention of these requirements seems to be to eliminate redundant 
> elements in pending, but a comment on the intention of each in the spec 
> would be helpful as it's quite cryptic right now.

Added some brief explanations.



> http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#microdata-dom-api
> 
> itemtype and itemid are both URL attributes and therefore when getting
> itemType and itemId relative URLs should be resolved (even if only absolute
> URLs are valid). Correct?

That was a correct interpretation of the spec, but was only intended to 
be the case for itemid. I've corrected the spec to say that itemType is 
just a regular DOMString with no resolution.


> itemprop and itemref are both "unordered set of unique space-separated
> tokens", but in HTMLElement only itemProp is a DOMSettableTokenList while
> itemRef is a DOMString. This doesn't really make sense, so make itemRef a
> DOMSettableTokenList too?

Fixed. That was an oversight.


> From reading the spec it's not obvious (without following cross- 
> references) that itemProp isn't just a plain string. An example using 
> .itemProp.contains(name) or similar would make this more difficult to 
> miss.

Done.



> http://www.whatwg.org/specs/vocabs/current-work/#vcard
> 
> Having clickable cross-references in this spec would help a lot when
> reviewing!

I've put them back in the HTML5 spec, which makes this a moot point.


> Grammar: Let value *be* the result of collecting the first vCard 
> subproperty named value in subitem.

Fixed.


> "Let n1 be the value of the first property named family-name in subitem, or
> the empty string if there is no such property or the property's value is
> itself an item." Why not use "collecting the first vCard subproperty" here?
> Not doing so had me trying to find how the two were different, but I couldn't
> find any differences given that the values are later escaped.

Oops. Fixed.


> There's also the issue of how newlines from textContent values are escaped.
> Applying the vCard extraction algorithm to the spec example gives:
> 
> BEGIN:VCARD
> PROFILE:VCARD
> VERSION:3.0
> SOURCE:http://foolip.org/microdatajs/demo/vcard.html
> NAME:vCard demo
> FN:Jack Bauer
> PHOTO;VALUE=URI:http://foolip.org/microdatajs/demo/jack-bauer.jpg
> ORG:Counter-Terrorist Unit;Los Angeles Division
> ADR:;;10201 W. Pico Blvd.;Los Angeles;CA;90064;United States
> GEO:34.052339;-118.410623
> TEL;TYPE=work:+1 (310)\n  597 3781
> URL;VALUE=URI:http://en.wikipedia.org/wiki/Jack_Bauer
> URL;VALUE=URI:http://www.jackbauerfacts.com/
> EMAIL:j.ba...@la.ctu.gov.invalid
> TEL;TYPE=cell:+1 (310) 555\n  3781
> NOTE:If I'm out in the field\, you may be better off\n contacting Chloe O'B
> rian if it's about\n work\, or ask Tony Almeida if\n you're interested in
> the CTU five-a-side football team we're trying\n to get going.
> AGENT;VALUE=VCARD:BEGIN:VCARD\nPROFILE:VCARD\nVERSION:3.0\nSOURCE:http://fo
> olip.org/microdatajs/demo/vcard.html\nNAME:vCard demo\nEMAIL\;VALUE=URI:ma
> ilto:c.obr...@la.ctu.gov.invalid\nfn:Chloe O'Brian\nN:O'Brian\;Chloe\;\;\;
> \nEND:VCARD\n
> AGENT:Tony Almeida
> REV:2008-07-20T21:00:00+0100
> TEL;TYPE=home:01632 960 123
> N:Bauer;Jack;;;
> END:VCARD
> 
> TEL and NOTE has line breaks that are just because of how the HTML source is
> formatted. Importing this into Gmail preserves these linebreaks which looks
> quite broken. Unless we expect text fields to contain meaningful formatting,
> perhaps simply collapsing all whitespace into a single space is OK? In the
> best of worlds  would be converted to \n, but I'm not sure if it's worth
> the trouble.

We're screwed either way. If we convert newlines to " ", then we lose 
formatting from . If we don't convert newlines, we gain spurious 
linebreaks (and spaces). The latter is less destructive, which is why I 
picked it, but it's not ideal, I agree.

I'd like at some point to introduce some sort of "semantic" textContent 
that handles , , , dir="", , , space- 
collapsing, and newline elimination, but there

[whatwg] Microdata example validation

2009-11-17 Thread Philip Jägenstedt

http://www.whatwg.org/specs/vocabs/current-work/#examples

The Jack Bauer example has validation issues (using http://validator.nu/)

My fix:

--- jack.html.orig  2009-11-17 11:03:03.0 +0100
+++ jack.html   2009-11-17 11:03:19.0 +0100
@@ -41,12 +41,12 @@
  you're interested in the CTU five-a-side football team we're trying
  to get going.
- 
+ 
   

-   
+   
   
   Update!
   My new home phone number is
-  01632 960 123.
+  01632 960 123.
  
 

This is just a convenient format, not a patch...

Apart from this I get validation errors in this and many other examples  
because the meta and link elements aren't allowed in the contexts where  
they are used, but I suspect this is just validator.nu being in need of  
fixing (waiting for new account email so I can report).


--
Philip Jägenstedt


Re: [whatwg] Microdata DOM API issues

2009-11-14 Thread Philip Jägenstedt
On Sat, 14 Nov 2009 00:34:12 +0100, Tab Atkins Jr.   
wrote:


On Fri, Nov 13, 2009 at 5:14 PM, Philip Jägenstedt   
wrote:
The itemref mechanism allows creating arbitrary graphs of items, rather  
than
the tree of items that is the intended microdata model (right?). Even  
though
my default reaction to graphs is "oh cool", for microdata when the  
domain

model is a graph you should probably just represent it with a level of
indirection (RDF).

Options:
1. patch the algorithms which can go into recursion
2. patch

to first check if an itemref'd property creates a loop before adding it  
to

candidates
3. ?

I think I prefer 2.


Looping in data-graphs is often useful, so I'm not sure I want to
throw it out generally.  Your statement in the first paragraph I'm
quoting, though, says that you'd rather leave loops to be defined in
the vocabulary itself?  So loops would be done by, frex, itemprop'ing
a link to the other element rather than itemref'ing the other element
directly?


Yes, that's basically what I'm saying. One option is to simply use  
microdata such that the RDF you extract is the graph you want (it will  
probably look quite ugly though). Another is always referencing subitems  
by a mechanism other than refid. For example, in the MusicBrainz XML  
webservice when an artist contains a release which itself references  
artists (e.g. as the producer), a stub item is used with only artist name  
and id, rather than including all information recursively. In microdata I  
would do:


http://musicbrainz.org/artist/";
 itemid="http://musicbrainz.org/artist/4d5447d7-c61c-4120-ba1b-d7f471d385b9";>
 John Lennon
 
  Releases
  http://musicbrainz.org/release/";
   itemid="http://musicbrainz.org/release/f237e6a0-4b0e-4722-8172-66f4930198bc";>
   Imagine
   Producer:
   http://musicbrainz.org/artist/";
itemid="http://musicbrainz.org/artist/e7b587f7-e678-47c1-81dd-e7bb7855b0f9";
>Phil Spector
  
 


Even if John Lennon were the producer here, you don't get any looping in  
the microdata itself. If you want to know everything about the producer,  
you should just follow the itemid... I haven't looked that much at the RDF  
extraction algorithm yet, but I think this example might even create the  
proper graph with loops if the producer were John Lennon.



That would probably be fine, and is compatible with a tree-based data
model like JSON.  Vocabs should know when loops are
permissible/desirable for themselves.


I agree, I don't see that we have a problem here.

--
Philip Jägenstedt
Opera Software


Re: [whatwg] Microdata DOM API issues

2009-11-13 Thread Tab Atkins Jr.
On Fri, Nov 13, 2009 at 5:14 PM, Philip Jägenstedt  wrote:
> The itemref mechanism allows creating arbitrary graphs of items, rather than
> the tree of items that is the intended microdata model (right?). Even though
> my default reaction to graphs is "oh cool", for microdata when the domain
> model is a graph you should probably just represent it with a level of
> indirection (RDF).
>
> Options:
> 1. patch the algorithms which can go into recursion
> 2. patch
> 
> to first check if an itemref'd property creates a loop before adding it to
> candidates
> 3. ?
>
> I think I prefer 2.

Looping in data-graphs is often useful, so I'm not sure I want to
throw it out generally.  Your statement in the first paragraph I'm
quoting, though, says that you'd rather leave loops to be defined in
the vocabulary itself?  So loops would be done by, frex, itemprop'ing
a link to the other element rather than itemref'ing the other element
directly?

That would probably be fine, and is compatible with a tree-based data
model like JSON.  Vocabs should know when loops are
permissible/desirable for themselves.

~TJ


Re: [whatwg] Microdata DOM API issues

2009-11-13 Thread Philip Jägenstedt
On Fri, 13 Nov 2009 19:27:39 +0100, Philip Jägenstedt   
wrote:


On Thu, 12 Nov 2009 03:23:54 +0100, Philip Jägenstedt  
 wrote:


Why are the algorithms for extracting RDF gone? All that's left is the  
book example with the equivalent Turtle, but it would be nice if it  
were actually defined how to extract RDF. The same for the JSON stuff,  
was that no good?


D'oh! I've been reading the multipage version and missed that it's on  
another page:


http://www.whatwg.org/specs/web-apps/current-work/multipage/converting-html-to-other-formats.html

I'll have to try implementing that and see if there are any more issues.



http://www.whatwg.org/specs/web-apps/current-work/multipage/converting-html-to-other-formats.html#json

This was easy to implement, but the algorithm isn't guaranteed to  
terminate.



  


This simple input causes the algorithm to recurse as the item references  
itself. I went back to the vCard algorithm and found that it too will fail  
to terminate with this input:


http://microformats.org/profile/hcard";>
  http://microformats.org/profile/hcard";>


vEvent is safe as the algorithm never recurses, but the RDF conversion  
algorithm would hit the same problem.


It's certainly possible to create loops which are less easy to spot:


  
  
  ...
  


Or this:


  

  


The itemref mechanism allows creating arbitrary graphs of items, rather  
than the tree of items that is the intended microdata model (right?). Even  
though my default reaction to graphs is "oh cool", for microdata when the  
domain model is a graph you should probably just represent it with a level  
of indirection (RDF).


Options:
1. patch the algorithms which can go into recursion
2. patch  
  
to first check if an itemref'd property creates a loop before adding it to  
candidates

3. ?

I think I prefer 2.

--
Philip Jägenstedt


Re: [whatwg] Microdata DOM API issues

2009-11-13 Thread Philip Jägenstedt
On Thu, 12 Nov 2009 03:23:54 +0100, Philip Jägenstedt   
wrote:


Why are the algorithms for extracting RDF gone? All that's left is the  
book example with the equivalent Turtle, but it would be nice if it were  
actually defined how to extract RDF. The same for the JSON stuff, was  
that no good?


D'oh! I've been reading the multipage version and missed that it's on  
another page:


http://www.whatwg.org/specs/web-apps/current-work/multipage/converting-html-to-other-formats.html

I'll have to try implementing that and see if there are any more issues.

--
Philip Jägenstedt
Opera Software


[whatwg] Microdata DOM API issues

2009-11-11 Thread Philip Jägenstedt
I've been playing with the microdata DOM APIs again, continuing the  
JavaScript experimental implementation .  
It's not small or elegant, but at least some spec issues have come up in  
the process.


What is the http://www.w3.org/1999/xhtml/microdata# URI? Just leftovers  
from earlier revisions to the spec?


Why are the algorithms for extracting RDF gone? All that's left is the  
book example with the equivalent Turtle, but it would be nice if it were  
actually defined how to extract RDF. The same for the JSON stuff, was that  
no good?



http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#associating-names-with-items

"Otherwise, if one of the other elements in pending is an ancestor element  
of candidate, and that element is scope, then remove candidate from  
pending."


"Otherwise, if one of the other elements in pending is an ancestor element  
of candidate, and that element also has scope as its nearest ancestor  
element with an itemscope attribute specified, then remove candidate from  
pending."


The intention of these requirements seems to be to eliminate redundant  
elements in pending, but a comment on the intention of each in the spec  
would be helpful as it's quite cryptic right now.



http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#microdata-dom-api

itemtype and itemid are both URL attributes and therefore when getting  
itemType and itemId relative URLs should be resolved (even if only  
absolute URLs are valid). Correct?
itemprop and itemref are both "unordered set of unique space-separated  
tokens", but in HTMLElement only itemProp is a DOMSettableTokenList while  
itemRef is a DOMString. This doesn't really make sense, so make itemRef a  
DOMSettableTokenList too? From reading the spec it's not obvious (without  
following cross-references) that itemProp isn't just a plain string. An  
example using .itemProp.contains(name) or similar would make this more  
difficult to miss.



http://www.whatwg.org/specs/vocabs/current-work/#vcard

Having clickable cross-references in this spec would help a lot when  
reviewing!


Grammar: Let value *be* the result of collecting the first vCard  
subproperty named value in subitem.


"Let n1 be the value of the first property named family-name in subitem,  
or the empty string if there is no such property or the property's value  
is itself an item." Why not use "collecting the first vCard subproperty"  
here? Not doing so had me trying to find how the two were different, but I  
couldn't find any differences given that the values are later escaped.


There's also the issue of how newlines from textContent values are  
escaped. Applying the vCard extraction algorithm to the spec example gives:


BEGIN:VCARD
PROFILE:VCARD
VERSION:3.0
SOURCE:http://foolip.org/microdatajs/demo/vcard.html
NAME:vCard demo
FN:Jack Bauer
PHOTO;VALUE=URI:http://foolip.org/microdatajs/demo/jack-bauer.jpg
ORG:Counter-Terrorist Unit;Los Angeles Division
ADR:;;10201 W. Pico Blvd.;Los Angeles;CA;90064;United States
GEO:34.052339;-118.410623
TEL;TYPE=work:+1 (310)\n  597 3781
URL;VALUE=URI:http://en.wikipedia.org/wiki/Jack_Bauer
URL;VALUE=URI:http://www.jackbauerfacts.com/
EMAIL:j.ba...@la.ctu.gov.invalid
TEL;TYPE=cell:+1 (310) 555\n  3781
NOTE:If I'm out in the field\, you may be better off\n contacting Chloe O'B
 rian if it's about\n work\, or ask Tony Almeida if\n you're interested in
 the CTU five-a-side football team we're trying\n to get going.
AGENT;VALUE=VCARD:BEGIN:VCARD\nPROFILE:VCARD\nVERSION:3.0\nSOURCE:http://fo
 olip.org/microdatajs/demo/vcard.html\nNAME:vCard demo\nEMAIL\;VALUE=URI:ma
 ilto:c.obr...@la.ctu.gov.invalid\nfn:Chloe O'Brian\nN:O'Brian\;Chloe\;\;\;
 \nEND:VCARD\n
AGENT:Tony Almeida
REV:2008-07-20T21:00:00+0100
TEL;TYPE=home:01632 960 123
N:Bauer;Jack;;;
END:VCARD

TEL and NOTE has line breaks that are just because of how the HTML source  
is formatted. Importing this into Gmail preserves these linebreaks which  
looks quite broken. Unless we expect text fields to contain meaningful  
formatting, perhaps simply collapsing all whitespace into a single space  
is OK? In the best of worlds  would be converted to \n, but I'm not  
sure if it's worth the trouble.


Finally on vCard, the final part of the extraction algorithm goes to great  
trouble to guess what is the family name and what is the given name. This  
guess will be broken for transliterated east Asian names (CJKV that I know  
of, maybe others too). Just saying. Also, why is it important to  
explicitly add N: for organizations?



http://www.whatwg.org/specs/vocabs/current-work/#vevent

"Add an iCalendar line with the type name and the value value to output."

At this point value is undefined.

Given the algorithm for extracting iCal, it seems that dtstart and dtend  
must be specified using , as it's only for time elements  
that the time stamps will be properly formatted (stripping - and :)


Ther

Re: [whatwg] Microdata feedback

2009-10-17 Thread Ian Hickson
On Thu, 15 Oct 2009, Philip Jägenstedt wrote:
>
> Is there a reason why HTMLPropertyCollection.namedItem unlike some other 
> collections' .namedItem don't return an element if there is only 1 
> element in the collection at the time the method is called? Perhaps this 
> is legacy quirks that we don't want to replicate?

Exactly.


> > > It's only in the case where both itemprop and item have a type that 
> > > an extra level of nesting will be needed and I expect that to be the 
> > > exception. Changing the model to something more DOM-tree-like is 
> > > probably going to be easier to understand for many web developers.
> > 
> > I dunno. People didn't seem to have much trouble getting it once we 
> > used itemscope="" rather than just item="". People understand the JSON 
> > datamodel pretty well, why would this be different?
> 
> After , the recent 
> syntax changes, the improved DOM API and the passage of time I'm not 
> very worried about the things I was worrying about above. If there's any 
> specific point that seems valid after another review I'll send separate 
> feedback on it. Thanks for all the other fixes!

Great! Thanks for the feedback so far. Please do send more if you have 
any!

Cheers,
-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Microdata feedback

2009-10-15 Thread Philip Jägenstedt

On Wed, 14 Oct 2009 13:53:46 +0200, Ian Hickson  wrote:


On Fri, 21 Aug 2009, Philip Jägenstedt wrote:


Shouldn't namedItem [6] be namedItems? Code like .namedItem().item(0)
would be quite confusing.
[6]  
http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#dom-htmlpropertycollection-nameditem


I don't understand what this is referring to.


I was incorrectly under the impressions that .namedItem on other  
collections always returned a single element and arguing that since  
HTMLPropertyCollection.namedItem always returns a PropertyNodeList  
namedItems in plural would make more sense. Now I see that some other  
namedItem methods aren't as simple as I'd thought, so I'm not sure what to  
make of it. Is there a reason why HTMLPropertyCollection.namedItem unlike  
some other collections' .namedItem don't return an element if there is  
only 1 element in the collection at the time the method is called? Perhaps  
this is legacy quirks that we don't want to replicate?



On Tue, 25 Aug 2009, Philip Jägenstedt wrote:


There's something like an inverse relationship between simplicity of the
syntax and complexity of the resulting markup, the best balance point
isn't clear (to me at least). Perhaps option 3 is better, never allowing
item+itemprop on the same element.


That would preclude being able to make trees.



> > Given that flat items like vcard/vevent are likely to be the most
> > common use case I think we should optimize for that. Child items can
> > be created by using a predefined item property:
> > itemprop="com.example.childtype item". The value of that property
> > would then be the first item in tree-order (or all items in the
> > subtree, not sure). This way, items would have better copy-paste
> > resilience as the whole item element could be made into a top-level
> > item simply by moving it, without meddling with the itemprop.
>
> That sounds kinda confusing...

More confusing than item+itemprop on the same element? In many cases the
property value is the contained text, having it be the contained item
node(s) doesn't seem much stranger.


Based on the studies Google did, I'm not convinced that people will find
the nesting that complicated. IMHO the proposal above is more confusing,
too. I'm not sure this is solving a problem that needs solving.



> > If the parent-item (com.example.blog) doesn't know what the
> > child-items are, it would simply use itemprop="item".
>
> I don't understand this at all.

This was an attempt to have anonymous sub-items. Re-thinking this,
perhaps a better solution would be to have each item behave in much the
same way that the document itself does. That is, simply add items in the
subtree without using itemprop and access them with .getItems(itemType)
on the outer item.


How would you do things like "agent" in the vEvent vocabulary?



Comparing the current model with a DOM tree, it seems odd in that a
property could be an item. It would be like an element attribute being
another element: . That kind of thing could just
as well be ,  or even  if the relationship
between the elements is clear just from the fact that they have a
parent-child relationship (usually the case).


Microdata's datamodel is more similar to JSON's than XML's.



It's only in the case where both itemprop and item have a type that an
extra level of nesting will be needed and I expect that to be the
exception. Changing the model to something more DOM-tree-like is
probably going to be easier to understand for many web developers.


I dunno. People didn't seem to have much trouble getting it once we used
itemscope="" rather than just item="". People understand the JSON
datamodel pretty well, why would this be different?


After , the recent syntax  
changes, the improved DOM API and the passage of time I'm not very worried  
about the things I was worrying about above. If there's any specific point  
that seems valid after another review I'll send separate feedback on it.  
Thanks for all the other fixes!


--
Philip Jägenstedt
Opera Software


[whatwg] Microdata feedback

2009-10-14 Thread Ian Hickson
On Fri, 21 Aug 2009, Philip Jägenstedt wrote:
> 
> The spec says that "properties can also themselves be groups of 
> name-value pairs", but this isn't exposed in a very convenient way in 
> the DOM API. The 'properties' DOM-property is a HTMLPropertyCollection 
> of all associated elements. Discovering if the item-property value is a 
> plain string or an item seems to require item.hasAttribute('item'), 
> which seems out of place when everything else has been so neatly 
> reflected.

This is now reflected on item.itemScope.


> Also, the 'contents' DOM-property is always the item-property value 
> except in the case where the item-property is another item -- in that 
> case it is something random like .href or .textContent depending on the 
> element type. I think it would be better if the DOM-property were simply 
> called 'value' (the spec does talk about name-value pairs after all) and 
> corresponded more exactly to 'property value' [3]. Elements that have no 
> 'property names' [4] should return null and otherwise elements with an 
> 'item' attribute should return itself, although I don't think it should 
> be writable in that case. One might also/otherwise consider adding a 
> valueType DOM-property which could be 'string', 'item' or something 
> similar.

Interesting idea. I've renamed 'content' to 'itemValue', and made it 
return null if there's no itemprop="", and the element itself if there's 
an itemscope="".


> One example [5] uses document.items[item].names but document.items isn't 
> defined anywhere. I assume this is an oversight and that it is 
> equivalent to document.getItems() Further, names is a member of 
> HTMLPropertyCollection, so document.items[item].properties.names is 
> probably intended instead of document.items[item].names. Assuming this 
> the example actually produces the output it claims to.

Fixed.


> Shouldn't namedItem [6] be namedItems? Code like .namedItem().item(0) 
> would be quite confusing.
> [6] 
> http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#dom-htmlpropertycollection-nameditem

I don't understand what this is referring to.


> Also, RadioNodeList should be PropertyNodeList.

Fixed.


> I think many will wonder why item and itemprop can't be given on a 
> single element for compactness:
> 
> Apples and
> Oranges
> don't compare well.

Modulo the changes to the syntax (s/item=/itemscope itemtype=/g), this is 
allowed -- but it means the same as this:

   ...

...which is to say, it's giving a property whose value is itself an item.


On Sun, 23 Aug 2009, Eduard Pascual wrote:
> On Sat, Aug 22, 2009 at 11:51 PM, Ian Hickson wrote:
> >
> > Based on some of the feedback on Microdata recently, e.g.:
> >
> > � http://www.jenitennison.com/blog/node/124
> >
> > ...and a number of e-mails sent to this list and the W3C lists, I am 
> > going to try some tweaks to the Microdata syntax. Google has kindly 
> > offered to provide usability testing resources so that we can try a 
> > variety of different syntaxes and see which one is easiest for authors 
> > to understand.
> >
> > If anyone has any concrete syntax ideas that they would like me to 
> > consider, please let me know. There's a (pretty low) limit to how many 
> > syntaxes we can perform usability tests on, though, so I won't be able 
> > to test every idea.
> 
> This would be more than just tweaking the syntax, but I think 
> appropriate to bring forth my CRDF proposal as a suggestion for an 
> alternative to Microdata.

I considered testing this, as well as RDFa, but due to time constraints we 
ended up only being able to test a few changes, so I concentrated 
specifically on microdata variants.


On Tue, 25 Aug 2009, Philip Jägenstedt wrote:
> 
> There's something like an inverse relationship between simplicity of the 
> syntax and complexity of the resulting markup, the best balance point 
> isn't clear (to me at least). Perhaps option 3 is better, never allowing 
> item+itemprop on the same element.

That would preclude being able to make trees.


> > > Given that flat items like vcard/vevent are likely to be the most 
> > > common use case I think we should optimize for that. Child items can 
> > > be created by using a predefined item property: 
> > > itemprop="com.example.childtype item". The value of that property 
> > > would then be the first item in tree-order (or all items in the 
> > > subtree, not sure). This way, items would have better copy-paste 
> > > resilience as the whole item element could be made into a top-level 
> > > item simply by moving it, without meddling with the itemprop.
> > 
> > That sounds kinda confusing...
> 
> More confusing than item+itemprop on the same element? In many cases the 
> property value is the contained text, having it be the contained item 
> node(s) doesn't seem much stranger.

Based on the studies Google did, I'm not convinced that people will find 
the nesting that complicated. IMHO the proposal above is more confusing, 
t

Re: [whatwg] Microdata

2009-08-26 Thread Brian Campbell

On Aug 22, 2009, at 5:51 PM, Ian Hickson wrote:


Based on some of the feedback on Microdata recently, e.g.:

  http://www.jenitennison.com/blog/node/124

...and a number of e-mails sent to this list and the W3C lists, I am  
going
to try some tweaks to the Microdata syntax. Google has kindly  
offered to

provide usability testing resources so that we can try a variety of
different syntaxes and see which one is easiest for authors to  
understand.


If anyone has any concrete syntax ideas that they would like me to
consider, please let me know. There's a (pretty low) limit to how many
syntaxes we can perform usability tests on, though, so I won't be  
able to

test every idea.


Here's an idea I've been mulling around. I think it would simplify the  
syntax and semantic model considerably.


Why do we need separate items and item properties? They seem to  
confuse people, when something can be both an item and an itemprop at  
the same time. They also seem to duplicate a certain amount of  
information; items can have "types", while itemprops can have "names",  
but they both seem to serve about the same role, which is to indicate  
how to interpret them in the context of page or larger item.


What if we just had "item", filling both of the roles? The value of  
the item would be either an associative array of the descendent items  
(or ones associated using "about") if those exists, or the text  
content of the item (or URL, depending on the tag) if it has no items  
within it.


Here's an example used elsewhere in the thread, marked up as I suggest:


  http://example.com/products/bt200x";>
  GPS Receiver BT 200X
  Rating: ⋆⋆⋆✩✩ item=rating content="2">

  Release Date:
January 22
  http://ln.hixie.ch/";>Ian
:
"Lots of memory, not much battery, very little
   accuracy."


  
  
My Pond
Licensed under the http://www.opensource.org/licenses/mit-license.php";>MIT
  license.
  



This would translate into the following JSON. Note that this is a  
simpler structure than the existing one proposed for microdata; it is  
a lot closer to how people generally use JSON natively, rather than  
using an extra level of nesting to distinguish types and properties:


// JSON DESCRIPTION OF MARKED UP DATA
// document URL: http://www.example.org/sample/test.html
{
 "com.example.product": [
   {
 "about": [ "http://example.com/products/bt200x"; ],
 "image": [ "http://www.example.org/sample/bt200x.jpeg"; ]
 "name": [ "GPS Receiver BT 200X" ],
 "reldate": [ "2009-01-22" ],
 "review": [
   {
 "reviewer": [ "http://ln.hixie.ch/"; ],
 "text": [ "Lots of memory, not much battery, very little  
accuracy." ]

   }
 ],
   },
 ],
 "work": [
 {
   "about": [ "http://www.example.org/sample/image.jpeg"; ],
   "license": [ "http://www.opensource.org/licenses/mit- 
license.php" ]

   "title": [ "My Pond" ],
 }
  ]
}

This has the slightly surprising property of making something like this:

  Some text. A link. Some  
more text


Result in:

  // http://example.org/sample/test
  { "foo": [ "Some text. A link. Some more text" ] }

While simply changing link an item:

  Some text A linka>. Some more text


Gives you:

  // http://example.org/sample/test
  { "foo": [ { link: [ "http://example.org/sample/somewhere"; ] } ] }

However, I think that people will generally expect "item" to be used  
for its text/URL content only on leaf nodes or nodes without much  
nested within them, while they would expect "item" to return  
structured, nested data when the DOM is nested deeply with items  
inside it, so I don't think people would be surprised by this behavior  
very often.


I haven't yet looked at every use case proposed so far to see how well  
this idea works for them, nor have I worked out the API differences  
(which should be simpler than the existing API). If there seem to be  
no serious problems with this idea, I can write up a more detailed  
justification and examples.


-- Brian


Re: [whatwg] Microdata

2009-08-25 Thread Philip Jägenstedt
On Tue, 25 Aug 2009 09:43:58 +0200, Philip Jägenstedt   
wrote:



On Tue, 25 Aug 2009 00:29:06 +0200, Ian Hickson  wrote:


On Mon, 24 Aug 2009, Philip Jägenstedt wrote:


I've found two related things that are a bit problematic. First,  
because

itemprops are only associated with ancestor item elements or via the
subject attribute, it's always necessary to find or create a separate
element for the item. This leads to more convoluted markup for small
items, so it would be nice if the first item and itemprop could be on
the same element when it makes sense:


  Concert at 19:00 at the beach.


rather than


  
Concert at 19:00 at the beach.
  



As specced now, having itemprop="" and item="" on the same element  
implies

that the value of the property is an item rooted at this element.

Not supporting the above was intentional, to keep the mental model of  
the

markup very simple, rather than having shortcuts. (RDFa has lots of
shortcuts and it ended up being very difficult to keep the mental model
straight.)


There's something like an inverse relationship between simplicity of the  
syntax and complexity of the resulting markup, the best balance point  
isn't clear (to me at least). Perhaps option 3 is better, never allowing  
item+itemprop on the same element.



Second, because composite items can only be made by adding item and
itemprop to the same element, the embedded item has to know that it has
a parent and what itemprop it should use to describe itself. James gave
the example of "something like planet where each article could be a
com.example.blog item and within each article there could be any
arbitrary author-supplied microdata" [1]. I also feel that the
item+itemprop syntax for composite items is one of the least intuitive
parts of the current spec. It's easy to get confused about what the  
type

of the item vs the itemprop should be and which item the itemprop
actually belongs to.


Fair points.


Given that flat items like vcard/vevent are likely to be the most  
common
use case I think we should optimize for that. Child items can be  
created

by using a predefined item property: itemprop="com.example.childtype
item".


Ok...



The value of that property would then be the first item in tree-order
(or all items in the subtree, not sure). This way, items would have
better copy-paste resilience as the whole item element could be made
into a top-level item simply by moving it, without meddling with the
itemprop.


That sounds kinda confusing...


More confusing than item+itemprop on the same element? In many cases the  
property value is the contained text, having it be the contained item  
node(s) doesn't seem much stranger.



If the parent-item (com.example.blog) doesn't know what the child-items
are, it would simply use itemprop="item".


I don't understand this at all.


This was an attempt to have anonymous sub-items. Re-thinking this,  
perhaps a better solution would be to have each item behave in much the  
same way that the document itself does. That is, simply add items in the  
subtree without using itemprop and access them with .getItems(itemType)  
on the outer item.


Comparing the current model with a DOM tree, it seems odd in the a  
property could be an item. It would be like an element attribute being  
another element: . That kind of thing could just  
as well be , type="foo"/> or even  if the relationship  
between the elements is clear just from the fact that they have a  
parent-child relationship (usually the case).


All examples of nested items in the spec are on the form



These would be replaced with



It's only in the case where both itemprop and item have a type that an  
extra level of nesting will be needed and I expect that to be the  
exception. Changing the model to something more DOM-tree-like is  
probably going to be easier to understand for many web developers. It  
would also fix the problem in my other mail where it's a bit tricky to  
determine via the DOM API whether a property is a string or an item.  
When on the topic of the DOM API,  
document.getItems("outer")[0].getItems("inner")[0] would be so much  
clearer than what we currently have.



Example:


  My name is Philip
  Jägenstedt.



I don't understand what this maps to at all.


The same as


   
 My name is Philip
 Jägenstedt.
   


Unless I've misunderstood the "n" in vcard (there's no example in the  
spec). But let's move on.



I'll admit that my examples are a bit simple, but the main point in my
opinion is to make item+itemprop less confusing. There are basically
only 3 options:

1. for compositing items (like now)
2. as shorthand on the top-level item (my suggestion)
3. disallow

I'd primarily like for 1 and 2 to be tested, but 3 is a real option  
too.


[1] http://krijnhoetmer.nl/irc-logs/whatwg/20090824#l-375


We can't disallow nesting items as values of properties, there are a  
whole

bunch of use cases that depend on it.


3 is not a suggestion to disallow 

Re: [whatwg] Microdata

2009-08-25 Thread Philip Jägenstedt

On Tue, 25 Aug 2009 00:29:06 +0200, Ian Hickson  wrote:


On Mon, 24 Aug 2009, Philip Jägenstedt wrote:


I've found two related things that are a bit problematic. First, because
itemprops are only associated with ancestor item elements or via the
subject attribute, it's always necessary to find or create a separate
element for the item. This leads to more convoluted markup for small
items, so it would be nice if the first item and itemprop could be on
the same element when it makes sense:


  Concert at 19:00 at the beach.


rather than


  
Concert at 19:00 at the beach.
  



As specced now, having itemprop="" and item="" on the same element  
implies

that the value of the property is an item rooted at this element.

Not supporting the above was intentional, to keep the mental model of the
markup very simple, rather than having shortcuts. (RDFa has lots of
shortcuts and it ended up being very difficult to keep the mental model
straight.)


There's something like an inverse relationship between simplicity of the  
syntax and complexity of the resulting markup, the best balance point  
isn't clear (to me at least). Perhaps option 3 is better, never allowing  
item+itemprop on the same element.



Second, because composite items can only be made by adding item and
itemprop to the same element, the embedded item has to know that it has
a parent and what itemprop it should use to describe itself. James gave
the example of "something like planet where each article could be a
com.example.blog item and within each article there could be any
arbitrary author-supplied microdata" [1]. I also feel that the
item+itemprop syntax for composite items is one of the least intuitive
parts of the current spec. It's easy to get confused about what the type
of the item vs the itemprop should be and which item the itemprop
actually belongs to.


Fair points.



Given that flat items like vcard/vevent are likely to be the most common
use case I think we should optimize for that. Child items can be created
by using a predefined item property: itemprop="com.example.childtype
item".


Ok...



The value of that property would then be the first item in tree-order
(or all items in the subtree, not sure). This way, items would have
better copy-paste resilience as the whole item element could be made
into a top-level item simply by moving it, without meddling with the
itemprop.


That sounds kinda confusing...


More confusing than item+itemprop on the same element? In many cases the  
property value is the contained text, having it be the contained item  
node(s) doesn't seem much stranger.



If the parent-item (com.example.blog) doesn't know what the child-items
are, it would simply use itemprop="item".


I don't understand this at all.


This was an attempt to have anonymous sub-items. Re-thinking this, perhaps  
a better solution would be to have each item behave in much the same way  
that the document itself does. That is, simply add items in the subtree  
without using itemprop and access them with .getItems(itemType) on the  
outer item.


Comparing the current model with a DOM tree, it seems odd in the a  
property could be an item. It would be like an element attribute being  
another element: . That kind of thing could just as  
well be , type="foo"/> or even  if the relationship  
between the elements is clear just from the fact that they have a  
parent-child relationship (usually the case).


All examples of nested items in the spec are on the form



These would be replaced with



It's only in the case where both itemprop and item have a type that an  
extra level of nesting will be needed and I expect that to be the  
exception. Changing the model to something more DOM-tree-like is probably  
going to be easier to understand for many web developers. It would also  
fix the problem in my other mail where it's a bit tricky to determine via  
the DOM API whether a property is a string or an item. When on the topic  
of the DOM API, document.getItems("outer")[0].getItems("inner")[0] would  
be so much clearer than what we currently have.



Example:


  My name is Philip
  Jägenstedt.



I don't understand what this maps to at all.


The same as


  
My name is Philip
Jägenstedt.
  


Unless I've misunderstood the "n" in vcard (there's no example in the  
spec). But let's move on.



I'll admit that my examples are a bit simple, but the main point in my
opinion is to make item+itemprop less confusing. There are basically
only 3 options:

1. for compositing items (like now)
2. as shorthand on the top-level item (my suggestion)
3. disallow

I'd primarily like for 1 and 2 to be tested, but 3 is a real option too.

[1] http://krijnhoetmer.nl/irc-logs/whatwg/20090824#l-375


We can't disallow nesting items as values of properties, there are a  
whole

bunch of use cases that depend on it.


3 is not a suggestion to disallow nesting, but to change the syntax for it.


Could you show how your syntax proposals wou

Re: [whatwg] Microdata

2009-08-24 Thread Ian Hickson
On Mon, 24 Aug 2009, Philip Jägenstedt wrote:
> 
> I've found two related things that are a bit problematic. First, because 
> itemprops are only associated with ancestor item elements or via the 
> subject attribute, it's always necessary to find or create a separate 
> element for the item. This leads to more convoluted markup for small 
> items, so it would be nice if the first item and itemprop could be on 
> the same element when it makes sense:
> 
> 
>   Concert at 19:00 at  itemprop="location">the beach.
> 
> 
> rather than
> 
> 
>   
> Concert at 19:00 at  itemprop="location">the beach.
>   
> 

As specced now, having itemprop="" and item="" on the same element implies 
that the value of the property is an item rooted at this element.

Not supporting the above was intentional, to keep the mental model of the 
markup very simple, rather than having shortcuts. (RDFa has lots of 
shortcuts and it ended up being very difficult to keep the mental model 
straight.)


> Second, because composite items can only be made by adding item and 
> itemprop to the same element, the embedded item has to know that it has 
> a parent and what itemprop it should use to describe itself. James gave 
> the example of "something like planet where each article could be a 
> com.example.blog item and within each article there could be any 
> arbitrary author-supplied microdata" [1]. I also feel that the 
> item+itemprop syntax for composite items is one of the least intuitive 
> parts of the current spec. It's easy to get confused about what the type 
> of the item vs the itemprop should be and which item the itemprop 
> actually belongs to.

Fair points.


> Given that flat items like vcard/vevent are likely to be the most common 
> use case I think we should optimize for that. Child items can be created 
> by using a predefined item property: itemprop="com.example.childtype 
> item".

Ok...


> The value of that property would then be the first item in tree-order 
> (or all items in the subtree, not sure). This way, items would have 
> better copy-paste resilience as the whole item element could be made 
> into a top-level item simply by moving it, without meddling with the 
> itemprop.

That sounds kinda confusing...


> If the parent-item (com.example.blog) doesn't know what the child-items 
> are, it would simply use itemprop="item".

I don't understand this at all.


> Example:
> 
> 
>   My name is Philip
>   Jägenstedt.
> 

I don't understand what this maps to at all.


> I'll admit that my examples are a bit simple, but the main point in my 
> opinion is to make item+itemprop less confusing. There are basically 
> only 3 options:
> 
> 1. for compositing items (like now)
> 2. as shorthand on the top-level item (my suggestion)
> 3. disallow
> 
> I'd primarily like for 1 and 2 to be tested, but 3 is a real option too.
> 
> [1] http://krijnhoetmer.nl/irc-logs/whatwg/20090824#l-375

We can't disallow nesting items as values of properties, there are a whole 
bunch of use cases that depend on it.

Could you show how your syntax proposals would look when marking up the 
following data?

// JSON DESCRIPTION OF MARKED UP DATA
// document URL: http://www.example.org/sample/test.html
{
  "items": [
{
  "type": "com.example.product",
  "properties": {
"about": [ "http://example.com/products/bt200x"; ],
"image": [ "http://www.example.org/sample/bt200x.jpeg"; ] // please keep 
this one outside the item in the DOM
"name": [ "GPS Receiver BT 200X" ],
"reldate": [ "2009-01-22" ],
"review": [
  {
"type": "",
"properties": {
  "reviewer": [ "http://ln.hixie.ch/"; ],
  "text": [ "Lots of memory, not much battery, very little 
accuracy." ]
}
  }
],
  }
},
{
  "type": "work",
  "properties": {
"about": [ "http://www.example.org/sample/image.jpeg"; ],
"license": [ "http://www.opensource.org/licenses/mit-license.php"; ]
"title": [ "My Pond" ],
  }
}
  ]
}


Here's how it would be marked up today:


 http://example.com/products/bt200x";>
 GPS Receiver BT 200X
 Rating: ⋆⋆⋆✩✩ 
 Release Date: January 
22
 http://ln.hixie.ch/";>Ian:
 "Lots of memory, not much battery, very little 
accuracy."


 
 
  My Pond
  Licensed under the http://www.opensource.org/licenses/mit-license.php";>MIT
  license.
 




-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Microdata

2009-08-24 Thread Philip Jägenstedt

On Sat, 22 Aug 2009 23:51:48 +0200, Ian Hickson  wrote:



Based on some of the feedback on Microdata recently, e.g.:

   http://www.jenitennison.com/blog/node/124

...and a number of e-mails sent to this list and the W3C lists, I am  
going

to try some tweaks to the Microdata syntax. Google has kindly offered to
provide usability testing resources so that we can try a variety of
different syntaxes and see which one is easiest for authors to  
understand.

If anyone has any concrete syntax ideas that they would like me to
consider, please let me know. There's a (pretty low) limit to how many
syntaxes we can perform usability tests on, though, so I won't be able to
test every idea.



I've found two related things that are a bit problematic. First, because  
itemprops are only associated with ancestor item elements or via the  
subject attribute, it's always necessary to find or create a separate  
element for the item. This leads to more convoluted markup for small  
items, so it would be nice if the first item and itemprop could be on the  
same element when it makes sense:



  Concert at 19:00 at itemprop="location">the beach.



rather than


  
Concert at 19:00 at itemprop="location">the beach.

  


Second, because composite items can only be made by adding item and  
itemprop to the same element, the embedded item has to know that it has a  
parent and what itemprop it should use to describe itself. James gave the  
example of "something like planet where each article could be a  
com.example.blog item and within each article there could be any arbitrary  
author-supplied microdata" [1]. I also feel that the item+itemprop syntax  
for composite items is one of the least intuitive parts of the current  
spec. It's easy to get confused about what the type of the item vs the  
itemprop should be and which item the itemprop actually belongs to.


Given that flat items like vcard/vevent are likely to be the most common  
use case I think we should optimize for that. Child items can be created  
by using a predefined item property: itemprop="com.example.childtype  
item". The value of that property would then be the first item in  
tree-order (or all items in the subtree, not sure). This way, items would  
have better copy-paste resilience as the whole item element could be made  
into a top-level item simply by moving it, without meddling with the  
itemprop. If the parent-item (com.example.blog) doesn't know what the  
child-items are, it would simply use itemprop="item".


Example:


  My name is Philip
  Jägenstedt.


I'll admit that my examples are a bit simple, but the main point in my  
opinion is to make item+itemprop less confusing. There are basically only  
3 options:


1. for compositing items (like now)
2. as shorthand on the top-level item (my suggestion)
3. disallow

I'd primarily like for 1 and 2 to be tested, but 3 is a real option too.

[1] http://krijnhoetmer.nl/irc-logs/whatwg/20090824#l-375

--
Philip Jägenstedt
Opera Software


Re: [whatwg] Microdata

2009-08-22 Thread Edward O'Connor
On Saturday, August 22, 2009, Eduard Pascual  wrote:
> On Sat, Aug 22, 2009 at 11:51 PM, Ian Hickson wrote:
>>
>> Based on some of the feedback on Microdata recently, e.g.:
>>
>>   http://www.jenitennison.com/blog/node/124
>>
>> ...and a number of e-mails sent to this list and the W3C lists, I am going
>> to try some tweaks to the Microdata syntax. Google has kindly offered to
>> provide usability testing resources so that we can try a variety of
>> different syntaxes and see which one is easiest for authors to understand.
>>
>> If anyone has any concrete syntax ideas that they would like me to
>> consider, please let me know. There's a (pretty low) limit to how many
>> syntaxes we can perform usability tests on, though, so I won't be able to
>> test every idea.
>>
>
> This would be more than just tweaking the syntax, but I think
> appropriate to bring forth my CRDF proposal as a suggestion for an
> alternative to Microdata. For reference, the latest version of the
> document can be found at [1], and the discussion that has happenned
> about it can be found at [2].
>
> Rather than just saying "use that syntax", I'm including here what IMO
> are the most prominent advantages (and potential issues) of that
> proposal, in no particular order:
>
> + Optional use of selectors: while the ability to use selectors seems
> quite useful, specially to handle "list" or "collection" cases, it has
> been argued that users may have problems with elaborated selectors.
> Since the last update of the CRDF document, this is addressed with the
> expanded inline content model: it should possible to express with only
> inline CRDF, and without using selectors at all, any semantics that
> can be represented with RDFa, Microdata, EASE, or eRDF. In other
> words: while CRDF can take full benefit of selectors to make better
> and/or clearer documents, it can still handle most cases (those
> actually handled by existing solutions) without them.
>
> + Microformats mapping: for good data (specifically, all content that
> doesn't duplicate any "singular" property), CRDF allows trivially
> mapping Microformat-marked data to an arbitrary RDF vocabulary (or
> even to multiple, overlapping vocabularies), thus allowing its re-use
> with RDF-related tools and/or combining it with RDF data from other
> sources and/or marked with other syntaxes. In order to achieve 100%
> compatibility with Microformats.org' processing model (including any
> form of bad data), a minor addition to Selectors is suggested in the
> document, although no substantial feedback has been given on it
> (neither against nor in favor).
>
> + Microformats-like but decentralized: the main issue with
> Microformats, at least with non-widespread vocabularies, is
> centralization: it requires a criticall mass of use-cases to get the
> Microformats community to engage in the process of creating a new
> vocabulary. With CRDF, any author may build their own vocabulary
> (implementing it as a CRDF mapping to RDF) and use it on their pages.
> If a vocabulary later gains momentum and is adopted by a wide enough
> set of authors, it'd be up to the Microformats community to decide
> whether "standarize" it or not.
>
> + Prefix declarations go out of HTML: After so many discussions,
> namespace prefixes has been the main source of criticism against RDFa.
> One of these criticism is the range of technicall issues that arise
> from the "xmlns:" syntax for defining namespace prefixes (in
> "tag-soup" syntax). CRDF handles this case by taking away the
> responsibility of prefix declarations from HTML: having a CSS-based
> syntax, CRDF takes the obvious step and uses CSS's own syntax for
> namespace declarations.
>
> + Entirely RDF based: while this might seem a purely theoretical
> advantage, there is also a practical benefit: once extracted from the
> webpage, CRDF data can be easily combined with any already existing
> RDF data; and can be used with RDF-related tools.
>
> - Copy-paste brittleness: IMO, the only serious drawback from CRDF;
> but there are some points worth making:
>   1) When used inline, CRDF can achieve the same resilience than RDFa,
> which is quite close to Microdata's.
>   2) I have noticed that some browsers can manage to copy-paste
> CSS-styled content preserving (most of) format. It shouldn't be hard
> for implementors to extend such functionality to CRDF. Of course, the
> support for this is not consistent among browsers, and also seems to
> vary for different paste targets. If there is some real interest, I
> might do some testing with multiple browsers and paste targets (for
> now, I have noticed that both IE and FF preserve most CSS formatting
> (but not layout) when pasting to Word, but pasting to OOo Writter gets
> rendered with the "default" formatting for the tags). It would be
> interesting, on this aspect, to hear about browser vendors: would they
> be willing to extend the CSS copy-paste capabilities to CRDF if it got
> adopted?
>
> - Prefix-based indir

Re: [whatwg] Microdata

2009-08-22 Thread Eduard Pascual
On Sat, Aug 22, 2009 at 11:51 PM, Ian Hickson wrote:
>
> Based on some of the feedback on Microdata recently, e.g.:
>
>   http://www.jenitennison.com/blog/node/124
>
> ...and a number of e-mails sent to this list and the W3C lists, I am going
> to try some tweaks to the Microdata syntax. Google has kindly offered to
> provide usability testing resources so that we can try a variety of
> different syntaxes and see which one is easiest for authors to understand.
>
> If anyone has any concrete syntax ideas that they would like me to
> consider, please let me know. There's a (pretty low) limit to how many
> syntaxes we can perform usability tests on, though, so I won't be able to
> test every idea.
>

This would be more than just tweaking the syntax, but I think
appropriate to bring forth my CRDF proposal as a suggestion for an
alternative to Microdata. For reference, the latest version of the
document can be found at [1], and the discussion that has happenned
about it can be found at [2].

Rather than just saying "use that syntax", I'm including here what IMO
are the most prominent advantages (and potential issues) of that
proposal, in no particular order:

+ Optional use of selectors: while the ability to use selectors seems
quite useful, specially to handle "list" or "collection" cases, it has
been argued that users may have problems with elaborated selectors.
Since the last update of the CRDF document, this is addressed with the
expanded inline content model: it should possible to express with only
inline CRDF, and without using selectors at all, any semantics that
can be represented with RDFa, Microdata, EASE, or eRDF. In other
words: while CRDF can take full benefit of selectors to make better
and/or clearer documents, it can still handle most cases (those
actually handled by existing solutions) without them.

+ Microformats mapping: for good data (specifically, all content that
doesn't duplicate any "singular" property), CRDF allows trivially
mapping Microformat-marked data to an arbitrary RDF vocabulary (or
even to multiple, overlapping vocabularies), thus allowing its re-use
with RDF-related tools and/or combining it with RDF data from other
sources and/or marked with other syntaxes. In order to achieve 100%
compatibility with Microformats.org' processing model (including any
form of bad data), a minor addition to Selectors is suggested in the
document, although no substantial feedback has been given on it
(neither against nor in favor).

+ Microformats-like but decentralized: the main issue with
Microformats, at least with non-widespread vocabularies, is
centralization: it requires a criticall mass of use-cases to get the
Microformats community to engage in the process of creating a new
vocabulary. With CRDF, any author may build their own vocabulary
(implementing it as a CRDF mapping to RDF) and use it on their pages.
If a vocabulary later gains momentum and is adopted by a wide enough
set of authors, it'd be up to the Microformats community to decide
whether "standarize" it or not.

+ Prefix declarations go out of HTML: After so many discussions,
namespace prefixes has been the main source of criticism against RDFa.
One of these criticism is the range of technicall issues that arise
from the "xmlns:" syntax for defining namespace prefixes (in
"tag-soup" syntax). CRDF handles this case by taking away the
responsibility of prefix declarations from HTML: having a CSS-based
syntax, CRDF takes the obvious step and uses CSS's own syntax for
namespace declarations.

+ Entirely RDF based: while this might seem a purely theoretical
advantage, there is also a practical benefit: once extracted from the
webpage, CRDF data can be easily combined with any already existing
RDF data; and can be used with RDF-related tools.

- Copy-paste brittleness: IMO, the only serious drawback from CRDF;
but there are some points worth making:
  1) When used inline, CRDF can achieve the same resilience than RDFa,
which is quite close to Microdata's.
  2) I have noticed that some browsers can manage to copy-paste
CSS-styled content preserving (most of) format. It shouldn't be hard
for implementors to extend such functionality to CRDF. Of course, the
support for this is not consistent among browsers, and also seems to
vary for different paste targets. If there is some real interest, I
might do some testing with multiple browsers and paste targets (for
now, I have noticed that both IE and FF preserve most CSS formatting
(but not layout) when pasting to Word, but pasting to OOo Writter gets
rendered with the "default" formatting for the tags). It would be
interesting, on this aspect, to hear about browser vendors: would they
be willing to extend the CSS copy-paste capabilities to CRDF if it got
adopted?

- Prefix-based indirection: I'd bet that there are people on this list
ready to argue that namespace prefixes are a good thing; but it seems
that it raises some issues, so I'll include them and share my PoV on
the topic:
  1) For t

[whatwg] Microdata

2009-08-22 Thread Ian Hickson

Based on some of the feedback on Microdata recently, e.g.:

   http://www.jenitennison.com/blog/node/124

...and a number of e-mails sent to this list and the W3C lists, I am going 
to try some tweaks to the Microdata syntax. Google has kindly offered to 
provide usability testing resources so that we can try a variety of 
different syntaxes and see which one is easiest for authors to understand.
 
If anyone has any concrete syntax ideas that they would like me to 
consider, please let me know. There's a (pretty low) limit to how many 
syntaxes we can perform usability tests on, though, so I won't be able to 
test every idea.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


[whatwg] Microdata DOM API

2009-08-20 Thread Philip Jägenstedt

Hi,

There are already two demos of converting Microdata to other formats which  
I found quite useful [1]. I've taken a closer look at the Microdata DOM  
API and hacked up a somewhat working JavaScript implementation of it [2].  
A few issues came up in the process:


To avoid total confusion I'll use item-property and DOM-property to  
disambiguate.


The spec says that "properties can also themselves be groups of name-value  
pairs", but this isn't exposed in a very convenient way in the DOM API.  
The 'properties' DOM-property is a HTMLPropertyCollection of all  
associated elements. Discovering if the item-property value is a plain  
string or an item seems to require item.hasAttribute('item'), which seems  
out of place when everything else has been so neatly reflected. (Just  
checking item.item won't work if the item attribute is empty.) Also, the  
'contents' DOM-property is always the item-property value except in the  
case where the item-property is another item -- in that case it is  
something random like .href or .textContent depending on the element type.  
I think it would be better if the DOM-property were simply called 'value'  
(the spec does talk about name-value pairs after all) and corresponded  
more exactly to 'property value' [3]. Elements that have no 'property  
names' [4] should return null and otherwise elements with an 'item'  
attribute should return itself, although I don't think it should be  
writable in that case. One might also/otherwise consider adding a  
valueType DOM-property which could be 'string', 'item' or something  
similar.


One example [5] uses document.items[item].names but document.items isn't  
defined anywhere. I assume this is an oversight and that it is equivalent  
to document.getItems() Further, names is a member of  
HTMLPropertyCollection, so document.items[item].properties.names is  
probably intended instead of document.items[item].names. Assuming this the  
example actually produces the output it claims to.


Shouldn't namedItem [6] be namedItems? Code like .namedItem().item(0)  
would be quite confusing. Also, RadioNodeList should be PropertyNodeList.


I think many will wonder why item and itemprop can't be given on a single  
element for compactness:


Apples  
and itemprop="org.example.name">Oranges don't compare well.


Allowing this would complicate the definition of 'corresponding item' [7],  
but I think that might be acceptable. I suggest either allowing it or  
adding a note explaining why it isn't allowed and giving examples of  
alternative techniques.


[1] http://philip.html5.org/demos/microdata/demo.html
http://james.html5.org/microdata/
[2]  
http://gitorious.org/microdatajs/microdatajs/blobs/0032bac85ceaf4fd2a6379b357a225f74c89d61f/microdata.js
[3]  
http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#concept-property-value
[4]  
http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#property-names
[5]  
http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#using-the-microdata-dom-api
[6]  
http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#dom-htmlpropertycollection-nameditem
[7]  
http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#concept-item-corresponding


--
Philip Jägenstedt
Doing Microdata just for fun, not for Opera Software.


Re: [whatwg] Microdata Revisited

2009-08-07 Thread Jonas Sicking
On Mon, Aug 3, 2009 at 2:58 AM, Martin McEvoy wrote:
> Hello All
>
> I have been working on a new proposal for HTML 5 Microdata, I thought you
> might all like to take a look at what I have come up with so far.
>
> please visit http://weborganics.co.uk/test/microdata.html
>
> Any feed back would be nice ;)

I'm in general vary of the use of prefixes here. Maciej summarized
things very nicely in [1]

/ Jonas

[1] http://lists.w3.org/Archives/Public/public-html/2009Jul/0919.html


Re: [whatwg] Microdata and Linked Data

2009-08-03 Thread Martin McEvoy

Hello Ian

Ian Hickson wrote:
I'm definitely against any in-page indirection mechanism, because we have 
seen with XML Namespaces (and with RDFa) that prefixes are simply a huge 
source of problems.
  
They are indeed, XML namespaces fixed one problem calling different 
things by the same name  but  they created another problem of calling 
the same thing by different names, Prefixes are not themselves bad, 
misunderstood  or any kind of indirection mechanism, they are just short 
hand urls, they are actually quite intuitive if used correctly.  RDFa Is 
currently trying to solve its problems with xmlns, is just a minor 
design flaw, xmlns is used for structure not content and they realize 
that issue.


Best wishes

--
Martin McEvoy
http://weborganics.co.uk/



Re: [whatwg] Microdata and Linked Data

2009-08-03 Thread Ian Hickson

(I trimmed public-html from the CC list to avoid cross-posting, and 
because the whatwg list has had most of the traffic on this topic so far; 
please feel free to forward this to public-html if you would rather 
discuss that there instead.)

On Fri, 24 Jul 2009, Peter Mika wrote:
> 
> The use of a URI as the value of the id attribute. It seems to me there 
> is actually nothing in the spec that would stop this:
> 
> "Identifiers are opaque strings. Particular meanings should not be derived
> from the value of the id  attribute."
> 
> This is great because in principle I could do something like:
> 
> http://john.example.com#hedral"; item="org.example.animal.cat
> com.example.feline">
> Hedral
> 
> 
> I assume you can achieve something similar with the "about" property but that
> would require me to write:
> 
> 
> Hedral
> http://john.example.com#hedral"/>
> 
> 
> This is longer by itself, and if I want an internal identifier as well, than I
> have to write:
> 
> 
> Hedral
> http://john.example.com#hedral"/>
> 

In practice, all the use cases that were brought up that needed to 
identify the item were cases where there was a URL already in the page, 
e.g. in a link or an  or a  element, such that it actually 
ends up better if we use itemprop=about rather than having a dedicated 
attribute (like id="" or about="") for identifying types.

Are there use cases where this is not the case? For example, when would 
you need to have an internal identifier?


> The other area that could be possibly improved is the connection of type 
> identifiers with ontologies on the web. I would actually like the notion 
> of reverse domain names if
> 
> -- there would be an explicit agreement that they are of the form
> xxx.yyy.zzz.classname
> -- there would be a registry for mappings from xxx.yyy.zzz to URIs.
> 
> For example, org.foaf-project.Person could be linked to
> http://xmlns.com/foaf/0.1/Person by having the mapping from org.foaf-project
> to http://xmlns.com/foaf/0.1/.
> 
> It wouldn't be perfect, the FOAF ontology as you see is not at 
> org.foaf-project but at com.xmlns. However, it would be a step in the 
> right direction.

What problem is this solving?


> I would consider adding the sameAs property as part of the standard 
> vocabulary. This is a term from the OWL vocabulary that is widely used 
> in the Linked Data world for connecting entities that are deemed to be 
> equivalent. Alternatively, we could add the entire RDFS and OWL 
> vocabulary to the spec.

Could you elaborate on this? What are the use cases that this is intended 
to address? What do you mean by "adding the sameAs property"?


> I don't expect that writing full URIs for property names will be 
> appealing to users, but of course I'm not a big fan either of defining 
> prefixes individually as done in RDFa with the CURIE mechanism. Still, 
> prefixes would be useful, e.g. foaf:Person is much shorter to write than 
> com.foaf-project.Person and also easier to remember. So would there be a 
> way to reintroduce the notion of prefixes, with possibly pointing to a 
> registry that defines the mapping from prefixes to namespaces?
> 
> http://www.w3c.org/registry/";
> item="animal:cat">
> Hedral
> 
> 
> Here the registry would define a number of prefixes. However, the 
> mechanism would be open in that other organizations or even individuals 
> could maintain registries.

I'm definitely against any in-page indirection mechanism, because we have 
seen with XML Namespaces (and with RDFa) that prefixes are simply a huge 
source of problems.

However, there actually already is a registry for registering strings that 
start with a keyword and a colon: the scheme registry. So if animals 
become important enough that they need their own scheme, I guess people 
could register them that way. Alternatively, a short domain followed by a 
keyword seems like a reasonable option: instead of "animal:cat", have 
"org.animal.cat": it's only four more characters. (Actually, with ICANN 
considering opening up TLDs, people could just register those: 
"animal.cat" is a valid reverse DNS label if "animal" is a TLD!)

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


[whatwg] Microdata Revisited

2009-08-03 Thread Martin McEvoy

Hello All

I have been working on a new proposal for HTML 5 Microdata, I thought 
you might all like to take a look at what I have come up with so far.


please visit http://weborganics.co.uk/test/microdata.html

Any feed back would be nice ;)

Best wishes

--
Martin McEvoy
http://weborganics.co.uk/



Re: [whatwg] Microdata and Linked Data

2009-07-24 Thread Peter Mika

Fair point. Just brainstorming here: how about making about an attribute?

http://";>
Name: Amanda

We still have two identifiers, but at least giving the URI is simplified.

Best,
Peter

Julian Reschke wrote:

Peter Mika wrote:

Hi All,

I've been taking a closer look at microdata. While I like the 
proposal in general, in particular the chance to unite microformat 
style annotations with some of the Semantic Web formalism (such as 
URIs for objects), there are still a number of points that I feel 
could be improved. So here are my proposals for discussion:


#1

The use of a URI as the value of the id attribute. It seems to me 
there is actually nothing in the spec that would stop this:

...


IDs like that would be very hard to use as fragment identifier...

> ...

BR, Julian




Re: [whatwg] Microdata and Linked Data

2009-07-24 Thread Peter Mika
Yes, #2 and #4 are quite related in that they both concern the 
abbreviation mechanism for URIs and might be considered alternative 
proposals.



On the other hand, on #4, you are opening the gate to independent
entities (be them organizations or individuals) to define the prefixes
they would be using for their pages' metadata: why don't apply this to
#2 as well? IMO, it would be more important for #2 than for #4; since
#4 only provides syntax sugar while #2 enables something that would be
undoable without it (mapping Microdata to arbitrary RDF).
  

Yes, the idea of distributing the registration could be applied to #2.

About #1, I'm not sure about what you are exacly proposing, so I can't
provide much feedback on it. Maybe you could make it a bit clearer:
are you proposing any specific change to the spec? If so, what would
be the change? If now, what are you proposing then?
  
Removing the about property, showing how id can be used in this way, and 
changing the description of how you transform an HTML5 document to RDF.



Finally, about #3 I'm not familiar with the OWL vocabulary, so I can't
say too much about it. But if your second proposal gets into the spec,
then this would become just syntax sugar, since any property from any
existing RDF vocabulary could be expressed; and if #4 also got in, the
benefit of "built-in" properties would be minimal compared to using a
reasonably short prefix (such as "owl:").
  
I agree... I'm personally not so attached to reverse domain names, but I 
might have missed a lot of the previous discussions on why they are good 
to have.


In any case, my intention was to get the discussion restarted around 
these issues: it seems to me there was a lot of discussion at the very 
beginning on microdata vs. RDFa when microdata was first proposed, but 
then the discussion died without necessarily finding the best solution 
(for my taste).


Cheers,
Peter






Re: [whatwg] Microdata and Linked Data

2009-07-24 Thread Julian Reschke

Peter Mika wrote:

Hi All,

I've been taking a closer look at microdata. While I like the proposal 
in general, in particular the chance to unite microformat style 
annotations with some of the Semantic Web formalism (such as URIs for 
objects), there are still a number of points that I feel could be 
improved. So here are my proposals for discussion:


#1

The use of a URI as the value of the id attribute. It seems to me there 
is actually nothing in the spec that would stop this:

...


IDs like that would be very hard to use as fragment identifier...

> ...

BR, Julian


Re: [whatwg] Microdata and Linked Data

2009-07-24 Thread Eduard Pascual
On Fri, Jul 24, 2009 at 1:07 PM, Peter Mika wrote:
> [...]
> #2
>
> The other area that could be possibly improved is the connection of type
> identifiers with ontologies on the web. I would actually like the notion of
>  reverse domain names if
>
> -- there would be an explicit agreement that they are of the form
> xxx.yyy.zzz.classname
> -- there would be a registry for mappings from xxx.yyy.zzz to URIs.
>
> For example, org.foaf-project.Person could be linked to
> http://xmlns.com/foaf/0.1/Person by having the mapping from org.foaf-project
> to http://xmlns.com/foaf/0.1/.
>
> It wouldn't be perfect, the FOAF ontology as you see is not at
> org.foaf-project but at com.xmlns. However, it would be a step in the right
> direction.
>
> [...]
> #4
>
> I don't expect that writing full URIs for property names will be appealing
> to users, but of course I'm not a big fan either of defining prefixes
> individually as done in RDFa with the CURIE mechanism. Still, prefixes would
> be useful, e.g. foaf:Person is much shorter to write than
> com.foaf-project.Person and also easier to remember. So would there be a way
> to reintroduce the notion of prefixes, with possibly pointing to a registry
> that defines the mapping from prefixes to namespaces?
>
> http://www.w3c.org/registry/";
> item="animal:cat">
> Hedral
> 
>
> Here the registry would define a number of prefixes. However, the mechanism
> would be open in that other organizations or even individuals could maintain
> registries.
>

IMO, both of these proposals are quite related. However, you added
substantial differences I can't really understand between them.

For #2 you suggest to have a sort of centralized registry of mappings
between the reversed domains and the vocabularies they refer to. What
happens if next year I have to use an unusual vocabulary for my site
that is not included on the registry? Would I have to get the
vocabulary included on the registry before my pages' microdata can be
mapped to the appropriate RDF graph?
On the other hand, on #4, you are opening the gate to independent
entities (be them organizations or individuals) to define the prefixes
they would be using for their pages' metadata: why don't apply this to
#2 as well? IMO, it would be more important for #2 than for #4; since
#4 only provides syntax sugar while #2 enables something that would be
undoable without it (mapping Microdata to arbitrary RDF).

About #1, I'm not sure about what you are exacly proposing, so I can't
provide much feedback on it. Maybe you could make it a bit clearer:
are you proposing any specific change to the spec? If so, what would
be the change? If now, what are you proposing then?
Finally, about #3 I'm not familiar with the OWL vocabulary, so I can't
say too much about it. But if your second proposal gets into the spec,
then this would become just syntax sugar, since any property from any
existing RDF vocabulary could be expressed; and if #4 also got in, the
benefit of "built-in" properties would be minimal compared to using a
reasonably short prefix (such as "owl:").

Just my two cents.

Regards,
Eduard Pascual


[whatwg] Microdata and Linked Data

2009-07-24 Thread Peter Mika

Hi All,

I've been taking a closer look at microdata. While I like the proposal 
in general, in particular the chance to unite microformat style 
annotations with some of the Semantic Web formalism (such as URIs for 
objects), there are still a number of points that I feel could be 
improved. So here are my proposals for discussion:


#1

The use of a URI as the value of the id attribute. It seems to me there 
is actually nothing in the spec that would stop this:


"Identifiers are opaque strings. Particular meanings should not be 
derived from the value of the id  attribute."


This is great because in principle I could do something like:

http://john.example.com#hedral"; 
item="org.example.animal.cat com.example.feline">

Hedral


I assume you can achieve something similar with the "about" property but 
that would require me to write:



Hedral
http://john.example.com#hedral"/>


This is longer by itself, and if I want an internal identifier as well, 
than I have to write:



Hedral
http://john.example.com#hedral"/>


#2

The other area that could be possibly improved is the connection of type 
identifiers with ontologies on the web. I would actually like the notion 
of  reverse domain names if


-- there would be an explicit agreement that they are of the form 
xxx.yyy.zzz.classname

-- there would be a registry for mappings from xxx.yyy.zzz to URIs.

For example, org.foaf-project.Person could be linked to 
http://xmlns.com/foaf/0.1/Person by having the mapping from 
org.foaf-project to http://xmlns.com/foaf/0.1/.


It wouldn't be perfect, the FOAF ontology as you see is not at 
org.foaf-project but at com.xmlns. However, it would be a step in the 
right direction.


#3

I would consider adding the sameAs property as part of the standard 
vocabulary. This is a term from the OWL vocabulary that is widely used 
in the Linked Data world for connecting entities that are deemed to be 
equivalent. Alternatively, we could add the entire RDFS and OWL 
vocabulary to the spec.


#4

I don't expect that writing full URIs for property names will be 
appealing to users, but of course I'm not a big fan either of defining 
prefixes individually as done in RDFa with the CURIE mechanism. Still, 
prefixes would be useful, e.g. foaf:Person is much shorter to write than 
com.foaf-project.Person and also easier to remember. So would there be a 
way to reintroduce the notion of prefixes, with possibly pointing to a 
registry that defines the mapping from prefixes to namespaces?


http://www.w3c.org/registry/"; 
item="animal:cat">

Hedral


Here the registry would define a number of prefixes. However, the 
mechanism would be open in that other organizations or even individuals 
could maintain registries.


Looking forward to your feedback,

Peter





Re: [whatwg] microdata use cases and Getting data out of poorly written Web pages

2009-05-08 Thread Shelley Powers

Ian Hickson wrote:

On Fri, 8 May 2009, Shelley Powers wrote:
  
It's difficult to tell where one should comment on the so-called 
microdata use cases. I'm forced to send to multiple mailing lists.



Please don't cross-post to the WHATWG list and other lists -- you may pick 
either one, I read all of them. (Cross-posting results in a lot of 
confusion because some of the lists only allow members to posts, which 
others allow anyone to post, so we end up with fragmented threads.)



  
But different people respond to the mailings in different ways, 
depending on the list. This isn't just you, Ian. How can I ensure that 
the W3C people have access to the same concerns?
Ian, I would like to see the original request that went into this 
particular use case. In particular, I'd like to know who originated it, 
so that we can ensure that the person has read your follow-up, as well 
as how you condensed the use case down (to check if your interpretation 
is proper or not).



I did not keep track of where the use cases came from (I generally ignore 
the source of requests so as to avoid any possible bias).


  
Documenting the originator of a use case is introducing bias? In what 
universe?


If anything, documenting where the use cases come from, and providing 
access to the original, raw data helps to ensure that bias has not been 
introduced. More importantly, it gives your teammates a chance to verify 
your interpretation of the use cases, and provide correction, if needed.


However, I can probably figure out some of the sources of a particular 
scenario if you have a specific one in mind. Could you clarify which 
scenario or requirement you are particularly interested in?



  
Ian, I think its important that you provide a place documenting the 
original raw data. This provides a historical perspective on the 
decisions going into HTML5 if nothing else.


If you need help, I'm willing to help you. You'll need to forward me the 
emails you received, and send me links to the other locations. I'll then 
put all these into a document and we can work to map to your condensed 
document. That way there's accountability at all steps in the decision 
process, as well as transparency.


Once I put the document together, we can put with other documents that 
also provide history of the decision processes.
In addition, from my reading of this posting of yours titled "[whatwg] 
Getting data out of poorly written Web pages", is this open for any 
discussion?



Naturally, all input is always welcome.


  
No, I didn't ask if input was welcome. I asked if this was still open 
for discussion, or if you have made up your mind, and and further 
discussion will just be wasting everyone's time.
It seems to me that you received the original data, generated a use case 
document from the data, unilaterally, and now you're making unilateral 
decisions as to whether the use case requires a change in HTML5 or not.


Is this what we can expect from all of the use cases?



Yes.
  

That's not appropriate for a team environment.
If my proposals don't actually address the use cases, then please do point 
how that is the case. Similarly, if there are missing use cases, please 
bring them up. All input is always welcome (whether on the lists, or 
direct e-mal, on blogs, or wherever). None of the text in the HTML5 spec 
is frozen, it's merely a proposal. If there are use cases that should be 
addressed that are not addressed then we should address them.


  

Again, how can I? I don't have the original data.
(Regarding microdata note that I've so far only sent proposals for three 
of the 20 use cases that I collected. I've still got a lot to go through.)


  

After digging, I found another one, at

http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-May/019620.html

Again, though, the writing style indicates the item is closed, and 
discussion is not welcome. I have to assume that this is how you 
mentally perceive the item, and therefore though we may respond, the 
response will make no difference.


And I can't find the third one. Perhaps you can provide a direct link.

I'm concerned, too, about the fact that the discussion for these is 
happening on the WhatWG group, but not in the HTML WG email list. I've 
never understood two different email lists, and have felt having both is 
confusing, and potentially misleading. Regardless, shouldn't this 
discussion be taking place in the HTML WG, too?


Isn't the specification the W3C HTML5 specification, also?

I'm just concerned because from what I can see of both groups, interests 
and concerns differ between the groups. That means only addressing 
issues in one group, would leave out potentially important discussions 
in the other group.


Shelley




Re: [whatwg] microdata use cases and Getting data out of poorly written Web pages

2009-05-08 Thread Ian Hickson
On Fri, 8 May 2009, Shelley Powers wrote:
>
> It's difficult to tell where one should comment on the so-called 
> microdata use cases. I'm forced to send to multiple mailing lists.

Please don't cross-post to the WHATWG list and other lists -- you may pick 
either one, I read all of them. (Cross-posting results in a lot of 
confusion because some of the lists only allow members to posts, which 
others allow anyone to post, so we end up with fragmented threads.)


> Ian, I would like to see the original request that went into this 
> particular use case. In particular, I'd like to know who originated it, 
> so that we can ensure that the person has read your follow-up, as well 
> as how you condensed the use case down (to check if your interpretation 
> is proper or not).

I did not keep track of where the use cases came from (I generally ignore 
the source of requests so as to avoid any possible bias).

However, I can probably figure out some of the sources of a particular 
scenario if you have a specific one in mind. Could you clarify which 
scenario or requirement you are particularly interested in?


> In addition, from my reading of this posting of yours titled "[whatwg] 
> Getting data out of poorly written Web pages", is this open for any 
> discussion?

Naturally, all input is always welcome.


> It seems to me that you received the original data, generated a use case 
> document from the data, unilaterally, and now you're making unilateral 
> decisions as to whether the use case requires a change in HTML5 or not.
> 
> Is this what we can expect from all of the use cases?

Yes.

If my proposals don't actually address the use cases, then please do point 
how that is the case. Similarly, if there are missing use cases, please 
bring them up. All input is always welcome (whether on the lists, or 
direct e-mal, on blogs, or wherever). None of the text in the HTML5 spec 
is frozen, it's merely a proposal. If there are use cases that should be 
addressed that are not addressed then we should address them.

(Regarding microdata note that I've so far only sent proposals for three 
of the 20 use cases that I collected. I've still got a lot to go through.)

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


[whatwg] microdata use cases and Getting data out of poorly written Web pages

2009-05-08 Thread Shelley Powers
It's difficult to tell where one should comment on the so-called 
microdata use cases. I'm forced to send to multiple mailing lists.


Ian, I would like to see the original request that went into this 
particular use case. In particular, I'd like to know who originated it, 
so that we can ensure that the person has read your follow-up, as well 
as how you condensed the use case down (to check if your interpretation 
is proper or not).


In addition, from my reading of this posting of yours titled "[whatwg] 
Getting data out of poorly written Web pages", is this open for any 
discussion? It seems to me that you received the original data, 
generated a use case document from the data, unilaterally, and now 
you're making unilateral decisions as to whether the use case requires a 
change in HTML5 or not.


Is this what we can expect from all of the use cases?

Shelley