On the other end, if, as I type this, I get an intellisense-like list of my contacts that I can select from, then I can just select Joe from the list and have the microformat markup added for me

I've been thinking a lot about how a Web browser could help end users author microformatted content in blogs and wikis, and I think we need to consider the user's goals and motivations. I can't imagine people associating a contact in their address book with Joe as they casually mention him in a blog post just because they have an appreciation for the beauty of structured data. However, if their goal is closely aligned with the goal of their readers, then I can see users going to the extra effort. For instance, let's say you want to review something, and because you want your vote to count and other people to be able to take advantage of your review once it gets aggregated, I can see users going to the extra effort of filling out a form like the hReview creator (http://microformats.org/code/hreview/creator) to get information into the structure of an hReview. The same goes for people who want to promote an event: since their motivation is for people to attend, they make it easy for users to add the event to their calendar. We already see this type of behavior in applications like Outlook or Zimbra, where people create events for other people, so they are easy to accept. Microformats allow to take that interaction out of closed systems, and apply it to HTML emails, blog posts, wikis, etc.

I'm all for building systems that attempt to infer structure from natural language, because like we see in Apple's 1998 article, and now in Mail.app, these types of systems can be really useful when they work. But I also don't think we should discount situations where the user may actually have a clear motivation for creating structured data by filling out a form.

In case anyone is interested in reading more about Data Detectors, you might find this paper interesting. It catalogs all of the research done throughout the late 90s, and discusses a prototype system that leverages large knowledge bases like Stanford's TAP and MIT's ConceptNet to disambiguate natural language and provide structure to unstructured text:

http://alumni.media.mit.edu/~faaborg/files/thesis/draft/complete/CHI06_goalOrientedWebBrowser.pdf

-Alex



On Feb 8, 2008, at 8:40 AM, Guillaume Lebleu wrote:

Toby A Inkster wrote:
Guillaume Lebleu wrote:


What I have been thinking more and more and what this tells me again is
that the same way we talk of POSH and microformats, we could talk of
plain text or plain old english formats, essentially standardizing how people write dates, addresses, etc on the Web or on their emails. Asking
people to write "Tuesday, February 5, 2008" in this order, with the
commas, etc. is very likely even simpler for normal people than writing <abbr class="foo" title="2008-05-02">Tuesday, February 5, 2008</ abbr>.


One problem with that is that it will find matches on people who aren't even intending to use your plain-old-english format. They may happen to be including "Tuesday, February 5, 2008" on their pages with a different intended meaning. 2008 could refer to eight minutes past eight PM in military time -- unlikely, but possible. And as you move away from dates, phone numbers and postcodes which have relatively parseable formats, towards locations, people's names and job titles and so on, the likelihood of false matches increases.

The use of explicit tags to mark up information do make microformats slightly harder to use, yes. But the key is that they also make microformats much easier to explicitly not use.


Toby,
I understand the challenge of disambiguation and the value microformats bring in terms of easier parser implementation and more reliable information consumption experience. The challenge for average people writing microformats can't be underestimated though. I strongly believe that the time where disambiguation costs are the lowest are at publishing time, but this is also the time where you are focused on the english content, not the microformats. This is why in the second part of the post you cited, I suggested the use of Apple Data Detectors' like functionality, not to detect objects in plain old english (POE) in published content, but to detect objects in POE at the time they are written and ask for the user for disambiguation at the same time, in a way that the underlying microformat markup is generated, but without the user having to know the syntax. I'm thinking of this particularly in the context of writing a blog post: writing 1 hCards just to say "My friend Joe" is way too much for normal people. On the other end, if, as I type this, I get an intellisense-like list of my contacts that I can select from, then I can just select Joe from the list and have the microformat markup added for me (just like Wordpress adds a lot of markup that isn't in the visual editor or like Wiki converts simplified markup into HTML markup).
Guillaume
_______________________________________________
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss

_______________________________________________
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss

Reply via email to