On 8 Jul 2008, at 06:45, Guillaume Lebleu wrote:
Jim O'Donnell wrote:
The recent discussion here about dates has made me wonder if such
a web service woud be useful for microformats parsers. What do
others think?
It seems to me that this type of date extraction might present
risks if used by uf parsers to extract date/time from published
content (and lead to the "people showing up on the wrong date"
error mentioned in earlier posts).
I don't think it's so risky. The inspiration for this particular work
was Dan's experience on the 20th century London site: http://www.
20thcenturylondon.org.uk/ which involved parsing and normalising text
dates across four different collections. Granted it's tedious to
analyse all the different patterns that have been used, but it isn't
impossible to extract accurate ISO dates. The fact that archive was
created from those four collections is a testament to that.
Museum catalogue records always have some sort of absolute date,
though, which makes things easier for me. If people are marking up
phrases like 'this Saturday' or '25th June' then I can see that
extracting a date would be tricky - the parser would need the context
within which to place the date, in order to get the year or month.
That said, I don't how often people use hcalendar to mark up phrases
like 'next weekend' vs, say, 'Saturday 19th July 2008'. If we had
some idea of how microformats are being used to mark up dates in
real, online text, then we could make some meaningful statements
about how risky, or even impossible, it might be to extract ISO dates
automatically.
On the other hand, it might be great at the time content is
authored, to convert ambiguous natural language dates into
unambiguous microformats, as a way to reduce the pain of micro-
formatting content (especially it can detect dates in plain text
rather than parsing something it knows is a date). Authors could
confirm the generated microformats before publishing in a way
similar to how Yahoo! shortcuts Wordpress plugin works [1]
Decent authoring tools would be brilliant. Not just for dates but
locations and possibly other types of microformatted text. For
instance, I can link a UK street address to Google maps and get back
a precise point on a map of the UK. So do I really need to manually
write a lat/long into the HTML to tell a microformats tool how to
place the address on a map? The text contains all the necessary
information to perform this operation already.
I think microformats should be relatively easy for a non-technical
author to add to their text. Decent tools that generate the machine-
readable data would be an enormous aid here.
Jim
Jim O'Donnell
[EMAIL PROTECTED]
http://eatyourgreens.org.uk
http://flickr.com/photos/eatyourgreens
___
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss