bill.robe...@planet.nl wrote:
I've been trying to weigh up the pros and cons of these two approaches
to understand more clearly when you might want to use each. I hope
that the list members will be able to provide me with the benefit of
their experience and insight!
So the situation is that I have some information on a topic and I want
to make it available both in machine readable form and in human
readable form, for example a company wanting to publish information on
its products, or a government department wanting to publish some
statistics.
I can either:
1) include 'human' and 'machine' representations in the same web page
using RDFa
2) have an HTML representation and a separate RDF/XML representation
(or N3 or whatever) and decide which to provide via HTTP content
negotiation.
So which should I use? I suppose it depends on how the information
will be produced, maintained and consumed. Some generic
requirements/wishes:
Yes it does.
If you take the RDFa route you are making assumptions about the
existence of RDFa processors (not many at the current time, but this
will change in due course).
- I only want to have one place where the data is managed.
Is this a Triple or Quad store?
- I want people to be able to browse around a nicely formatted
representation of the information, ie a regular web page, probably
incorporating all sorts of other stuff as well as the data itself.
You want user agents to negotiate representations of your data which
brings you back to content negotiation.
- I don't want to type lots of XHTML or XML.
- I want the data to be found and used by search engines and aggregators.
Simply put RDFa in your HTML representations and via content negotiation
this will be exposed to crawlers (which are user agents).
The approach presented by Halb, Raimond and Hausenblas (
http://events.linkeddata.org/ldow2008/papers/06-halb-raimond-building-linked-data.pdf)
seems attractive: to summarise crudely, auto-generate some RDFa from
your database, but provide an RDF/XML dump too.
You can auto-generate RDFa as part of your automated HTML generation
pipeline, but you still have the subtle issue of implicit association of
a given entity and its metadata (your choice of URI scheme will
determine if content negotiation is required, assuming all of your data
exists in an RDFa annotated HTML doc).
On the other hand I find that RDFa leads to rather messy markup - I
prefer the 'cleanliness' of the separate representations.
Well back to Triple / Quad Stores and URL-rewrite rules.
For any non-trivial amount of data, then we will need a templating
engine of some sort for either approach. I suppose what may tip the
balance is that Yahoo and Google are starting to make use of RDFa, but
AFAIK they are not (yet) doing anything with "classic"
content-negotiated linked data.
They are, they just don't know it. The always request HTML so they get
an HTML representation of the metadata :-)
Anyone care to argue for one approach or the other? I suppose the
answer may well be "it depends" :-) But if so, what does it depend on?
Yes, it really just depends.
RDFa and Content Negotiation work better together that apart (note: to
Giovanni) .
Kingsley
Thanks in advance
Bill Roberts
--
Regards,
Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO
OpenLink Software Web: http://www.openlinksw.com