On 1/13/11 12:04 PM, Nathan wrote:
Hi Kinglsey,
Kingsley Idehen wrote:
When our engine describes entities it can publish these descriptions
using variety of structured data formats that include RDF. The same
thing applies on the data consumption side. Basically, RDF formats
are options re. Linked Data (the concept).
A generic problem here, when using non RDF types with Linked Data over
HTTP, is that there's currently no way to indicate that a resource
is/has a set of machine readable "linked data" variants, in many cases
it is useful to publish and consume with linked data in CSV format and
related (as you well note) - but without prior out of band knowledge
that the representation contains, or is, linked data, the machines are
pretty much screwed. Typically the RDF variants don't have this
problem because the media type sets the expectation, so you can conneg
on an RDF type and know your getting back "linked data", you can't do
this with CSV and related with any expectation that you'll get back
"linked data" - thus, if there was some way to mark the set of
representations given upon dereferencing a URI as linked data,
containing rdf, rdfable 3 tuples, or a view thereof, it'd be a lot
friendlier to the web of data in general.
So what happens to RDFa in (X)HTML? Even worse, no DOCTYPE declarations?
What about various JSON dialects for Linked Data graphs?
How about N-Triples? Ditto TriX and others?
In my world view I see realities such as:
1. Spreadsheet and other desktop productivity users opening up a URL
(directly or indirectly via WebDAV mounted to filesystem) -- this is a
massive realm for Linked Data exploitation
2. Starting FYN (follow-your-nose) patterns in ODE, Sponger etc.. that
might start from an RDF resource but eventually encounter resources that
aren't RDF based.
Thus, I believe we have to consider:
1. Client side heuristics on the parts of Linked Data apps that deal
with data format heterogeneity atop underlying S-P-O / E-A-V homogeneity
re. propositions embedded in data.
A typical approach would be to register new mediatypes, +variant
kinds, for instance text/rdf+csv or such like, but these types
wouldn't be well known throughout the internet, served correctly by
default in the likes of apache, or handed off to the correct consuming
programs by user agents - I'll leave it there, without a proposal, but
some indication to the machine would/will be needed to make this
approach friendlier for the web.
Remember a Linked Data Server can say (via HTTP): all I have is a CSV
(or other non RDF format) based representation of the RDF (via
mediatype) based Data you requested :-)
If you look closer, we are revisiting the issue of: where does
"resource" stop. Is it at the container or content level? In my world
view, the content matters. Yes, mediatypes help, but ultimately we have
to be much more open about the concept of Linked Data. Of course, a
client (e.g. Tabulator) can say: I don't understand what you sent me
etc..., which is fine, but it shouldn't be the basis for defining what
Linked Data (the concept) is all about.
Again, I have no problems with RDF based Linked Data as a variation of
the Linked Data concept. I just want clarity more than anything else.
Being provincial about Linked Data (via RDF format specificity) isn't
going to increase comprehension and adoption momentum.
and as an aside: I do worry a little that there may be some
overloading of terms going on here, Linked Data (the concept) and
Linked Data (the protocol) - I'm unsure exactly how to define Linked
Data (the concept) but assuming you're referring to a broad range of
EAV variant 3-Tuple based data with URIs.
The concept of Linked Data is old. Linked Data at InterWeb scale
courtesy of HTTP ubiquity is an immensely valuable (and mega cool!)
contemporary spin on an old concept. What else can I say? I guess
Google's your friend re. historic research on the subject: Linked Data :-)
TimBL (as far as I know) has never claimed to have invented the concept
of Linked Data. He dropped a note (subject: Linked Data) explaining how
you can leverage AWWW as an effective mechanism for producing Linked
Data at InterWeb scale.
Kingsley
Best,
Nathan
--
Regards,
Kingsley Idehen
President& CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen