Re: Comments on Data 3.0 manifesto

Kingsley Idehen Mon, 19 Apr 2010 10:26:06 -0700

Leigh Dodds wrote:

Hi Kingsley,


Thanks for the response. I wanted to clarify a couple of points.
Response edited and comments line:

On 19 April 2010 16:28, Kingsley Idehen <kide...@openlinksw.com> wrote:

I can replicate every killer ODBC demo I gave circa. 1993 with Linked
Data just by opening up a Descriptor URL. But, I can't really pull that
off today smoothly because my effort will ultimately get hijacked by RDF
issues like:

1. What is this thing you opened via a URL from Access or Excel?
2. Why do those LINKs do the wonderful things we see (e.g., polymorphic
resultsets i.e., the pattern you see in snorql or isparql query results
tables)
3. etc..

The distraction in the scenario above will either come from a confused
user or a Semantic Web aficionado.  Ironically, both are equally
confused, all of the time, but one party doesn't know it :-(


I'm not sure I really follow that.

Might the confusion you're seeing just be genuine questions about
trying to understand what your demos are doing? (You don't always
provide much context)


I think Danbri articulates the problem very well re. "URLs".

Follow the URL thread re. LINKER or leave the "Locator" alone.

Deemphasizing URL and emphasizing URI is a classic example. Neither areactually devoid of confusion.


Nobody hears or conveys the following with clarity:

1. A Structured (EAV model based) Descriptor Document has anAddress/Location (URL) on an HTTP Network

2. It has a Subject

3. The Subject is Named via a Generic HTTP based Identifier ( a Hybridwhere Generic HTTP Identifiers are used for Names)

4. The Subject's Attributes are also Named the same way

5. The Attribute values may also be References to Subjects (via theirNames)6. Names resolve to Structured Descriptions carried (or borne) byStructured Descriptor Documents (accessed via their URLs).


Closed loop.

Here is the norm (when talking to the assumed newbie audience) or whatthe aficionado expects to hear:


1. a Resource Description has a URI
2. Give Resources Names using HTTP
3. Use  RDF S-P-O Triples to Describe Resources
4. Link to other Resources using HTTP URIs.

Basically, Subject is a Resource, and a Resource has a URI. Wow!

Where do I get the Description from? Ah!  A URI.

Hmm I go get <http://xyz.rdf>, what's that? A Resource.

Hmm. what inside the resource <http://xyz.rdf>? Lots of triples thathave: Resource URIs in the Subject slot, Property Resource URIs in theProperty slot, and Values or Resources in the Object slot.That's the deconstruction of the *predictable essence* of a typicalLinked Data conversation.

Then to compound the matter, the ".rdf" files carry no relations back tothe Subject being described etc.. Thus, you can't even explore theDescription Graph by starting at the URI of the .rdf resource because ina way the URI abstraction as used in this context deems the DescriptorDocument (or Resource) non existent (it isn't important).

Please do not characterize my concerns as being about people not grokingmy demos, I haven't made a simple reference to my demos. I've madereferences to applications that already exist in other realms that work,and in many cases are plenty EAV savvy.

Note this example and specific tweaks explicitly made to DBpedia as partof our quest for coherence:


curl -I -H "Accept: text/n3" http://dbpedia.org/data/DBpedia.n3
HTTP/1.1 200 OK
Server: Virtuoso/06.01.3127 (Solaris) x86_64-sun-solaris2.10-64  VDB
Connection: Keep-Alive
Date: Mon, 19 Apr 2010 16:09:15 GMT
Accept-Ranges: bytes

Link: <http://dbpedia.org/data/DBpedia.xml>; rel="alternate";title="Metadata in RDF/XML format",<http://dbpedia.org/data/DBpedia.json>; rel="alternate";title="Metadata in JSON+RDF format",

<http://dbpedia.org/page/DBpedia>; rel="alternate"; title="XHTML+RDFa",
<http://dbpedia.org/resource/DBpedia>; rev="primarytopic",
<http://dbpedia.org/resource/DBpedia>; rel="describedby",

<http://mementoarchive.lanl.gov/dbpedia/timegate/http://dbpedia.org/data/DBpedia.n3>;rel="timegate"

X-SPARQL-default-graph: http://dbpedia.org
Content-Type: text/n3; charset=UTF-8
Content-Length: 5919

Here's what's in the HTML based Descriptor Document's <head/>:

<head profile="http://www.w3.org/1999/xhtml/vocab";>
   <title>About: DBpedia</title>
 ...

<link rel="alternate" type="application/rdf+xml"href="http://dbpedia.org/data/DBpedia.rdf"; title="RDF/XML Representation" /><link rel="alternate" type="text/rdf+n3"href="http://dbpedia.org/data/DBpedia.n3"; title="RDF N3/TurtleRepresentation" /><link rel="alternate" type="application/json+rdf"href="http://dbpedia.org/data/DBpedia.jrdf"; title="RDF/JSONRepresentation" /><link rel="alternate" type="application/json"href="http://dbpedia.org/data/DBpedia.json"; title="RDF/JSONRepresentation" /><link rel="timegate" type="text/html"href="http://mementoarchive.lanl.gov/dbpedia/timegate/http://dbpedia.org/page/DBpedia";title="Time Machine" /><link rel="foaf:primarytopic"href="http://dbpedia.org/resource/DBpedia"/>

   <link rev="describedby" href="http://dbpedia.org/resource/DBpedia"/>
...
</head>


Goal:

Make it clear that we have:

1. Descriptor Document
2. Subject of the Description carried by the Descriptor Document
3. Descriptor Document carries content in a variety of formats.

This I can explain is very plain language 100% of the time. Even therewe have EAV in play re. <link/> and @rel.

...
RDF inadvertently conflates Data Model and Data Representation Formats.


Do you really mean that, or that *people* have conflated those two
aspects of RDF?

I mean that the initial RDF/XML is RDF conflation basically killed allroutes to Data Model appreciation. The dropping URLs from Linked Dataparalance and focusing URI solely (without a modicum of qualification)compounded the matter.

Trying to convince people that there is a Data Model aspect to RDFdoesn't wash. No more than trying to convince people there is aHierarchical Data Model aspect to XML. Models don't get accentuated byMarkup languages, I have XML and RDF history as proof!

Thus, if people won't accept RDF's Data Model aspect do we continue towaste time beating that dead horse? Why not flip it around and open up achapter for the all critical data model, the real and coherent basis formeshing heterogeneous shaped data across disparate data sources.

In a nutshell its about Data Virtualization, even that moniker is takinga life of its own without people instinctively correlating it with RDFbased Linked Data.

This is an old snafu from the first coming of RDF, and sadly we can't
fix that in 2010. Simply stating: "RDF is based on a Graph Model..."
isn't enough. What Graph Model are we talking about? One that dropped
upon us from Space? Or one that we've used since the start of time?


The one in the RDF Model spec?

I take your point though about context.


It's all about Context.

Context is King!

People like to claim they grok the fact that Resource Description
Framework is: Graph Data Model and a collection of associated Data
Representation Formats, but in the same guise all attention is paid to
the latter. Even worse, RDF/XML  is still pitched as the only official
variant of latter (even in 2010). Look at how long it's taken RDFa to
emerge, and the amount of pressure its taken get it this far etc.


I think once people move beyond theory they naturally enough look at
how they're exchanging data, how to parse it, etc. This is when syntax
issues arise, they do get in the way, but problems aren't
insurmountable.

Problems aren't insurmountable when each problem is taken as a genuinelearning opportunity.

In my experience old mistakes repeating themselves remains too prevalentre. RDF in general, and now Linked Data.

Do remember, Linked Data's bootstrap had next to nothing to do withSemantic Web messaging and general mode of operation re. RDF. Ithappened on the pragmatic basis of simply deriving a Corpus of Namesfrom Wikipedia that showcased the value of Generic HTTP Identifier basedNaming above all else. What the likes of BBC, New York Times, Reutersetc.. are doing with DBpedia is basically what happens when people grokthe value of a powerful lookup table in a DBMS (basic or federated).

Imagine if DBpedia was left to go the traditional Semantic Web route,our grandkids would be lucky to have the 2007 variant of DBpedia letalone what exists today, and I am darned serious when I say that.

Having some more RDF syntaxes reach Recommendation status would be a
good thing though.

Even if they don't reach recommendation, they should be at the forefront of education oriented communications.

Standards are Retrospective beasts, just like the mythical "KillerApp.". The moment you take "Retro" out of standards, you have problems(many of which are playing out repeatedly re. RDF).

Even for technical audiences, beginning from EAV or RDF model doesn't
always help.

Well if the technical audience in question doesn't make the connection
between DBMS realm and Linked Data, of course not. Likewise, if  they
don't make the connection between standards based Data Access and Linked
Data, of course not. The Data 3.0 manifesto or emphasis on the EAV
cannot resonate with said audience, and its not who I am actually trying
to speak to either.


Actually what I meant was: people have different approaches to
learning (new technologies or otherwise). Some will warm to a theory
first approach, others just want to get their hands dirty. Starting
with a general introduction to modelling isn't useful for the latter
audience.


Hmm.

I don't know about "getting your hands dirty" without some basic context.

I am much more interested in people that already work with data,  via
tools without writing a single line of code.


Yes I'm interested in how Linked Data can put powerful tools into the
hands of non-programmers too.

I am simply saying to the audience above:

1. We have Structured Data
2. Here is how you make Structured Data (i.e. the underlying model)
3. Here is how you share Structured Data (via Descriptor Documents on an
HTTP network).

When people understand 1-3 (in many cases making links to what they
already grok), they can get on with exploiting the kind of
individual/enterprise Agility levels that real Open (standards
compliant) Data Access and Integration accords.


To be clear: are you advocating a broader view of Linked Data that
doesn't use RDF technologies at all?

No, again I am saying: RDF (which is perceived by most as Markup) isnot the commencement point re. Linked Data comprehension and appreciation.You will lose more people than you gain -- every time -- when the storystarts with a Markup Language that has its Data Model grounding in avague footnote. As I said, simply saying RDF is Graph Model baseddoesn't cut it. We have to expand, and more importantly, use whatalready exists in the minds of broader audiences. EAV is much morewidely known, understood, and used than RDF.

RDF based Linked Data builds on EAV by mandating the use of Generic HTTPscheme Identifiers for Names across Entity, Attributes, and Attributevalues (optionally).

If so what are you recommending that people use, for e.g. "Descriptor
Documents".

What ever Data Representation works for them, ultimately, they will cometo comprehend and appreciate RDF, just as they will ultimately come toappreciate OWL (yes, and I absolutely mean that about OWL). Thereshouldn't be an RDF Tax.

Paul Houle put it nicely in a Tweet: RDF is very easy and powerful, butyou only come to realize and appreciate it, post attempting to reinvent it.




Kingsley

Cheers,

L.



--

Regards,

Kingsley IdehenPresident & CEOOpenLink SoftwareWeb: http://www.openlinksw.com

Weblog: http://www.openlinksw.com/blog/~kidehen

Twitter/Identi.ca: kidehen

Re: Comments on Data 3.0 manifesto

Reply via email to