On 11/11/10 4:54 AM, Richard Light wrote:
In message
<aanlktikmg=+augjhlf-88q-6jzd7=zxz2gsj-qda1...@mail.gmail.com>, Harry
Halpin <hhal...@ibiblio.org> writes
The question is how to build Linked Data on top of *only* HTTP 200 -
the case where the data publisher either cannot alter their server
set-up (.htaccess) files or does not care to.
Might it help to look at this problem from the other end of the
telescope? So far, the discussion has all been about what is returned.
How about considering what is requested?
I assume that we're talking about the situation where a user (human or
machine) is faced with a URI to resolve. The implication is that they
have acquired this URI through some Linked Data activity such as a
SPARQL query, or reading a chunk of RDF from their own triple store.
(If we're not - if we're talking about auto-magically inferring Linked
Data-ness from random URLs, then I would agree that sticking RDFa into
said random pages is a way to go, and leave the discussion.)
The Linked Data guidelines make the assumption that said user is
willing and able to indicate what sort of content they want, in this
case via the Accept header mechanism. This makes it reasonable to
further specify that the fallback response, in the absence of a
suitable Accept header, is to deliver a human-readable resource, i.e.
an HTML web page. Thus the web of Linked Data behaves like part of the
web of documents, if users take no special action when dereferencing
URLs.
If we agree that it is reasonable for user agents to take some action
to indicate what type of response they want, then one very simple
solution for the content-negotiation-challenged data publisher would
be to establish a convention that adding '.rdf' to a URL should
deliver an RDF description of the NIR signified by that URL.
Richard
Richard,
Yes, we should look at this differently. We should honor the fact that
the burgeoning Web of Linked Data is an evolution of the Web of Linked
Document. To do this effectively, I believe we need to fix the Document
Web and Data Web false dichotomy.
There is no Linked Data to exploit without Documents at HTTP Addresses
from which content is streamed.
If we put the Web aside for a second, I am hoping we can accept that in
the real world we have Documents with different surface structure e.g.
Blank Paper and Graph Paper.
We can scribble and doodle on blank paper. We can even describe things
in sentences and paragraphs on blank paper, but when it comes to
Observations ("Data") Graph Paper is better i.e., it delivers
high-fidelity expression of Observation by letting us place Subject
Identifier, Subject Attributes, and Attribute values into cells.
In the real-world, we've been able to make References across both types
of paper (Documents):
1. Reference one Document from another
2. Reference a cell in one Document from a cell in another.
Enter the luxury of computers and hypermedia. These innovations allow us
to replicate what I've outline above using hyperlinks. Some examples:
1. Word processors -- you could reference across Microsoft Word
documents on a computer, but never across Word and WordPerfect
2. Spreadsheets -- you could use Reference values (Names or Addresses)
to connect cell content within a single spreadsheet or across several
spreadsheets and workbooks, but you couldn't reference data across Excel
and Lotus 1-2-3
3. Database Tables -- could use Unique Keys to Identify records with
Foreign Keys are the Reference mechanism, but in the case of relational
databases (majority) the tables didn't accept Reference values i.e.,
content was typed literals oriented; you could reference a table in
Oracle from a Table in Microsoft SQL Server etc.
As you can see from the above:
#1 is still about scribbling on blank paper. References are scoped to
entire documents or fragments.
#2-3 is about graph paper oriented observation (data) capture and
reference that leverages the fidelity of cells.
Enter the luxury of computers, hypermedia, and a network protocols (HTTP):
#1 looses its operating system and application specific scope. We have
blank paper, so when we scribble we do so in HTML which leverages HTTP
for referencing other documents.
#2-3 loose their operating system and application specific scope. We
have graph paper, so when we capture observation, leveraging the
fidelity of cell level references, we do so via an EAV/SPO graph.
As you can see, the Document hasn't gone anywhere, its structure has
evolved with reference scope becoming more granular.
Thus, when you HTTP GET and a sever responds with 200 OK, it's safe and
sound to assume that a Document has been located. It is also safe and
sound for a user agent to express what type of Content it would expect
from a Document, and then interpret the Content retrieved at varying
levels of semantic fidelity.
Back to the point of looking at this differently re. user interaction.
I've held the position for a while that the Linked Data narrative is
back to front. I say this for the following reasons:
1. Document vs Data false dichotomy
2. Assumption that anytime soon people will think URIs when they are
already used to URLs.
Orderly Linked Data narrative in steps for Humans:
1. Users continue to enter Document URLs into Browsers e.g.
<http://dbpedia.org/page/Paris> instead of
<http://dbpedia.org/resource/Paris>
2. Users will see a human comprehensible document with a clearly
identified subject and all its associated attributes and attribute values
3. They will follow their noses to wherever via the links in the
document take them enjoying the power of serendipitous discovery of
relevant things
4. They will bookmark without confusion i.e. not magical changes in the
Browser address bar
5. They will be also discover human limitations as time, data volume,
data disparity intersect
6. They will be happy and ultimately wiser (i.e., delegate stuff to
smart agents that can exploit these links without human limitations).
To conclude, Ian is suggesting a solution for high-semantic-fidelity
user-agents that doesn't break anything, and actually accentuates the
Document vs Data false dichotomy. HTTP is a document location and
content retrieval protocol :-)
--
Regards,
Kingsley Idehen
President& CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen