Re: A(nother) Guide to Publishing Linked Data Without Redirects
On Thursday 11. November 2010 01:50:36 Harry Halpin wrote: The question is how to build Linked Data on top of only HTTP 200 - the case where the data publisher either cannot alter their server set-up (.htaccess) files or does not care to. I think that's really simple, then they should use hash-URIs. That's all, really. Basically, what we have now is choice: If you want to publish documents on an old style web server and not worry about the 303 dance, use hash-URIs. If you have more control and prefer to take it all the way, you can spend two days of coding or just use some module and do the 303 dance. I really don't see a point in deprecating number two, and making a huge fuzz about it, as long as they can do number one. Cheers, Kjetil -- Kjetil Kjernsmo Ph.d Research Fellow, Semantic Web kje...@kjernsmo.net http://www.kjetil.kjernsmo.net/
Re: A(nother) Guide to Publishing Linked Data Without Redirects
In message aanlktikmg=+augjhlf-88q-6jzd7=zxz2gsj-qda1...@mail.gmail.com, Harry Halpin hhal...@ibiblio.org writes The question is how to build Linked Data on top of *only* HTTP 200 - the case where the data publisher either cannot alter their server set-up (.htaccess) files or does not care to. Might it help to look at this problem from the other end of the telescope? So far, the discussion has all been about what is returned. How about considering what is requested? I assume that we're talking about the situation where a user (human or machine) is faced with a URI to resolve. The implication is that they have acquired this URI through some Linked Data activity such as a SPARQL query, or reading a chunk of RDF from their own triple store. (If we're not - if we're talking about auto-magically inferring Linked Data-ness from random URLs, then I would agree that sticking RDFa into said random pages is a way to go, and leave the discussion.) The Linked Data guidelines make the assumption that said user is willing and able to indicate what sort of content they want, in this case via the Accept header mechanism. This makes it reasonable to further specify that the fallback response, in the absence of a suitable Accept header, is to deliver a human-readable resource, i.e. an HTML web page. Thus the web of Linked Data behaves like part of the web of documents, if users take no special action when dereferencing URLs. If we agree that it is reasonable for user agents to take some action to indicate what type of response they want, then one very simple solution for the content-negotiation-challenged data publisher would be to establish a convention that adding '.rdf' to a URL should deliver an RDF description of the NIR signified by that URL. Richard -- Richard Light
Re: A(nother) Guide to Publishing Linked Data Without Redirects
On 11/11/10 4:54 AM, Richard Light wrote: In message aanlktikmg=+augjhlf-88q-6jzd7=zxz2gsj-qda1...@mail.gmail.com, Harry Halpin hhal...@ibiblio.org writes The question is how to build Linked Data on top of *only* HTTP 200 - the case where the data publisher either cannot alter their server set-up (.htaccess) files or does not care to. Might it help to look at this problem from the other end of the telescope? So far, the discussion has all been about what is returned. How about considering what is requested? I assume that we're talking about the situation where a user (human or machine) is faced with a URI to resolve. The implication is that they have acquired this URI through some Linked Data activity such as a SPARQL query, or reading a chunk of RDF from their own triple store. (If we're not - if we're talking about auto-magically inferring Linked Data-ness from random URLs, then I would agree that sticking RDFa into said random pages is a way to go, and leave the discussion.) The Linked Data guidelines make the assumption that said user is willing and able to indicate what sort of content they want, in this case via the Accept header mechanism. This makes it reasonable to further specify that the fallback response, in the absence of a suitable Accept header, is to deliver a human-readable resource, i.e. an HTML web page. Thus the web of Linked Data behaves like part of the web of documents, if users take no special action when dereferencing URLs. If we agree that it is reasonable for user agents to take some action to indicate what type of response they want, then one very simple solution for the content-negotiation-challenged data publisher would be to establish a convention that adding '.rdf' to a URL should deliver an RDF description of the NIR signified by that URL. Richard Richard, Yes, we should look at this differently. We should honor the fact that the burgeoning Web of Linked Data is an evolution of the Web of Linked Document. To do this effectively, I believe we need to fix the Document Web and Data Web false dichotomy. There is no Linked Data to exploit without Documents at HTTP Addresses from which content is streamed. If we put the Web aside for a second, I am hoping we can accept that in the real world we have Documents with different surface structure e.g. Blank Paper and Graph Paper. We can scribble and doodle on blank paper. We can even describe things in sentences and paragraphs on blank paper, but when it comes to Observations (Data) Graph Paper is better i.e., it delivers high-fidelity expression of Observation by letting us place Subject Identifier, Subject Attributes, and Attribute values into cells. In the real-world, we've been able to make References across both types of paper (Documents): 1. Reference one Document from another 2. Reference a cell in one Document from a cell in another. Enter the luxury of computers and hypermedia. These innovations allow us to replicate what I've outline above using hyperlinks. Some examples: 1. Word processors -- you could reference across Microsoft Word documents on a computer, but never across Word and WordPerfect 2. Spreadsheets -- you could use Reference values (Names or Addresses) to connect cell content within a single spreadsheet or across several spreadsheets and workbooks, but you couldn't reference data across Excel and Lotus 1-2-3 3. Database Tables -- could use Unique Keys to Identify records with Foreign Keys are the Reference mechanism, but in the case of relational databases (majority) the tables didn't accept Reference values i.e., content was typed literals oriented; you could reference a table in Oracle from a Table in Microsoft SQL Server etc. As you can see from the above: #1 is still about scribbling on blank paper. References are scoped to entire documents or fragments. #2-3 is about graph paper oriented observation (data) capture and reference that leverages the fidelity of cells. Enter the luxury of computers, hypermedia, and a network protocols (HTTP): #1 looses its operating system and application specific scope. We have blank paper, so when we scribble we do so in HTML which leverages HTTP for referencing other documents. #2-3 loose their operating system and application specific scope. We have graph paper, so when we capture observation, leveraging the fidelity of cell level references, we do so via an EAV/SPO graph. As you can see, the Document hasn't gone anywhere, its structure has evolved with reference scope becoming more granular. Thus, when you HTTP GET and a sever responds with 200 OK, it's safe and sound to assume that a Document has been located. It is also safe and sound for a user agent to express what type of Content it would expect from a Document, and then interpret the Content retrieved at varying levels of semantic fidelity. Back to the point of looking at this differently re. user interaction. I've
Re: A(nother) Guide to Publishing Linked Data Without Redirects
On Nov 11, 2010, at 07:44, Kingsley Idehen wrote: On 11/11/10 4:54 AM, Richard Light wrote: In message aanlktikmg=+augjhlf-88q-6jzd7=zxz2gsj-qda1...@mail.gmail.com, Harry Halpin hhal...@ibiblio.org writes The question is how to build Linked Data on top of *only* HTTP 200 - the case where the data publisher either cannot alter their server set-up (.htaccess) files or does not care to. Might it help to look at this problem from the other end of the telescope? So far, the discussion has all been about what is returned. How about considering what is requested? Good idea. I assume that we're talking about the situation where a user (human or machine) is faced with a URI to resolve. The implication is that they have acquired this URI through some Linked Data activity such as a SPARQL query, or reading a chunk of RDF from their own triple store. (If we're not - if we're talking about auto-magically inferring Linked Data-ness from random URLs, then I would agree that sticking RDFa into said random pages is a way to go, and leave the discussion.) The Linked Data guidelines make the assumption that said user is willing and able to indicate what sort of content they want, in this case via the Accept header mechanism. This makes it reasonable to further specify that the fallback response, in the absence of a suitable Accept header, is to deliver a human-readable resource, i.e. an HTML web page. Thus the web of Linked Data behaves like part of the web of documents, if users take no special action when dereferencing URLs. If we agree that it is reasonable for user agents to take some action to indicate what type of response they want, then one very simple solution for the content-negotiation-challenged data publisher would be to establish a convention that adding '.rdf' to a URL should deliver an RDF description of the NIR signified by that URL. Richard Richard, Yes, we should look at this differently. We should honor the fact that the burgeoning Web of Linked Data is an evolution of the Web of Linked Document. To do this effectively, I believe we need to fix the Document Web and Data Web false dichotomy. There is no Linked Data to exploit without Documents at HTTP Addresses from which content is streamed. Kingsley, your analysis is solid except for one part: You seem to forget that the issue that brought us to this point was that the address of an information resource describing something is not the same as the address of the thing itself. It is that problem that is still worth solving. Regards, Dave If we put the Web aside for a second, I am hoping we can accept that in the real world we have Documents with different surface structure e.g. Blank Paper and Graph Paper. We can scribble and doodle on blank paper. We can even describe things in sentences and paragraphs on blank paper, but when it comes to Observations (Data) Graph Paper is better i.e., it delivers high-fidelity expression of Observation by letting us place Subject Identifier, Subject Attributes, and Attribute values into cells. In the real-world, we've been able to make References across both types of paper (Documents): 1. Reference one Document from another 2. Reference a cell in one Document from a cell in another. Enter the luxury of computers and hypermedia. These innovations allow us to replicate what I've outline above using hyperlinks. Some examples: 1. Word processors -- you could reference across Microsoft Word documents on a computer, but never across Word and WordPerfect 2. Spreadsheets -- you could use Reference values (Names or Addresses) to connect cell content within a single spreadsheet or across several spreadsheets and workbooks, but you couldn't reference data across Excel and Lotus 1-2-3 3. Database Tables -- could use Unique Keys to Identify records with Foreign Keys are the Reference mechanism, but in the case of relational databases (majority) the tables didn't accept Reference values i.e., content was typed literals oriented; you could reference a table in Oracle from a Table in Microsoft SQL Server etc. As you can see from the above: #1 is still about scribbling on blank paper. References are scoped to entire documents or fragments. #2-3 is about graph paper oriented observation (data) capture and reference that leverages the fidelity of cells. Enter the luxury of computers, hypermedia, and a network protocols (HTTP): #1 looses its operating system and application specific scope. We have blank paper, so when we scribble we do so in HTML which leverages HTTP for referencing other documents. #2-3 loose their operating system and application specific scope. We have graph paper, so when we capture observation, leveraging the fidelity of cell level references, we do so via an EAV/SPO graph. As you can see, the Document hasn't gone
Re: A(nother) Guide to Publishing Linked Data Without Redirects
On 11/11/10 8:07 AM, David Wood wrote: On Nov 11, 2010, at 07:44, Kingsley Idehen wrote: On 11/11/10 4:54 AM, Richard Light wrote: In message aanlktikmg=+augjhlf-88q-6jzd7=zxz2gsj-qda1...@mail.gmail.com, Harry Halpin hhal...@ibiblio.org writes The question is how to build Linked Data on top of *only* HTTP 200 - the case where the data publisher either cannot alter their server set-up (.htaccess) files or does not care to. Might it help to look at this problem from the other end of the telescope? So far, the discussion has all been about what is returned. How about considering what is requested? Good idea. I assume that we're talking about the situation where a user (human or machine) is faced with a URI to resolve. The implication is that they have acquired this URI through some Linked Data activity such as a SPARQL query, or reading a chunk of RDF from their own triple store. (If we're not - if we're talking about auto-magically inferring Linked Data-ness from random URLs, then I would agree that sticking RDFa into said random pages is a way to go, and leave the discussion.) The Linked Data guidelines make the assumption that said user is willing and able to indicate what sort of content they want, in this case via the Accept header mechanism. This makes it reasonable to further specify that the fallback response, in the absence of a suitable Accept header, is to deliver a human-readable resource, i.e. an HTML web page. Thus the web of Linked Data behaves like part of the web of documents, if users take no special action when dereferencing URLs. If we agree that it is reasonable for user agents to take some action to indicate what type of response they want, then one very simple solution for the content-negotiation-challenged data publisher would be to establish a convention that adding '.rdf' to a URL should deliver an RDF description of the NIR signified by that URL. Richard Richard, Yes, we should look at this differently. We should honor the fact that the burgeoning Web of Linked Data is an evolution of the Web of Linked Document. To do this effectively, I believe we need to fix the Document Web and Data Web false dichotomy. There is no Linked Data to exploit without Documents at HTTP Addresses from which content is streamed. Kingsley, your analysis is solid except for one part: You seem to forget that the issue that brought us to this point was that the address of an information resource describing something is not the same as the address of the thing itself. It is that problem that is still worth solving. David, I do believe Ian's solution solves the matter of Name / Address disambiguation. Using a Document URL (Address) as a Name requires the aforementioned disambiguation. Question is: who has to do the disambiguation? The user agent or the data server? I believe a user agent should perform Name / Address disambiguation via it semantic-fidelity choice. If high, then Ian's solution works i.e., the data is self-describing and the user agent should interpret accordingly. The semantic fidelity of HTTP stops at the Document, the problem at hand takes us into the realm of content interpretation. In a sense, like beauty this too lies in the eye of the beholder (the user agent). I don't think a new code is necessary since HTTP is doing its job as a document location and content access protocol. Thus, if we reference document URLs from browsers and follow links, everything will be fine. If we even go as far as taking a descriptor document's Subject URI (slash terminated) and then place that in a browser, we will be sorta fine too, depending on which user agent we use. If today's small pool of Linked Data aware user agents adopt the Ian's option, then I'll drop sorta from the paragraph above :-) Hope this helps. Kingsley Regards, Dave If we put the Web aside for a second, I am hoping we can accept that in the real world we have Documents with different surface structure e.g. Blank Paper and Graph Paper. We can scribble and doodle on blank paper. We can even describe things in sentences and paragraphs on blank paper, but when it comes to Observations (Data) Graph Paper is better i.e., it delivers high-fidelity expression of Observation by letting us place Subject Identifier, Subject Attributes, and Attribute values into cells. In the real-world, we've been able to make References across both types of paper (Documents): 1. Reference one Document from another 2. Reference a cell in one Document from a cell in another. Enter the luxury of computers and hypermedia. These innovations allow us to replicate what I've outline above using hyperlinks. Some examples: 1. Word processors -- you could reference across Microsoft Word documents on a computer, but never across Word and WordPerfect 2. Spreadsheets -- you could use Reference values (Names or Addresses) to connect cell content within a single
A(nother) Guide to Publishing Linked Data Without Redirects
Hi all, I've collected my thoughts on The Great 303 Debate of 2010 (as it will be remembered) at: http://prototypo.blogspot.com/2010/11/another-guide-to-publishing-linked-data.html Briefly, I propose a new HTTP status code (210 Description Found) to disambiguate between generic information resources and the special class of information resources that provide metadata descriptions about URIs addressed. My proposal is basically the same as posted earlier to this list, but significantly updated to include a mechanism to allow for the publication of Linked Data using a new HTTP status code on Web hosting services. Several poorly thought out corner cases were also dealt with. I look forward to feedback from the community. However, if you are about to say something like, the Web is just fine as it is, then I will have little patience. We invent the Web as we go and need not be artificially constrained. The Semantic Web is still young enough to be done right (or more right, or maybe somewhat right). Regards, Dave
Re: A(nother) Guide to Publishing Linked Data Without Redirects
On 11/10/10 5:15 PM, David Wood wrote: Hi all, I've collected my thoughts on The Great 303 Debate of 2010 (as it will be remembered) at: http://prototypo.blogspot.com/2010/11/another-guide-to-publishing-linked-data.html Briefly, I propose a new HTTP status code (210 Description Found) to disambiguate between generic information resources and the special class of information resources that provide metadata descriptions about URIs addressed. My proposal is basically the same as posted earlier to this list, but significantly updated to include a mechanism to allow for the publication of Linked Data using a new HTTP status code on Web hosting services. Several poorly thought out corner cases were also dealt with. I look forward to feedback from the community. However, if you are about to say something like, the Web is just fine as it is, then I will have little patience. We invent the Web as we go and need not be artificially constrained. The Semantic Web is still young enough to be done right (or more right, or maybe somewhat right). Initial comment, pre reading the post. What about, we get the Web of Linked Data sorted first, then move on to the Web of Semantically Linked Data? :-) Regards, Dave -- Regards, Kingsley Idehen President CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca: kidehen
Re: A(nother) Guide to Publishing Linked Data Without Redirects
Bravo Harry :-) let me also add without adding anythng to the header.. *keeping HTTP completely outside the picture* http header are for pure optimization issues, almos networking level. Caching fetching crawling, nothing to do with semantics. A conjecture: the right howto document is about 2 pages long it says somethingsimply put RDFa on your pages and.. ( a) there is a default interpretation which works 99.99% of the time e.g. if it has RDFa it talks about something that's an entity and its not a page or b) you add a triple but no triple means by default that.. or c) ) we're almost there i feel it. Gio On Thu, Nov 11, 2010 at 1:50 AM, Harry Halpin hhal...@ibiblio.org wrote: On Wed, Nov 10, 2010 at 11:15 PM, David Wood da...@3roundstones.com wrote: Hi all, I've collected my thoughts on The Great 303 Debate of 2010 (as it will be remembered) at: http://prototypo.blogspot.com/2010/11/another-guide-to-publishing-linked-data.html Briefly, I propose a new HTTP status code (210 Description Found) to disambiguate between generic information resources and the special class of information resources that provide metadata descriptions about URIs addressed. My proposal is basically the same as posted earlier to this list, but significantly updated to include a mechanism to allow for the publication of Linked Data using a new HTTP status code on Web hosting services. Several poorly thought out corner cases were also dealt with. I don't this solution cuts it or solves the problem to the extent that Ian Davis was proposing. To recap my opinion, the *entire* problem from many publisher's perpsectives is the use of status codes at all - whether it's 303 or 210 doesn't really matter. Most people, they will just want to publish their linked data in a directory without having to worry about status codes. So, de facto, the only status code that will matter is 200. The question is how to build Linked Data on top of *only* HTTP 200 - the case where the data publisher either cannot alter their server set-up (.htaccess) files or does not care to. I look forward to feedback from the community. However, if you are about to say something like, the Web is just fine as it is, then I will have little patience. We invent the Web as we go and need not be artificially constrained. The Semantic Web is still young enough to be done right (or more right, or maybe somewhat right). Regards, Dave
Re: A(nother) Guide to Publishing Linked Data Without Redirects
Hi Harry, On Nov 10, 2010, at 19:50, Harry Halpin wrote: On Wed, Nov 10, 2010 at 11:15 PM, David Wood da...@3roundstones.com wrote: Hi all, I've collected my thoughts on The Great 303 Debate of 2010 (as it will be remembered) at: http://prototypo.blogspot.com/2010/11/another-guide-to-publishing-linked-data.html Briefly, I propose a new HTTP status code (210 Description Found) to disambiguate between generic information resources and the special class of information resources that provide metadata descriptions about URIs addressed. My proposal is basically the same as posted earlier to this list, but significantly updated to include a mechanism to allow for the publication of Linked Data using a new HTTP status code on Web hosting services. Several poorly thought out corner cases were also dealt with. I don't this solution cuts it or solves the problem to the extent that Ian Davis was proposing. To recap my opinion, the *entire* problem from many publisher's perpsectives is the use of status codes at all - whether it's 303 or 210 doesn't really matter. Most people, they will just want to publish their linked data in a directory without having to worry about status codes. So, de facto, the only status code that will matter is 200. The question is how to build Linked Data on top of *only* HTTP 200 - the case where the data publisher either cannot alter their server set-up (.htaccess) files or does not care to. Yes, I understand the tendency to think this way. It is easy to implement and understand. However do *you*, as a more knowledgeable individual, really think we can build a Web of Data (or whatever you want to call it) by overloading both the http:// namespace and the 200 status code? I sure don't. In fact, I think it is silly to try especially in the absence of any standard way to understand what we got back. We are already running into serious problems trying to deal with physical and conceptual resources being given http:// URIs, but not being resolvable. We are stressing the (very young) Web but not even solving basic problems. In my opinion, we need a way to tie together (and yet allow to be separate) the Web of Documents and the Web of Data. To me that means that we need a hook to separate an information resource (the 200 status code) from a general metadata description (what I proposed as the 210 status code). My proposal doesn't have to be it, but something does. Without some separation at that level, we will continue to have practical problems. If we forgo any status code separation, then we will have to introspect every 200 result to acquire any information about what we got back. I don't think that is practical. Regards, Dave I look forward to feedback from the community. However, if you are about to say something like, the Web is just fine as it is, then I will have little patience. We invent the Web as we go and need not be artificially constrained. The Semantic Web is still young enough to be done right (or more right, or maybe somewhat right). Regards, Dave
Re: A(nother) Guide to Publishing Linked Data Without Redirects
On Wed, Nov 10, 2010 at 11:15 PM, David Wood da...@3roundstones.com wrote: Hi all, I've collected my thoughts on The Great 303 Debate of 2010 (as it will be remembered) at: http://prototypo.blogspot.com/2010/11/another-guide-to-publishing-linked-data.html Briefly, I propose a new HTTP status code (210 Description Found) to disambiguate between generic information resources and the special class of information resources that provide metadata descriptions about URIs addressed. My proposal is basically the same as posted earlier to this list, but significantly updated to include a mechanism to allow for the publication of Linked Data using a new HTTP status code on Web hosting services. Several poorly thought out corner cases were also dealt with. I don't this solution cuts it or solves the problem to the extent that Ian Davis was proposing. To recap my opinion, the *entire* problem from many publisher's perpsectives is the use of status codes at all - whether it's 303 or 210 doesn't really matter. Most people, they will just want to publish their linked data in a directory without having to worry about status codes. So, de facto, the only status code that will matter is 200. The question is how to build Linked Data on top of *only* HTTP 200 - the case where the data publisher either cannot alter their server set-up (.htaccess) files or does not care to. I look forward to feedback from the community. However, if you are about to say something like, the Web is just fine as it is, then I will have little patience. We invent the Web as we go and need not be artificially constrained. The Semantic Web is still young enough to be done right (or more right, or maybe somewhat right). Regards, Dave
Publishing Linked Data without Redirects
I wrote up a summary of the current thinking on using 200 instead of 303 to serve up Linked Data: http://iand.posterous.com/a-guide-to-publishing-linked-data-without-red The key part is: When your webserver receives a GET request to your thing’s URI you may respond with a 200 response code and include the content of the description document in the response provided that you: 1. include the URI of the description document in a content-location header, and 2. ensure the body of the response is the same as the body obtained by performing a GET on the description document’s URI, and 3. include a triple in the body of the response whose subject is the URI of your thing, predicate is http://www.w3.org/2007/05/powder-s#describedby and object is the URI of your description document But read the whole post for an example, some theory background and some FAQ. Cheers, Ian
Re: Publishing Linked Data without Redirects
This makes a lot of sense and I'm feeling a lot more comfortable. Thanks Ian. However, like kaeic, who commented on the blog, I have one little concern that I hope can be made to go away. The list of rules to apply take a first match and the one we're interested in is no. 4. However, doesn't no. 1 match? That says: If the response status code is 200 or 203 and the request method was GET, the response payload is a representation of the target resource. On the face of it, this first rule applies to the toucan example since the content-location header is not mentioned at all. This seems very odd - surely for rule 4 /ever/ to match it must be applied ahead of what is now rule 1? What am I missing? Phil On 08/11/2010 08:59, Ian Davis wrote: I wrote up a summary of the current thinking on using 200 instead of 303 to serve up Linked Data: http://iand.posterous.com/a-guide-to-publishing-linked-data-without-red The key part is: When your webserver receives a GET request to your thing’s URI you may respond with a 200 response code and include the content of the description document in the response provided that you: 1. include the URI of the description document in a content-location header, and 2. ensure the body of the response is the same as the body obtained by performing a GET on the description document’s URI, and 3. include a triple in the body of the response whose subject is the URI of your thing, predicate is http://www.w3.org/2007/05/powder-s#describedby and object is the URI of your description document But read the whole post for an example, some theory background and some FAQ. Cheers, Ian Please consider the environment before printing this email. Find out more about Talis at http://www.talis.com/ shared innovation™ Any views or personal opinions expressed within this email may not be those of Talis Information Ltd or its employees. The content of this email message and any files that may be attached are confidential, and for the usage of the intended recipient only. If you are not the intended recipient, then please return this message to the sender and delete it. Any use of this e-mail by an unauthorised recipient is prohibited. Talis Information Ltd is a member of the Talis Group of companies and is registered in England No 3638278 with its registered office at Knights Court, Solihull Parkway, Birmingham Business Park, B37 7YB. -- Phil Archer W3C Mobile Web Initiative http://www.w3.org/Mobile http://philarcher.org @philarcher1