Re: RDF Update Feeds + URI time travel on HTTP-level

Herbert Van de Sompel Sun, 22 Nov 2009 13:42:16 -0800

[tried to send this before but somehow did not get through to list]


hi all,

(thanks Chris, Richard, Danny)

In light of the current discussion, I would like to provide someclarifications regarding "Memento: Time Travel for the Web", ie theidea of introducing HTTP content negotiation in the datetime dimension:


(*) Some extra pointers:

- For those who prefer browsing slides over reading a paper, there is 
http://www.slideshare.net/hvdsomp/memento-time-travel-for-the-web

- Around mid next week, a video recording of a presentation I gave onMemento should be available at http://www.oclc.org/research/dss/default.htm

- The Memento site is at http://www.mementoweb.org. Of specialinterest may be the proposed HTTP interactions for (a) web serverswith internal archival capabilities such as content managementsystems, version control systems, etc (http://www.mementoweb.org/guide/http/local/) and (b) web servers without internal archival capabilities (http://www.mementoweb.org/guide/http/remote/).

(*) The overall motivation for the work is the integration of archivedresources into regular web navigation by making them available viatheir original URIs. The archived resources we have focused on in ourexperiments so far are those kept by:

(a) Web Archives such as the Internet Archive, Webcite, archive-it.organd


(b) Content Management Systems such as wikis, CVS, ...

The reason I pinged Chris Bizer about our work is that we thought thatour proposed approach could also be of interest in the LoDenvironment. Specifically, the ability to get to prior descriptionsof LoD resources by doing datetime content negotiation on their URIseemed appealing; e.g. what was the dbpedia description for the Cityof Paris on March 20 2008? This ability would, for example, allowanalysis of (the evolution of ) data over time. The requirement thatis currently being discussed in this thread (which I interpret to beabout approaches to selectively get updates for a certain LoDdatabase) is not one I had considered using Memento for, thinking thiswas more in the realm of feed technologies such as Atom (as suggestedby Ed Summers), or the pre-REST OAI-PMH (http://www.openarchives.org/OAI/openarchivesprotocol.html).


(*) Regarding some issues that were brought up in the discussion so far:

- We use an X header because that seems to be best practice when doingexperimental work. We would very much like to eventually migrate to areal header, e.g. Accept-Datetime.

- We are definitely considering and interested in some way toformalize our proposal in a specification document. We felt that the I-D/RFC path would have been the appropriate one, but are obviously opento other approaches.

- As suggested by Richard, there is a bootstrapping problem, as thereis with many new paradigms that are introduced. I trust LoD developersfully understand this problem. Actually, the problem is not only atthe browser level but also at the server level. We are currentlyworking on a FireFox plug-in that, when ready, will be availablethrough the regular channels. And we have successfully (andexperimentally) modified the Mozilla code itself to be able todemonstrate the approach. We are very interested in getting support inother browsers, natively or via plug-ins. We also have some toolsavailable to help with initial deployment (http://www.mementoweb.org/tools/). One is a plug-in for the mediawiki platform; when installed thewiki natively supports datetime content negotiation and redirects aclient to the history page that was active at the datetime requestedin the X-Accept-Header. We just started a Google group for developersinterested in making Memento happen for their web servers, contentmanagement system, etc. (http://groups.google.com/group/memento-dev/).

(*) Note that the proposed solution also leverages the OAI-OREspecification (fully compliant with LoD best practice) as a mechanismto support discovery of archived resources.

I hope this helps to get a better understanding of what Memento isabout, and what its current status is. Let me end by stating that wewould very much like to get these ideas broadly adopted. And weunderstand we will need a lot of help to make that happen.


Cheers

Herbert



On Nov 22, 2009, at 2:39 AM, Danny Ayers wrote:

2009/11/22 Richard Cyganiak <rich...@cyganiak.de>:

On 20 Nov 2009, at 19:07, Chris Bizer wrote:


[snips]

From a web architecture POV it seems pretty solid to me. Doingstuff viaheaders is considered bad if you could just as well do it via linksand
additional URIs, but you can argue that the time dimension is such a
universal thing that a header-based solution is warranted.


Sounds good to me too, but x-headers are a jump, I think perhaps it's
a question worthy of throwing at the W3C TAG - pretty sure they've
looked at similar stuff in the past, but things are changing fast...

From what I can gather, proper diffs over time are hard (long before
you get to them logics). But Web-like diffs don't have to be - can't
be any less reliable than my online credit card statement. Bit
worrying there are so many different approaches available, sounds like
there could be a lot of coding time wasted.

But then again, might well be one for evolution - and in the virtual
world trying stuff out is usually worth it.

The main drawback IMO is that existing clients, such as all webbrowsers,will be unable to access the archived versions, because they don'tknowabout the header. If you are archiving web pages or RDF document,then youcould add links that lead clients to the archived versions, butthat won't
work for images, PDFs and so forth.


Hmm. For one, browsers are in flux, for two then you probably wouldn't
expect that kind of agent to give you anything but the latest.
If I need last years version, I follow my nose through URIs (as in svn
etc) - that kind of thing has to be a fallback, imho.

In summary, I think it's pretty cool.


Cool idea, for sure. It is something strong...ok, temporal stuff
should be available down at quite a low level, especially given that
things like xmpp will be bouncing around - but I reckon Richard's
right in suggesting the plain old URI thing will currently serve most
purposes.

Cheers,
Danny.

--
http://danny.ayers.name


==
Herbert Van de Sompel
Digital Library Research & Prototyping
Los Alamos National Laboratory, Research Library
http://public.lanl.gov/herbertv/
tel. +1 505 667 1267

Re: RDF Update Feeds + URI time travel on HTTP-level

Reply via email to