Re: Conneg representation equivalence
On 11/03/2010 23:55, Nathan wrote: Pierre-Antoine Champin wrote: On 11/03/2010 11:04, Toby Inkster wrote: On Thu, 2010-03-11 at 02:24 +, Nathan wrote: If I have multiple representations of a resource which I consider equal, let's say one of each of the following: RDF+XML, RDF+N3, SVG Then should all three representations be considered equivalent? They certainly *could* all represent the same thing. Whether they *do* represent the same thing is a judgement call. Well, if they are accessible via the same URI, using content negociation, then my reading of the HTTP specification is that they *must be* representations of the same resource. Not sure what Nathan means by equivalent... that I consider them semantically equal representations of a resource. for instance the same RDF encoded as N3 and RDF+XML. I would rather write that they represent the same resource, then. Equivalence is really a matter of what you intend to do with it. An HTML entity and an RDF/XML entity can represent the same resource, but they may not be equivalent to a human reader... Note that the notion of RDF graph is a tricky one here, because in some cases, this is neither the resource being identified by the URI nor the entity retrieved through HTTP... Take for example http://example.com/john.doe/me This URI would identifies a person (1), represented by an RDF graph (2), serialized in RDF/XML (3). According to HTTP terminology, (1) is the resource and (3) is the entity representing (1), but (2) has no precise status here... It does not mean that the RDF graph *can not* be considered as a resource. As a matter of fact, the above URI will probably 303 redirect to something like http://example.com/john.doe/my_foaf_profile which would idenfity the foaf profile, and could resolve in RDF/XML or Turtle depending on Conneg, but the common practice is more to jump from the URI of the person to the URI of a *specific* representation (.rdf or .ttl or .html). Is it correct that all representations must have consistent fragment identifiers in order to be considered equivalent? A fragment identifier should not identify different things in different representations. (Though it may be unrepresented in some or all of the representations.) Is that so? If I recall correctly the URI RFC (no internet when writing the mail, sorry), the semantics of fragments identifiers depends on the retrieved content-type. So why would they *have* to identify the same thing? That being said, I agree it sounds like a good practice. Especially if you consider an RDF/XML and a Turtle representation of the same RDF graph... If their fragment identifier were not consistent, that would be a serious headache... But is this rule written somewhere? yeah in awww http://www.w3.org/TR/webarch/#fragid and more precisely in http://www.w3.org/TR/webarch/#frag-coneg Indeed, I once read that but had forgotten about it (though I wouldn't have dared to behave otherwise ;) Thanks for remininding me :) pa
RDF Serializations
Hi All, I've been putting some thought in to RDF Serializations in the context of linked data; and ever increasingly I'm questioning why I feel the need to offer the same RDF graphs serialized in different formats. I guess a specific questions would be, does anybody operate a linked data consuming library that doesn't support a particular serialization? I'm starting to see this more as a processing / computation load transfer between server and client, where most clients could easily convert the serialization from one format to another, but prefer to offload this to the server where possible. What I'm gunning for in the end, is to only expose all linked data / rdf as static RDF+XML documents within my application - would this in any way make the data less linked because some clients don't support RDF+XML or could I take it for granted that everybody (for instance everybody on this list) could handle this serialization. Any other comments or thoughts people may have on this topic are more than welcome. Many Regards, Nathan
Entity Search @ SEMSEARCH10
(Apologies if you receive multiple copies of this message) Call for PARTICIPATION: Entity Search @ SEMSEARCH10 == Fellow Researcher, for this year's SemSearch workshop to be held at WWW 2010, we are glad to announce a special track for entity search. This is to see where we are and to promote further research on entity retrieval on the semantic data. Please refer to the call below for more details on this matter. As many people were already asking, we would like to make clear that the participation at the entity search evaluation is not necessary for SemSearch10. As usual, we accept any papers that address the SEMSEARCH topics. For news and discussions related to SemSearch and evaluation at SemSearch, please register at http://tech.groups.yahoo.com/group/semsearcheval/. We are looking forward to see you at SemSearch10 in Raleigh, NC! Cheers, Marko Grobelnik, Jožef Stefan Institute, Ljubljana, Slovenia Peter Mika, Yahoo! Research, Barcelona, Spain Thanh Tran Duc, Institute AIFB, University of Karlsruhe (TH), Germany Haofen Wang, Apex Lab, Shanghai Jiao Tong University, China. === Entity Search @ SEMSEARCH10 Third International Semantic Search Workshop SemSearch10 April 26, 2010, Raleigh, NC, USA Homepage: http://km.aifb.uni-karlsruhe.de/ws/semsearch10#eva Submission deadline for descriptions of Entity Search systems results: April 10th, 2010 (12.00 AM, GMT) === Our ultimate goal is to develop a benchmark, based on which semantic search systems can be compared and analyzed in a systematic fashion. Clearly, semantics can be used for different tasks (document vs. data retrieval) and can be exploited throughout the search process (for more usable query construction, for better matching and ranking, for richer result presentation etc). Hence, such a benchmark shall enable the study of different aspects of semantic search systems. For this workshop, we will initially focus on the aspects of matching and ranking in the semantic data search scenario. In particular, we aim to analyze the effectiveness, efficiency and robustness of those features of semantic search systems, which are ready to be applied to the Web today: the capability to answer queries related to real world entities. The research questions we aim to tackle are: - How well do semantic data search engines perform on the task of Entity Search on the Web? - What are the underlying concepts and techniques that make up the differences? For answering these questions, we provide the following guidelines and support for evaluating entity search systems: --- Queries --- We provide a set of queries that are focused on the task of entity search. Every query is a plain list of keywords which refer to one particular entity. In other words, the queries ask for one particular entity (as opposed to a set of entity. These queries represent a sample extracted from the Yahoo Web search query log. One example of this type is Semantic Search workshop 2010 WWW, which retrieves resources that are representations of or related to the current Semantic Search workshop. More sample queries can be downloaded from this link: http://km.aifb.uni-karlsruhe.de/ws/semsearch10/Files/samplequeries Access to the evaluation set of queries and thus participation in the evaluation requires the signing of a license agreement. http://km.aifb.uni-karlsruhe.de/ws/semsearch10/Files/agreement To avoid the effect of ad-hoc optimization, we will make the final queries used for the evaluation available to participants only shortly before the submission deadline. --- Data --- We provide a corpus of datasets, which contain entity descriptions in the form of RDF. They represent a sample of Web data crawled from publicly available sources. For this evaluation, we use the Billion Triple Challenge 2009 dataset. Further information and detailed statistics can be found here: http://vmlion25.deri.ie/ The original Billion Triple Challenge 2009 dataset contains blank nodes. We will not deal with blank nodes in this evaluation and thus require participants to encode blank nodes according to the following rule: BNID map to http://example.org/URLEncode(BNID), where BNID is the blank node id. Since the blank node ids in that dataset are unique, this convention is sufficient to map blank nodes to obtain distinct URIs. Instead of encoding the blank nodes using this convention, participants can also download the following version of the Billion Triple Challenge 2009 dataset where blank nodes are have been already converted to URIs: http://km.aifb.uni-karlsruhe.de/ws/dataset_semsearch2010/000-CONTENTS --- Relevance Judgment --- The search systems produce lists of at
RE: RDF Serializations
Hi All, I've been putting some thought in to RDF Serializations in the context of linked data; and ever increasingly I'm questioning why I feel the need to offer the same RDF graphs serialized in different formats. I guess a specific questions would be, does anybody operate a linked data consuming library that doesn't support a particular serialization? Yes I expect they do because unfortunately RDF/XML is still the only officially endorsed W3C standard despite the plethora of other formats - not every library supports every serialization and then you have the issue of RDF embedded/implied in other formats - RDFa, micro formats, RSS, GRDDL - where support is more patchy. I'm starting to see this more as a processing / computation load transfer between server and client, where most clients could easily convert the serialization from one format to another, but prefer to offload this to the server where possible. A good library should be able to do the transformation efficiently whether at the client/server end, unless you're serving massive RDF dumps where this is infeasible/ill-advised for the server then there's no reason not to offer multiple formats What I'm gunning for in the end, is to only expose all linked data / rdf as static RDF+XML documents within my application - would this in any way make the data less linked because some clients don't support RDF+XML or could I take it for granted that everybody (for instance everybody on this list) could handle this serialization. Yes most clients would support RDF/XML as it's the only official W3C standard but part of the ethos of the whole LOD movement is that the data should be as open as possible - restricting it to one format limits the openness of the data to some degree. Personally from the point of view of someone who both consumes Linked Data and writes parsers and serializers for RDF I'd prefer to get my RDF in a format other than RDF/XML such as Turtle as other formats are typically far easier (and faster) to parse. Having it available in formats other than RDF/XML also allows for easy scripting - someone could quite easily write a script to grab RDF in NTriples format from some URI and then dump the Triples to the screen without having to use a full blown RDF library whereas it's just not possible if you're stuck with getting RDF/XML I guess in answer to your question it doesn't make the data less linked but it makes it less accessible i.e. open Any other comments or thoughts people may have on this topic are more than welcome. Many Regards, Nathan Rob Vesse PhD Student IAM Group Bay 20, Room 4027, Building 32 Electronics Computer Science University of Southampton SO17 1BJ
Re: RDF Serializations
Rob Vesse wrote: Hi All, I've been putting some thought in to RDF Serializations in the context of linked data; and ever increasingly I'm questioning why I feel the need to offer the same RDF graphs serialized in different formats. I guess a specific questions would be, does anybody operate a linked data consuming library that doesn't support a particular serialization? Yes I expect they do because unfortunately RDF/XML is still the only officially endorsed W3C standard despite the plethora of other formats - not every library supports every serialization and then you have the issue of RDF embedded/implied in other formats - RDFa, micro formats, RSS, GRDDL - where support is more patchy. I'm starting to see this more as a processing / computation load transfer between server and client, where most clients could easily convert the serialization from one format to another, but prefer to offload this to the server where possible. A good library should be able to do the transformation efficiently whether at the client/server end, unless you're serving massive RDF dumps where this is infeasible/ill-advised for the server then there's no reason not to offer multiple formats What I'm gunning for in the end, is to only expose all linked data / rdf as static RDF+XML documents within my application - would this in any way make the data less linked because some clients don't support RDF+XML or could I take it for granted that everybody (for instance everybody on this list) could handle this serialization. Yes most clients would support RDF/XML as it's the only official W3C standard but part of the ethos of the whole LOD movement is that the data should be as open as possible - restricting it to one format limits the openness of the data to some degree. Personally from the point of view of someone who both consumes Linked Data and writes parsers and serializers for RDF I'd prefer to get my RDF in a format other than RDF/XML such as Turtle as other formats are typically far easier (and faster) to parse. Having it available in formats other than RDF/XML also allows for easy scripting - someone could quite easily write a script to grab RDF in NTriples format from some URI and then dump the Triples to the screen without having to use a full blown RDF library whereas it's just not possible if you're stuck with getting RDF/XML I guess in answer to your question it doesn't make the data less linked but it makes it less accessible i.e. open good answer; thanks :)
Invitation to contribute to DBpedia by improving the infobox mappings + New Scala-based Extraction Framework
Hi all, in order to extract high quality data from Wikipedia, the DBpedia extraction framework relies on infobox to ontology mappings which define how Wikipedia infobox templates are mapped to classes of the DBpedia ontology. Up to now, these mappings were defined only by the DBpedia team and as Wikipedia is huge and contains lots of different infobox templates, we were only able to define mappings for a small subset of all Wikipedia infoboxes and also only managed to map a subset of the properties of these infoboxes. In order to enable the DBpedia user community to contribute to improving the coverage and the quality of the mappings, we have set up a public wiki at http://mappings.dbpedia.org/index.php/Main_Page which contains: 1. all mappings that are currently used by the DBpedia extraction framework 2. the definition of the DBpedia ontology and 3. documentation for the DBpedia mapping language as well as step-by-step guides on how to extend and refine mappings and the ontology. So if you are using DBpedia data and you you were always annoyed that DBpedia did not properly cover the infobox template that is most important to you, you are highly invited to extend the mappings and the ontology in the wiki. Your edits will be used for the next DBpedia release expected to be published in the first week of April. The process of contributing to the ontology and the mappings is as follows: 1. You familiarize yourself with the DBpedia mapping language by reading the documentation in the wiki. 2. In order to prevent random SPAM, the wiki is read-only and new editors need to be confirmed by a member of the DBpedia team (currently Anja Jentzsch does the clearing). Therefore, please create an account in the wiki for yourself. After this, Anja will give you editing rights and you can edit the mappings as well as the ontology. 3. For contributing to the next DBpedia relase, you can edit until Sunday, March 21. After this, we will check the mappings and the ontology definition in the Wiki for consistency and then use both for the next DBpedia release. So, we are starting kind of a social experiment on if the DBpedia user community is willing to contribute to the improvement of DBpedia and on how the DBpedia ontology develops through community contributions :-) Please excuse, that it is currently still rather cumbersome to edit the mappings and the ontology. We are currently working on a visual editor for the mappings as well as a validation service, which will check edits to the mappings and test the new mappings against example pages from Wikipedia. We hope that we will be able to deploy these tools in the next two months, but still wanted to release the wiki as early as possible in order to already allow community contributions to the DBpedia 3.5 release. If you have questions about the wiki and the mapping language, please ask them on the DBpedia mailing list where Anja and Robert will answer them. What else is happening around DBpedia? In order to speed up the data extraction process and to lay a solid foundation for the DBpedia Live extraction, we have ported the DBpedia extraction framework from PHP to Scala/Java. The new framework extracts exactly the same types of data from Wikipedia as the old framework, but processes a single page now in 13 milliseconds instead of the 200 milliseconds. In addition, the new framework can extract data from tables within articles and can handle multiple infobox templates per article. The new framework is available under GPL license in the DBpedia SVN and is documented at http://wiki.dbpedia.org/Documentation. The whole DBpedia team is very thankful to two companies which enabled us to do all this by sponsoring the DBpedia project: 1. Vulcan Inc. as part of its Project Halo (www.projecthalo.com). Vulcan Inc. creates and advances a variety of world-class endeavors and high impact initiatives that change and improve the way we live, learn, do business (http://www.vulcan.com/). 2. Neofonie GmbH, a Berlin-based company offering leading technologies in the area of Web search, social media and mobile applications (http://www.neofonie.de/index.jsp). Thank you a lot for your support! I personally would also like to thank: 1. Anja Jentzsch, Robert Isele, and Christopher Sahnwaldt for all their great work on implementing the new extraction framework and for setting up the mapping wiki. 2. Andreas Lange and Sidney Bofah for correcting and extending the mappings in the Wiki. Cheers, Chris -- Prof. Dr. Christian Bizer Web-based Systems Group Freie Universität Berlin +49 30 838 55509 http://www.bizer.de ch...@bizer.de
Re: Conneg representation equivalence
Nathan nat...@webr3.org wrote: Is it correct that all representations must have consistent fragment identifiers in order to be considered equivalent? A fragment identifier should not identify different things in different representations. (Though it may be unrepresented in some or all of the representations.) From http://www.w3.org/TR/webarch/#frag-coneg , the fragment identifier must be defined consistently by the representations; the *provider* decides when definitions of fragment identifier semantics are sufficiently consistent... If I recall correctly the URI RFC..., the semantics of fragments identifiers depends on the retrieved content-type. So why would they *have* to identify the same thing? From http://www.w3.org/TR/webarch/#frag-coneg , Individual data formats may define their own rules for use of the fragment identifier syntax for specifying different types of subsets, views, or external references that are identifiable as secondary resources by that media type. That being said, I agree it sounds like a good practice. Especially if you consider an RDF/XML and a Turtle representation of the same RDF graph... If their fragment identifier were not consistent, that would be a serious headache... But is this rule written somewhere? The interpretation in http://www.w3.org/TR/webarch/#frag-coneg suggests that it would be correct either for the fragment identifier to be interpreted by the RFG/XML and Turtle representations consistently --- refers to a semantically consistent secondary representation --- or for an interpretation to not be defined at all. It is not acceptable for representations with an interpretation defined, to interpret the fragment identifier such that semantically inconsistent secondary representations are returned. Interpret them consistently, or don't interpret them, but don't do it inconsistently! So maybe an answer to Nathan's question needs qualification; IF for each representation of a resource an interpretation for a given fragment identifier has been defined, AND we assume that the server is exhibiting correct behavior, THEN we must accept that the secondary representations meet the providers definition of consistent. Consistency is not the same as equivalence; two representations might return consistent secondary representations that cannot be considered equal because they are of entirely different content types. -- John S. Erickson, Ph.D. http://bitwacker.wordpress.com olyerick...@gmail.com Twitter: @olyerickson
Re: Conneg representation equivalence
On 11 Mar 2010, at 20:34, Pierre-Antoine Champin wrote: Is it correct that all representations must have consistent fragment identifiers in order to be considered equivalent? A fragment identifier should not identify different things in different representations. (Though it may be unrepresented in some or all of the representations.) Is that so? If I recall correctly the URI RFC (no internet when writing the mail, sorry), the semantics of fragments identifiers depends on the retrieved content-type. Correct, see [1]: “The semantics of a fragment identifier are defined by the set of representations that might result from a retrieval action on the primary resource. The fragment's format and resolution is therefore dependent on the media type [RFC2046] of a potentially retrieved representation, even though such a retrieval is only performed if the URI is dereferenced.” So why would they *have* to identify the same thing? You just have to read one paragraph down: “If the primary resource has multiple representations, as is often the case for resources whose representation is selected based on attributes of the retrieval request (a.k.a., content negotiation), then whatever is identified by the fragment should be consistent across all of those representations. Each representation should either define the fragment so that it corresponds to the same secondary resource, regardless of how it is represented, or should leave the fragment undefined (i.e., not found).” Best, Richard [1] http://www.ietf.org/rfc/rfc3986.txt That being said, I agree it sounds like a good practice. Especially if you consider an RDF/XML and a Turtle representation of the same RDF graph... If their fragment identifier were not consistent, that would be a serious headache... But is this rule written somewhere? pa
URI Fragments
Hi Again :) Last question(s) related to fragments.. if I have: http://example.org/something http://example.org/something#a Those are two unique URIs and thus two unique resources (?) And the semantics of a fragment means that http://example.org/something#a is a secondary resource, where http://example.org/something is the primary resource (?) Then if I delete a Primary resource, the secondary resources must also be deleted, true / false (?). Here are some examples, which may seem like over kill but some are interesting and generally I *feel* rules like this should be either always true, or always false, never varying. examples: if I remove a database table, then all it's rows also no longer exist. if I remove London then the Tower of London also no longer exists. if somebody removes me, then my arms also no longer exist. if I remove test.html then test.html#whatever no longer exists. if I remove test.rdf then test.rdf#this no longer exists if I remove http://www.w3.org/People/Berners-Lee/card then http://www.w3.org/People/Berners-Lee/card#i no longer exists. conversely: if I remove a row, the table still exists if I remove the Tower of London, London still exists if you remove my arms, I still exists and I'll find another way to type. if I remove test.html#whatever test.html still exists if I remove test.rdf#this, test.rdf still exists if I remove http://www.w3.org/People/Berners-Lee/card#i then http://www.w3.org/People/Berners-Lee/card still exists. If the above is true (secondary resource must also be deleted on removal of primary resource), then I should never use a fragment Identifier to refer to a non-virtual object (i.e. me a Person) - because I can't be deleted by simply removing a resource. (?) Regards! Nathan
Notes on RDFa For Turtles
I’ve gotten some great feedback and spent some time looking at specs, and here are some thoughts. “RDFa For Turtles” is an RDFa subset (profile?) that makes it easy to specify triples in the HEAD of an HTML document. I intend to distill it into a short “HOWTO” document that any webmaster can understand and correctly apply. Towards that goal, it fixes certain choice to improve interoperability. I'm interested in any feedback that can further improve interoperability. --- Now, it’s tempting to say that RDFa documents can be ‘duck typed’; that is, try to extract some RDFa triples, and if you get some, it’s an RDFa document. I see one problem with that. Superficially, it looks like RDFa should be able to extract old-school link rel=”{p}” href=”{o}” / elements from the headers of legacy HTML and XHTML documents. (Similarly, people are being told to add a rel=”license” href=”{o}” to today’s HTML documents.) These constructions havve the desired effect when the base of the document is not specified, however, the use of base href=”{base}” causes the RDFa interpretation of these constructs to be different from that in legacy systems. To avoid this and other problems, “RDFa For Turtles” documents use the @about attribute of the html element to explicitly specify the URL of the current document. “RDFa For Turtles” documents are not allowed to assert triples about the current document unless that URI of that document is specified explicitly. It would be nice to have a reliable and simple way to make statements about the current document for documents that are being written by hand, but I don’t think there is any, at least not if the base element is in use. --- When used in an XHTML document, “RDFa For Turtles” uses the standard DOCTYPE for XHTML+RDFa documents and is completely conformant with the XHTML+RDFa specification. “RDFa For Turtles” embedding in HTML will be based on the HTML 5 + RDFa specification, http://dev.w3.org/html5/rdfa/rdfa-module.html but will not require the use of a specific DOCTYPE. “RDFa For Turtles” will follow changes in HTML5 + RDFa as the standard matures. Conformant “RDFa for Turtles” HTML documents have the following characteristics: (i) One or more RDFa statements can be extracted from the head using the HTML 5 + RDFa rules (ii) No statements are made without an explicit @about; the head element cannot contain an @about element; the html element can contain only an @about element that points to the current document URI (iii) RDFa statements specified in the head must use the restricted “RDFa For Turtles” vocabulary, however (iv) Arbitrary RDFa statements are allowed elsewhere in the document, subject only to rule (ii) “RDFa For Turtles” uses xmlns notation to define CURIE prefixes, as do current HTML+RDF specifications, but will track the change if a new mechanism is introduced. --- Here’s the “RDFa For Turtles” vocabulary: all of these go into the head If @about is specified explicitly in the html element, we can write meta property={predicate} content={object} meta property={predicate} content={object} To assert triples about the current document. We can also use @datatype here, @lang in HTML documents, and @xml:lang in XHTML documents. We can also write link rel=”{reserved_value}” href=”{object}” link rev=”{reserved_value}” href=”{subject}” And link rel=”{curie_predicate}” resource=”{object}” link rel=”{curie_predicate}” resource=”{subject}” I think the differential use of href and resource here maximizes backwards and forwards compatibility. It is non-conformant to use legacy link/@relelements if @about is not set in the html element. Also, link about=”{any_subject}” rel=”{reserved_value}” href=”{object}” link about=”{any_subject}” rel=”{reserved_value}” href=”{object}” Is disallowed because legacy clients could misinterpret it. If we want to assert predicates such as “alternate”, “cite” about documents that are not the present document, we need to add a namespace declaration like xmlns:xhv=”http://www.w3.org/1999/xhtml/vocab#” and then write the predicate as a CURIE: link about=”{any_subject}” rel=”xhv:{reserved_value}” resource=”{object}” Of course, to assert triples about other things, we may add @about to link and meta. The shorthand form link about=”{subject}” typeof=”{object}” Is equivalent to link about=”{subject}” predicate=”rdf:type” value=”{object}” @about is required when using @typeof. RDFa For Turtles allows (but discourages) the creation of blank nodes with the safe CURIE syntax [_:suffix] RDFa For Turtles supports the full range of possible syntax in @about, @resource, @rel, @rev and @property attributes (other than the compatibility restrictions above.) I’m planning, however, to split the “RDFa For Turtles” spec to have a basic half that leaves out certain features (@typeof, use of Safe CURIEs, and space-separated CURIE/URI lists) and an advanced half that adds a few dashes of syntactic sugar.
Re: URI Fragments
Nathan, I'm not sure it's correct to refer to your examples as primary and secondary resources. As you point out, it is not true that if I remove http://www.w3.org/People/Berners-Lee/card then http://www.w3.org/People/Berners-Lee/card#i no longer exists. since the first URI refers to an information resource, while the second refers to a non-information resource. You seem to take this as an argument against fragment identifiers, but it doesn't just apply to hash URIs. If the server www.w3.org goes down, then all URIs that it dereferences become non-dereferenceable, whether they are hash URIs or slash URIs. Now, must we stop using a URI when the server that dereferences it goes down? I think there are cases where the answer is no, where it makes sense to continue using the URI as an identifier, even if the URI is no longer valid as an address. In the above case, there are many webpages making assertions about http://www.w3.org/People/Berners-Lee/card#i , and those assertions are valid, regardless of the existence of the server www.w3.org. Making thoses assertions easy to find might be a challenge, of course, which is why I would like to see rdf browsers do more than simply issue a GET on a URI when trying to resolve it. Joel. On Fri, 12 Mar 2010, Nathan wrote: Hi Again :) Last question(s) related to fragments.. if I have: http://example.org/something http://example.org/something#a Those are two unique URIs and thus two unique resources (?) And the semantics of a fragment means that http://example.org/something#a is a secondary resource, where http://example.org/something is the primary resource (?) Then if I delete a Primary resource, the secondary resources must also be deleted, true / false (?). Here are some examples, which may seem like over kill but some are interesting and generally I *feel* rules like this should be either always true, or always false, never varying. examples: if I remove a database table, then all it's rows also no longer exist. if I remove London then the Tower of London also no longer exists. if somebody removes me, then my arms also no longer exist. if I remove test.html then test.html#whatever no longer exists. if I remove test.rdf then test.rdf#this no longer exists if I remove http://www.w3.org/People/Berners-Lee/card then http://www.w3.org/People/Berners-Lee/card#i no longer exists. conversely: if I remove a row, the table still exists if I remove the Tower of London, London still exists if you remove my arms, I still exists and I'll find another way to type. if I remove test.html#whatever test.html still exists if I remove test.rdf#this, test.rdf still exists if I remove http://www.w3.org/People/Berners-Lee/card#i then http://www.w3.org/People/Berners-Lee/card still exists. If the above is true (secondary resource must also be deleted on removal of primary resource), then I should never use a fragment Identifier to refer to a non-virtual object (i.e. me a Person) - because I can't be deleted by simply removing a resource. (?) Regards! Nathan
Re: RDF Serializations
On Fri, Mar 12, 2010 at 5:23 AM, Nathan nat...@webr3.org wrote: What I'm gunning for in the end, is to only expose all linked data / rdf as static RDF+XML documents within my application - would this in any way make the data less linked because some clients don't support RDF+XML or could I take it for granted that everybody (for instance everybody on this list) could handle this serialization. At some point you've got to put your foot down and stop supporting new output formats. If I invent one tomorrow, that doesn't mean you have to support it. There are three standards that I see in use: (i) RDF/XML for relatively small triple sets (triples about a subject) that are published in the typical linked data style that are not embedded in other documents. RDF/XML is particularly used for bnode-heavy applications such as OWL schemas. (ii) RDFa for embedding triples in other documents, again, in the linked data context where any individual document contains just a small fraction of the data in the system (iii) Turtle-family serializations for large whole system dumps, such as the dbpedia dumps RDF/XML isn't my favorite serialization, but it really does seem to be the most widespread; I think all linked data systems are going to support it for input, and ought to support it for output, unless they are going the publish RDFa embedded in document route. I think the software complexity argument against RDF/XML is weak these days, because we've had a decade to get good RDF/XML parsers. if you're working in some mainstream language, it's just something you can download and run. My more serious beef with RDF/XML is pedagogical: it's not a good way to teach people RDF because it's not immediately obvious to the beginner where exactly the triples are. RDF data modelling is actually incredibly simple, but you wouldn't know that if you started with RDF/XML. Turtle, however, helps you understand RDF at a triple-by-triple level... Once you've gotten some experience with that, RDF/XML makes a lot more sense.
Re: URI Fragments
Hi Nathan, On 12 Mar 2010, at 14:00, Nathan wrote: Last question(s) related to fragments.. if I have: http://example.org/something http://example.org/something#a Those are two unique URIs and thus two unique resources (?) Yes. And the semantics of a fragment means that http://example.org/something#a is a secondary resource, where http://example.org/something is the primary resource (?) Then if I delete a Primary resource, the secondary resources must also be deleted, true / false (?). Here's my take on this. The web is about representations of information resources. If you add RDF to the picture, then it's also about descriptions of arbitrary entities. On the web, you can create and delete representations. You can create and delete descriptions. But you cannot create or delete resources. For example, if you do an HTTP DELETE request to a URI, the representations at that URI are deleted. As a side effect, something in your system (file, database record, purchase order) might be deleted as well, because your system intrinsically connects the representation to that system-internal entity, but that side effect is part of the application's internals and not a concern for the web interface. So, you can't really “delete” those primary and secondary resources. But if you delete all the representations of a primary resource, then this will delete the authoritative descriptions of the secondary resources, because those live inside the representations. Best, Richard Here are some examples, which may seem like over kill but some are interesting and generally I *feel* rules like this should be either always true, or always false, never varying. examples: if I remove a database table, then all it's rows also no longer exist. if I remove London then the Tower of London also no longer exists. if somebody removes me, then my arms also no longer exist. if I remove test.html then test.html#whatever no longer exists. if I remove test.rdf then test.rdf#this no longer exists if I remove http://www.w3.org/People/Berners-Lee/card then http://www.w3.org/People/Berners-Lee/card#i no longer exists. conversely: if I remove a row, the table still exists if I remove the Tower of London, London still exists if you remove my arms, I still exists and I'll find another way to type. if I remove test.html#whatever test.html still exists if I remove test.rdf#this, test.rdf still exists if I remove http://www.w3.org/People/Berners-Lee/card#i then http://www.w3.org/People/Berners-Lee/card still exists. If the above is true (secondary resource must also be deleted on removal of primary resource), then I should never use a fragment Identifier to refer to a non-virtual object (i.e. me a Person) - because I can't be deleted by simply removing a resource. (?) Regards! Nathan
Re: URI Fragments
joel sachs wrote: Nathan, I'm not sure it's correct to refer to your examples as primary and secondary resources. As you point out, it is not true that if I remove http://www.w3.org/People/Berners-Lee/card then http://www.w3.org/People/Berners-Lee/card#i no longer exists. since the first URI refers to an information resource, while the second refers to a non-information resource. with regards primary and secondary: http://www.ietf.org/rfc/rfc3986.txt states: The fragment identifier component of a URI allows indirect identification of a secondary resource by reference to a primary resource and additional identifying information. The identified secondary resource may be some portion or subset of the primary resource, some view on representations of the primary resource, or some other resource defined or described by those representations. http://www.w3.org/TR/webarch/#fragid states: The fragment identifier component of a URI allows indirect identification of a secondary resource by reference to a primary resource and additional identifying information hence the usage :) You seem to take this as an argument against fragment identifiers, but it doesn't just apply to hash URIs. If the server www.w3.org goes down, then all URIs that it dereferences become non-dereferenceable, whether they are hash URIs or slash URIs. I hope this doesn't come across wrong, but if we're saying that one server can go down, then all servers can go down; and all linked data can no longer be dereferenced; so for the sake of this conversation and the whole linked data thing in general I'm most interested in addressing what to do whilst the servers are live. As for an argument against fragment identifiers - I'm not against them; however if we take the case of TimBL's card; personally I can't see any reason why he couldn't have a personal uri of say http://www.w3.org/TimBL which 303 See Other through to his card; then his personal uri is a resource all of its own and independent of any representation; thus allowing representations to be moved around / deleted without any effect on his personal URI, and further allow for multiple information resources describing him, with different media-types. This is digressing though and probably not worth discussing. Now, must we stop using a URI when the server that dereferences it goes down? I think there are cases where the answer is no, where it makes sense to continue using the URI as an identifier, even if the URI is no longer valid as an address. In the above case, there are many webpages making assertions about http://www.w3.org/People/Berners-Lee/card#i , and those assertions are valid, regardless of the existence of the server www.w3.org. Making thoses assertions easy to find might be a challenge, of course, which is why I would like to see rdf browsers do more than simply issue a GET on a URI when trying to resolve it. Unsure, and I think about these things often; all I will say is that it appears the web of data is fault tolerant thanks in part to sameAs and dcterms:replaces - personally I would like to see usage of 410 Gone and 301 Moved Permanently to help matters but that's a different (yet related) topic. Back to the point in hand, surely if I delete a Primary resource, the secondary resources must also be deleted, and this stands true in 99% of use-cases; would that not indicate that the use-case where it doesn't appear like it should be true may be implemented incorrectly? especially if that same use-case can be implemented in a different way and allow this statement to be true 100% of the time. (trying to tread cautiously here)! regards, nathan On Fri, 12 Mar 2010, Nathan wrote: Hi Again :) Last question(s) related to fragments.. if I have: http://example.org/something http://example.org/something#a Those are two unique URIs and thus two unique resources (?) And the semantics of a fragment means that http://example.org/something#a is a secondary resource, where http://example.org/something is the primary resource (?) Then if I delete a Primary resource, the secondary resources must also be deleted, true / false (?). Here are some examples, which may seem like over kill but some are interesting and generally I *feel* rules like this should be either always true, or always false, never varying. examples: if I remove a database table, then all it's rows also no longer exist. if I remove London then the Tower of London also no longer exists. if somebody removes me, then my arms also no longer exist. if I remove test.html then test.html#whatever no longer exists. if I remove test.rdf then test.rdf#this no longer exists if I remove http://www.w3.org/People/Berners-Lee/card then http://www.w3.org/People/Berners-Lee/card#i no longer exists. conversely: if I remove a row, the table still exists if I remove the Tower of London, London still exists if you remove my arms, I still exists
Re: URI Fragments
Thanks for your reply Richard, I'm going to go balls-out today and challenge a bit of this for the sake of argument: Richard Cyganiak wrote: Hi Nathan, On 12 Mar 2010, at 14:00, Nathan wrote: Last question(s) related to fragments.. if I have: http://example.org/something http://example.org/something#a Those are two unique URIs and thus two unique resources (?) Yes. And the semantics of a fragment means that http://example.org/something#a is a secondary resource, where http://example.org/something is the primary resource (?) Then if I delete a Primary resource, the secondary resources must also be deleted, true / false (?). Here's my take on this. The web is about representations of information resources. If you add RDF to the picture, then it's also about descriptions of arbitrary entities. On the web, you can create and delete representations. You can create and delete descriptions. But you cannot create or delete resources. I'd argue that a resource is anything that can be named (or assigned a URI), regardless of whether it has a representation or not. Even without a representation a resource could still be reserved (which allows references to be made to a concept before any realization of that concept exists - although I've yet to confirm if 204 could be used for this..); in another use-case though a resource like /news/latest may be nothing more than a conceptual map to another resource (served via a 3xx code) - this is a resource with no representation, which can be both created and deleted surely? In another case; let's say planned to lease a /London_Office (resource) which I then described with a representation and 303'd to; then I decided not to lease the /London_Office so deleted the representation /and/ the resource because /London_Office isn't something that can be named because it no longer exists, was never realized, and moreover I want it removed because it was a painful loss. Thus, can you delete resources? or another way, can you delete a conceptual map? I can't really respond to anything below this until the aforementioned has been addressed; other than one small point. For example, if you do an HTTP DELETE request to a URI, the representations at that URI are deleted. As a side effect, something in your system (file, database record, purchase order) might be deleted as well, because your system intrinsically connects the representation to that system-internal entity, but that side effect is part of the application's internals and not a concern for the web interface. So, you can't really “delete” those primary and secondary resources. But if you delete all the representations of a primary resource, then this will delete the authoritative descriptions of the secondary resources, because those live inside the representations. if I remove the section and the reference test.html#whatever from test.html; have I not deleted that secondary resource? it can't be named any more, or referenced, or.. and so on Best, Richard Thanks again, Nathan Here are some examples, which may seem like over kill but some are interesting and generally I *feel* rules like this should be either always true, or always false, never varying. examples: if I remove a database table, then all it's rows also no longer exist. if I remove London then the Tower of London also no longer exists. if somebody removes me, then my arms also no longer exist. if I remove test.html then test.html#whatever no longer exists. if I remove test.rdf then test.rdf#this no longer exists if I remove http://www.w3.org/People/Berners-Lee/card then http://www.w3.org/People/Berners-Lee/card#i no longer exists. conversely: if I remove a row, the table still exists if I remove the Tower of London, London still exists if you remove my arms, I still exists and I'll find another way to type. if I remove test.html#whatever test.html still exists if I remove test.rdf#this, test.rdf still exists if I remove http://www.w3.org/People/Berners-Lee/card#i then http://www.w3.org/People/Berners-Lee/card still exists. If the above is true (secondary resource must also be deleted on removal of primary resource), then I should never use a fragment Identifier to refer to a non-virtual object (i.e. me a Person) - because I can't be deleted by simply removing a resource. (?) Regards! Nathan
Re: RDF Serializations
On 12 Mar 2010, at 10:41, Rob Vesse wrote: Hi All, I've been putting some thought in to RDF Serializations in the context of linked data; and ever increasingly I'm questioning why I feel the need to offer the same RDF graphs serialized in different formats. I guess a specific questions would be, does anybody operate a linked data consuming library that doesn't support a particular serialization? Yes I expect they do because unfortunately RDF/XML is still the only officially endorsed W3C standard... No longer true: RDFa in XHTML is a recommendation.[1] I've been corrected on that in the past :-) N-Triples is, together with RDF/XML, the best supported serialisation. Damian [1] http://www.w3.org/TR/rdfa-syntax/
Re: URI Fragments
Nathan wrote: Hi Again :) Last question(s) related to fragments.. if I have: http://example.org/something http://example.org/something#a Those are two unique URIs and thus two unique resources (?) My world view (i.e. I don't do Resource and Information Resource lingo): Careless and dangerous, but accurate. 1. http://example.org/something -- a resource URI 2. http://example.org/something#a -- a resource URI Less confusing, assuming you are have a # terminated URI pattern in play: 1. http://example.org/something -- a resource URL 2. http://example.org/something#a -- a data object URI (if we are talking about a commonly used Linked Data pattern, then URL above would be conduit to the EAV model based representation of the description of this data object) And the semantics of a fragment means that http://example.org/something#a is a secondary resource, where http://example.org/something is the primary resource (?) Sorta. Then if I delete a Primary resource, the secondary resources must also be deleted, true / false (?). Not necessarily, this really depends on the Linked Data pattern you've adopted re. generic HTTP URIs. Basically, the pattern you've adopted such that that you to Reference a Data Object and Access a Representation of its Description via a single URI. Here are some examples, which may seem like over kill but some are interesting and generally I *feel* rules like this should be either always true, or always false, never varying. examples: if I remove a database table, then all it's rows also no longer exist. if I remove London then the Tower of London also no longer exists. if somebody removes me, then my arms also no longer exist. if I remove test.html then test.html#whatever no longer exists. if I remove test.rdf then test.rdf#this no longer exists if I remove http://www.w3.org/People/Berners-Lee/card then http://www.w3.org/People/Berners-Lee/card#i no longer exists. No, you've lost access to description of: http://www.w3.org/People/Berners-Lee/card#i, of course it still exists :-) conversely: if I remove a row, the table still exists if I remove the Tower of London, London still exists if you remove my arms, I still exists and I'll find another way to type. if I remove test.html#whatever test.html still exists if I remove test.rdf#this, test.rdf still exists if I remove http://www.w3.org/People/Berners-Lee/card#i then http://www.w3.org/People/Berners-Lee/card still exists. How do you remove: http://www.w3.org/People/Berners-Lee/card#i ? Let's say you take it out of http://www.w3.org/People/Berners-Lee/card, then for agents that seek description of http://www.w3.org/People/Berners-Lee/card#i via aforementioned URL, you get nothing. Nothing stops the http://www.w3.org/People/Berners-Lee/card#i description existing in my linked data space :-) If the above is true (secondary resource must also be deleted on removal of primary resource), Not true . then I should never use a fragment Identifier to refer to a non-virtual object (i.e. me a Person) - because I can't be deleted by simply removing a resource. (?) Best to think about the issue of Identifier as absolutely distinct from Representation. Links: 1. http://www.cs.cmu.edu/afs/cs.cmu.edu/user/clamen/OODBMS/Manifesto/htManifesto/node4.html -- might come in handy re. Identifier matters . Kingsley Regards! Nathan -- Regards, Kingsley Idehen President CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca: kidehen
Re: URI Fragments (typo fixed version)
Nathan wrote: Hi Again :) Last question(s) related to fragments.. if I have: http://example.org/something http://example.org/something#a Those are two unique URIs and thus two unique resources (?) My world view (i.e. I don't do Resource and Information Resource lingo): Careless and dangerous, but accurate. 1. http://example.org/something -- a resource URI 2. http://example.org/something#a -- a resource URI Less confusing, assuming you are have a # terminated URI pattern in play: 1. http://example.org/something -- a resource URL 2. http://example.org/something#a -- a data object URI (if we are talking about a commonly used Linked Data pattern, then URL above would be conduit to the EAV model based representation of the description of this data object) And the semantics of a fragment means that http://example.org/something#a is a secondary resource, where http://example.org/something is the primary resource (?) Sorta. Then if I delete a Primary resource, the secondary resources must also be deleted, true / false (?). Not necessarily, this really depends on the Linked Data pattern you've adopted re. generic HTTP URIs. Basically, the pattern you've adopted such that: you can Reference a Data Object and Access a Representation of its Description, via a single Generic HTTP URI. Here are some examples, which may seem like over kill but some are interesting and generally I *feel* rules like this should be either always true, or always false, never varying. examples: if I remove a database table, then all it's rows also no longer exist. if I remove London then the Tower of London also no longer exists. if somebody removes me, then my arms also no longer exist. if I remove test.html then test.html#whatever no longer exists. if I remove test.rdf then test.rdf#this no longer exists if I remove http://www.w3.org/People/Berners-Lee/card then http://www.w3.org/People/Berners-Lee/card#i no longer exists. No, you've lost access to description of: http://www.w3.org/People/Berners-Lee/card#i, of course it still exists :-) conversely: if I remove a row, the table still exists if I remove the Tower of London, London still exists if you remove my arms, I still exists and I'll find another way to type. if I remove test.html#whatever test.html still exists if I remove test.rdf#this, test.rdf still exists if I remove http://www.w3.org/People/Berners-Lee/card#i then http://www.w3.org/People/Berners-Lee/card still exists. How do you remove: http://www.w3.org/People/Berners-Lee/card#i ? Let's say you take it out of http://www.w3.org/People/Berners-Lee/card, then for agents that seek description of http://www.w3.org/People/Berners-Lee/card#i via aforementioned URL, you get nothing. Nothing stops the http://www.w3.org/People/Berners-Lee/card#i description existing in my linked data space :-) If the above is true (secondary resource must also be deleted on removal of primary resource), Not true . then I should never use a fragment Identifier to refer to a non-virtual object (i.e. me a Person) - because I can't be deleted by simply removing a resource. (?) Best to think about the issue of Identifier as absolutely distinct from Representation. Links: 1. http://www.cs.cmu.edu/afs/cs.cmu.edu/user/clamen/OODBMS/Manifesto/htManifesto/node4.html -- might come in handy re. Identifier matters . -- Regards, Kingsley Idehen President CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca: kidehen
Re: URI Fragments (typo fixed version)
Kingsley Idehen wrote: How do you remove: http://www.w3.org/People/Berners-Lee/card#i ? Let's say you take it out of http://www.w3.org/People/Berners-Lee/card, then for agents that seek description of http://www.w3.org/People/Berners-Lee/card#i via aforementioned URL, you get nothing. Nothing stops the http://www.w3.org/People/Berners-Lee/card#i description existing in my linked data space :-) exactly.. how *DO* you remove a resource from the web of linked data? let's just suppose that the high court has instructed it; it *must* happen - how?
Re: URI Fragments
joel sachs wrote: Nathan, A couple of points ... On Fri, 12 Mar 2010, Nathan wrote: joel sachs wrote: Nathan, I'm not sure it's correct to refer to your examples as primary and secondary resources. As you point out, it is not true that if I remove http://www.w3.org/People/Berners-Lee/card then http://www.w3.org/People/Berners-Lee/card#i no longer exists. since the first URI refers to an information resource, while the second refers to a non-information resource. with regards primary and secondary: http://www.ietf.org/rfc/rfc3986.txt states: The fragment identifier component of a URI allows indirect identification of a secondary resource by reference to a primary resource and additional identifying information. The identified secondary resource may be some portion or subset of the primary resource, some view on representations of the primary resource, or some other resource defined or described by those representations. http://www.w3.org/TR/webarch/#fragid states: The fragment identifier component of a URI allows indirect identification of a secondary resource by reference to a primary resource and additional identifying information hence the usage :) But http://www.w3.org/TR/webarch/#fragid stresses that the terms primary and secondary apply only in the context of a particular URI; they do not imply any sort of you must have the first to have the second significance. This is in contrast to database tables and database rows, or London and the Tower of London. So lessons that you draw from resources that are primary and secondary in an ontological sense do not apply to the resources in a fragment URI. (I'm referring here to your last sentence of this email, which states that if I delete a Primary resource, the secondary resources must also be deleted, and this stands true in 99% of use-cases; would that not indicate that the use-case where it doesn't appear like it should be true may be implemented incorrectly?) noted; we can agree though that if you remove the representation then the fragment identified resource can no longer be dereferenced yes? whereas if you use a non-fragment uri to identify a resource and 303 it through to it's representation; then you are free to have multiple mediatypes and move the representation without loosing the dereferencing capability. or am I wrong here? You seem to take this as an argument against fragment identifiers, but it doesn't just apply to hash URIs. If the server www.w3.org goes down, then all URIs that it dereferences become non-dereferenceable, whether they are hash URIs or slash URIs. I hope this doesn't come across wrong, but if we're saying that one server can go down, then all servers can go down; and all linked data can no longer be dereferenced; Surely the Web of Data should have a little bit of fault tolerance? yes which is why I said: Unsure, and I think about these things often; all I will say is that it appears the web of data is fault tolerant thanks in part to sameAs and dcterms:replaces - personally I would like to see usage of 410 Gone and 301 Moved Permanently to help matters but that's a different (yet related) topic. regards!
Re: URI Fragments (typo fixed version)
Hi, exactly.. how *DO* you remove a resource from the web of linked data? let's just suppose that the high court has instructed it; it *must* happen - how? What would you do for a document? Its on your web site. Its also in the Google cache and the Wayback Machine. What do you do? What are your legal requirements, and what are your practical limitations? Cheers, L. -- Leigh Dodds Programme Manager, Talis Platform Talis leigh.do...@talis.com http://www.talis.com
Re: URI Fragments
Hi Nathan, On 12 Mar 2010, at 16:46, Nathan wrote: Then if I delete a Primary resource, the secondary resources must also be deleted, true / false (?). The web is about representations of information resources. If you add RDF to the picture, then it's also about descriptions of arbitrary entities. On the web, you can create and delete representations. You can create and delete descriptions. But you cannot create or delete resources. I'd argue that a resource is anything that can be named (or assigned a URI), regardless of whether it has a representation or not. Even without a representation a resource could still be reserved (which allows references to be made to a concept before any realization of that concept exists - although I've yet to confirm if 204 could be used for this..); I would agree with everything above. But I'd say that from the web POV, you can't do anything useful with a resource that is reserved but doesn't have a representation or provides some other useful response (such as 303) when resolved. in another use-case though a resource like /news/latest may be nothing more than a conceptual map to another resource (served via a 3xx code) - Side note: I interpret 301, 302 and 307 as “try over there to get a representation of this resource”. So if you get a representation from the target, then I'd consider that a representation of the original resource. This interpretation is not backed by any spec or other authoritative document, but for me it makes the picture cleaner. 303 of course is explicit about that the representation is of a *different* resource. this is a resource with no representation, which can be both created and deleted surely? Well it can be created and deleted, but the web (by which I mean, HTTP and URIs) provides no standard way of creating or deleting resources that have no representation. You have to use some nonstandard mechanism of your own invention for this (which of course can be built on standard operations, e.g. POST). Hence my view that the creation or deletion of the resource is a side effect of something that you invoke that's “outside” of the web. In another case; let's say planned to lease a /London_Office (resource) which I then described with a representation and 303'd to; then I decided not to lease the /London_Office so deleted the representation /and/ the resource because /London_Office isn't something that can be named because it no longer exists, was never realized, and moreover I want it removed because it was a painful loss. Thus, can you delete resources? or another way, can you delete a conceptual map? I suppose you can delete them, but there is no operation for doing so in HTTP. Hence you'll have to devise your own mechanism for this. And wether deleting foo also deletes foo#bar is something that depends on how you model your domain concepts as resources, representations and descriptions. In general, the HTTP and URI specs don't constrain this. So, you can't really “delete” those primary and secondary resources. But if you delete all the representations of a primary resource, then this will delete the authoritative descriptions of the secondary resources, because those live inside the representations. if I remove the section and the reference test.html#whatever from test.html; have I not deleted that secondary resource? it can't be named any more, or referenced, or.. and so on Good point, in this case you are right. For HTML documents, foo.html#whatever, if defined at all, will be a named element within the HTML document (see RFC 2854). It's pretty clear what it means to delete a named element in an HTML document (there are even DOM operations for doing so), it's pretty safe to say that you are deleting the secondary resource. This is because RFC 2854 constrains the semantics of fragment IDs on HTML documents in such a way that it becomes clear what creating and deleting them means. The same would apply to any other media type where fragments identify parts of the document. This is not the case for RDF media types, where the semantics of fragment IDs is essentially: “They identify whatever the full hash URI identifies according to the RDF graph in the document” (too lazy to dig out the reference -- the media type registration RFC for application/rdf+xml, which points to some section of some RDF spec that according to my understanding can be summarised as above). So perhaps I should say that *in general* you cannot really delete resources, unless some spec (especially media type specs) defines the semantics of the resources in such a way that you can delete them. Feel free to challenge some more ;-) All the best, Richard Best, Richard Thanks again, Nathan Here are some examples, which may seem like over kill but some are interesting and generally I *feel* rules like this should be either always true, or always
Re: URI Fragments
Richard Cyganiak wrote: Hi Nathan, On 12 Mar 2010, at 16:46, Nathan wrote: Then if I delete a Primary resource, the secondary resources must also be deleted, true / false (?). The web is about representations of information resources. If you add RDF to the picture, then it's also about descriptions of arbitrary entities. On the web, you can create and delete representations. You can create and delete descriptions. But you cannot create or delete resources. I'd argue that a resource is anything that can be named (or assigned a URI), regardless of whether it has a representation or not. Even without a representation a resource could still be reserved (which allows references to be made to a concept before any realization of that concept exists - although I've yet to confirm if 204 could be used for this..); I would agree with everything above. But I'd say that from the web POV, you can't do anything useful with a resource that is reserved but doesn't have a representation or provides some other useful response (such as 303) when resolved. you can prevent it being used as an identifier for something else and prevent possible ambiguity. For instance I'd like to reserve http://webr3.org/nathan as my webid until I have a foaf profile up there. Change this scenario to a shared server managed space and you need to create that conceptual map even if it does map to an empty set. (?) in another use-case though a resource like /news/latest may be nothing more than a conceptual map to another resource (served via a 3xx code) - Side note: I interpret 301, 302 and 307 as “try over there to get a representation of this resource”. So if you get a representation from the target, then I'd consider that a representation of the original resource. This interpretation is not backed by any spec or other authoritative document, but for me it makes the picture cleaner. 303 of course is explicit about that the representation is of a *different* resource. ahh a side note of my own on that point: 301 Moved Permanently would be perfect for changing uri references in the web of linked data The requested resource has been assigned a new permanent URI and any future references to this resource SHOULD use one of the returned URIs. Clients with link editing capabilities ought to automatically re-link references to the request-target to one or more of the new references returned by the server, where possible. but only when the resource doesn't have a fragment; otherwise it would be ambiguous. this is a resource with no representation, which can be both created and deleted surely? Well it can be created and deleted, but the web (by which I mean, HTTP and URIs) provides no standard way of creating or deleting resources that have no representation. You have to use some nonstandard mechanism of your own invention for this (which of course can be built on standard operations, e.g. POST). Hence my view that the creation or deletion of the resource is a side effect of something that you invoke that's “outside” of the web. Personally I think it does.. 410 Gone The requested resource is no longer available at the server and no forwarding address is known. This condition is expected to be considered permanent. Clients with link editing capabilities SHOULD delete references to the request-target after user approval. to me that says resource is deleted (gone), permanently, delete all references if you can please. again though, only when the resource doesn't have a fragment; otherwise it would be ambiguous. In another case; let's say planned to lease a /London_Office (resource) which I then described with a representation and 303'd to; then I decided not to lease the /London_Office so deleted the representation /and/ the resource because /London_Office isn't something that can be named because it no longer exists, was never realized, and moreover I want it removed because it was a painful loss. Thus, can you delete resources? or another way, can you delete a conceptual map? I suppose you can delete them, but there is no operation for doing so in HTTP. Hence you'll have to devise your own mechanism for this. And wether deleting foo also deletes foo#bar is something that depends on how you model your domain concepts as resources, representations and descriptions. In general, the HTTP and URI specs don't constrain this. see above re 410 Gone. So, you can't really “delete” those primary and secondary resources. But if you delete all the representations of a primary resource, then this will delete the authoritative descriptions of the secondary resources, because those live inside the representations. if I remove the section and the reference test.html#whatever from test.html; have I not deleted that secondary resource? it can't be named any more, or referenced, or.. and so on Good point, in this case you are right. For HTML documents,
Re: URI Fragments (typo fixed version)
Nathan wrote: Kingsley Idehen wrote: How do you remove: http://www.w3.org/People/Berners-Lee/card#i ? Let's say you take it out of http://www.w3.org/People/Berners-Lee/card, then for agents that seek description of http://www.w3.org/People/Berners-Lee/card#i via aforementioned URL, you get nothing. Nothing stops the http://www.w3.org/People/Berners-Lee/card#i description existing in my linked data space :-) exactly.. how *DO* you remove a resource from the web of linked data? let's just suppose that the high court has instructed it; it *must* happen - how? We'll high court will have figure out what a Web Eraser is, and then how it should be constructed :-) I think on the Web you could kinda achieve: Don't Say It Again. I really think that's it bar attempting to generate content that pushes all resource descriptions out of relatively reasonable scope (i.e. edges of the Web of Linked Data core, to the degree that can even be established). For now, like Diamonds, Objects References are forever :-) -- Regards, Kingsley Idehen President CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca: kidehen
Re: URI Fragments (typo fixed version)
Leigh Dodds wrote: Hi, exactly.. how *DO* you remove a resource from the web of linked data? let's just suppose that the high court has instructed it; it *must* happen - how? What would you do for a document? Its on your web site. Its also in the Google cache and the Wayback Machine. What do you do? What are your legal requirements, and what are your practical limitations? maybe we can address this for the web of linked data resources before the same issues arise..? 410 Gone and obedient http clients with link editing capabilities. google cache remove, would be interesting to test the 410 with them though: http://www.google.com/support/webmasters/bin/answer.py?hl=enanswer=164734 wayback: http://www.archive.org/about/exclude.php note the identical way's of doing it; robots.txt handles the current web of documents (pretty much).
Re: URI Fragments (typo fixed version)
Nathan wrote: Leigh Dodds wrote: Hi, exactly.. how *DO* you remove a resource from the web of linked data? let's just suppose that the high court has instructed it; it *must* happen - how? What would you do for a document? Its on your web site. Its also in the Google cache and the Wayback Machine. What do you do? What are your legal requirements, and what are your practical limitations? maybe we can address this for the web of linked data resources before the same issues arise..? 410 Gone and obedient http clients with link editing capabilities. google cache remove, would be interesting to test the 410 with them though: http://www.google.com/support/webmasters/bin/answer.py?hl=enanswer=164734 wayback: http://www.archive.org/about/exclude.php note the identical way's of doing it; robots.txt handles the current web of documents (pretty much). How does that remove TimBL's URI in my linked data space? A URI that is owl:sameAs my local URI for him? -- Regards, Kingsley Idehen President CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca: kidehen
Re: URI Fragments (typo fixed version)
Kingsley Idehen wrote: Nathan wrote: Leigh Dodds wrote: Hi, exactly.. how *DO* you remove a resource from the web of linked data? let's just suppose that the high court has instructed it; it *must* happen - how? What would you do for a document? Its on your web site. Its also in the Google cache and the Wayback Machine. What do you do? What are your legal requirements, and what are your practical limitations? maybe we can address this for the web of linked data resources before the same issues arise..? 410 Gone and obedient http clients with link editing capabilities. google cache remove, would be interesting to test the 410 with them though: http://www.google.com/support/webmasters/bin/answer.py?hl=enanswer=164734 wayback: http://www.archive.org/about/exclude.php note the identical way's of doing it; robots.txt handles the current web of documents (pretty much). How does that remove TimBL's URI in my linked data space? A URI that is owl:sameAs my local URI for him? hence why the subject is URI Fragments :) as I said earlier: however if we take the case of TimBL's card; personally I can't see any reason why he couldn't have a personal uri of say http://www.w3.org/TimBL which 303 See Other through to his card; then his personal uri is a resource all of its own and independent of any representation; thus allowing representations to be moved around / deleted without any effect on his personal URI, and further allow for multiple information resources describing him, with different media-types. and obviously as mentioned a few minutes ago in a reply to Richard, the 410 Gone could only apply (unambiguously) to resources without a hash. All in I'm gunning for Roy T. Fieldings original single version of a resource (and no fragments, except for a few use-cases which I'll cover later) none of this two kinds of resource, and use HTTP status codes and dereferencing to: 1: 204 No Content = resource which maps to an empty set / reserved resource that can't be used for anything else 2: 303 See Other = indicates that the requested resource does not have a representation of its own that can be transferred by the server over HTTP (the way linked data already uses it) 3: 301 Moved Permanently = The requested resource has been assigned a new permanent URI and any future references to this resource SHOULD use one of the returned URIs. Clients with link editing capabilities ought to automatically re-link references to the request-target to one or more of the new references returned by the server 4: 410 Gone = The requested resource is no longer available at the server and no forwarding address is known. This condition is expected to be considered permanent. Clients with link editing capabilities SHOULD delete references to the request-target after user approval. Regardless of semantic web, linked data is bound to http for the time being thanks to dereferencing, http does give us the utilities to do everything we need with regards linked data. as for it not being bound in the future; all of these use cases can also be represented in rdf with existing ontologies (dcterms:replaces and so forth), and when something can't simply make a predicate that allows it, shouldn't be too hard.. further in to the future I can't see any reason why we can't simply: http://a.org/b owl:sameAs aprotocol://a.org/http://a.org/b I'm strongly putting focus back on the conceptual mapping side of resources; that allows for anything and covers everything, afaict we have no need to use #fragments at all. regards!
For Discussion..
Hi All, following on from many previous emails; here's something to ponder: - multiple quads or named graphs in a single rdf graph - an asserted named graph - a quoted named graph - never describing a subject directly hopefully should allow a groundwork for provenance, trust, and many other things I'm sure you are all aware of. Looks quite interesting when you generate a visual graph of the data model too (using say w3c rdf validator) simply putting this forward to the group for feedback and thoughts: --- in n3 / Ntriples: @prefix dcterms: http://purl.org/dc/terms/ . @prefix foaf: http://xmlns.org/foaf/0.1/ . @prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# . @prefix rdfs: http://www.w3.org/2000/01/rdf-schema# . @prefix owl: http://www.w3.org/2002/07/owl# . @prefix w: http://webr3.org/ns# . http://webr3.org/nathan#authoritative rdf:type w:NamedGraph ; dcterms:isPartOf http://webr3.org/#graph ; w:assertedBy http://webr3.org/nathan . http://webr3.org/nathan#me rdf:type foaf:Person ; rdfs:isDefinedBy http://webr3.org/nathan#authoritative ; owl:sameAs http://webr3.org/nathan ; rdfs:seeAlso http://sameas.org/n3?uri=http://webr3.org/nathan ; foaf:mbox nat...@webr3.org . http://somesite.org/people/nathan#quoted rdf:type w:NamedGraph . http://somesite.org/people/nathan#me rdfs:isDefinedBy http://somesite.org/people/nathan#quoted ; owl:sameAs http://webr3.org/nathan . --- in rdf+xml too: rdf:RDF xmlns=http://webr3.org/nathan#; xmlns:dcterms=http://purl.org/dc/terms/; xmlns:foaf=http://xmlns.org/foaf/0.1/; xmlns:log=http://www.w3.org/2000/10/swap/log#; xmlns:owl=http://www.w3.org/2002/07/owl#; xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns:rdfs=http://www.w3.org/2000/01/rdf-schema#; xmlns:w=http://webr3.org/ns#; rdf:Description rdf:about=http://somesite.org/people/nathan#me; rdfs:isDefinedBy rdf:resource=http://somesite.org/people/nathan#quoted/ owl:sameAs rdf:resource=http://webr3.org/nathan/ /rdf:Description w:NamedGraph rdf:about=http://somesite.org/people/nathan#quoted; /w:NamedGraph w:NamedGraph rdf:about=http://webr3.org/nathan#authoritative; dcterms:isPartOf rdf:resource=http://webr3.org/#graph/ w:assertedBy rdf:resource=http://webr3.org/nathan/ /w:NamedGraph foaf:Person rdf:about=http://webr3.org/nathan#me; rdfs:isDefinedBy rdf:resource=http://webr3.org/nathan#authoritative/ rdfs:seeAlso rdf:resource=http://sameas.org/n3?uri=http://webr3.org/nathan/ owl:sameAs rdf:resource=http://webr3.org/nathan/ foaf:mboxnat...@webr3.org/foaf:mbox /foaf:Person /rdf:RDF --- Many Regards, Nathan
SPARQL: sorting resources by label?
The closest I get is the following SPARQL query: SELECT DISTINCT ?subj ?label WHERE { GRAPH ?graph { ?subj ?pred ?obj . OPTIONAL { ?subj ?labelPred ?label . FILTER ( (?labelPred = http://www.w3.org/2000/01/rdf-schema#label) # (1) ) FILTER( isLiteral(?label) ) } } FILTER (?graph = http://hypergraphs.de/TestGraph) } ORDER BY ?label ?subj Comments: - This solution also works for multiple label predicates (i.e., if there are subproperties of rdfs:label), then the unary disjunction (1) has more components. - ?graph is necessary, because Sesame does not support datasets and I want to restrict the query to all graphs that are currently visible. - This query returns unlabeled resources first (?label is unbound), then labeled resources. Better would be to show labeled resources first. Best would be to mix them, where unlabeled resources are sorted according to their qname. Can this be improved? Thanks for any comments or suggestions... Axel -- axel.rauschma...@ifi.lmu.de http://www.pst.ifi.lmu.de/~rauschma/
Re: SPARQL: sorting resources by label?
On 13 March 2010 04:16, Axel Rauschmayer a...@rauschma.de wrote: Thanks for any comments or suggestions... I'm a little perturbed that you have to use something so convoluted to get labels Why not something just like (whatever graph) SELECT ?o WHERE { ?s rdfs:label ?o } , or at worse an OPTIONAL on maybe dc:label or whatever..? - are the objects of any labels resources? Can you please clarify what you are looking for, and explain further - I honestly hope you are missing something there. If there is something wrong with the material, the problems should be surfaced and fixed (and no doubt will be for the next rev, if need be) Cheers, Danny. http://danny.ayers.name
Re: SPARQL: sorting resources by label?
I have a GUI data structure that is a pair (resource, label). The label is used for humans, the resource is used to process RDF. If I want SPARQL to produce list of these pairs ordered by label, this is the simplest query that I can think of. This is but a start, I will later insert more FILTERs (for faceted navigation etc.). Axel On Mar 13, 2010, at 4:51 , Danny Ayers wrote: On 13 March 2010 04:16, Axel Rauschmayer a...@rauschma.de wrote: Thanks for any comments or suggestions... I'm a little perturbed that you have to use something so convoluted to get labels Why not something just like (whatever graph) SELECT ?o WHERE { ?s rdfs:label ?o } , or at worse an OPTIONAL on maybe dc:label or whatever..? - are the objects of any labels resources? Can you please clarify what you are looking for, and explain further - I honestly hope you are missing something there. If there is something wrong with the material, the problems should be surfaced and fixed (and no doubt will be for the next rev, if need be) Cheers, Danny. http://danny.ayers.name -- axel.rauschma...@ifi.lmu.de http://www.pst.ifi.lmu.de/~rauschma/
Re: SPARQL: sorting resources by label?
Addendum: If one wants to produce a table | URI | label | types | SPARQL becomes even more unwieldy. Just think of sorting the type column by the label of the types. Or even of producing a Java object for each row. The problem is that SPARQL does not support query rows in NFNF. Maybe DESCRIBE can be used in the future for this? Axel On Mar 13, 2010, at 4:51 , Danny Ayers wrote: On 13 March 2010 04:16, Axel Rauschmayer a...@rauschma.de wrote: Thanks for any comments or suggestions... I'm a little perturbed that you have to use something so convoluted to get labels Why not something just like (whatever graph) SELECT ?o WHERE { ?s rdfs:label ?o } , or at worse an OPTIONAL on maybe dc:label or whatever..? - are the objects of any labels resources? Can you please clarify what you are looking for, and explain further - I honestly hope you are missing something there. If there is something wrong with the material, the problems should be surfaced and fixed (and no doubt will be for the next rev, if need be) Cheers, Danny. http://danny.ayers.name -- axel.rauschma...@ifi.lmu.de http://www.pst.ifi.lmu.de/~rauschma/
encoding sparql queries made to dbpedia
Hello, While programmatically accessing dbpedia through GET using Jersey, I had to urlencode the query to UTF-8 and then replace the + with %20. Does anyone know of a better way to do this ? Thanks, Monika -- Dr Monika Solanki F27 Department of Computer Science University of Leicester Leicester LE1 7RH United Kingdom Tel: +44 116 252 3828 Google: 52.653791,-1.158414 http://www.cs.le.ac.uk/people/ms491