Re: SPARQL results in RDF
Hi Hugh, You can get results in RDF if you use CONSTRUCT -- which is basically a special case of SELECT that returns 3-tuples and uses set semantics (does not allow duplicates), but I imagine that you are aware of this. Returning RDF for SELECT where the result set consists in n-tuples where n != 3 is difficult because there is no direct way to represent it. Also problematic is that there *is* a concept of order in SPARQL query results while there is not with RDF. Also the use of bag semantics allowing duplicates which also does not really work with RDF. These, again, could be kludged with reification, but that is not very elegant. So most SELECT results are not directly representable in RDF. Cheers, -w
Linked Prolog
On Thu, 20 Jun 2013 01:28:54 +, エリクソン トーレ t-eriks...@so.taisho.co.jp said: ex:distance ex:earth ex:moon 381550 25150 u:km. (Ab)using RDF I was able to (barely) document my semantics directly in turtle. Where is the semantics and syntax of your example described? Your data might be linked, but as a prospective consumer of it I'm feeling a bit lost :-) I just made it up. But not out of thin air. It's basically a prolog assertion with a turtle-esque surface syntax -- that's why the predicate comes first, so you can have arbitrary arity. The definitions of ex:distance (and ex:moon, ex:earth, u:km) could be obtained by dereferencing those URIs in the same way it's done with RDF. In fact it takes the two best ideas from RDF -- URIs as identifiers for real-world things, and the mechanism of dereferencing these URIs to get more information. Pace assertions about a normative meaning of Linked Data from the RDF WG (of which I am a member), I think these two ideas are the essence of Linked Data. I'm not seriously advocating this right now, it's just an example or thought experiment to answer your question and there's too much sunk investment in RDF for such a radical change. In fact if we were making radical changes, thinking about lambda expressions might be better than doing it this way. Maybe for RDF 3.0... -w
Re: Proof: Linked Data does not require RDF
On Tue, 18 Jun 2013 23:32:42 +, エリクソン トーレ t-eriks...@so.taisho.co.jp said: I would be interested in seeing some linked data that is incompatible with RDF while still adhering to rules like using global identifiers and typed links. @prefix ex: http://example.org/ @prefix u: http://example.org/units ex:distance ex:earth ex:moon 381550 25150 u:km. This relation has a typed link (ex:distance) between two non-informational resources (ex:earth, ex:moon). It has a distance that has units as well as a datatype, and a +/- uncertainty thrown in for good measure. I could even imagine the ex:distance predicate to be self-describing in the usual way, defining its arity and the meaning and type of its arguments. I think this can quite sensibly be called Linked Data and whilst with sufficient contortions (reification, abuse of datatypes, perhaps anonymous or parametrised predicates) it can be shoehorned into RDF, it really doesn't happen naturally or obviously enough that it could be called compatible in my opinion. Happy hacking, -w
Re: Content negotiation for Turtle files
On Wed, 06 Feb 2013 11:45:10 +, Richard Light rich...@light.demon.co.uk said: In a web development context, JSON would probably come second for me as a practical proposition, in that it ties in nicely with widely-supported javascript utilities. If it were up to me, XML with all the pointy brackets that make my eyes bleed would be deprecated everywhere. Most or all modern programming languages have good support for JSON, the web browsers do natively as well, and it's much easier to work with since it mostly maps directly to built-in datatypes. To me, Turtle is symptomatic of a world in which people are still writing far too many Linked Data examples and resources by hand, and want something that is easier to hand-write than RDF/XML. I don't really see how that fits in with the promotion of the idea of machine-processible web-based data. Kind of agree. Turtle is a relic of trying to make a machine readable quasi-prose representation of data, which is suitable for both machines and people. But it's not general enough -- you can only use it to write RDF, which means you need specialised tools. It's saddening because (especially with some of the N3 enhancements) it's quite an elegant approach. Cheers, -w
WebID vs. JSON (Was: Re: Think before you write Semantic Web crawlers)
What does WebID have to do with JSON? They're somehow representative of two competing trends. The RDF/JSON, JSON-LD, etc. work is supposed to be about making it easier to work with RDF for your average programmer, to remove the need for complex parsers, etc. and generally to lower the barriers. The WebID arrangement is about raising barriers. Not intended to be the same kind of barriers, certainly the intent isn't to make programmer's lives more difficult, rather to provide a good way to do distributed authentication without falling into the traps of PKI and such. While I like WebID, and I think it is very elegant, the fact is that I can use just about any HTTP client to retrieve a document whereas to get rdf processing clients, agents, whatever, to do it will require quite a lot of work [1]. This is one reason why, for example, 4store's arrangement of /sparql/ for read operations and /data/ and /update/ for write operations is *so* much easier to work with than Virtuoso's OAuth and WebID arrangement - I can just restrict access using all of the normal tools like apache, nginx, squid, etc.. So in the end we have some work being done to address the perception that RDF is difficult to work with and on the other hand a suggestion of widespread putting in place of authentication infrastructure which, whilst obviously filling a need, stands to make working with the data behind it more difficult. How do we balance these two tendencies? [1] examples of non-WebID aware clients: rapper / rasqal, python rdflib, curl, the javascript engine in my web browser that doesn't properly support client certificates, etc. -- William Waitesmailto:w...@styx.org http://river.styx.org/ww/sip:w...@styx.org F4B3 39BF E775 CF42 0BAB 3DF0 BE40 A6DF B06F FD45
Re: WebID vs. JSON (Was: Re: Think before you write Semantic Web crawlers)
* [2011-06-22 16:00:49 +0100] Kingsley Idehen kide...@openlinksw.com écrit: ] explain to me how the convention you espouse enables me confine access ] to a SPARQL endpoint for: ] ] A person identified by URI based Name (WebID) that a member of a ] foaf:Group (which also has its own WebID). This is not a use case I encounter much. Usually I have some application code that needs write access to the store and some public code (maybe javascript in a browser, maybe some program run by a third party) that needs read access. If the answer is to teach my application code about WebID, it's going to be a hard sell because really I want to be working on other things than protocol plumbing. If you then go further and say that *all* access to the endpoint needs to use WebID because of resource-management issues, then every client now needs to do a bunch of things that end with shaving a yak before they can even start on working on whatever they were meant to be working on. On the other hand, arranging things so that access control can be done by existing tools without burdening the clients is a lot easier, if less general. And easier is what we want working with RDF to be. Cheers, -w -- William Waitesmailto:w...@styx.org http://river.styx.org/ww/sip:w...@styx.org F4B3 39BF E775 CF42 0BAB 3DF0 BE40 A6DF B06F FD45
Re: Squaring the HTTP-range-14 circle [was Re: Schema.org in RDF ...]
* [2011-06-14 08:55:09 -0700] Pat Hayes pha...@ihmc.us écrit: ] Well, you have got me confused. Are you saying here that it does ] in fact make sense to say that a description of the eiffel tower ] is 356M tall? I'm just saying that things like this will be published because the publisher is confused, or mistaken or doesn't think that making the distinction is important or convenient and consumers of the data have to deal with it. We should encourage the publishers to do a better job but some of them will balk and sometimes, like with the schema.org that started this thread, big, important publishers with a lot of influence will balk. If we're lucky we can convince them to fix it, otherwise writers of software that consumes the data and tries to reason with it have to work out a way to be robust in the face of this kind of ambiguity. That's all. -w -- William Waitesmailto:w...@styx.org http://river.styx.org/ww/sip:w...@styx.org F4B3 39BF E775 CF42 0BAB 3DF0 BE40 A6DF B06F FD45
Re: Schema.org in RDF ...
* [2011-06-07 09:22:01 +0100] Michael Hausenblas michael.hausenb...@deri.org écrit: ] Something I don't understand. If I read well all savvy discussions ] so far, publishers behind http://schema.org URIs are unlikely to ] ever provide any RDF description, ] ] What makes you so sure about that not one day in the (near?) future ] the Schema.org URIs will serve RDF or JSON, FWIW, additionally to ] HTML? ;) I suspect the prevailing view within Google is that autoneg is not used in the real world. For example, https://groups.google.com/group/golang-nuts/msg/b882b153a3acd58e (Brad was the creator of LiveJournal, now at Google, and I don't think his view expressed there is uncommon). So perhaps not the *near* future. Some other arrangement that does not use the Accept header (and that seems to mean different URIs then) is probably more likely, but this is just a guess. Cheers, -w -- William Waitesmailto:w...@styx.org http://river.styx.org/ww/sip:w...@styx.org F4B3 39BF E775 CF42 0BAB 3DF0 BE40 A6DF B06F FD45
implied datasets
This is the RDF version of the question I just sent to the CKAN list [1]. It is somewhat a policy question and I believe that in RDF terms the open world means the answer is basically, yes you can say what you want. Consider the diagram here, http://semantic.ckan.net/group/?group=http://ckan.net/group/lld this is interconnections between library datasets. You'll notice there is a partition. This partition is not really there. Here's why. In library world, perhaps more than elsewhere, it is common to do things like this, http://example.org/issn/1234-5678 a bibo:Jornal; blah blah blah some descriptions; owl:sameAs urn:issn:1234-5678. This is because there are standard identifiers for lots of things that are found in libraries and they even have a urn namespace. So it is a lot easier when publishing this data than to go out and use something like silk to try to find links. They're already implied by the identifiers we have in hand. So given two such datasets, they are indeed connected in the way we think of RDF datasets as being connected, not necessarily with semantics as strict as owl:sameAs - we would probably not choose to actually materialise its productions here especially since the entities might be modelled in different, incompatible ways, and the owl:sameAs is really not the right predicate to be using, but at least connected with semantics along the lines of rdfs:seeAlso. The point is, the two datasets are transitively connected. But because we have no extant dataset that contains all the ISSNs, particularly all ISSNs where the identifier is expressed as a urn: URI, we have nothing to put in our voiD linkset -- which is how the relationships between these datasets are represented at a high level. So we have an apparent partition. What I propose to do here, is invent an implied dataset, the one that contains in principle the entire list of ISSNs. Something like, urn:issn:- a rdf:Resource. urn:issn:-0001 a rdf:Resource. ... but which actually should contain X a rdf:Resource for everything in the valid lexical space of urn:issn, which may be (countably) infinite for all I know. Then for each dataset that I have that uses the links to this space, I count them up and make a linkset pointing at this imaginary dataset. Obviously the same strategy for anywhere there exist some kind of standard identifiers that are not URIs in HTTP. Does this make sense? Can we sensibly talk about and even assert the existence of a dataset of infinite size? (whatever existence means). Is this an abuse of DCat/voiD? Are this class of datasets subsets of sameAs.org (assuming sameAs.org to be complete in principle?) Cheers, -w [1] http://lists.okfn.org/pipermail/ckan-discuss/2011-May/001269.html -- William Waitesmailto:w...@styx.org http://river.styx.org/ww/sip:w...@styx.org F4B3 39BF E775 CF42 0BAB 3DF0 BE40 A6DF B06F FD45
Re: implied datasets
* [2011-05-23 11:34:56 -0400] glenn mcdonald gl...@furia.com écrit: ] It seems to me that this is another demonstration of confusion that wouldn't ] happen if we all understood RDF IDs to be pure identifiers that belong to ] the graph representation of a dataset and nothing else. ISSN numbers are not ] graph-node IDs, they are real-world conceptual identifiers like social ] security numbers or SKUs or country codes. Many different data-structure ] might reference them in very different ways, so it should be fairly clear ] that they cannot uniquely identify anything but themselves, and thus they ] should themselves be represented in RDF as nodes. So the above should be ] more like: Hi Glenn, That may be so but it misses the point. The point is there is a field, be it a URI or a literal however modelled, that can be used to join between two datasets. This join field is hidden in that there exists no (known) dataset that contains all possible values it can take on. So you have a situation when you are trying to describe datasets where you can say that DS1 and DS2 are indirectly linked and you want to make that link explicit so that you can put it on diagrams ans such. Saying, DS1 indirectlyLinkedTo DS2 is no good because then you get O(n^2) such statements which makes your visualisation messy and furthermore you don't know without examining them that they have any common values on the join field so they may not actually be linked except in a degenerate sense. Inventing a dataset that contains only the join field lets you say something useful and coherent about the relationship between DS1 and DS2. There is nothing in this that requires the datasets themselves to be RDF. See my other post to ckan-discuss on the same topic expressed in terms of the relationships between CSV files. Cheers, -w -- William Waitesmailto:w...@styx.org http://river.styx.org/ww/sip:w...@styx.org F4B3 39BF E775 CF42 0BAB 3DF0 BE40 A6DF B06F FD45
Re: implied datasets
* [2011-05-23 14:46:47 +0100] Leigh Dodds leigh.do...@talis.com écrit: ] I'm not sure that the dataset is imaginary, but what you're doing ] seems eminently sensible to me. I've been working on a little project ] that I hope to release shortly that aims to facilitate this kind of ] linking, especially where those non-URI identifiers, or Literal Keys ] [1] are used to build patterned URIs. The thing is, as with Hugh's suggestion, as a curator of datasets I have little control or influence over how the dataset authors choose to do this. I have noticed a common pattern though (urn:issn for example) and encouraging patterns like this is helpful I think. ] It may be more natural to thing of these more as services though than ] datasets. i.e. a service that accepts some keys as input and returns a ] set of assertions. In this case the assertions would be links to other ] datasets. This is a bit different. I was thinking of an implied dataset that would have no links outwards at all. ] Subsets if they only asserted sameAs links, but I think you're ] suggesting that this may be too strict. I think there's potentially a ] whole set of related predicate based services [2] that provide ] useful indexes of existing datasets, or expose additional annotations ] of extra sources. So this would be a separation of edge-labelled graphs into a bunch of perhaps more manageable basic (V,E) graphs. An interesting way of chopping things up. The reason I think sameAs is too strict, aside from people putting sameAs when they really mean similarTo, can be shown by another library example. Broadly there seem to be two strategies for representing things like books, the flat BIBO style and the more elaborate FRBR/WEMI style. So if I have two datasets, one in each, I might have something like, ds1:flc a bibo:Book; dc:title The Feynman Lectures on Computation; dc:creator [ foaf:name Richard Feynman ]; dc:language eng; owl:sameAs urn:isbn:0738202967. ds2:flc a frbr:Manifestation; frbr:manifestationOf [ a frbr:Expression; dc:language en; frbr:expressionOf [ a frbr:Work; dc:title The Feynman Lectures on Computation; dc:creator [ foaf:name Richard Feynman ] ] ]; owl:sameAs urn:isbn:0738202967. Both the authors have done something prima facie reasonable with the sameAs but if you actually run it transitively you get into trouble. This also goes to what Glenn was saying. These datasets are obviously related in a meaningful way, there may well be useful ways for someone who studies them to draw links between them but it isn't as simple as saying they both have things of the same type. In fact what type assertions are appropriate to clarify the relationship between these datasets is the type of analysis that I would want to facilitate, not try to do up front. What I can say is they both have references (that may or may not be strictly believable) to this funny non-dereferenceable URI (or equivalently, string literal of a certain kind). Cheers, -w -- William Waitesmailto:w...@styx.org http://river.styx.org/ww/sip:w...@styx.org F4B3 39BF E775 CF42 0BAB 3DF0 BE40 A6DF B06F FD45
Re: See UK
* [2011-05-21 14:59:26 +0300] Denny Vrandecic denny.vrande...@kit.edu écrit: ] Very impressive! Yes indeed! ] How much of rewriting would be needed to use data from other countries? Sadly the data seems to stop at Hadrian's wall. Perhaps this should properly be See England (and possibly Wales) :( Cheers, -w -- William Waitesmailto:w...@styx.org http://river.styx.org/ww/sip:w...@styx.org F4B3 39BF E775 CF42 0BAB 3DF0 BE40 A6DF B06F FD45
ANN: Semantic CKAN - Revisited
A new version of RDF infrastructure for CKAN has now been deployed at http://semantic.ckan.net/ this represents a complete reimplementation and re-thinking of how to manage a distributed set of metadata catalogues. Features: * Aggregation of all known CKAN instances (if there are any missing, please let me know and I'll add them) * Search and filtering across all data sources * Dataset metadata represented using DCat and, if available, voiD * Retrieval of RDF representations using content-type negotiation. * Where voiD information is known, navigable visualisations of the datasets and their neighbours, likewise for curated groups. * CKAN-compatible API for using existing CKAN tools to inspect the aggregated data from one place Some interesting pages to look at: DBpedia http://semantic.ckan.net/record/dcc6715c-bf94-4a89-bbf3-35933da795a5.html Linked Library Data http://semantic.ckan.net/group/?group=http://ckan.net/group/lld -- William Waitesmailto:w...@styx.org http://river.styx.org/ww/sip:w...@styx.org F4B3 39BF E775 CF42 0BAB 3DF0 BE40 A6DF B06F FD45
Re: ANN: Semantic CKAN - Revisited
* [2011-04-17 19:58:28 +0200] Pierre-Yves Vandenbussche py.vandenbuss...@gmail.com écrit: ] Nice work William, Thank you Pierre-Yves. ] Do you plan to add navigation links beetween ckan.net and semantic.ckan.net? From semantic.ckan.net there are links back to *.ckan.net on each catalog record page. On ckan.net (and the others once they are upgraded) there are links to the various RDF representations and ckan.net will also content-type negotiate and redirect to semantic.ckan.net but there is no link to the HTML page and visualisations - I leave that to the discretion of the other CKAN developers. There are some corner cases where this is not true - these old CKAN instances http://tinyurl.com/5skavxk don't give back a URI for the dataset in their API calls and the result is that we end up with these datasets being blank nodes. Hopefully they will be upgraded soon. There is also no particular way to find out who is responsible for the maintenance of a CKAN site and what version of the software it is running. In practice it usually ends up being OKFN staff so sending a message to the ckan-discuss list is likely to alert the right person, but this is not a very scalable situation. There is a ticket to add an API call to CKAN to address this. Cheers, -w -- William Waitesmailto:w...@styx.org http://river.styx.org/ww/sip:w...@styx.org F4B3 39BF E775 CF42 0BAB 3DF0 BE40 A6DF B06F FD45
Re: LOD Cloud Cache Stats
So I don't have answers to your questions, but do have some observations about the results, particularly the counts of distinct predicates. The top one is rdf:type which makes sense. Below that we have ones used in reification. Who knew there was actually that much reified data out there? I wonder where this comes from and what about the consensus that this is not a good idea and should be deprecated? SELECT DISTINCT ?graph, COUNT(?s) AS ?count WHERE { GRAPH ?graph { ?s ?p http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement } } ORDER BY DESC(?count) LIMIT 50 This query times out, but it would be interesting to know the answer, who is the source of all of these reifications? Next is rdfs:label, ok, fine. After that, a sizeable chunk of data has to do with rows and columns in CSV tables that comes from data.gov. How is a mechanical transliteration from CSV to RDF without any modelling useful? It just makes the data a couple of orders of magnitude bigger and a few more orders of magnitude more cumbersome to deal with. I mean, being able to refer to a specific spreadsheet cell is useful but how does actually materialising all of them do anything but take up disk space and slow down queries? Cheers, -w -- William Waitesmailto:w...@styx.org http://river.styx.org/ww/sip:w...@styx.org F4B3 39BF E775 CF42 0BAB 3DF0 BE40 A6DF B06F FD45
Re: Introducing Vocabularies of a Friend (VOAF)
* [2011-01-25 11:21:45 -0500] Kingsley Idehen kide...@openlinksw.com écrit: ] Hmm. Is it the Name or Description that's important? ] ] But what about discerning meaning from the VOAF graph? Humans looking at documents and trying to understand a system do so in a very different way from machines. While what you suggest might be strictly true according to the way RDF and formal logic work, it isn't the way humans work (otherwise the strong AI project of the past half-century might have succeeded by now). So we should try arrange things in a way that is both consistent with what the machines want and as easy as possible for humans to understand. That Hugh, an expert in the domain, had trouble figuring it out due to poetic references to well known concepts suggests that there is some room for improvement. Cheers, -w -- William Waitesmailto:w...@styx.org http://eris.okfn.org/ww/ sip:w...@styx.org F4B3 39BF E775 CF42 0BAB 3DF0 BE40 A6DF B06F FD45
Re: URI Comparisons: RFC 2616 vs. RDF
* [2011-01-20 14:29:35 +] Nathan nat...@webr3.org écrit: ] RDF Publishers MUST perform Case Normalization and Percent-Encoding ] Normalization on all URIs prior to publishing. When using relative URIs ] publishers SHOULD include a well defined base using a serialization ] specific mechanism. Publishers are advised to perform additional ] normalization steps as specified by URI (RFC 3986) where possible. ] ] RDF Consumers MAY normalize URIs they encounter and SHOULD perform ] Case Normalization and Percent-Encoding Normalization. ] ] Two RDF URIs are equal if and only if they compare as equal, ] character by character, as Unicode strings. ] ] For many reasons it would be good to solve this at the publishing phase, ] allow normalization at the consuming phase (can't be precluded as ] intermediary components may normalize), and keep simple case sensitive ] string comparison throughout the stack and specs (so implementations ] remain simple and fast.) ] ] Does anybody find the above disagreeable? Sounds about right to me, but what about port numbers, http://example.org/ vs http://example.org:80/? -w -- William Waitesmailto:w...@styx.org http://eris.okfn.org/ww/ sip:w...@styx.org F4B3 39BF E775 CF42 0BAB 3DF0 BE40 A6DF B06F FD45
Re: URI Comparisons: RFC 2616 vs. RDF
* [2011-01-19 11:11:20 -0500] Kingsley Idehen kide...@openlinksw.com écrit: ] On 1/19/11 10:59 AM, Nathan wrote: ] htTp://lists.W3.org/Archives/Public/public-lod/2011Jan/ - Personally ] I'd hope that any statements made using these URIs (asserted by man or ] machine) would remain valid regardless of the (incorrect?-)casing. ] ] Okay for Data Source Address Ref. (URL), no good for Entity (Data Item ] or Data Object) Name Ref., bar system specific handling via IFP property ] or owl:sameAs :-) FWIW I've just added a FuXi builtin for the curate tool [1] that does URI comparisons using ll.uri [2] (deliberately pushing the choice of place on the ladder into a library). It is used like this: @prefix curate: http://eris.okfn.org/ww/2010/12/curate#. { ?s1 ?p1 ?o1 . ?s2 ?p2 ?o2 . ?s1 curate:cmpURI ?s2 } = { ?s1 = ?s2 }. And results in statements like this: HTTP://example.org:80/ = HTTP://example.org:80/, http://EXAMPLE.ORG/, http://example.org/ . http://EXAMPLE.ORG/ = HTTP://example.org:80/, http://EXAMPLE.ORG/, http://example.org/ . http://example.org/ = HTTP://example.org:80/, http://EXAMPLE.ORG/, http://example.org/ . Cheers, -w [1] https://bitbucket.org/okfn/curate/src/1f6ba3c360c3/curate/builtins.py#cl-9 [2] http://www.livinglogic.de/Python/url/Howto.html -- William Waitesmailto:w...@styx.org http://eris.okfn.org/ww/ sip:w...@styx.org F4B3 39BF E775 CF42 0BAB 3DF0 BE40 A6DF B06F FD45
Re: Property for linking from a graph to HTTP connection meta-data?
* [2011-01-17 16:39:27 +0100] Martin Hepp martin.h...@ebusiness-unibw.org écrit: ] Does anybody know of a standard property for linking a RDF graph to a ] http:GetRequest, http:Connection, or http:Response instance? Maybe ] rdfs:seeAlso (@TBL: ;- ))? If you suppose that the name of the graph is the same as the request URI (it will not always be, of course) you can link in the other direction from http:Request using http:requestURI. I am not sure that http:requestURI has a standard inverse though. Cheers, -w -- William Waitesmailto:w...@styx.org http://eris.okfn.org/ww/ sip:w...@styx.org F4B3 39BF E775 CF42 0BAB 3DF0 BE40 A6DF B06F FD45
Re: Property for linking from a graph to HTTP connection meta-data?
* [2011-01-17 23:09:01 +0100] Martin Hepp martin.h...@ebusiness-unibw.org écrit: ] # Link the graph to the HTTP header info from the data transformation ] foo:dataset rdfs:seeAlso foo:ResponseMetaData . Actually this seems like a use case for OPMV. So I think you'd do something like, foo:dataset opmv:wasGeneratedBy [ a opmv:Process; opmv:used foo:ResponseMetaData; opmv:used http://example.org/foo.xml ]. This would have the side-effect of making your graph an opmv:Artifact but that actually makes sense. Cheers, -w -- William Waitesmailto:w...@styx.org http://eris.okfn.org/ww/ sip:w...@styx.org F4B3 39BF E775 CF42 0BAB 3DF0 BE40 A6DF B06F FD45
Re: Is it best practices to use a rdfs:seeAlso link to a potentially multimegabyte PDF?, existing predicate for linking to PDF?
* [2011-01-10 08:55:59 +] Phil Archer phil.arc...@talis.com écrit: ] However... a property should not imply any content type AFAIAC. That's ] the job of the HTTP Headers. If software de-references an rdfs:seeAlso ] object and only expects RDF then it should have a suitable accept ] header. if the server can't respond with that content type, there are ] codes to handle that. I disagree that we should rely on HTTP headers for this. Consider local processing of a large multi-graph dataset. These kinds of properties can act as hints to process one graph or another without the need to dereference something. (tending to think of graph as equivalent to document obtained by dereferencing the graph's name). Slightly more esoteric are graphs made available over ftp, finger, freenet, etc.. Let's take advantage of HTTP where appropriate but not mix up the transport and content unnecessariy. Cheers, -w -- William Waitesmailto:w...@styx.org http://eris.okfn.org/ww/ sip:w...@styx.org 9C7E F636 52F6 1004 E40A E565 98E3 BBF3 8320 7664
Re: Is vCard range restriction on org:siteAddress necessary?
* [2011-01-04 11:49:43 +] Dave Reynolds dave.e.reyno...@gmail.com écrit: ] Is VCard that bad? It fits your example below just fine. The only problem I see with the example is that we don't have counties in Scotland, we have districts. In Quebec and Louisiana and other historically catholic places we have parishes. Is Scotland a state in the American sense, not really. You could use things like vc:county and vc:state and just say that the naming is bad, I guess. Geonames tackles this problem in a language-neutral way by having several levels of administrative areas but they also construct a hierarchy which might be a little verbose for this use case. Cheers, -w -- William Waitesmailto:w...@styx.org http://eris.okfn.org/ww/ sip:w...@styx.org 9C7E F636 52F6 1004 E40A E565 98E3 BBF3 8320 7664
CKAN Curation Tool
Hi all, I did some work over the past couple of days to try to imagine how a package curation tool might work. This means a tool that looks at packages on CKAN, applies some rules, and produces some output. The output might be instructions to add a tag to a package or it might be to add a package to a group. This last is the main use case, really -- trying to answer the question of, given a package and some rules about group membership, does it qualify? I tried to approach this in a general way and what I arrived at was actually quite easy to implement. On the other had it is a command line too, and writing rules, whilst straightforward enough, requires some knowledge of inference rules. Ideas on how to make it more user friendly are more than welcome. Here's a very brief summary of how it works. It first reads an RDF description of a package and a set of rules. The set of rules can include operators like, try to get this web page or even, add this tag to the package or add this package to a group. It compiles the ruleset and then feeds the description through, triggering these operations. Any inferred statements and relationships are printed out and (optionally) any desired changes are saved back to CKAN through the API. A somewhat longer explanation, with worked examples can be found at, http://packages.python.org/curate/overview.html For this to be truly useful, a much larger library of built-in predicates and good bunch of example rulesets would be necessary at the very least. Comments and suggestions most welcome -- indeed eagerly sought. Cheers, -w -- William Waites http://eris.okfn.org/ww/foaf#i 9C7E F636 52F6 1004 E40A E565 98E3 BBF3 8320 7664
Re: Is 303 really necessary?
* [2010-11-27 15:24:53 -0500] Tim Berners-Lee ti...@w3.org écrit: ] http://www.w3.org/2008/site/images/logo-w3c-mobile-lg.png ] we know we can use to refer to the image in its PNG version. ] http://www.w3.org/2008/site/images/logo-w3c-mobile-lg ] we know nothing about. ] ] Because the fetch returned a content-location header, ] we are now not allowed to use that URI to refer to anything -- it could ] after all refer to the Eiffel Tour, or the W3C as an organization ] according to the new system. ] ] Does this make sense? Yes and no. I see the distinction between representation and description, but I don't think the line is necessarily so sharp. For example, you could make http://www.w3.org/2008/site/images/logo-w3c-mobile-lg respond with, The W3C logo, white text on a teal background the characters W and 3 apparently raised and C apparently sunken. It is the large version of the logo intended for use with mobile browsers. when asked for text/plain (pace alt). Maybe this is useful for blind people. For them it functions as a representation but is written using descriptive language. I could imagine formalising the descriptive language in RDF and returning that when asked for a different content-type. Maybe I should do some background reading in semiotics to get this clearer in my mind. In the meantime, % curl -I http://bnb.bibliographica.org/entry/GB5105626 HTTP/1.0 303 See Other Server: nginx/0.7.65 Date: Sat, 27 Nov 2010 23:44:54 GMT Content-Type: text/html; charset=UTF-8 Content-Length: 0 Pragma: no-cache Cache-Control: no-cache Vary: Accept Location: http://bnb.bibliographica.org/entry/GB5105626.rdf X-Cache: MISS from localhost X-Cache-Lookup: MISS from localhost:80 Via: 1.0 localhost (squid/3.0.STABLE19) Connection: close Cheers, -w -- William Waites http://eris.okfn.org/ww/foaf#i 9C7E F636 52F6 1004 E40A E565 98E3 BBF3 8320 7664
Re: Is 303 really necessary?
* [2010-11-26 15:15:42 +0100] Bob Ferris z...@elbklang.net écrit: ] ] I wrote a note as an attempt to clarify a bit the terms Resource, ] Information Resource and Document and their relations (from my point of ] view). Maybe this helps to figure out the bits of the current confusion. So taking a cue from this thread, I've implemented something that I think is in line with the original suggestion for a new dataset that I'm working on. If you request, e.g. http://bnb.bibliographica.org/entry/GB8102507 with an Accept header indicating an interest in RDF data, you will get a 200 response with a Content-Location header indicating that what is returned is actually the GB8102507.rdf document. It seems to me that this is enough information that a client needn't be confused between the document and the book, A good man in Africa. There is foaf:primaryTopic linkage in the document that should also adequately explain the state of affairs. However it seems that some clients are confused -- tabulator for instance as was pointed out in irc the other day. My question is, should I change the behaviour to the standard 303 redirect or leave it as a stake in the ground saying that this is a reasonable arrangement? Cheers, -w -- William Waites http://eris.okfn.org/ww/foaf#i 9C7E F636 52F6 1004 E40A E565 98E3 BBF3 8320 7664
FW: Failed to port datastore to RDF, will go Mongo
Friedrich, I'm forwarding your message to one of the W3 lists. Some of your questions could be easily answered (e.g. for euro in your context, you don't have a predicate for that, you have an Observation with units of a currency and you could take the currency from dbpedia, the predicate is units). But I think your concerns are quite valid generally and your experience reflects that of most web site developers that encounter RDF. LOD list, Friedrich is a clueful developer, responsible for http://bund.offenerhaushalt.de/ amongst other things. What can we learn from this? How do we make this better? -w - Forwarded message from Friedrich Lindenberg friedr...@pudo.org - From: Friedrich Lindenberg friedr...@pudo.org Date: Wed, 24 Nov 2010 11:56:20 +0100 Message-Id: a9089567-6107-4b43-b442-d09dcc0c3...@pudo.org To: wdmmg-discuss wdmmg-disc...@lists.okfn.org Subject: [wdmmg-discuss] Failed to port datastore to RDF, will go Mongo (reposting to list): Hi all, As an action from OGDCamp, Rufus and I agreed that we should resume porting WDMMG to RDF in order to make the data model more flexible and to allow a merger between WDMMG, OffenerHaushalt and similar other projects. After a few days, I'm now over the whole idea of porting WDMMG to RDF. Having written a long technical pro/con email before (that I assume contained nothing you don't already know), I think the net effect of using RDF would be the following: * Lots of coolness, sucking up to linked data people. * Further research regarding knowledge representation. vs. * Unstable and outdated technological base. No triplestore I have seen so far seemed on par with MySQL 4. * No freedom wrt to schema, instead modelling overhead. Spent 30 minutes trying to find a predicate for Euro. * Scares off developers. Invested 2 days researching this, which is how long it took me to implement OHs backend the first time around. Project would need to be sustained through linked data grad students. * Less flexibility wrt to analytics, querying and aggregation. SPARQL not so hot. * Good chance of chewing up the UI, much harder to implement editing. I normally enjoy learning new stuff. This is just painful. Most of the above points are probably based on my ignorance, but it really shouldn't take a PhD to process some gov spending tables. I'll now start a mongo effort because I really think this should go schema-free + I want to get stuff moving. If you can hold off loading Uganda and Israel for a week that would of course be very cool, we could then try to evaluate how far this went. Progress will be at: http://bitbucket.org/pudo/wdmmg-core Friedrich ___ wdmmg-discuss mailing list wdmmg-disc...@lists.okfn.org http://lists.okfn.org/mailman/listinfo/wdmmg-discuss - End forwarded message - -- William Waites http://eris.okfn.org/ww/foaf#i 9C7E F636 52F6 1004 E40A E565 98E3 BBF3 8320 7664
Re: FW: Failed to port datastore to RDF, will go Mongo
... on the plus side, Friedrich wrote: ] * Lots of coolness, sucking up to linked data people. I don't see these as particularly good things in themselves. The solutions have to be obviously technically sound and convenient to use. Drinking the kool-aid is not helpful. * [2010-11-24 08:05:08 -0500] Kingsley Idehen kide...@openlinksw.com écrit: ] ] Is your data available as a dump? UK data for 2009 that I made is available at: http://semantic.ckan.net/dataset/cra/2009/dump.nt.bz2 http://semantic.ckan.net/dataset/cra/2009/dump.nq.bz2 But this was done more or less by hand and repurposing the CSV - SDMX (this was done before QB became best practice) scripts is not easy. Still, from a modeling perspective they might be a good starting point. But having to ask a question in the right place and the answer being a good starting point is maybe different from doing a google search and finding easy to follow recipes that can immediately plugged into some web app. Cheers, -w -- William Waites http://eris.okfn.org/ww/foaf#i 9C7E F636 52F6 1004 E40A E565 98E3 BBF3 8320 7664
Re: Failed to port datastore to RDF, will go Mongo
* [2010-11-24 22:44:53 +] Toby Inkster t...@g5n.co.uk écrit: ] ] Or, to put a different slant on it: a competent developer who has spent ] years using SQL databases day-to-day finds it easier to use SQL and the ] relational data model than a different data model and different query ] language that he's spent a few days trying out. I don't think that's what's happening here, or at least not entirely. People coming from a RDB background expect things like SUM, COUNT, INSERT, DELETE, not to mention GROUP BY to work. But SPARQL 1.1 is still very new, each store implements them in slightly different ways with slightly different syntax, sometimes requiring workarounds in application code. With RDBs we have good libraries for abstracting away these differences. We still require people to pay a lot closer attention to what the underlying plumbing is and how it works (and if the binary package they got with their OS might be out of date or has to be compiled from source or even patched - the horror!). These things prevent people from getting on with what they see as the task at hand. Cheers, -w -- William Waites http://eris.okfn.org/ww/foaf#i 9C7E F636 52F6 1004 E40A E565 98E3 BBF3 8320 7664
ANN: British National Bibliography
Following up on the earlier announcement [1] that the British Library [2] has made the British National Bibliography [3] available under a public domain dedication, the JISC Open Bibliography [4] project has worked to make this data more useable. The data has been loaded into a Virtuoso store that is queriable through the SPARQL Endpoint [5] and the URIs that we have assigned each record use the ORDF [6] software to make them dereferencable, supporting perform content auto-negotiation as well as embedding RDFa in the HTML representation. The data contains some 3 million individual records and some 173 million triples. Indexing the data was a very CPU intensive process taking approximately three days. Transforming and loading the source data took about five hours. For more detail see http://eris.okfn.org/ww/2010/11/bl 1. http://openbiblio.net/2010/11/17/jisc-openbibliography-british-library-data-release/ 2. http://www.bl.uk/ 3. http://www.bl.uk/bibliographic/natbib.html 4. http://openbiblio.net/ 5. http://bnb.bibliographica.org/sparql 6. http://ordf.org/ -- William Waites http://eris.okfn.org/ww/foaf#i 9C7E F636 52F6 1004 E40A E565 98E3 BBF3 8320 7664
Re: ANN: British National Bibliography
* [2010-11-22 16:25:35 +] Richard Light rich...@light.demon.co.uk écrit: ] ] But is it reliable? I looked up the book I wrote (Presenting XML, SAMS ] Net, 1997) and find that it claims it was written by Laura Alschuler. ] How did that happen? That's what's in the source data, the 111443rd entry in BNBrdfdc12.xml I have no idea the error rate for problems of this type in the data - finding that out is a research project in itself. A next step is to feed it through the link discovery tools at DERI and Berlin (are they both the same (Silk) or is DERI's something different?) and then see what kinds of inconsistencies we can find. A bit easier, what we really need now is a good way to make these corrections and feed them back into the BL. For the RDF clued like yourself, I could take a patch in the form of a corrected graph and put it in the store. Cheers, -w -- William Waites http://eris.okfn.org/ww/foaf#i 9C7E F636 52F6 1004 E40A E565 98E3 BBF3 8320 7664
Re: [open-bibliography] ANN: British National Bibliography
(trimming the Cc a little bit but as Peter rightly points out we should probably trim it further for continued discussion) * [2010-11-22 17:33:33 +] Richard Light rich...@light.demon.co.uk écrit: ] Absolutely, though lets hope it's a random error and not something ] systematic. I went to download the file in question to have a look, but ] as it's a 450MB XML document, which Firefox is gallantly trying to load ] for me to read, I suspect I will fail. Any chance of these resources ] being offered as zip files? The cleaned record, which I would agree should not be deleted but superceded, can be retrieved as http://bnb.bibliographica.org/entry/GB97W9726.rdf So what do we do about this? If it won't appear in further corrected data from the BL, we should mint a new URI for it. This might be directly in bibliographica.org. The identifier/slug shouldn't be used because that's the BNB identifier. Easiest thing is just to make a hash. So if you do a search now you'll see two records for that book, the incorrect one from the original data and a hand-made one based on that record and what I could easily find with google. So the new record is at: http://bibliographica.org/entry/c4bb7da2c60413acc06f2369746da92b (anyone with a suggestion about how to make better identifiers please pipe up). As far as downloading the source data, I would suggest using wget(1). Cheers, -w -- William Waites http://eris.okfn.org/ww/foaf#i 9C7E F636 52F6 1004 E40A E565 98E3 BBF3 8320 7664
Re: [open-bibliography] ANN: British National Bibliography
* [2010-11-22 20:35:55 +] Richard Light rich...@light.demon.co.uk écrit: ] Thanks for this. A follow-up nitpick: the sameAs URL ] ] http://bibliographica.org/entity/f30566181677c26b17a024c0145f91cd ] ] for the author gives a 404. Right. There is (as yet) no particular graph there so what happens is a SPARQL construct for that URI as a subject is done. I've turned on sameAs processing in the query (only that one) and now it should give some more useful information. Time will tell how well that scales. Cheers, -w -- William Waites http://eris.okfn.org/ww/foaf#i 9C7E F636 52F6 1004 E40A E565 98E3 BBF3 8320 7664
Re: Semantic Ambiguity
* [2010-11-12 15:33:01 +0100] Henry Story henry.st...@bblfish.net écrit: ] I'd start differently. Start with the social web, and simple terms such ] as foaf and sioc. The build up meanings from the ground up, piece by ] piece by introducing value at each point in the game. FOAF avoided a minefield by using foaf:knows instead of e.g foaf:friend. Still, what exactly it means that a foaf:knows b is kept deliberately vague. It probably has as many interpretations as there are FOAF profiles. Maybe there is some basic consensus about the meaning which is the intersection of all (non-pathological) interpretations. But chosing the appropriate interpretation depends very much on the context or purpose of the communication or task at hand. In your slides I think you have implicitly assumed a context which has something to do with very basic questions of identity -- this is useful but is hardly the only context in which foaf:knows links between people can be considered and it isn't at all clear if the assumptions you make will hold in other contexts. ] Global naming is going to be useful, but by taking such a big ] problem, the linked data community is just confronting many big problems ] simultaneously, which is why it can seem intractable. The network effect ] will end up working itself out. This seems very hand-wavy to me. I agree that global naming is useful. But sorting out the myriad interpretations of these global names is a hard problem that I don't think is going to just work itself out. Cheers, -w -- William Waites http://eris.okfn.org/ww/foaf#i 9C7E F636 52F6 1004 E40A E565 98E3 BBF3 8320 7664 pgpUQHUVhWR40.pgp Description: PGP signature
Semantic Ambiguity
On Fri, Nov 12, 2010 at 08:40:14AM -0500, Patrick Durusau wrote: Semantic ambiguity isn't going to go away. It is part and parcel of the very act of communication. [...] Witness the lack of uniform semantics in the linked data community over something as common as sameAs. As the linked data community expands, so are the number of interpretations of sameAs. Why can't we fashion solutions for how we are rather than wishing for solution for how we aren't? I was at a lecture by Dave Robertson [0] the other day where he talked about some of the ideas behind one of his current projects [1]. Particularly relevant was the idea of completely abandoning any attempts at global semantics and instead working on making sure the semantics are clear on a local communication channel (as I understood it). So maybe that would mean a different meaning for sameAs in different datasets, and that's just fine as long as the reader is aware of that and fasions some transformation from their notion of sameAs to their peer's, mutatis mutandis for other predicates and classes. In some ways this is similar to how we use language. If I'm talking to a computer scientist I'll use a different but overlapping sub-language of English than if I'm talking to the postman. If I'm talking to a non-native English speaker I'll modify my speech so as to be more easily understood. Around here, tea means supper but a short distance to the South it more likely means a snack with cakes and cucumber sandwiches. The important thing is a context of communication which modifies -- and disambiguates meaning. This might be touched on in the RDF Semantics with the not often mentioned idea of an interpretation of a graph. How does this square with the apparent tendency to want to treat statements as overarching universal truths? Cheers, -w [0] http://www.dai.ed.ac.uk/groups/ssp/members/dave.htm [1] http://socialcomputer.eu/
Re: Is 303 really necessary?
On Fri, Nov 05, 2010 at 09:34:43AM +, Leigh Dodds wrote: Are you suggesting that Linked Data crawlers could/should look at the status code and use that to infer new statements about the resources returned? If so, I think that's the first time I've seen that mentioned, and am curious as to why someone would do it. Surely all of the useful information is in the data itself. Provenance and debugging. It would be quite possible to record the fact that this set of triples, G, were obtained by dereferencing this uri N, at a certain time, from a certain place, with a request that looked like this and a response that had these headers and response code. The class of information that is kept for [0]. If N appeared in G, that could lead directly to inferences involving the provenance information. If later reasoning is concerned at all with the trustworthiness or up-to-dateness of the data it could look at this as well. Keeping this quantity of information around might quickly turn out to be too data-intensive to be practical, but that's more of an engineering question. I think it does make some sense to do this in principle at least. Cheers, -w [0] http://river.styx.org/ww/2010/10/corscheck
Re: Is 303 really necessary?
On Thu, Nov 04, 2010 at 01:22:09PM +, Ian Davis wrote: Hi all, The subject of this email is the title of a blog post I wrote last night questioning whether we actually need to continue with the 303 redirect approach for Linked Data. My suggestion is that replacing it with a 200 is in practice harmless and that nothing actually breaks on the web. Please take a moment to read it if you are interested. cf. other discussion about RDF URI References and IRIs, where a resource is given an IRI that is not a valid URI as far as HTTP is concerned we can't dereference it properly so we need some kind of document - description indirection... Though in general I think the best practice is only to give resources IRIs that are also valid URIs... Cheers, -w
Re: Please allow JS access to Ontologies and LOD
On Wed, Oct 27, 2010 at 09:00:52PM +, Hugh Glaser wrote: Great stuff - thanks for the advice. Done for sameas.org and *.rkbexplorer.com However, did it via .htaccess, and would prefer to do it in /etc/httpd/http.conf, not least because the vhosts seems to make it end up with two of them (which I assume is not illegal?) Can anyone tell me the http.conf line that does the same thing, to help a lazy citizen :-) Same as in .htaccess I believe. I just put Header add Access-Control-Allow-Origin * for ckan.net in the apache config. Cheers, -w
Re: WordNet RDF
On 10-09-20 23:11, Vasiliy Faronov wrote: Have you looked at the GOLD ontology[1]? [1] http://linguistics-ontology.org/gold/ No, somehow I had missed that. It looks like just the thing! (could benefit from some examples though, sample sentences and how they would be represented with GOLD). Thank you for the link! -w -- William Waites w...@styx.org Mob: +44 789 798 9965 Fax: +44 131 464 4948 CD70 0498 8AE4 36EA 1CD7 281C 427A 3F36 2130 E9F5 signature.asc Description: OpenPGP digital signature
Re: WordNet RDF
On 10-09-20 12:45, Antoine Isaac wrote: Very interesting! I'm curious though: what's the application scenario that made you create this version? (hopefully this is closely enough related that my reply below isn't a non-sequitur) I worked on a toy NLP bot that might expose some real uses for representing natural language in RDF [0]. The basic premise was to allow users to describe bibliographic data (works and authors and such) in simple natural language sentences and have it output RDF (FRBR-esque) [1]. (Motivated partly by the fact that I am terrible at user interface design and had a very hard time trying to make a web interface that allowed users to enter data with anything other than a very simple structure). One vocabulary that I missed while doing this is something to represent parts of speech and grammatical syntax in natural language. I invented something ad-hoc but it might be useful to have a more completely thought out way to do this. You can see some examples in the first link. How do you make the distinction between the two situations--I mean, based on which elements in the Wordnet data? The approach that I took -- and keep in mind this was a toy, I have doubts about the scalability doing things this way was to (1) parse the natural language sentence into an annotated syntax tree as an intermediate form (represented in RDF) and then (2) run specially crafted N3 inference rules over it to generate the desired output. The inference rules encode the semantic relationships between concepts existing in (or across) sentences. I mostly worked with inference rules that hinged on the main verb in the sentence (which also happens to be the top of the syntax tree). In principle, with a complete enough set of such inference rules (most likely restricted to a particular domain of discourse, a truly general set would be very hard if it is possible at all) would resolve the ambiguity. In the case that makes sense there would be useful entailments, in the case that doesn't there wouldn't. I saw this kind of resolution of syntactic ambiguity happen a couple of times. Resolution of homonyms might work similarly. I'm not so sure the structure of creating a class hierarchy based on orthographical accident makes sense. Where the words do have a common conceptual root, certainly. But in the crack example I don't think so. They are (probably) completely different concepts that just happen to be denoted by the same string. I might be wrong but I don't think that wordnet contains enough information to make this choice. Cheers, -w [0] http://blog.okfn.org/2010/08/09/cataloguing-bibliographic-data-with-natural-language-and-rdf/ [1] http://pastebin.ca/1913826 -- William Waites w...@styx.org Mob: +44 789 798 9965 Fax: +44 131 464 4948 CD70 0498 8AE4 36EA 1CD7 281C 427A 3F36 2130 E9F5 signature.asc Description: OpenPGP digital signature
Re: New LOD ESW wikipage about Data Licensing
An idle thought. Suppose I take two datasets, licensed differently, and combine them. Maybe I do something clever to capture provenance information in how they are combined (a combination of opmv and evopat comes to mind). If the licenses are defined at a suitable granularity (is the cc vocabulary enough?) I can then derive the the resulting terms by doing something like the intersection of rights granted in the source licenses. So (copyleft \cap public domain) = copyleft, etc. I wonder about constructing inference rules for this... If the combination is done in a way that is reversible, simply selecting some triples from different sources, for example, rather than putting provenance and license information on graphs [0], putting it on individual triples might be nice. But then we need some token for a triple, ideally in a global way where if the same triple occurs independently in two places, two people making tokens for it will end up with the same token... Hrmmm... As I said, idle thoughts... Cheers, -w [0] I'm not sure graph isn't a misnomer, or at least loose language. An RDF graph is a set, I think, and you can make a standard graph relative to a predicate by taking vertices from subject and object and edges from IEXT(predicate). Is this spliting hairs? -- William Waites w...@styx.org Mob: +44 789 798 9965 Fax: +44 131 464 4948 CD70 0498 8AE4 36EA 1CD7 281C 427A 3F36 2130 E9F5 signature.asc Description: OpenPGP digital signature
Re: Next version of the LOD cloud diagram. Please provide input, so that your dataset is included.
On 10-09-08 17:47, Ted Thibodeau Jr wrote: On Sep 8, 2010, at 01:31 AM, Peter DeVries wrote: I am kind of annoyed by the CKAN site. I'm right there with you, Peter. Anja, you say you can edit without logging in but please note that the doc page [1] about this database says -- • Please register to CKAN bevor editing or adding any packages. The login issues should be fixed now. Something had changed at Google and Yahoo that was causing them to return 501 Unimplemented errors when the association was made. Updating python-openid to a newer version (2.2.5) appears to have solved the problem. Please let me know if anyone has further troubles logging in. When I ignore that and do dive into editing DBpedia's listing, I discover -- (Leaving your comments intact for the ckan-discuss list, you are correct that it is an RDBMS system and that starts showing through clearly when people used to thinking in an RDF or EAV way start throwing data at it. I am particularly interested in looking at ways to improve this, keeping in mind that it is a running system with real users and a lot of effort that has gone into building it -- so we need to be gentle). - The notes field uses Markdown markup, which I've never encountered anywhere else, and must now learn (or fake). - There must be a singular author, with a singular email address. DBpedia doesn't have a singular author, and there are several URIs which might be relevant to have here -- and they are not mailto: URIs. The best is an http: URI ... but there is no way to make this present, except as part of the literal associated with the mailto: URI. - There must be a singular maintainer, with a singular email address. Same issues as with author. - There are 14+ CKAN Resource links listed [2] in the documentation, but the form appears to only take 5 (at least, 4 were previously filled on the DBpedia page, and filling in the 5th didn't magically cause a 6th to open, nor was there a link to create a 6th). OH! Until I Preview the page -- and now there's an empty set of Resource boxes ... so I can add one more, and Preview, and maybe add one more, and Preview, and maybe Painful. - The licensure choices separate CC-ShareAlike and CC-Attribution, but do not list CC-Attribution-ShareAlike [3]. cc-by-sa is distinct from cc-by -- and also from cc-by-nc-sa (CC-Attribution-NonCommercial- ShareAlike), among others. Clarity of presentation is VERY important for licensing! - There appears to be an arbitrary limit on the number of Extras key-value pairs associated with any given data set ... which means that *truly* densely connected data sets will be short-changed. From all I can see here, this is an RDB-based thing, not RDF-based. That's disappointing, to say the least. All in all, the experience is challenging at best, when listing one data set. But I have several more to deal with, and today's the deadline! Hurrah! *sighs* Ted [1] http://esw.w3.org/TaskForces/CommunityProjects/LinkingOpenData/DataSets/CKANmetainformation#How_do_I_add_a_dataset_to_CKAN_or_edit_an_existing_dataset.3F [2] http://esw.w3.org/TaskForces/CommunityProjects/LinkingOpenData/DataSets/CKANmetainformation#CKAN_resource_links [3] http://en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License -- A: Yes. http://www.guckes.net/faq/attribution.html | Q: Are you sure? | | A: Because it reverses the logical flow of conversation. | | | Q: Why is top posting frowned upon? Ted Thibodeau, Jr. // voice +1-781-273-0900 x32 Evangelism Support //mailto:tthibod...@openlinksw.com // http://twitter.com/TallTed OpenLink Software, Inc. // http://www.openlinksw.com/ 10 Burlington Mall Road, Suite 265, Burlington MA 01803 http://www.openlinksw.com/weblogs/uda/ OpenLink Blogs http://www.openlinksw.com/weblogs/virtuoso/ http://www.openlinksw.com/blog/~kidehen/ Universal Data Access and Virtual Database Technology Providers -- William Waites william.wai...@okfn.org Mob: +44 789 798 9965Open Knowledge Foundation Fax: +44 131 464 4948Edinburgh, UK RDF Indexing, Clustering and Inferencing in Python http://ordf.org/
Re: Next version of the LOD cloud diagram. Please provide input, so that your dataset is included.
On 10-09-08 18:36, Leigh Dodds wrote: Hi, I've updated several packagea and knot had any issues. While CKAN may not be to everyone's taste, it's a much, much better than the previous approaches which were largely opaque Thank you for this Leigh. The fact that we will now have more structured data describing the cloud so that it can be analysed is another big win. Converting the data to RDF is easy. The CKAN API is simple and easy to use. In fact, for python hackers, there is http://bitbucket.org/ww/ckanrdf which is another kettle of fish, but it will crawl the CKAN API and put a DCAT representation into an rdflib store (the precise way this is handled -- see http://ordf.org/ is another topic that I would be very happy to discuss). Anyone is, of course, perfectly welcome to roll their own as was done for the current LOD work. Maybe the grumbling can be converted into useful contributions to the CKAN code base, which is open source and being used by a number of different organisations. No one ga bothered to create anything better in the past, so using what's available and looking for ways to improve it seems like a more constructive approach IMHO. I would suggest that unless there is particular interest on the public-lod list about the workings of CKAN and how it could be improved that we could continue discussion on the ckan-disc...@lists.okfn.org list. Contributions of code, ideas and (constructive) criticism alike are more than welcome. Cheers, -w -- William Waites william.wai...@okfn.org Mob: +44 789 798 9965Open Knowledge Foundation Fax: +44 131 464 4948Edinburgh, UK RDF Indexing, Clustering and Inferencing in Python http://ordf.org/
CKAN API Preformance Fixed
As was noticed during the crawling of the CKAN API over the past few days for creating the diagram, there were some performance problems. These were caused by one of the other ckan sites (there are about a dozen localised ones) that was misbehaving and eating CPU. The main (and largest) site has been moved to a dedicated machine and some caching has been put in place to further boost performance (though changes may take up to 15 minutes to appear in the API). The misbehaving site, which was heavily customised has been temporarily disabled as well. Please let me know if there are any further problems, but with luck it should be smooth sailing from now on. Cheers, -w -- William Waites william.wai...@okfn.org Mob: +44 789 798 9965Open Knowledge Foundation Fax: +44 131 464 4948Edinburgh, UK RDF Indexing, Clustering and Inferencing in Python http://ordf.org/
Re: Org. Namespace Example
On 10-06-23 23:31, Toby Inkster wrote: Firstly, bridges and beaches are not typically considered organisations. Sentient, self-organised bridges and beaches? On second thought maybe they should be foaf:Person -w -- William Waites william.wai...@okfn.org Mob: +44 789 798 9965Open Knowledge Foundation Fax: +44 131 464 4948Edinburgh, UK RDF Indexing, Clustering and Inferencing in Python http://ordf.org/
Re: Organization types predicates vs classes
On 10-06-08 04:27, Todd Vincent wrote: By adding OrganizationType to the Organization data model, you provide the ability to modify the type of organization and can then represent both (legal) entities and (legally unrecognized) organizations. :foo rdf:type SomeKindOfOrganisation . vs. :foo org:organisationType SomeKindOfOrganisation . I don't really see the need for an extra predicate with almost identical semantics to rdf:type. There is nothing stopping a subject from having more than one type. Having a special predicate doesn't really help with modification, you could easily do the same thing with rdf:type and still run up against the problem that there is no good way of specifying *when* a particular statement is true (OPMV notwithstanding) Cheers, -w -- William Waites william.wai...@okfn.org Mob: +44 789 798 9965Open Knowledge Foundation Fax: +44 131 464 4948Edinburgh, UK
Re: Organization types predicates vs classes
On 10-06-08 11:48, Dan Brickley wrote: Yes, exactly. The schema guarantees things will have multiple types. The art is to know when to bother mentioning each type. Saying things are an rdfs:Resource is rarely interesting. FWIW, I actually put (using an inferencer) rdfs:Resource on everything in [1][2] because I use the fresnel vocabulary to display things. This means I can make a generic lens like this, :resourceLens a fresnel:Lens ; fresnel:purpose fresnel:defaultLens ; fresnel:classLensDomain rdfs:Resource ; fresnel:showProperties ( rdf:type fresnel:allProperties ) . to use as a default. [1] http://knowledgeforge/pdw/ordf/ [2] http://bibliographica.org/ -- William Waites william.wai...@okfn.org Mob: +44 789 798 9965Open Knowledge Foundation Fax: +44 131 464 4948Edinburgh, UK
Re: Organization ontology
On 10-06-03 09:01, Dan Brickley wrote: I don't find anything particularly troublesome about the org: vocab on this front. If you really want to critique culturally-loaded ontologies, I'd go find one that declares class hierarchies with terms like 'Terrorist' without giving any operational definitions... I must admit when I looked at the org vocabulary I had a feeling that there were some assumptions buried in it but discarded a couple of draft emails trying to articulate it. I think it stems from org:FormalOrganization being a thing that is legally recognized and org:OrganizationalUnit (btw, any particular reason for using the North American spelling here?) being an entity that is not recognised outside of the FormalOrg Organisations can become recognised in some circumstances despite never having solicited outside recognition from a state -- this might happen in a court proceeding after some collective wrongdoing. Conversely you might have something that can behave like a kind of organisation, e.g. a class in a class-action lawsuit without the internal structure present it most organisations. Is a state an Organisation? Organisational units can often be semi-autonomous (e.g. legally recognised) subsidiaries of a parent or holding company. What about quangos or crown-corporations (e.g. corporations owned by the state). They have legal recognition but are really like subsidiaries or units. Some types of legally recognised organisations don't have a distinct legal personality, e.g. a partnership or unincorporated association so they cannot be said to have rights and responsibilities, rather the members have joint (or joint and several) rights and responsibilities. This may seem like splitting hairs but from a legal perspective its an important distinction at least in some legal environments. The description provided in the vocabulary is really only true for corporations or limited companies. I think the example, eg:contract1 is misleading since this is an inappropriate way to model a contract. A contract has two or more parties. A contract might include a duty to fill a role on the part of one party but it is not normally something that has to do with membership Membership usually has a particular meaning as applied to cooperatives and not-for-profits. They usually wring their hands extensively about what exactly membership means. This concept normally doesn't apply to other types of organisations and does not normally have much to do with the concept of a role. The president of ${big_corporation} cannot be said to have any kind of membership relationship to that corporation, for example. I think there might be more, but I don't think its a problem of embedding westminister assumptions because I don't think the vocabulary fits very well even in the UK and commonwealth countries when you start looking at it closely. Thoughts? Cheers, -w -- William Waites william.wai...@okfn.org Mob: +44 789 798 9965Open Knowledge Foundation Fax: +44 131 464 4948Edinburgh, UK