Denny, the statement-level of granularity you're describing is achieved by RDF reification. You describe it however as a "deprecated mechanism" of provenance, without backing it up.
Why do you think there must be a better mechanism? Maybe you should take another look at reification, or lower your provenance requirements, at least initially? Martynas graphity.org On Jun 22, 2012 5:20 PM, "Denny Vrandečić" <denny.vrande...@wikimedia.de> wrote: > Here's the use case: > > Every statement in Wikidata will have a URI. Every statement can have > one more references. > In many cases, the reference might be text on a website. > > Whereas it is always possible (and probably what we will do first) as > well as correct to state: > > Statement1 accordingTo SlashDot . > > it would be preferable to be a bit more specific on that, and most > preferably it would be to go all the way down to the sentence saying > > Statement1 accordingTo X . > > with X being a URI denoting the sentence that I mean in a specific > Slashdot-Article. > > I would prefer a standard or widely adopted way to how to do that, and > NIF-URIs seem to be a viable solution for that. We will come back to > this once we start modeling references in more detail. > > The reference could be pointing to a book, to a video, to a > mesopotamic stone table, etc. (OK, I admit that the different media > types will be differently prioritized). > > I hope this helps, > Cheers, > Denny > > 2012/6/21 Sebastian Hellmann <hellm...@informatik.uni-leipzig.de>: > > Hello Denny, > > I was traveling for the past few weeks and can finally answer your email. > > See my comments inline. > > > > On 05/29/2012 05:25 PM, Denny VrandeÄ ić wrote: > > > > Hello Sebastian, > > > > > > Just a few questions - as you note, it is easier if we all use the same > > standards, and so I want to ask about the relation to other related > > standards: > > * I understand that you dismiss IETF RFC 5147 because it is not stable > > enough, right? > > > > The offset scheme of NIF is built on this RFC. > > So the following would hold: > > @prefix ld: <http://www.w3.org/DesignIssues/LinkedData.html#> . > > @prefix owl: <http://www.w3.org/2002/07/owl#> . > > ld:offset_717_729 owl:sameAs ld:char=717,12 . > > > > > > We might change the syntax and reuse the RFC syntax, but it has several > > issues: > > 1. The optional part is not easy to handle, because you would need to > add > > owl:sameAs statements: > > > > ld:char=717,12;length=12,UTF-8 owl:sameAs ld:char=717,12;length=12 . > > ld:char=717,12;length=12,UTF-8 owl:sameAs ld:char=717,12 . > > ld:char=717,12;UTF-8 owl:sameAs ld:char=717,12;length=9876 . > > > > So theoretically ok, but annoying to implement and check. > > > > 2. When implementing web services, NIF allows the client to choose the > > prefix: > > > http://nlp2rdf.lod2.eu/demo/NIFStemmer?input-type=text&nif=true&prefix=http%3A%2F%2Fthis.is%2Fa%2Fslash%2Fprefix%2F&urirecipe=offset&input=President+Obama+is+president > . > > returning URIs like <http://this.is/a/slash/prefix/offset_10_15> > > So RFC 5147 would look like: > > <http://this.is/a/slash/prefix/char=717,12> > > <http://this.is/a/slash/prefix/char=717,12;UTF-8> > > or > > <http://this.is/a/slash/prefix?char=717,12> > > <http://this.is/a/slash/prefix?char=717,12;UTF-8> > > > > 3. Character like = , prevent the use of prefixes: > > echo "@prefix ld: <http://www.w3.org/DesignIssues/LinkedData.html#> > > . > > @prefix owl: <http://www.w3.org/2002/07/owl#> . > > ld:offset_717_729 owl:sameAs ld:char=717,12 . > > " > test.ttl ; rapper -i turtle test.ttl > > > > 4. implementation is a little bit more difficult, given that : > > $arr = split("_", "offset_717_729") ; > > switch ($arr[0]){ > > case 'offset' : > > $begin = $arr[1]; > > $end = $arr[2]; > > break; > > case 'hash' : > > $clength = $arr[1]; > > $slength = $arr[2]; > > $hash = $arr[3]; > > $rest = /*merge remaining with '_' */ > > break; > > } > > > > 5. RFC assumes a certain mime type, i.e. plain text. NIF does have a > broader > > assumption. > > > > * what is the relation to the W3C media fragment URIs? Did not find a > > pointer there. > > > > They are designed for media such as images, video, not strings. > > Potentially, the same principle can be applied, but it is not yet > > engineered/researched. > > > > * any plans of standardizing your approach? > > > > We will do NIF 2.0 as a community standard and finish it in a couple of > > months. It will be published under open licences, so anybody W3C or ISO > > might pick it up, easily. Other than that there are plans by several EU > > projects (see e.g. here > > > http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jun/0101.html > ) > > and a US project to use it and there are several third party > > implementations, already. We would rather have it adopted first on a > large > > scale and then standardized, properly, i.e. W3C. This worked quite well > for > > the FOAF project or for RDB2RDF Mappers. > > Chances for fast standardization are not so unlikely, I would assume. > > > > We would strongly prefer to just use a standard instead of advocating > > contenders for one -- if one exists. > > > > You might want to look at: > > http://www.w3.org/community/openannotation/wiki/TextCommentOnWebPage > > and the same highlighting here: > > > http://pcai042.informatik.uni-leipzig.de/~swp12-9/vorprojekt/index.php?annotation_request=http%3A%2F%2Fwww.w3.org%2FDesignIssues%2FLinkedData.html%23hash_10_12_60f02d3b96c55e137e13494cf9a02d06_Semantic%2520Web > > > > NIF equivalent (4 triples instad of 14 and only one generated uuid): > > ld:hash_10_12_60f02d3b96c55e137e13494cf9a02d06_Semantic%20Web a > str:String ; > > oa:hasBody [ > > oa:annotator <mailto:Bob> ; > > cnt:chars "Hey Tim, good idea that Semantic Web!" . > > ] > > > > So you might not think in a "contender" way. Approaches are > complementary. > > NIF is simpler and the URIs have some features that might be wanted > > (stability, uniqueness, easy to implement). > > This is why I was asking for your *use case* . > > > > Note that: there are still some problems, when annotating DOM with URIs, > > e.g. xPointer is abandoned and was never finished. Xpath has its limits > and > > is also expensive (i.e. SAX not possible). > > I think there is no proper solution as of now. > > All the best, > > Sebastian > > > > > > Cheers, > > Denny > > > > > > > > > > 2012/5/18 Sebastian Hellmann <hellm...@informatik.uni-leipzig.de> > > > > Hello again, > > maybe the question, I asked was lost, as the text was TL;DR > > > > I heard that, it is planned to track provenance of facts. e.g. Berlin has > > 3,337,000 citizens found here: > > http://www.worldatlas.com/**citypops.htm< > http://www.worldatlas.com/citypops.htm> > > Do you have a place where the use case and the requirements are > documented > > for this? Or is it out of scope? > > Will it be course grained, i.e. website level ? Or fine grained, i.e. > text > > paragraph level? See e.g. how Berlin is highlighted here: > > http://pcai042.informatik.uni-**leipzig.de/~swp12-9/** > > vorprojekt/index.php?**annotation_request=http%3A%2F%** > > 2Fwww.worldatlas.com%**2Fcitypops.htm%23hash_4_30_** > > > 7449e732716c8e68842289bf2e6667**d5_Berlin%2C%2520Germany%2520-**%25203%2C< > http://pcai042.informatik.uni-leipzig.de/~swp12-9/vorprojekt/index.php?annotation_request=http%3A%2F%2Fwww.worldatlas.com%2Fcitypops.htm%23hash_4_30_7449e732716c8e68842289bf2e6667d5_Berlin%2C%2520Germany%2520-%25203%2C > > > > in this very early prototype. > > > > Could you give me a link were I can read more about any Wikidata plans > > towards this direction? > > Sebastian > > > > > > > > On 05/16/2012 09:10 AM, Sebastian Hellmann wrote: > > > > Dear all, > > (Note: I could not find the document, where your requirements regarding > > the tracking of facts on the web are written, so I am giving a general > > introduction to NIF. Please send me a link to the document that specifies > > your need for tracing facts on the web, thanks) > > > > I would like to point your attention to the URIs used in the NLP > > Interchange Format (NIF). > > NIF-URIs are quite easy to use, understand and implement. NIF has a > > one-triple-per-annotation paradigm. The latest documentation can be > found > > here: > > http://svn.aksw.org/papers/**2012/WWW_NIF/public/string_**ontology.pdf< > http://svn.aksw.org/papers/2012/WWW_NIF/public/string_ontology.pdf> > > > > The basic idea is to use URIs with hash fragment ids to annotate or mark > > pages on the web: > > An example is the first occurrence of "Semantic Web" on > > http://www.w3.org/**DesignIssues/LinkedData.html< > http://www.w3.org/DesignIssues/LinkedData.html> > > as highlighted here: > > http://pcai042.informatik.uni-**leipzig.de/~swp12-9/** > > vorprojekt/index.php?**annotation_request=http%3A%2F%** > > 2Fwww.w3.org%2FDesignIssues%**2FLinkedData.html%23hash_10_**12_** > > 60f02d3b96c55e137e13494cf9a02d**06_Semantic%2520Web< > http://pcai042.informatik.uni-leipzig.de/~swp12-9/vorprojekt/index.php?annotation_request=http%3A%2F%2Fwww.w3.org%2FDesignIssues%2FLinkedData.html%23hash_10_12_60f02d3b96c55e137e13494cf9a02d06_Semantic%2520Web > > > > > > Here is a NIF example for linking a part of the document to the DBpedia > > entry of the Semantic Web: > > <http://www.w3.org/**DesignIssues/LinkedData.html#**offset_717_729< > http://www.w3.org/DesignIssues/LinkedData.html#offset_717_729> > > > > a str:StringInContext ; > > sso:oen > > <http://dbpedia.org/resource/**Semantic_Web< > http://dbpedia.org/resource/Semantic_Web>> > > . > > > > > > We are currently preparing a new draft for the spec 2.0. The old one can > > be found here: > > http://nlp2rdf.org/nif-1-0/ > > > > There are several EU projects that intend to use NIF. Furthermore, it is > > easier for everybody, if we standardize a Web annotation format together. > > Please give feedback of your use cases. > > All the best, > > Sebastian > > > > > > -- > > Dipl. Inf. Sebastian Hellmann > > Department of Computer Science, University of Leipzig > > Projects: http://nlp2rdf.org , http://dbpedia.org > > Homepage: > > http://bis.informatik.uni-**leipzig.de/SebastianHellmann< > http://bis.informatik.uni-leipzig.de/SebastianHellmann> > > Research Group: http://aksw.org > > > > > > ______________________________**_________________ > > Wikidata-l mailing list > > Wikidata-l@lists.wikimedia.org > > https://lists.wikimedia.org/**mailman/listinfo/wikidata-l< > https://lists.wikimedia.org/mailman/listinfo/wikidata-l> > > > > > > > > > > _______________________________________________ > > Wikidata-l mailing list > > Wikidata-l@lists.wikimedia.org > > https://lists.wikimedia.org/mailman/listinfo/wikidata-l > > > > > > > > -- > > Dipl. Inf. Sebastian Hellmann > > Department of Computer Science, University of Leipzig > > Projects: http://nlp2rdf.org , http://dbpedia.org > > Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann > > Research Group: http://aksw.org > > > > -- > Project director Wikidata > Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin > Tel. +49-30-219 158 26-0 | http://wikimedia.de > > Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. > Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg > unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das > Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. > > _______________________________________________ > Wikidata-l mailing list > Wikidata-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata-l >
_______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l