Re: [Wikidata-l] Another birthday gift: SPARQL queries in the browser over Wikidata RDF dumps using Linked Data Fragments

2014-10-30 Thread Cristian Consonni
2014-10-30 19:41 GMT+01:00 Cristian Consonni :
> 2014-10-30 18:05 GMT+01:00 Markus Krötzsch :
>> Awesome :-) Small note: I just got a "Bad Gateway" when trying
>> http://data.wikidataldf.com/ but it now seems to work.
>
> I was restarting the server in fact I have uploaded also the
> wikidata-terms dump now.
> (the wikidata-statements file is not collaborating though :( )

Ok, now I have managed to add the Wikidata statements dump too.

If somebody would like to add some exemple SPARQL queries it would be awesome.

>> It also seems that some of your post answers the question from my previous
>> email. That sounds as if it is pretty hard to create HDT exports (not much
>> surprise there). Maybe it would be nice to at least reuse the work: could we
>> re-publish your HDT dumps after you created them?
>
> yes, sure, here they are:
> http://wikidataldf.com/download/

I should add, yes, it is pretty hard to create the HDT file since the
process requires an awful lot of RAM, and I don't know if in the
future I will be able to produce them.

C

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Wikidata RDF

2014-10-30 Thread Cristian Consonni
2014-10-30 17:34 GMT+01:00 Markus Krötzsch :
> On 30.10.2014 11:49, Cristian Consonni wrote:
>>
>> 2014-10-29 22:59 GMT+01:00 Lydia Pintscher :
>>>
>>> Help with this would be awesome and totally welcome. The tracking bug
>>> is at https://bugzilla.wikimedia.org/show_bug.cgi?id=48143
>>
>>
>> Speaking of totally awesome (aehm :D):
>> * see: http://wikidataldf.com
>> * see this other thread:
>> https://lists.wikimedia.org/pipermail/wikidata-l/2014-October/004920.html
>>
>> (If I can ask, having the RDF dumps in HDT format [again, see the
>> other thread] would be really helpful)
>
>
> We are using OpenRDF. Can it do HDT? If yes, this would be easy to do. If
> no, it would be easier to use a standalone tool to transform our dumps. We
> could still do this. Do you have any recommendation what we could use there
> (i.e., a memory-efficient command-line conversion script for N3 -> HDT)?

It seems that OpenRDF does not support HDT creation (see [1]).
I have been using the rdf2hdt tool, obtained compiling the devel
branch of the hdt-cpp library[2].
Which is developed by the group who is proposing the standard
implementation to the W3C.
C

[1] https://openrdf.atlassian.net/browse/SES-1874
[2] https://github.com/rdfhdt/hdt-cpp/tree/devel

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Categories in Wikidata

2014-10-30 Thread Gerard Meijssen
Hoi,
A category or a list includes entries that fulfill certain criteria. For
instance,many awards are given to humans, so we can define on the list "is
a list of" "human"with a qualifier "award received" "Award name". Over 1400
categories have such definitions [1]. When you choose any of them, there is
an icon for the "Reasonator". It shows up to 500 items that fit the
definition independent if they are on one, all or no categories.

The Reasonator will show up to 500 people on the award. It will show the
award on the person and, it will show the first 500 items that fit the
query defitinition and all others may be shown in.

I am actively harvesting categories in this way and, I do this based on the
content of categories on many Wikipedias. As a result at least half a
million statements have been made as a result, probably a lot more.

What will be of real interest to you is that for all those 1400 categories
the entries will show in the Reasonator. When there are more than 500,
there is an option to see all of them in a differrent screen.
Thanks,
  GerardM




[1]
http://tools.wmflabs.org/autolist/autolist1.html?q=CLAIM%5B31%3A4167836%5D%20AND%20CLAIM%5B360%3A5%5D

On 30 October 2014 16:09, Nicholas Humfrey 
wrote:

>  Hello,
>
>  [think this come up before, but can't find a recent thread that directly
> relates to this]
>
>
>  Are there any plans directly relate entities in Wikidata with categories
> in Wikidata – and avoid the duplication between all the different page
> languages?
>
>
>  For example this book:
>  http://www.wikidata.org/wiki/Q3235393
> 
>  http://en.wikipedia.org/wiki/Half_of_a_Yellow_Sun
> 
>
>  And the English Wikipedia page is a member of this category:
> http://en.wikipedia.org/wiki/Category:War_novels
> 
>
>  Which has this Wikidata ID associated with it:
> http://www.wikidata.org/wiki/Q8170055
> 
>
>
>  Currently we could resolve it using the following, slightly convoluted
> workflow:
>
>1. Get Wikidata ID and resolve to 1 (or more) Wikipedia language pages
>2. For each Wikipedia page, lookup a list of categories and gather
>together
>3. From each category lookup a Wikidata ID for that category page
>4. Remove duplicate Wikidata IDs
>5. Lookup each of the Wikidata IDs in Wikidata to get title/description
>
> Going from a Wikidata Category to a list of Wikidata entities could be
> done in a similar way.
>
>  Is there a better way of doing this now or the near future?
>
>
>  nick.
>
>
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Another birthday gift: SPARQL queries in the browser over Wikidata RDF dumps using Linked Data Fragments

2014-10-30 Thread Cristian Consonni
2014-10-30 18:05 GMT+01:00 Markus Krötzsch :
> Hi Christian,
>
> Awesome :-) Small note: I just got a "Bad Gateway" when trying
> http://data.wikidataldf.com/ but it now seems to work.

I was restarting the server in fact I have uploaded also the
wikidata-terms dump now.
(the wikidata-statements file is not collaborating though :( )

> It also seems that some of your post answers the question from my previous
> email. That sounds as if it is pretty hard to create HDT exports (not much
> surprise there). Maybe it would be nice to at least reuse the work: could we
> re-publish your HDT dumps after you created them?

yes, sure, here they are:
http://wikidataldf.com/download/

C

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Another birthday gift: SPARQL queries in the browser over Wikidata RDF dumps using Linked Data Fragments

2014-10-30 Thread Markus Krötzsch

Hi Christian,

Awesome :-) Small note: I just got a "Bad Gateway" when trying 
http://data.wikidataldf.com/ but it now seems to work.


It also seems that some of your post answers the question from my 
previous email. That sounds as if it is pretty hard to create HDT 
exports (not much surprise there). Maybe it would be nice to at least 
reuse the work: could we re-publish your HDT dumps after you created 
them? I thought about creating HDT right away but this is quite hard 
since the order is based on URL strings and must thus be different from 
any order one could establish "naturally" on the Wikidata data.


Cheers,

Markus


On 30.10.2014 11:32, Cristian Consonni wrote:

Dear all,

I wanted to join in and give my birthday present to Wikidata (I  am a
little bit late, though!)
(also, honestly, I didn't recall it was Wikidata's birthday, but it is
a nice occasion :P)

Here it is:
http://wikidataldf.com

What is LDF?
LDF stands for Linked Data Fragments, they are a new system for query
RDF datasets that stands middle way between having a SPARQL endpoint
and downloading the whole thing.

More formally LDF is «a publishing method [for RDF datasets] that
allows efficient offloading of query execution from servers to clients
through a lightweight partitioning strategy. It enables servers to
maintain availability rates as high as any regular HTTP server,
allowing querying to scale reliably to much larger numbers of
clients»[1].

This system was devised Ruben Verborgh, Miel Vander Sande and Pieter
Colpaert at Multimedia Lab (Ghent University)  in Ghent, Belgium.
You can read more about it: http://linkeddatafragments.org/

What is Wikidata LDF?
Using the software by Verborgh et al. I have setup the website
http://wikidataldf.com that contains:
* an interface to navigate in the RDF data and query them using the
Triple Pattern Fragments client
* a web client where you can compose and execute SPARQL queries

This is not, strictly speaking, a SPARQL endpoint (not all the SPARQL
standard is implemented and it is slower, but it should be more
reliable, if you are interested in details, please do read more at the
link above).

The data are, for the moment, limited to the sitelinks dump but I am
working towards adding the other dump. I have taken the Wikidata RDF
dumps as of Oct, 13th 2014[2].

To use them I had to convert them in HDT format[3a][3b], using the
hdt-cpp library[3c] (devel) (which is taking quite a lot of resources
and computing time for the whole dumps, that's the reason why I
haven't published the rest yet ^_^).

DBpedia has also this[4]:
http://fragments.dbpedia.org/

All the software used is available under the MIT license on the LDF
repo on github[5a], and also the (two pages) website is available
here[5b].

I would like to thank Ruben for his feedback and his presentation
about LDF at SpazioDati in Trento, Italy (here's the slides[6]).

All this said, happy birthday Wikidata.

Cristian

[1] http://linkeddatafragments.org/publications/ldow2014.pdf
[2] https://tools.wmflabs.org/wikidata-exports/rdf/exports/
[3a] http://www.rdfhdt.org/
[3b] http://www.w3.org/Submission/HDT-Implementation/
[3c] https://github.com/rdfhdt/hdt-cpp
[4] http://sourceforge.net/p/dbpedia/mailman/message/32982329/
[5a] see the Browser.js, Server.js and Client.js repos in
https://github.com/LinkedDataFragments
[5b] https://github.com/CristianCantoro/wikidataldf
[6] 
http://www.slideshare.net/RubenVerborgh/querying-datasets-on-the-web-with-high-availability

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l




___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Another birthday gift: SPARQL queries in the browser over Wikidata RDF dumps using Linked Data Fragments

2014-10-30 Thread Kingsley Idehen

On 10/30/14 9:29 AM, Kingsley Idehen wrote:

On 10/30/14 6:32 AM, Cristian Consonni wrote:

Dear all,

I wanted to join in and give my birthday present to Wikidata (I am a
little bit late, though!)
(also, honestly, I didn't recall it was Wikidata's birthday, but it is
a nice occasion :P)

Here it is:
http://wikidataldf.com

What is LDF?
LDF stands for Linked Data Fragments, they are a new system for query
RDF datasets that stands middle way between having a SPARQL endpoint
and downloading the whole thing.

More formally LDF is «a publishing method [for RDF datasets] that
allows efficient offloading of query execution from servers to clients
through a lightweight partitioning strategy. It enables servers to
maintain availability rates as high as any regular HTTP server,
allowing querying to scale reliably to much larger numbers of
clients»[1].

This system was devised Ruben Verborgh, Miel Vander Sande and Pieter
Colpaert at Multimedia Lab (Ghent University)  in Ghent, Belgium.
You can read more about it:http://linkeddatafragments.org/

What is Wikidata LDF?
Using the software by Verborgh et al. I have setup the website
http://wikidataldf.com  that contains:
* an interface to navigate in the RDF data and query them using the
Triple Pattern Fragments client
* a web client where you can compose and execute SPARQL queries

This is not, strictly speaking, a SPARQL endpoint (not all the SPARQL
standard is implemented and it is slower, but it should be more
reliable, if you are interested in details, please do read more at the
link above).

The data are, for the moment, limited to the sitelinks dump but I am
working towards adding the other dump. I have taken the Wikidata RDF
dumps as of Oct, 13th 2014[2].

To use them I had to convert them in HDT format[3a][3b], using the
hdt-cpp library[3c] (devel) (which is taking quite a lot of resources
and computing time for the whole dumps, that's the reason why I
haven't published the rest yet ^_^).

DBpedia has also this[4]:
http://fragments.dbpedia.org/

All the software used is available under the MIT license on the LDF
repo on github[5a], and also the (two pages) website is available
here[5b].

I would like to thank Ruben for his feedback and his presentation
about LDF at SpazioDati in Trento, Italy (here's the slides[6]).

All this said, happy birthday Wikidata.

Cristian

[1]http://linkeddatafragments.org/publications/ldow2014.pdf
[2]https://tools.wmflabs.org/wikidata-exports/rdf/exports/
[3a]http://www.rdfhdt.org/
[3b]http://www.w3.org/Submission/HDT-Implementation/
[3c]https://github.com/rdfhdt/hdt-cpp
[4]http://sourceforge.net/p/dbpedia/mailman/message/32982329/
[5a] see the Browser.js, Server.js and Client.js repos in
https://github.com/LinkedDataFragments
[5b]https://github.com/CristianCantoro/wikidataldf
[6]http://www.slideshare.net/RubenVerborgh/querying-datasets-on-the-web-with-high-availability 



Yep! And for publishing some of the information above into the Linked 
Open Data Cloud, from this thread, via nanotation:



a schema:WebPage;
rdfs:label "Wikidata LDF";
skos:altLabel "Wikidata Linked Data Fragments" ;
dcterms:hasPart , 
;
xhv:related , 
 ;

rdfs:comment """
I wanted to join in and give my birthday present 
to Wikidata (I  am a

little bit late, though!)
(also, honestly, I didn't recall it was Wikidata's 
birthday, but it is

a nice occasion :P)
""" ;

rdfs:comment """
What is Wikidata LDF?
Using the software by Verborgh et al. I have setup 
the website

http://wikidataldf.com that contains:
* an interface to navigate in the RDF data and 
query them using the

Triple Pattern Fragments client
* a web client where you can compose and execute 
SPARQL queries


This is not, strictly speaking, a SPARQL endpoint 
(not all the SPARQL
standard is implemented and it is slower, but it 
should be more
reliable, if you are interested in details, please 
do read more at the

link above).

The data are, for the moment, limited to the 
sitelinks dump but I am
working towards adding the other dump. I have 
taken the Wikidata RDF

dumps as of Oct, 13th 2014[2].

To use them I had to convert them in HDT 
format[3a][3b], using the
hdt-cpp library[3c] (devel) (which is taking quite 
a lot of resources
and computing time for the whole dumps, that's the 
reason why I

haven't published the rest yet ^_^).
""" ;
dcterms:references 


Re: [Wikidata-l] Wikidata RDF

2014-10-30 Thread Paul Houle
Here's my take.

RDF standards,  in themselves,  don't address all of the issues needed in a
data wiki.  I've been thinking about the math for data wikis and it seems
to me you could have a bipartite system where you have "the fact" and then
"the operational metadata about the fact" and these are conceptually two
different things.  Then you can query against the operational metadata to
project out RDF or similar facts.

Had the Wikidata people had the right idea at the time and had good luck,
 they might have been able to build such a system.  As it is they came up
with a plan they were able to execute and they did it well.

The real trouble with an RDF export from Wikidata is that if you do a
complete export,  you're going to get something tangled up that you can't
query easily with SPARQL.

There are really two answers to this,  one of them the essentially
forward-chaining approach of "create a canonical data model" (it doesn't
need to be the "one ring to rule them all" for the whole web,  just the one
you need for your job),  then you can project out an RDF graph where you've
expressed your opinions about all the opinions in Wikidata.

There's a backwards-chaining kind of strategy where you,  effectively,  try
running the query multiple times with different strategies and then do data
clean-up post query.  That's an interesting topic too,  one that again
demands something beyond ordinary RDF.  Since RDF and SPARQL are so well
specified it also possible to extend them to do new things,  such as
"taint" facts with their original and propagate it to the output.

I think also people too are realizing ISO Common Logic is a superset of RDF
and it is really about time that support for arity > 2 predicates comes
around.
Note that arity>2 is already exists in W3C specifications,  in that the
fundamental object in SPARQL is a "SPARQL result set" which is an
arbitrary-length tuple of nodes.  It is clear what should happen if you
write a triple pattern like

{
   ?s ?p ?o1 ?o2 ?o3 .
}

This also gives a more direct mapping from SQL to SPARQL,  one that would
be comfortable if there was some syntactic sugar to specify fields by names.

Yes,  you can fake it by writing triple patterns,  but in practice people
struggle to even get simple SQL-like queries to work right,  and can't do
the very simple things people did with production rules systems back in the
1970s.

OWL was designed on the basis of math,  not on the basis of "what are the
requirements for large scale data integration".  Thus it lacks very basic
facilities,  such as numeric conversions between,  say,  heights,  in
different units.





On Thu, Oct 30, 2014 at 9:08 AM, Kingsley Idehen 
wrote:

>  On 10/29/14 5:59 PM, Lydia Pintscher wrote:
>
> Hey Phillip :)
>
> On Wed, Oct 29, 2014 at 7:41 PM, Phillip Rhodes 
>  wrote:
>
>  > FWIW, put me in the camp of "people who want to see wikidata available> 
> via RDF" as well.  I won't argue that RDF needs to be the **native**> format 
> for Wikidata, but I think it would be a crying shame for such a> large 
> knowledgebase to be cut off from seamless integration with the> rest of the 
> LinkedData world.>> That said, I don't really care if RDF/SPARQL support come 
> later and> are treated as an "add on", but I do think Wikidata should at 
> least> have that as a goal for "eventually".  And if I can help make that> 
> happen, I'll try to pitch in however I can.   I have some experiments> I'm 
> doing now, working on some new approaches to scaling RDF> triplestores, so 
> using the Wikidata data may be an interesting testbed> for that down the 
> road.>> And on a related note - and apologies if this has been discussed to> 
> death, but I haven't been on the list since the beginning - but I am> curious 
> if there is any formal collaboration> (in-place|proposed|possible) between 
> dbpedia and wikidata?
>
>  Help with this would be awesome and totally welcome. The tracking bug
> is at https://bugzilla.wikimedia.org/show_bug.cgi?id=48143
>
>
> Lydia,
>
> Linked Open Data URIs for tracking issues such as the one above:
>
> [1]
> http://linkeddata.uriburner.com/about/id/entity/https/bugzilla.wikimedia.org/show_bug.cgi?id=48143
> [2]
> http://bit.ly/vapour-report-sample-wikidata-issue-tracking-entity-http-uri
> -- vapour report on the Linked Data URI above
> [3] http://linkeddata.uriburner.com/c/9BTVWIGG -- use of #this to make a
> Linked Open Data URI "on the fly" (no owl:sameAs reasoning and inference
> applied)
> [4] http://linkeddata.uriburner.com/c/8GUIAJ -- ditto, but with
> owl:sameAs reasoning and inference applied.
>
> Since this mailing list is online, I can also add some RDF statements into
> this post. Basically, this turns said post (or any other such conversation)
> into a live Linked Open Data creation and publication mechanism, by way of
> nanotation [1].
>
>
> ## Nanotation Start ##
>
>
> 
> 

Re: [Wikidata-l] Wikidata RDF

2014-10-30 Thread Magnus Manske
Hi,

I am running the Wikidata query tool (WDQ) at http://wdq.wmflabs.org/

WDQ can run many advanced queries, but I am using my bespoke query language.

I could try to write a wrapper around it, but have not had much (aka
"none") experience with SPARQL. Are there some common use case examples
(even fictional ones) I could look at, or does anyone want to collaborate
on a wrapper?

Cheers,
Magnus

On Thu, Oct 30, 2014 at 4:19 PM, Phillip Rhodes 
wrote:

> On Thu, Oct 30, 2014 at 4:42 AM, Markus Krötzsch
>  wrote:
> > Hi Phillip,
> >
> > Are you aware of the Wikidata RDF exports at
> > http://tools.wmflabs.org/wikidata-exports/rdf/ ? Do they meet your
> > requirements for now or do you need something different? If you have
> > specific plans for the RDF, I would be curious to learn about them.
>
> Only in passing, I'm only just starting to really dip my toes into the
> Wikidata waters now.  Offhand
> I'd say that having RDF dumps is great, depending on how frequently
> they are exported.  Of course
> I'd love to see live access to the current data via SPARQL in general,
> but my specific use-case
> can be driven off exports.
>
> Basically, I work on applying Semantic Web technology to enterprise /
> organizational
> knowledge management, using tools like Jena and Stanbol.   As part of
> that, we
> do content enhancement and automatic entity linking with Stanbol.
> Right now we mainly
> use dbpedia for that, but I'm trying to figure out how data from
> Wikidata will play into this
> as well.
>
>
> Phil
>
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Wikidata RDF

2014-10-30 Thread Markus Krötzsch

On 30.10.2014 11:49, Cristian Consonni wrote:

2014-10-29 22:59 GMT+01:00 Lydia Pintscher :

Help with this would be awesome and totally welcome. The tracking bug
is at https://bugzilla.wikimedia.org/show_bug.cgi?id=48143


Speaking of totally awesome (aehm :D):
* see: http://wikidataldf.com
* see this other thread:
https://lists.wikimedia.org/pipermail/wikidata-l/2014-October/004920.html

(If I can ask, having the RDF dumps in HDT format [again, see the
other thread] would be really helpful)


We are using OpenRDF. Can it do HDT? If yes, this would be easy to do. 
If no, it would be easier to use a standalone tool to transform our 
dumps. We could still do this. Do you have any recommendation what we 
could use there (i.e., a memory-efficient command-line conversion script 
for N3 -> HDT)?


Markus



___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Wikidata RDF

2014-10-30 Thread Phillip Rhodes
On Thu, Oct 30, 2014 at 4:42 AM, Markus Krötzsch
 wrote:
> Hi Phillip,
>
> Are you aware of the Wikidata RDF exports at
> http://tools.wmflabs.org/wikidata-exports/rdf/ ? Do they meet your
> requirements for now or do you need something different? If you have
> specific plans for the RDF, I would be curious to learn about them.

Only in passing, I'm only just starting to really dip my toes into the
Wikidata waters now.  Offhand
I'd say that having RDF dumps is great, depending on how frequently
they are exported.  Of course
I'd love to see live access to the current data via SPARQL in general,
but my specific use-case
can be driven off exports.

Basically, I work on applying Semantic Web technology to enterprise /
organizational
knowledge management, using tools like Jena and Stanbol.   As part of that, we
do content enhancement and automatic entity linking with Stanbol.
Right now we mainly
use dbpedia for that, but I'm trying to figure out how data from
Wikidata will play into this
as well.


Phil

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


[Wikidata-l] Missing Wikipedia links tool - thought

2014-10-30 Thread Amir E. Aharoni
ZOMFG, the tool that Denny introduced yesterday as a birthday gift is
unbelieavably useful and fun.

Here are a few thoughts I had about it:

I went over all the pages for the Hebrew-English pair. There were only 36,
and that is suspiciously low. Were all the articles in these languages
tested by this tool or only a subset?

Even though almost all of the tool's suggestions were correct It would be
problematic to fix these automatically. There were several types of article
pairs:
* Unrelated because one of the suggested pages was a disambiguation page
and the other was not. Sometimes there was a link to the correct related
page from the disambig page. If anybody makes a new version, this certainly
should be corrected.
* Related, but with explicit interlanguage links in the articles' source
code. This required old-style interwiki conflict resolution. There was a
surprisingly high number of these. I managed to resolve all the conflicts
manually, but it did take a few minutes for each case. Examples from
en.wikipedia: [[Bombe]], [[Bomba (cryptography)]], [[Diary of a Wimpy
Kid]], [[PFLAG]].
* Related, with a Wikidata item for each page, but without conflicts, so
easily mergeable. This can be done by a bot once it is identified for sure.

Adding links to a page without any language links shows a box to write a
language and a target title, and that's it. Adding a link to a new language
to a page which already has some interlanguage links opens the whole item
page in Wikidata (a whole other website!) and requires scrolling, editing
the links, and in many cases - merging the items manually. The result is
actually the same, so it would be very nice if the second case wouldn't be
so complicated.

That's it for now - I hope somebody finds it useful :)

I finished with Hebrew, and I'm going on to Russian, which has over a
thousand article pairs. IT'S INSANELY FUN.

--
Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
http://aharoni.wordpress.com
‪“We're living in pieces,
I want to live in peace.” – T. Moore‬

2014-10-29 19:56 GMT+02:00 Denny Vrandečić :

> Folks,
>
> as you know, many Googlers are huge fans of Wikipedia. So here’s a little
> gift for Wikidata’s second birthday.
>
> Some of my smart colleagues at Google have run a few heuristics and
> algorithms in order to discover Wikipedia articles in different languages
> about the same topic which are missing language links between the articles.
> The results contain more than 35,000 missing links with a high confidence
> according to these algorithms. We estimate a precision of about 92+% (i.e.
> we assume that less than 8% of those are wrong, based on our evaluation).
> The dataset covers 60 Wikipedia language editions.
>
> Here are the missing links, available for download from the WMF labs
> servers:
>
> https://tools.wmflabs.org/yichengtry/merge_candidate.20141028.csv
>
> The data is published under CC-0.
>
> What can you do with the data? Since it is CC-0, you can do anything you
> want, obviously, but here are a few suggestions:
>
> There’s a small tool on WMF labs that you can use to verify the links (it
> displays the articles side by side from a language pair you select, and
> then you can confirm or contradict the merge):
>
> https://tools.wmflabs.org/yichengtry
>
> The tool does not do the change in Wikidata itself, though (we thought it
> would be too invasive if we did that). Instead, the results of the human
> evaluation are saved on WMF labs. You are welcome to take the tool and
> extend it with the possibility to upload the change directly on Wikidata,
> if you so wish, or, once the data is verified, to upload the results.
>
> Also, Magnus Manske is already busy uploading the data to the Wikidata
> game, so you can very soon also play the merge game on the data directly.
> He is also creating the missing items on Wikidata. Thanks Magnus for a very
> pleasant cooperation!
>
> I want to call out to my colleagues at Google who created the dataset -
> Jiang Bian and Si Li - and to Yicheng Huang, the intern who developed the
> tool on labs.
>
> I hope that this small data release can help a little with further
> improving the quality of Wikidata and Wikipedia! Thank you all, you are
> awesome!
>
> Cheers,
> Denny
>
>
>
> On Wed Oct 29 2014 at 10:52:05 AM Lydia Pintscher <
> lydia.pintsc...@wikimedia.de> wrote:
>
> Hey folks :)
>
> Today Wikidata is turning two. It amazes me what we've achieved in
> just 2 years. We've built an incredible project that is set out to
> change the world. Thank you everyone who has been a part of this so
> far.
> We've put together some notes and opinions. And there are presents as
> well! Check them out and leave your birthday wishes:
> https://www.wikidata.org/wiki/Wikidata:Second_Birthday
>
>
> Cheers
> Lydia
>
> --
> Lydia Pintscher - http://about.me/lydia.pintscher
> Product Manager for Wikidata
>
> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesells

[Wikidata-l] Categories in Wikidata

2014-10-30 Thread Nicholas Humfrey
Hello,

[think this come up before, but can't find a recent thread that directly 
relates to this]


Are there any plans directly relate entities in Wikidata with categories in 
Wikidata – and avoid the duplication between all the different page languages?


For example this book:
http://www.wikidata.org/wiki/Q3235393
http://en.wikipedia.org/wiki/Half_of_a_Yellow_Sun

And the English Wikipedia page is a member of this category:
http://en.wikipedia.org/wiki/Category:War_novels

Which has this Wikidata ID associated with it:
http://www.wikidata.org/wiki/Q8170055


Currently we could resolve it using the following, slightly convoluted workflow:

  1.  Get Wikidata ID and resolve to 1 (or more) Wikipedia language pages
  2.  For each Wikipedia page, lookup a list of categories and gather together
  3.  From each category lookup a Wikidata ID for that category page
  4.  Remove duplicate Wikidata IDs
  5.  Lookup each of the Wikidata IDs in Wikidata to get title/description

Going from a Wikidata Category to a list of Wikidata entities could be done in 
a similar way.

Is there a better way of doing this now or the near future?


nick.

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Wikidata turns two!

2014-10-30 Thread Allan J. Aguilar
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

El Wed, 29 Oct 2014 18:51:24 +0100
Lydia Pintscher  escribió:
> Hey folks :)
> 
> Today Wikidata is turning two. It amazes me what we've achieved in
> just 2 years. We've built an incredible project that is set out to
> change the world. Thank you everyone who has been a part of this so
> far.
> We've put together some notes and opinions. And there are presents as
> well! Check them out and leave your birthday wishes:
> https://www.wikidata.org/wiki/Wikidata:Second_Birthday
> 
> 
> Cheers
> Lydia
> 

Happy birthday! I wrote a little note about it on my blog:
https://editandowikimedia.wordpress.com/2014/10/29/wikidata-2-anos/

Cheers!
- -- 
Allan J. Aguilar
ral...@vmail.me - ralgis@freenode - al...@jabber.ccc.de
PGP: B387 F3B1 0F2C F46B 36AD  FAFF 7BC3 594D F7C0 E1A3
OTR: E95CB6E6 22751983 CA8F3F67 3DFACBFF 0FA3A1BC
userralgis@Twitter - User:Ralgis@Wikimedia
https://editandowikimedia.wordpress.com
https://libredebian.wordpress.com
https://revistasifra.wordpress.com/
-BEGIN PGP SIGNATURE-

iQIcBAEBCgAGBQJUUk5hAAoJEHvDWU33wOGjrR8P/Ag5NaQMoQOXdYKFS7dMwe3u
nOitD4Et7hqZBQgA63NEvhVdHwKW4rCa4IBWfxUkWlzlAhM8UHhpUBL7NRGRrzeG
Lo6DzmbkZl9UUgZeRs7eJPwgKkEYv85205uTNuOidIA3Z+CPl3RjxanptrI2hrOU
9J4pq4Xd5gupdbtD07iNJbM2vfH95OIQGMLVMXLvtqgXURIWKyGdAmvkHnt1Vj2D
k5b5q6/hG/Ih80m7imkE45uIPReODlOXu4PcRVi5IIJcNnZqLYNcd3knodY6l1qj
NmsMsGeP12UixFmz0D1milO2Nb4g2mRX4Syl34ek2KZzXvVFycnzNHgx8ov93CfT
6v99E+VfzNuOxi7jLzBHeG/t7425m5Zkdo6Yq6nHf8oHkTtpAym3tpX5h1qeDJF+
zBcpMCIdpSXZkns0q3wR0yAoESfSWu6hOx30NKA1jkYqQFBB5uocRbR/ScQnpCtH
GtpbLGfSrb64GZMpxDW2H1E3mFzJsgt9sbU4uPw7T2Rg1hfYDMoxEBOglx+Y0GWf
6wv7tJ0enXToSRpwKd3aQBD0jln3nNxHta78fTcesAGFUB+bNJCFaSvpEn3I2+rF
aQhpOpTfnSfeJCsWtxkMQnI+Na4YkCbouZxDmHDcxI5iKuwG6ilw1823tMrz0faL
RJA9PUqry6Q5wQatZwD5
=tmS2
-END PGP SIGNATURE-
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Another birthday gift: SPARQL queries in the browser over Wikidata RDF dumps using Linked Data Fragments

2014-10-30 Thread Kingsley Idehen

On 10/30/14 6:32 AM, Cristian Consonni wrote:

Dear all,

I wanted to join in and give my birthday present to Wikidata (I  am a
little bit late, though!)
(also, honestly, I didn't recall it was Wikidata's birthday, but it is
a nice occasion :P)

Here it is:
http://wikidataldf.com

What is LDF?
LDF stands for Linked Data Fragments, they are a new system for query
RDF datasets that stands middle way between having a SPARQL endpoint
and downloading the whole thing.

More formally LDF is «a publishing method [for RDF datasets] that
allows efficient offloading of query execution from servers to clients
through a lightweight partitioning strategy. It enables servers to
maintain availability rates as high as any regular HTTP server,
allowing querying to scale reliably to much larger numbers of
clients»[1].

This system was devised Ruben Verborgh, Miel Vander Sande and Pieter
Colpaert at Multimedia Lab (Ghent University)  in Ghent, Belgium.
You can read more about it:http://linkeddatafragments.org/

What is Wikidata LDF?
Using the software by Verborgh et al. I have setup the website
http://wikidataldf.com  that contains:
* an interface to navigate in the RDF data and query them using the
Triple Pattern Fragments client
* a web client where you can compose and execute SPARQL queries

This is not, strictly speaking, a SPARQL endpoint (not all the SPARQL
standard is implemented and it is slower, but it should be more
reliable, if you are interested in details, please do read more at the
link above).

The data are, for the moment, limited to the sitelinks dump but I am
working towards adding the other dump. I have taken the Wikidata RDF
dumps as of Oct, 13th 2014[2].

To use them I had to convert them in HDT format[3a][3b], using the
hdt-cpp library[3c] (devel) (which is taking quite a lot of resources
and computing time for the whole dumps, that's the reason why I
haven't published the rest yet ^_^).

DBpedia has also this[4]:
http://fragments.dbpedia.org/

All the software used is available under the MIT license on the LDF
repo on github[5a], and also the (two pages) website is available
here[5b].

I would like to thank Ruben for his feedback and his presentation
about LDF at SpazioDati in Trento, Italy (here's the slides[6]).

All this said, happy birthday Wikidata.

Cristian

[1]http://linkeddatafragments.org/publications/ldow2014.pdf
[2]https://tools.wmflabs.org/wikidata-exports/rdf/exports/
[3a]http://www.rdfhdt.org/
[3b]http://www.w3.org/Submission/HDT-Implementation/
[3c]https://github.com/rdfhdt/hdt-cpp
[4]http://sourceforge.net/p/dbpedia/mailman/message/32982329/
[5a] see the Browser.js, Server.js and Client.js repos in
https://github.com/LinkedDataFragments
[5b]https://github.com/CristianCantoro/wikidataldf
[6]http://www.slideshare.net/RubenVerborgh/querying-datasets-on-the-web-with-high-availability


Yep! And for publishing some of the information above into the Linked 
Open Data Cloud, from this thread, via nanotation:



a schema:WebPage;
rdfs:label "Wikidata LDF";
skos:altLabel "Wikidata Linked Data Fragments" ;
dcterms:hasPart , 
;
xhv:related , 
 ;

rdfs:comment """
I wanted to join in and give my birthday present to 
Wikidata (I  am a

little bit late, though!)
(also, honestly, I didn't recall it was Wikidata's 
birthday, but it is

a nice occasion :P)
""" ;

rdfs:comment """
What is Wikidata LDF?
Using the software by Verborgh et al. I have setup 
the website

http://wikidataldf.com that contains:
* an interface to navigate in the RDF data and 
query them using the

Triple Pattern Fragments client
* a web client where you can compose and execute 
SPARQL queries


This is not, strictly speaking, a SPARQL endpoint 
(not all the SPARQL
standard is implemented and it is slower, but it 
should be more
reliable, if you are interested in details, please 
do read more at the

link above).

The data are, for the moment, limited to the 
sitelinks dump but I am
working towards adding the other dump. I have taken 
the Wikidata RDF

dumps as of Oct, 13th 2014[2].

To use them I had to convert them in HDT 
format[3a][3b], using the
hdt-cpp library[3c] (devel) (which is taking quite 
a lot of resources
and computing time for the whole dumps, that's the 
reason why I

haven't published the rest yet ^_^).
""" ;
dcterms:references 
,



Re: [Wikidata-l] Wikidata RDF

2014-10-30 Thread Kingsley Idehen

On 10/29/14 5:59 PM, Lydia Pintscher wrote:

Hey Phillip:)

On Wed, Oct 29, 2014 at 7:41 PM, Phillip Rhodes
  wrote:

>FWIW, put me in the camp of "people who want to see wikidata available
>via RDF" as well.  I won't argue that RDF needs to be the*native*
>format for Wikidata, but I think it would be a crying shame for such a
>large knowledgebase to be cut off from seamless integration with the
>rest of the LinkedData world.
>
>That said, I don't really care if RDF/SPARQL support come later and
>are treated as an "add on", but I do think Wikidata should at least
>have that as a goal for "eventually".  And if I can help make that
>happen, I'll try to pitch in however I can.   I have some experiments
>I'm doing now, working on some new approaches to scaling RDF
>triplestores, so using the Wikidata data may be an interesting testbed
>for that down the road.
>
>And on a related note - and apologies if this has been discussed to
>death, but I haven't been on the list since the beginning - but I am
>curious if there is any formal collaboration
>(in-place|proposed|possible) between dbpedia and wikidata?

Help with this would be awesome and totally welcome. The tracking bug
is athttps://bugzilla.wikimedia.org/show_bug.cgi?id=48143


Lydia,

Linked Open Data URIs for tracking issues such as the one above:

[1] 
http://linkeddata.uriburner.com/about/id/entity/https/bugzilla.wikimedia.org/show_bug.cgi?id=48143
[2] 
http://bit.ly/vapour-report-sample-wikidata-issue-tracking-entity-http-uri 
-- vapour report on the Linked Data URI above
[3] http://linkeddata.uriburner.com/c/9BTVWIGG -- use of #this to make a 
Linked Open Data URI "on the fly" (no owl:sameAs reasoning and inference 
applied)
[4] http://linkeddata.uriburner.com/c/8GUIAJ -- ditto, but with 
owl:sameAs reasoning and inference applied.


Since this mailing list is online, I can also add some RDF statements 
into this post. Basically, this turns said post (or any other such 
conversation) into a live Linked Open Data creation and publication 
mechanism, by way of nanotation [1].



## Nanotation Start ##


xhv:related  ;
is foaf:primaryTopic of , 
 
.


## Nanotation End ##

Links:

[1] http://kidehen.blogspot.com/2014/07/nanotation.html -- Nanotation
[2] 
http://linkeddata.uriburner.com/about/html/{url-of-this-reply-once-its-live} 
-- URL pattern that will show the effects (refied statements/claims 
amongst other things) of the nanotations above .


--
Regards,

Kingsley Idehen 
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog 1: http://kidehen.blogspot.com
Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen
Twitter Profile: https://twitter.com/kidehen
Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen
Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this



smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Super Lachaise, a mobile app based on Wikidata

2014-10-30 Thread Pierre-Yves Beaudouin
>
> Dear Pierre-Yves,
>
> On 10/28/2014 05:41 PM, Pierre-Yves Beaudouin wrote:
>
> >* I don't know because I'm not the developer of the app and my knowledge
> *>* is limited in this area. For many years now, I am collecting data
> *>* (information, photo, coordinates) about the cemetery. I've publish
> *>* everything on Commons, Wikidata and OSM, so developers can do something
> *>* smart with that ;)
> *
> How do you get the geocoordinates for the individual graves? Looing 
> athttp://www.superlachaise.fr/ I see Guillaume Apollinaire. His Wikidata 
> https://www.wikidata.org/wiki/Q133855 has no geodata. The cemetery link
> and Findagrave seem neither to have geodata.
>
>
> - Finn Årup Nielsen
>
>
Hi,

The geocoordinates are not on Wikidata. They are on OpenStreetMap and most
of them are also on Wikimedia Commons.

I didn't put the geocordinates on Wikidata because I'm still thinking how
to do it: put the geocordinates on the biography (with a qualifier) or on
the monument. And I didn't start the mass item creation of monuments
because I'm waiting the quantity datatype with units. That's why a lot of
my data are still on Commons

https://commons.wikimedia.org/wiki/Category:Grave_of_Guillaume_Apollinaire

Pyb
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Birthday gift: Missing Wikipedia links (was Re: Wikidata turns two!)

2014-10-30 Thread Adrian Lang
Hi Denny,

great tool! I couldn't find the source code, though. Can you point me
to the repository it's hosted at?

Regards,
Adrian


On Wed, Oct 29, 2014 at 6:56 PM, Denny Vrandečić  wrote:
> Folks,
>
> as you know, many Googlers are huge fans of Wikipedia. So here’s a little
> gift for Wikidata’s second birthday.
>
> Some of my smart colleagues at Google have run a few heuristics and
> algorithms in order to discover Wikipedia articles in different languages
> about the same topic which are missing language links between the articles.
> The results contain more than 35,000 missing links with a high confidence
> according to these algorithms. We estimate a precision of about 92+% (i.e.
> we assume that less than 8% of those are wrong, based on our evaluation).
> The dataset covers 60 Wikipedia language editions.
>
> Here are the missing links, available for download from the WMF labs
> servers:
>
> https://tools.wmflabs.org/yichengtry/merge_candidate.20141028.csv
>
> The data is published under CC-0.
>
> What can you do with the data? Since it is CC-0, you can do anything you
> want, obviously, but here are a few suggestions:
>
> There’s a small tool on WMF labs that you can use to verify the links (it
> displays the articles side by side from a language pair you select, and then
> you can confirm or contradict the merge):
>
> https://tools.wmflabs.org/yichengtry
>
> The tool does not do the change in Wikidata itself, though (we thought it
> would be too invasive if we did that). Instead, the results of the human
> evaluation are saved on WMF labs. You are welcome to take the tool and
> extend it with the possibility to upload the change directly on Wikidata, if
> you so wish, or, once the data is verified, to upload the results.
>
> Also, Magnus Manske is already busy uploading the data to the Wikidata game,
> so you can very soon also play the merge game on the data directly. He is
> also creating the missing items on Wikidata. Thanks Magnus for a very
> pleasant cooperation!
>
> I want to call out to my colleagues at Google who created the dataset -
> Jiang Bian and Si Li - and to Yicheng Huang, the intern who developed the
> tool on labs.
>
> I hope that this small data release can help a little with further improving
> the quality of Wikidata and Wikipedia! Thank you all, you are awesome!
>
> Cheers,
> Denny
>
>
>
> On Wed Oct 29 2014 at 10:52:05 AM Lydia Pintscher
>  wrote:
>
> Hey folks :)
>
> Today Wikidata is turning two. It amazes me what we've achieved in
> just 2 years. We've built an incredible project that is set out to
> change the world. Thank you everyone who has been a part of this so
> far.
> We've put together some notes and opinions. And there are presents as
> well! Check them out and leave your birthday wishes:
> https://www.wikidata.org/wiki/Wikidata:Second_Birthday
>
>
> Cheers
> Lydia
>
> --
> Lydia Pintscher - http://about.me/lydia.pintscher
> Product Manager for Wikidata
>
> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
> Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
>
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Another birthday gift: SPARQL queries in the browser over Wikidata RDF dumps using Linked Data Fragments

2014-10-30 Thread Ruben Verborgh
Dear Cristian,

> http://wikidataldf.com

Thanks so much for this, we have featured your dataset on our homepage:
http://linkeddatafragments.org/data/

> What is LDF?

Please allow me to make a technical correction here:
Linked Data Fragments are a uniform view
on _all_ possible interfaces to Linked Data,
not just the light-weight interface that Cristian set up:
- SPARQL endpoints offer Linked Data Fragments;
  you can select a part of a dataset corresponding to a specific SPARQL query
- a server of Linked Data documents offers Linked Data Fragments;
  you can select a part of a dataset corresponding to a specific subject
- …
What all those interfaces have in common,
is that they offer some part, some fragment, of a dataset;
hence the name “Linked Data Fragments”.

In addition, we have introduced a new kind of interface:
Triple Pattern Fragments, which offer access to parts of a dataset by triple 
pattern.
This is indeed a very lightweight system for the server,
as you can host live data without much processing resources.
SPARQL queries can be executed on the client side,
as Cristian's client instance shows: http://client.wikidataldf.com/

> «a publishing method [for RDF datasets] that
> allows efficient offloading of query execution from servers to clients
> through a lightweight partitioning strategy. It enables servers to
> maintain availability rates as high as any regular HTTP server,
> allowing querying to scale reliably to much larger numbers of
> clients»[1].

So the above definition is about triple pattern fragments,
not Linked Data Fragments interfaces (which include SPARQL endpoints) in 
general.



Thanks, Cristian, for setting this up!
I hope the Wikidata community finds good use for it.
Finally, live data from Wikidata can be queried with SPARQL :-)

Best,

Ruben
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Wikidata RDF

2014-10-30 Thread Cristian Consonni
2014-10-29 22:59 GMT+01:00 Lydia Pintscher :
> Help with this would be awesome and totally welcome. The tracking bug
> is at https://bugzilla.wikimedia.org/show_bug.cgi?id=48143

Speaking of totally awesome (aehm :D):
* see: http://wikidataldf.com
* see this other thread:
https://lists.wikimedia.org/pipermail/wikidata-l/2014-October/004920.html

(If I can ask, having the RDF dumps in HDT format [again, see the
other thread] would be really helpful)

C

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


[Wikidata-l] Another birthday gift: SPARQL queries in the browser over Wikidata RDF dumps using Linked Data Fragments

2014-10-30 Thread Cristian Consonni
Dear all,

I wanted to join in and give my birthday present to Wikidata (I  am a
little bit late, though!)
(also, honestly, I didn't recall it was Wikidata's birthday, but it is
a nice occasion :P)

Here it is:
http://wikidataldf.com

What is LDF?
LDF stands for Linked Data Fragments, they are a new system for query
RDF datasets that stands middle way between having a SPARQL endpoint
and downloading the whole thing.

More formally LDF is «a publishing method [for RDF datasets] that
allows efficient offloading of query execution from servers to clients
through a lightweight partitioning strategy. It enables servers to
maintain availability rates as high as any regular HTTP server,
allowing querying to scale reliably to much larger numbers of
clients»[1].

This system was devised Ruben Verborgh, Miel Vander Sande and Pieter
Colpaert at Multimedia Lab (Ghent University)  in Ghent, Belgium.
You can read more about it: http://linkeddatafragments.org/

What is Wikidata LDF?
Using the software by Verborgh et al. I have setup the website
http://wikidataldf.com that contains:
* an interface to navigate in the RDF data and query them using the
Triple Pattern Fragments client
* a web client where you can compose and execute SPARQL queries

This is not, strictly speaking, a SPARQL endpoint (not all the SPARQL
standard is implemented and it is slower, but it should be more
reliable, if you are interested in details, please do read more at the
link above).

The data are, for the moment, limited to the sitelinks dump but I am
working towards adding the other dump. I have taken the Wikidata RDF
dumps as of Oct, 13th 2014[2].

To use them I had to convert them in HDT format[3a][3b], using the
hdt-cpp library[3c] (devel) (which is taking quite a lot of resources
and computing time for the whole dumps, that's the reason why I
haven't published the rest yet ^_^).

DBpedia has also this[4]:
http://fragments.dbpedia.org/

All the software used is available under the MIT license on the LDF
repo on github[5a], and also the (two pages) website is available
here[5b].

I would like to thank Ruben for his feedback and his presentation
about LDF at SpazioDati in Trento, Italy (here's the slides[6]).

All this said, happy birthday Wikidata.

Cristian

[1] http://linkeddatafragments.org/publications/ldow2014.pdf
[2] https://tools.wmflabs.org/wikidata-exports/rdf/exports/
[3a] http://www.rdfhdt.org/
[3b] http://www.w3.org/Submission/HDT-Implementation/
[3c] https://github.com/rdfhdt/hdt-cpp
[4] http://sourceforge.net/p/dbpedia/mailman/message/32982329/
[5a] see the Browser.js, Server.js and Client.js repos in
https://github.com/LinkedDataFragments
[5b] https://github.com/CristianCantoro/wikidataldf
[6] 
http://www.slideshare.net/RubenVerborgh/querying-datasets-on-the-web-with-high-availability

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Wikidata RDF

2014-10-30 Thread Markus Krötzsch

Hi Phillip,

Are you aware of the Wikidata RDF exports at 
http://tools.wmflabs.org/wikidata-exports/rdf/ ? Do they meet your 
requirements for now or do you need something different? If you have 
specific plans for the RDF, I would be curious to learn about them.


Cheers,

Markus

On 29.10.2014 19:41, Phillip Rhodes wrote:

FWIW, put me in the camp of "people who want to see wikidata available
via RDF" as well.  I won't argue that RDF needs to be the *native*
format for Wikidata, but I think it would be a crying shame for such a
large knowledgebase to be cut off from seamless integration with the
rest of the LinkedData world.

That said, I don't really care if RDF/SPARQL support come later and
are treated as an "add on", but I do think Wikidata should at least
have that as a goal for "eventually".  And if I can help make that
happen, I'll try to pitch in however I can.   I have some experiments
I'm doing now, working on some new approaches to scaling RDF
triplestores, so using the Wikidata data may be an interesting testbed
for that down the road.

And on a related note - and apologies if this has been discussed to
death, but I haven't been on the list since the beginning - but I am
curious if there is any formal collaboration
(in-place|proposed|possible) between dbpedia and wikidata?


Phil

This message optimized for indexing by NSA PRISM


On Wed, Oct 29, 2014 at 2:34 PM, Markus Krötzsch
 wrote:

Martynas,

Denny is right. You could set up a Virtuoso endpoint based on our RDF
exports. This would be quite nice to have. That's one important reason why
we created the exports, and I really hope we will soon see this happening.
We are dealing here with a very large project, and the decision for or
against a technology is not just a matter of our personal preference. If RDF
can demonstrate added value, then there will surely be resources to further
extend the support for it. So far, we are in the lead: we provide close to
one billion (!) triples Wikidata knowledge to the world. So far, there is no
known use of this data. We need to go step by step: some support from us,
some practical usage from the RDF community, some more support from us, ...

In reply to your initial email, Martynas, I have to say that you seem to
have very little knowledge about what is going on in Wikidata. If you would
follow the development reports more closely, you would know that most of the
work is going into components that RDF does not replace at all. Querying
with SPARQL is nice, but we are still more focussed on UI issues, history
management, infrastructure integration (such as pushing changes to other
sites), and many more things which are completely unrelated to RDF in every
way. Your suggestion that a single file format would somehow magically make
the construction of one of the world-largest community-edited knowledge
bases a piece of cake is just naive.

Now don't get me wrong: naive thinking has it's place in Wikidata -- it's
always naive to try what others consider impossible -- but it should be
combined with some positive, forward thinking attitude. I hope that our
challenge to show the power of RDF to us can unleash some positive energies
in you :-) I am looking forward to your results (and happy to help if you
need some more details about the RDF dumps etc.).

Best wishes,

Markus


On 29.10.2014 18:26, Denny Vrandečić wrote:


Martynas,

since we had this discussion on this list previously, and again I am
irked by your claim that we could just use standard RDF tools out of the
box for Wikidata.

I will shut up and concede that you are right if you manage to set up a
standard open source RDF tool on an open source stack that contains the
Wikidata knowledge base, is keeping up to date with the rate of changes
that we have, and is able to answer queries from the public without
choking and dying for 24 hours, before this year is over. Announce a few
days in advance on this list when you will make the experiment.

Technology has advanced by three years since we made the decision not to
use standard RDF tools, so I am sure it should be much easier today. But
last time I talked with people writing such tools, they were rather
cautious due to our requirements.

We still wouldn't have proven that it could deal with the expected QPS
Wikidata will have, but heck, I would be surprised and I would admit
that I was wrong with my decision if you can do that.

Seriously, we did not snub RDF and SPARQL because we don't like it or
don't know it. We decided against it *because* we know it so well and we
realized it does not fulfill our requirements.

Cheers,
Denny

On Mon Oct 27 2014 at 6:47:05 PM Martynas Jusevičius
mailto:marty...@graphity.org>> wrote:

 Hey all,

 so I see there is some work being done on mapping Wikidata data model
 to RDF [1].

 Just a thought: what if you actually used RDF and Wikidata's concepts
 modeled in it right from the start? And used standard RDF tools, APIs,
 query language (SPA