[Wikidata-l] DBpedia-based RDF dumps for Wikidata

2015-05-15 Thread Dimitris Kontokostas
Dear all,

Following up on the early prototype we announced earlier [1] we are happy
to announce a consolidated Wikidata RDF dump based on DBpedia.
(Disclaimer: this work is not related or affiliated with the official
Wikidata RDF dumps)

We provide:
 * sample data for preview http://wikidata.dbpedia.org/downloads/sample/
 * a complete dump with over 1 Billion triples:
http://wikidata.dbpedia.org/downloads/20150330/
 * a  SPARQL endpoint: http://wikidata.dbpedia.org/sparql
 * a Linked Data interface: http://wikidata.dbpedia.org/resource/Q586

Using the wikidata dump from March we were able to retrieve more that 1B
triples, 8.5M typed things according to the DBpedia ontology along with 48M
transitive types, 6.4M coordinates and 1.5M depictions. A complete report
for this effort can be found here:
http://svn.aksw.org/papers/2015/ISWC_Wikidata2DBpedia/public.pdf

The extraction code is now fully integrated in the DBpedia Information
Extraction Framework.

We are eagerly waiting for your feedback and your help in improving the
DBpedia to Wikidata mapping coverage
http://mappings.dbpedia.org/server/ontology/wikidata/missing/

Best,

Ali Ismayilov, Dimitris Kontokostas, Sören Auer, Jens Lehmann, Sebastian
Hellmann

[1]
http://www.mail-archive.com/dbpedia-discussion%40lists.sourceforge.net/msg06936.html

-- 
Dimitris Kontokostas
Department of Computer Science, University of Leipzig & DBpedia Association
Projects: http://dbpedia.org, http://http://aligned-project.eu
Homepage:http://aksw.org/DimitrisKontokostas
Research Group: http://aksw.org
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] DBpedia-based RDF dumps for Wikidata

2015-05-15 Thread Giovanni Tummarello
Hi Dimistris, everyone in the team. congratulations, great job.. it will
certainly be useful

Gio

On Fri, May 15, 2015 at 11:28 AM, Dimitris Kontokostas <
kontokos...@informatik.uni-leipzig.de> wrote:

> Dear all,
>
> Following up on the early prototype we announced earlier [1] we are happy
> to announce a consolidated Wikidata RDF dump based on DBpedia.
> (Disclaimer: this work is not related or affiliated with the official
> Wikidata RDF dumps)
>
> We provide:
>  * sample data for preview http://wikidata.dbpedia.org/downloads/sample/
>  * a complete dump with over 1 Billion triples:
> http://wikidata.dbpedia.org/downloads/20150330/
>  * a  SPARQL endpoint: http://wikidata.dbpedia.org/sparql
>  * a Linked Data interface: http://wikidata.dbpedia.org/resource/Q586
>
> Using the wikidata dump from March we were able to retrieve more that 1B
> triples, 8.5M typed things according to the DBpedia ontology along with 48M
> transitive types, 6.4M coordinates and 1.5M depictions. A complete report
> for this effort can be found here:
> http://svn.aksw.org/papers/2015/ISWC_Wikidata2DBpedia/public.pdf
>
> The extraction code is now fully integrated in the DBpedia Information
> Extraction Framework.
>
> We are eagerly waiting for your feedback and your help in improving the
> DBpedia to Wikidata mapping coverage
> http://mappings.dbpedia.org/server/ontology/wikidata/missing/
>
> Best,
>
> Ali Ismayilov, Dimitris Kontokostas, Sören Auer, Jens Lehmann, Sebastian
> Hellmann
>
> [1]
> http://www.mail-archive.com/dbpedia-discussion%40lists.sourceforge.net/msg06936.html
>
> --
> Dimitris Kontokostas
> Department of Computer Science, University of Leipzig & DBpedia
> Association
> Projects: http://dbpedia.org, http://http://aligned-project.eu
> Homepage:http://aksw.org/DimitrisKontokostas
> Research Group: http://aksw.org
>
>
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


[Wikidata-l] DBpedia-based RDF dumps for Wikidata

2015-03-11 Thread Dimitris Kontokostas
Dear all,

TL;DR; We are working on an *experimental* Wikidata RDF export based on
DBpedia and would like some feedback on our future directions.

Disclaimer: this work is not related or affiliated with the official
Wikidata RDF dumps.

Our current approach is to use Wikidata like all other Wikipedia editions
and apply our extractors to each Wikidata page (item). This approach
generates triples in the DBpedia domain (
http://wikidata.dbpedia.org/resource/). Although this results in
duplication, since Wikidata already provides RDF, we made some different
design choices and map wikidata data directly into the DBpedia ontology.

sample data: http://nl.dbpedia.org/downloads/wikidatawiki/sample/

experimental dump: http://nl.dbpedia.org/downloads/wikidatawiki/20150207/
(errors see below)

*Wikidata mapping details*

In the same way we use mappings.dbpedia.org to define mappings from
Wikipedia templates to the DBpedia ontology, we define transformation
mappings from Wikidata properties to RDF triples in the DBpedia ontology.

At the moment we provide two types of Wikidata property mappings:

a)  through the mappings wiki in the form of equivalent classes or
properties e.g.

property: http://mappings.dbpedia.org/index.php/OntologyProperty:BirthDate

Class: http://mappings.dbpedia.org/index.php/OntologyClass:Person

which will result in the following triples:

wd:Qx a dbo:Person

wd:Qx dbo:birthDate “”

b) transformation mappings that are (for now) defined in a json file [1].
At the moment we provide the following mappings options:


   -

   Predefined values
   -

  "P625": {"rdf:type":"
  http://www.w3.org/2003/01/geo/wgs84_pos#SpatialThing"}
  will result in: wd:Qx a geo:SpatialThing
  -

   Value formatting with a string containing $1
   -

  "P214": {"owl:sameAs": "http://viaf.org/viaf/$1"}
  will result in:  wd:Qx owl:sameAs http://viaf.org/viaf/{wikidataValue}
   .
  -

   Value formatting with predefined functions. The following are supported
   for now
   -

  $getDBpediaClass: returns the equivalent DBpedia class for a Wikidata
  item (using the mappings wiki)
  -

  $getLatitude, $getLongitude & $getGeoRss: geo-related functions


Also note that we can define multiple mappings per property to get the
Wikidata data closer to the DBpedia RDF exports e.g.:

"P625": [

{"rdf:type":"http://www.w3.org/2003/01/geo/wgs84_pos#SpatialThing"},

{"geo:lat":"$getLatitude"},

{"geo:long": "$getLongitude"},

{"georss:point":"$getGeoRss"}],

"P18": [

{"thumbnail":"
http://commons.wikimedia.org/wiki/Special:FilePath/$1?width=300"},

{"foaf:depiction":"http://commons.wikimedia.org/wiki/Special:FilePath/$1"}],

*Qualifiers & reification*

Like Wikidata we provide a simplified dump without qualifiers and a reified
dump with qualifiers. However, for the reification we chose simple RDF
reification in order to reuse the DBpedia ontology as much as possible. The
reified dumps are also mapped using the same configuration.

*Labels, descriptions, aliases and interwiki links*

We additionally defined extractors to get data other than statements. For
textual data we split the dumps to the languages that are enabled in the
mappings wiki and all the rest. We extract aliases, labels, descriptions,
site links. For interwiki links we provide links between Wikidata and
DBpedia as well as links between different DBpedia language editions.

*Properties*

We also fully extract wikidata property pages. However, for now we don’t
apply any mappings to wikidata properties.

*DBpedia extractors*

Some existing DBpedia extractors also apply in Wikidata that provide
versioning and provenance (e.g. pageID, revisionID, etc)

*Help & Feedback*

Although this is a work in progress we wanted to announce it early and get
you feedback on the following:

   -

   Are we going in the right direction?
   -

   Did we overlook something or is something missing?
   -

   Are there any other mapping options we should include?
   -

   Where should we host the advanced json mappings?
   -

  One option is in the mappings wiki, another one is in Wikidata
  directly or a separate github project


It would be great if you could help us map more data. The easiest way is
through the mappings wiki where you can define equivalent classes &
properties. See what is missing here:
http://mappings.dbpedia.org/server/ontology/wikidata/missing/

You can also provide json configuration but until the code is merged it
will not be easy with PRs.

Until the code is merged in the main DBpedia repo you can check it out from
here:

https://github.com/alismayilov/extraction-framework/tree/wikidataAllCommits

Notes:

   -

   we use the Wikidata-Toolkit for reading the json structure which is a
   great project btw
   -

   The full dump we provide is not complete due to a Wikidata dump export
   bug. The compressed files are not closed correctly due to this.


Best,

Ali Ismayilov, Dimitris K