Hi everybody,

Oops, I didn't realize this e-mail had been moderated yesterday. It
should get through okay now!

cheers,
Gaurav

---------- Forwarded message ----------
From: Gaurav Vaidya <gau...@ggvaidya.com>
Date: 29 July 2014 20:32
Subject: [ANN] Experimental Wikimedia Commons RDF extraction with DBpedia
To: dbpedia-discussion@lists.sourceforge.net,
dbpedia-develop...@lists.sourceforge.net,
dbpedia-du...@lists.sourceforge.net, wikidat...@lists.wikimedia.org,
public-...@w3.org, Wikimedia Commons Discussion List
<common...@lists.wikimedia.org>

Hi everybody,

We are happy to announce an experimental RDF dump of the Wikimedia
Commons. A complete first draft is now available online at
http://nl.dbpedia.org/downloads/commonswiki/20140705/, and will be
eventually accesible from http://commons.dbpedia.org. A small sample
dataset, which may be easier to browse, is available on Github at
https://github.com/gaurav/commons-extraction/tree/master/commonswiki/20140101

The following datasets showcases some of the improvements that we’ve
been working on over the last two months:
 - File information (*-file-information.*) is a completely new dataset
that contains information on the files in the Commons, including file
and thumbnail URLs, file extensions, file type classes and MIME types.
 - DBpedia’s Mappings Extractor (*-mappingbased-properties.*) uses
templates stored on the Mapping server (http://mappings.dbpedia.org/)
to create RDF for information-rich templates. This system still has
some important limitations, such as not being able to process process
embedded templates (e.g. license templates inside {{Information}}),
but top-level templates are completely configurable. The existing
mappings are available at
http://mappings.dbpedia.org/index.php/Mapping_commons
 - This includes 363 license templates that indicate licensing for
Commons files under public domain, Creative Commons and other open
access licenses. These were created by bots and still require
verification before use. They are listed at
http://mappings.dbpedia.org/index.php/Category:Commons_media_license
 - The DBpedia Geoextractor (*-geo-coordinates.*) now extracts
geographical coordinates from Commons files using the {{Location}}
template.
 - The DBpedia SKOS Extractor (*-skos-categories.*) now identifies
relationships between Commons categories, building a SKOS-based
description of the entire Commons category tree.

Please have a look and let us know what you think. We’ll be working on
a number of open tasks over the next three weeks, listed at
https://github.com/gaurav/extraction-framework/issues?state=open -- if
you see something wrong with what we’ve done above, or have an issue
you’d particularly like us to tackle, please report it there or drop
me an e-mail!

This work is sponsored by the Google Summer of Code program
(https://www.google-melange.com/gsoc/project/details/google/gsoc2014/gaurav/5676830073815040).

Thanks!

cheers,
The DBpedia Commons extraction team:
Gaurav Vaidya
Dimitris Kontokostas
Andrea Di Menna
Jimmy O’Regan

------------------------------------------------------------------------------
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls. 
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to