On 5 April 2013 21:27, Andrea Di Menna <ninn...@gmail.com> wrote:
> Hi Dimitris,
>
> I am not completely getting your point.
>
> How would you handle the following example? (supposing the following will be
> possible with Wikipedia/Wikidata)
>
> Suppose you have
>
> {{Infobox:Test
> | name = {{#property:p45}}
> }}
>
> and a mapping
>
> {{PropertyMapping | templateProperty = name | ontologyProperty = foaf:name}}
>
> what would happen when running the MappingExtractor?
> Which RDF triples would be generated?

I think there are two questions here, and two very different approaches.

1. In the near term, I would expect that Wikipedia templates are
modified like in your example.

How could/should DBpedia deal with this? The simplest solution seems
to be that during a preliminary step, we extract data from Wikidata
and store it. During the main extraction, whenever we find a reference
to Wikidata, we look it up and generate a triple as usual. Not a huge
change.

2. In the long run though, when all data is moved to Wikidata, all
instances of a certain infobox type will look the same. It doesn't
matter anymore if an infobox is about Germany or Italy, because they
all use the same properties:

{{Infobox country
| capitol = {{#property:p45}}
| population = {{#property:p42}}
... etc. ...
}}

I guess Wikidata already thought of this and has plans to then replace
the whole infobox by a small construct that simply instructs MediaWiki
to pull all data for this item from Wikidata and display an infobox.
In this case, there will be nothing left to extract for DBpedia.

Implementation detail: we shouldn't use a SPARQL store to look up
Wikidata data, we should keep them in memory. A SPARQL call will
certainly be at least 100 times slower than a lookup in a map, but
probably 10000 times or more. This matters because there will be
hundreds of millions of lookup calls during an extraction. Keeping all
inter-language links in memory takes about 4 or 5 GB - not much. Of
course, keeping all Wikidata data in memory would take between 10 and
100 times as much RAM.

Cheers,
JC

>
> Cheers
> Andrea
>
>
> 2013/4/5 Dimitris Kontokostas <jimk...@gmail.com>
>>
>> Hi,
>>
>> For me there is no reason to complicate the DBpedia framework by resolving
>> Wikidata data / templates.
>> What we could do is (try to) provide a semantic mirror of Wikidata in i.e.
>> data.dbpedia.org. We should simplify it by mapping the data to the DBpedia
>> ontology and then use it like any other language edition we have (e.g.
>> nl.dbpedia.org).
>>
>> In dbpedia.org we already aggregate data from other language editions. For
>> now it is mostly labels & abstracts but we can also fuse Wikidata data. This
>> way, whatever is missing from the Wikipedia dumps will be filled in the end
>> by the Wikidata dumps
>>
>> Best,
>> Dimitris
>>
>>
>> On Fri, Apr 5, 2013 at 9:49 PM, Julien Plu
>> <julien....@redaction-developpez.com> wrote:
>>>
>>> Ok, thanks for the precision :-) It's perfect, now just waiting when the
>>> dump of these data will be available.
>>>
>>> Best.
>>>
>>> Julien Plu.
>>>
>>>
>>> 2013/4/5 Jona Christopher Sahnwaldt <j...@sahnwaldt.de>
>>>>
>>>> On 5 April 2013 19:59, Julien Plu <julien....@redaction-developpez.com>
>>>> wrote:
>>>> > Hi,
>>>> >
>>>> > @Anja : Have you a post from a blog or something like that which
>>>> > speaking
>>>> > about RDF dump of wikidata ?
>>>>
>>>> http://meta.wikimedia.org/wiki/Wikidata/Development/RDF
>>>>
>>>> @Anja: do you know when RDF dumps are planned to be available?
>>>>
>>>> > The french wikidata will also provide their
>>>> > data in RDF ?
>>>>
>>>> There is only one Wikidata - neither English nor French nor any other
>>>> language. It's just data. There are labels in different languages, but
>>>> the data itself is language-agnostic.
>>>>
>>>> >
>>>> > This news interest me very highly.
>>>> >
>>>> > Best
>>>> >
>>>> > Julien Plu.
>>>> >
>>>> >
>>>> > 2013/4/5 Tom Morris <tfmor...@gmail.com>
>>>> >>
>>>> >> On Fri, Apr 5, 2013 at 9:40 AM, Jona Christopher Sahnwaldt
>>>> >> <j...@sahnwaldt.de> wrote:
>>>> >>>
>>>> >>>
>>>> >>> thanks for the heads-up!
>>>> >>>
>>>> >>> On 5 April 2013 10:44, Julien Plu
>>>> >>> <julien....@redaction-developpez.com>
>>>> >>> wrote:
>>>> >>> > Hi,
>>>> >>> >
>>>> >>> > I saw few days ago that MediaWiki since one month allow to create
>>>> >>> > infoboxes
>>>> >>> > (or part of them) with Lua scripting language.
>>>> >>> > http://www.mediawiki.org/wiki/Lua_scripting
>>>> >>> >
>>>> >>> > So my question is, if every data in the wikipedia infoboxes are in
>>>> >>> > Lua
>>>> >>> > scripts, DBPedia will still be able to retrieve all the data as
>>>> >>> > usual ?
>>>> >>>
>>>> >>> I'm not 100% sure, and we should look into this, but I think that
>>>> >>> Lua
>>>> >>> is only used in template definitions, not in template calls or other
>>>> >>> places in content pages. DBpedia does not parse template
>>>> >>> definitions,
>>>> >>> only content pages. The content pages probably will only change in
>>>> >>> minor ways, if at all. For example, {{Foo}} might change to
>>>> >>> {{#invoke:Foo}}. But that's just my preliminary understanding after
>>>> >>> looking through a few tuorial pages.
>>>> >>
>>>> >>
>>>> >> As far as I can see, the template calls are unchanged for all the
>>>> >> templates which makes sense when you consider that some of the
>>>> >> templates
>>>> >> that they've upgraded to use Lua like Template:Coord  are used on
>>>> >> almost a
>>>> >> million pages.
>>>> >>
>>>> >> Here are the ones which have been updated so far:
>>>> >> https://en.wikipedia.org/wiki/Category:Lua-based_templates
>>>> >> Performance improvement looks impressive:
>>>> >> https://en.wikipedia.org/wiki/User:Dragons_flight/Lua_performance
>>>> >>
>>>> >> Tom
>>>> >
>>>> >
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Minimize network downtime and maximize team effectiveness.
>>> Reduce network management and security costs.Learn how to hire
>>> the most talented Cisco Certified professionals. Visit the
>>> Employer Resources Portal
>>> http://www.cisco.com/web/learning/employer_resources/index.html
>>> _______________________________________________
>>> Dbpedia-discussion mailing list
>>> Dbpedia-discussion@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>>>
>>
>>
>>
>> --
>> Kontokostas Dimitris
>>
>>
>> ------------------------------------------------------------------------------
>> Minimize network downtime and maximize team effectiveness.
>> Reduce network management and security costs.Learn how to hire
>> the most talented Cisco Certified professionals. Visit the
>> Employer Resources Portal
>> http://www.cisco.com/web/learning/employer_resources/index.html
>> _______________________________________________
>> Dbpedia-discussion mailing list
>> Dbpedia-discussion@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>>
>
>
> ------------------------------------------------------------------------------
> Minimize network downtime and maximize team effectiveness.
> Reduce network management and security costs.Learn how to hire
> the most talented Cisco Certified professionals. Visit the
> Employer Resources Portal
> http://www.cisco.com/web/learning/employer_resources/index.html
> _______________________________________________
> Dbpedia-discussion mailing list
> Dbpedia-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>

------------------------------------------------------------------------------
Minimize network downtime and maximize team effectiveness.
Reduce network management and security costs.Learn how to hire 
the most talented Cisco Certified professionals. Visit the 
Employer Resources Portal
http://www.cisco.com/web/learning/employer_resources/index.html
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to