I'd say this topic can safely move out of dbpedia-discussion and to
dbpedia-developers now. :)

I agree with Jona. With one small detail: perhaps it is better we don't to
load everything in memory, if we use a fast database such as Berkeley DB or
JDBM3. They would also allow you to use in-memory when you can splunge or
use disk-backed when restricted. What do you think?

Cheers,
Pablo


On Fri, Apr 5, 2013 at 10:01 PM, Jona Christopher Sahnwaldt <j...@sahnwaldt.de
> wrote:

> On 5 April 2013 21:27, Andrea Di Menna <ninn...@gmail.com> wrote:
> > Hi Dimitris,
> >
> > I am not completely getting your point.
> >
> > How would you handle the following example? (supposing the following
> will be
> > possible with Wikipedia/Wikidata)
> >
> > Suppose you have
> >
> > {{Infobox:Test
> > | name = {{#property:p45}}
> > }}
> >
> > and a mapping
> >
> > {{PropertyMapping | templateProperty = name | ontologyProperty =
> foaf:name}}
> >
> > what would happen when running the MappingExtractor?
> > Which RDF triples would be generated?
>
> I think there are two questions here, and two very different approaches.
>
> 1. In the near term, I would expect that Wikipedia templates are
> modified like in your example.
>
> How could/should DBpedia deal with this? The simplest solution seems
> to be that during a preliminary step, we extract data from Wikidata
> and store it. During the main extraction, whenever we find a reference
> to Wikidata, we look it up and generate a triple as usual. Not a huge
> change.
>
> 2. In the long run though, when all data is moved to Wikidata, all
> instances of a certain infobox type will look the same. It doesn't
> matter anymore if an infobox is about Germany or Italy, because they
> all use the same properties:
>
> {{Infobox country
> | capitol = {{#property:p45}}
> | population = {{#property:p42}}
> ... etc. ...
> }}
>
> I guess Wikidata already thought of this and has plans to then replace
> the whole infobox by a small construct that simply instructs MediaWiki
> to pull all data for this item from Wikidata and display an infobox.
> In this case, there will be nothing left to extract for DBpedia.
>
> Implementation detail: we shouldn't use a SPARQL store to look up
> Wikidata data, we should keep them in memory. A SPARQL call will
> certainly be at least 100 times slower than a lookup in a map, but
> probably 10000 times or more. This matters because there will be
> hundreds of millions of lookup calls during an extraction. Keeping all
> inter-language links in memory takes about 4 or 5 GB - not much. Of
> course, keeping all Wikidata data in memory would take between 10 and
> 100 times as much RAM.
>
> Cheers,
> JC
>
> >
> > Cheers
> > Andrea
> >
> >
> > 2013/4/5 Dimitris Kontokostas <jimk...@gmail.com>
> >>
> >> Hi,
> >>
> >> For me there is no reason to complicate the DBpedia framework by
> resolving
> >> Wikidata data / templates.
> >> What we could do is (try to) provide a semantic mirror of Wikidata in
> i.e.
> >> data.dbpedia.org. We should simplify it by mapping the data to the
> DBpedia
> >> ontology and then use it like any other language edition we have (e.g.
> >> nl.dbpedia.org).
> >>
> >> In dbpedia.org we already aggregate data from other language editions.
> For
> >> now it is mostly labels & abstracts but we can also fuse Wikidata data.
> This
> >> way, whatever is missing from the Wikipedia dumps will be filled in the
> end
> >> by the Wikidata dumps
> >>
> >> Best,
> >> Dimitris
> >>
> >>
> >> On Fri, Apr 5, 2013 at 9:49 PM, Julien Plu
> >> <julien....@redaction-developpez.com> wrote:
> >>>
> >>> Ok, thanks for the precision :-) It's perfect, now just waiting when
> the
> >>> dump of these data will be available.
> >>>
> >>> Best.
> >>>
> >>> Julien Plu.
> >>>
> >>>
> >>> 2013/4/5 Jona Christopher Sahnwaldt <j...@sahnwaldt.de>
> >>>>
> >>>> On 5 April 2013 19:59, Julien Plu <
> julien....@redaction-developpez.com>
> >>>> wrote:
> >>>> > Hi,
> >>>> >
> >>>> > @Anja : Have you a post from a blog or something like that which
> >>>> > speaking
> >>>> > about RDF dump of wikidata ?
> >>>>
> >>>> http://meta.wikimedia.org/wiki/Wikidata/Development/RDF
> >>>>
> >>>> @Anja: do you know when RDF dumps are planned to be available?
> >>>>
> >>>> > The french wikidata will also provide their
> >>>> > data in RDF ?
> >>>>
> >>>> There is only one Wikidata - neither English nor French nor any other
> >>>> language. It's just data. There are labels in different languages, but
> >>>> the data itself is language-agnostic.
> >>>>
> >>>> >
> >>>> > This news interest me very highly.
> >>>> >
> >>>> > Best
> >>>> >
> >>>> > Julien Plu.
> >>>> >
> >>>> >
> >>>> > 2013/4/5 Tom Morris <tfmor...@gmail.com>
> >>>> >>
> >>>> >> On Fri, Apr 5, 2013 at 9:40 AM, Jona Christopher Sahnwaldt
> >>>> >> <j...@sahnwaldt.de> wrote:
> >>>> >>>
> >>>> >>>
> >>>> >>> thanks for the heads-up!
> >>>> >>>
> >>>> >>> On 5 April 2013 10:44, Julien Plu
> >>>> >>> <julien....@redaction-developpez.com>
> >>>> >>> wrote:
> >>>> >>> > Hi,
> >>>> >>> >
> >>>> >>> > I saw few days ago that MediaWiki since one month allow to
> create
> >>>> >>> > infoboxes
> >>>> >>> > (or part of them) with Lua scripting language.
> >>>> >>> > http://www.mediawiki.org/wiki/Lua_scripting
> >>>> >>> >
> >>>> >>> > So my question is, if every data in the wikipedia infoboxes are
> in
> >>>> >>> > Lua
> >>>> >>> > scripts, DBPedia will still be able to retrieve all the data as
> >>>> >>> > usual ?
> >>>> >>>
> >>>> >>> I'm not 100% sure, and we should look into this, but I think that
> >>>> >>> Lua
> >>>> >>> is only used in template definitions, not in template calls or
> other
> >>>> >>> places in content pages. DBpedia does not parse template
> >>>> >>> definitions,
> >>>> >>> only content pages. The content pages probably will only change in
> >>>> >>> minor ways, if at all. For example, {{Foo}} might change to
> >>>> >>> {{#invoke:Foo}}. But that's just my preliminary understanding
> after
> >>>> >>> looking through a few tuorial pages.
> >>>> >>
> >>>> >>
> >>>> >> As far as I can see, the template calls are unchanged for all the
> >>>> >> templates which makes sense when you consider that some of the
> >>>> >> templates
> >>>> >> that they've upgraded to use Lua like Template:Coord  are used on
> >>>> >> almost a
> >>>> >> million pages.
> >>>> >>
> >>>> >> Here are the ones which have been updated so far:
> >>>> >> https://en.wikipedia.org/wiki/Category:Lua-based_templates
> >>>> >> Performance improvement looks impressive:
> >>>> >> https://en.wikipedia.org/wiki/User:Dragons_flight/Lua_performance
> >>>> >>
> >>>> >> Tom
> >>>> >
> >>>> >
> >>>
> >>>
> >>>
> >>>
> >>>
> ------------------------------------------------------------------------------
> >>> Minimize network downtime and maximize team effectiveness.
> >>> Reduce network management and security costs.Learn how to hire
> >>> the most talented Cisco Certified professionals. Visit the
> >>> Employer Resources Portal
> >>> http://www.cisco.com/web/learning/employer_resources/index.html
> >>> _______________________________________________
> >>> Dbpedia-discussion mailing list
> >>> Dbpedia-discussion@lists.sourceforge.net
> >>> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
> >>>
> >>
> >>
> >>
> >> --
> >> Kontokostas Dimitris
> >>
> >>
> >>
> ------------------------------------------------------------------------------
> >> Minimize network downtime and maximize team effectiveness.
> >> Reduce network management and security costs.Learn how to hire
> >> the most talented Cisco Certified professionals. Visit the
> >> Employer Resources Portal
> >> http://www.cisco.com/web/learning/employer_resources/index.html
> >> _______________________________________________
> >> Dbpedia-discussion mailing list
> >> Dbpedia-discussion@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
> >>
> >
> >
> >
> ------------------------------------------------------------------------------
> > Minimize network downtime and maximize team effectiveness.
> > Reduce network management and security costs.Learn how to hire
> > the most talented Cisco Certified professionals. Visit the
> > Employer Resources Portal
> > http://www.cisco.com/web/learning/employer_resources/index.html
> > _______________________________________________
> > Dbpedia-discussion mailing list
> > Dbpedia-discussion@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
> >
>
>
> ------------------------------------------------------------------------------
> Minimize network downtime and maximize team effectiveness.
> Reduce network management and security costs.Learn how to hire
> the most talented Cisco Certified professionals. Visit the
> Employer Resources Portal
> http://www.cisco.com/web/learning/employer_resources/index.html
> _______________________________________________
> Dbpedia-discussion mailing list
> Dbpedia-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>



-- 

Pablo N. Mendes
http://pablomendes.com
------------------------------------------------------------------------------
Minimize network downtime and maximize team effectiveness.
Reduce network management and security costs.Learn how to hire 
the most talented Cisco Certified professionals. Visit the 
Employer Resources Portal
http://www.cisco.com/web/learning/employer_resources/index.html
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to