We are currently working on something that could be extended to be used as a source of finding data conflicts / import. I have to check if this can be integrated with the primary sources tool. I hope we have something ready in the next couple of weeks and I'll get back at this thread.
Best, Dimitris On Thu, Jun 4, 2015 at 11:49 AM, Gerard Meijssen <gerard.meijs...@gmail.com> wrote: > Hoi, > Markus with all due respect, we have a LOT of data in Wikidata that is > plain wrong. When we add the missing data from DBpedia it is of a higher > quality than what we have. Insisting that it first needs to be validated is > foolish. It is not done for any of the work we do. All our bots make use of > Wikipedia and in this DBpedia is no different. > > I do agree that it makes sense to verify the data that is different. But > even so. When Wikidata says 1929 and DBpedia says 7-June-1929 our practise > has been to remove the 1929 for the more precise data. > > Let us be pragmatic and improve our data and start with what is missing. > Thanks, > GerardM > > On 4 June 2015 at 10:31, Markus Krötzsch <mar...@semantic-mediawiki.org> > wrote: > >> Hi Dmitris, >> >> Interesting situation. If you have contradictory data from several >> templates, then the challenge will be to find out which information is >> correct for importing it to Wikidata. Could your dataset maybe become an >> input to the primary sources tool [1]? Then Wikidata users could help to >> clean the dataset and try to find references (as you know, references are >> quite important for Wikidata, but it would really be asking too much of >> DBpedia to provide these). >> >> This could be a viable strategy to merge DBpedia data into Wikidata. This >> email was only about person-related data, but one could do this for any >> kind of dataset where the information in DBpedia is of relatively high >> quality. I don't know exactly what the primary sources tool needs as input >> (it is still beta), but I think it mainly requires that a decent quality >> set of candidate statements is extracted and provided in some suitable >> format. >> >> As a first step, it might make sense to do a scan to see how many >> date-of-death (or whatever) statements in DBpedia are not yet found in >> Wikidata. If it is a small dataset (e.g., only a subset of the people who >> have died in the last year), then maybe one could also add and verify it in >> another way, not going through primary sources. But especially for recent >> deaths, there might be a great variety of sources (esp. newspaper articles) >> that are not easy to find without user support. >> >> Regards, >> >> Markus >> >> [1] https://www.wikidata.org/wiki/Wikidata:Primary_sources_tool >> >> >> >> On 04.06.2015 09:56, Dimitris Kontokostas wrote: >> >>> >>> >>> On Thu, Jun 4, 2015 at 1:18 AM, Markus Krötzsch >>> <mar...@semantic-mediawiki.org <mailto:mar...@semantic-mediawiki.org>> >>> >>> wrote: >>> >>> On 03.06.2015 22:44, Gerard Meijssen wrote: >>> >>> Hoi, >>> The Dutch indicated their willingness to add the dead to >>> Wikidata ... I >>> add quite a few dead from other countries and because of Jura1 >>> Brazilians who died in 2015 have an added significance. >>> >>> Given that we CAN produce lists like this, it makes sense to >>> reconsider >>> the offer by the fine people from DBpedia and have the >>> information they >>> harvest from Wikipedia added automatically to Wikidata.. One >>> reason I >>> pointed out on my recent blogpost.. >>> >>> >>> DBpedia is getting this information from the contents of the >>> template Persondata as used on Wikipedia [1]. The enwiki community >>> just recently decided to maintain this data on Wikidata instead. I >>> guess this means that (English) DBpedia will not contain this data >>> in the future, unless they import it from Wikidata (they are >>> tracking the issue at [2]). >>> >>> >>> Note that DBpedia gets person data information both from the persondata >>> template and from the infobox templates using the mappings wiki. >>> We also noted that the data between the two is many times out of sync >>> (and usually the person data is stalled/wrong because people don't know >>> it's existence). >>> >>> e.g. we have 28K items with double birth dates one from the infobox and >>> another from persondata. >>> >>> select count(*) where {?s dbpedia-owl:birthDate ?b1 ; >>> dbpedia-owl:birthDate ?b2 . >>> filter (?b1 != ?b2 && ?b1 < ?b2)} >>> >>> http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=select+count%28*%29+where+%7B%3Fs+dbpedia-owl%3AbirthDate+%3Fb1+%3B+dbpedia-owl%3AbirthDate+%3Fb2+.%0D%0Afilter+%28%3Fb1+%21%3D+%3Fb2+%26%26+%3Fb1+%3C+%3Fb2%29%7D&format=text%2Fhtml&timeout=30000&debug=on >>> >>> The persondata template is used in German Wikipedia as well. The >>> following release has ~ 2.2M triples coming from the german persondata >>> template (which iirc has the same problems as the english) >>> >>> Best, >>> Dimitris >>> >>> >>> So you see, times are changing quickly ... but overall I hope that >>> this is still solving the problem you identified, in fact in a much >>> more direct way than one might have hoped for :-). >>> >>> DBpedia may still play a role. I don't know how exactly the enwiki >>> community is planning to implement the move from Persondata to >>> Wikidata. It could be that DBpedia is the only project extracting >>> this data. So in a way, your suggestion might be a great idea, >>> though not as a long-term data maintenance plan but as a one-time >>> help for migration. >>> >>> To support data maintenance further, it would make sense to use bots >>> for synching with authority files. These files also contain death >>> dates and they can even be used as a valid reference. >>> >>> Regards, >>> >>> Markus >>> >>> [1] https://en.wikipedia.org/wiki/Template:Persondata >>> [2] https://github.com/dbpedia/extraction-framework/issues/397 >>> >>> Thanks, >>> GerardM >>> >>> >>> http://ultimategerardm.blogspot.nl/2015/06/wikidata-jurandyr-noronha-died-in-2015.html >>> >>> On 3 June 2015 at 07:16, Gerard Meijssen >>> <gerard.meijs...@gmail.com <mailto:gerard.meijs...@gmail.com> >>> <mailto:gerard.meijs...@gmail.com >>> >>> <mailto:gerard.meijs...@gmail.com>>> wrote: >>> >>> Hoi, >>> Jura1 created a wonderful list of people who died in Brazil >>> in 2015 >>> [1]. It is a page that may update regularly from Wikidata >>> thanks to >>> the ListeriaBot. Obviously, there may be a few more because >>> I am >>> falling ever more behind with my quest for registering >>> deaths in 2015. >>> >>> I have copied his work and created a page for people who >>> died in the >>> Netherlands in 2015 [2]. It is trivially easy to do this >>> and, the >>> result is great. The result looks great, it can be used for >>> any >>> country in any Wikipedia >>> >>> The Dutch Wikipedia indicated that they nowadays maintain >>> important >>> metadata at Wikidata. I am really happy that we can >>> showcase their >>> work. It is important work because as someone reminded me >>> at some >>> stage, this is part of what amounts to the policy of living >>> people... >>> >>> Thanks, >>> GerardM >>> >>> [1] >>> https://www.wikidata.org/wiki/User:Jura1/Recent_deaths_in_Brazil >>> [2] >>> >>> https://www.wikidata.org/wiki/User:Jura1/Recent_deaths_in_the_Netherlands >>> >>> >>> >>> >>> _______________________________________________ >>> Wikidata mailing list >>> Wikidata@lists.wikimedia.org <mailto: >>> Wikidata@lists.wikimedia.org> >>> https://lists.wikimedia.org/mailman/listinfo/wikidata >>> >>> >>> >>> _______________________________________________ >>> Wikidata mailing list >>> Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> >>> https://lists.wikimedia.org/mailman/listinfo/wikidata >>> >>> >>> >>> >>> -- >>> Kontokostas Dimitris >>> >>> >>> _______________________________________________ >>> Wikidata mailing list >>> Wikidata@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/wikidata >>> >>> >> >> _______________________________________________ >> Wikidata mailing list >> Wikidata@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wikidata >> > > > _______________________________________________ > Wikidata mailing list > Wikidata@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata > > -- Kontokostas Dimitris
_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata