[Wikidata] Re: Timezone, before, and after fields in JSON dump
Hi! On Mon, Jan 10, 2022 at 4:50 PM Lydia Pintscher wrote: > Thanks for checking. Do you have a few examples so we can have a closer look? There are many. Few cases (many seems to about a reference with P813): Q5198 P21 reference 1 snaks P813 0: timezone 60 Q5664 P20 reference 1 snaks P813 0: timezone 60 Q5721 P106 reference 0 snaks P813 0: timezone 60 Q5869 P194 reference 0 snaks P813 0: timezone -5 Q5826 P194 reference 0 snaks P813 0: timezone -5 Q5816 P106 reference 0 snaks P813 0: timezone 60 Q11618 P4632 reference 0 snaks P813 0: before 1 Q12018 P4632 reference 0 snaks P813 0: before 1 Q12773 P106 reference 0 snaks P813 0: timezone 60 Q12773 P106 reference 0 snaks P813 0: timezone 60 Q13283 P1999 reference 0 snaks P813 0: after 1 Q13293 P1999 reference 0 snaks P813 0: after 1 Q13307 P1999 reference 0 snaks P813 0: after 1 Q13334 P355 reference 0 snaks P813 0: before 1 Q13353 P2853 reference 0 snaks P813 0: timezone 120 Q13361 P1999 reference 0 snaks P813 0: after 1 Q14430 P106 reference 0 snaks P813 0: timezone 60 Q14524 P106 reference 1 snaks P813 0: timezone 60 Q15174 P194 reference 0 snaks P813 0: timezone -5 Q16019 P21 reference 0 snaks P813 0: timezone 60 Q16285 P106 reference 1 snaks P813 0: timezone 60 Q16285 P106 reference 1 snaks P813 0: timezone 60 Q16389 P106 reference 0 snaks P813 0: timezone 60 Q16403 P4632 reference 0 snaks P813 0: before 1 Q16572 P194 reference 0 snaks P813 0: timezone -5 Q16967 P194 reference 0 snaks P813 0: timezone -5 Q18809 P106 reference 0 snaks P813 0: timezone 60 Q18809 P106 reference 0 snaks P813 0: timezone 60 Q19214 P1001 reference 0 snaks P813 0: timezone -5 Q20456 P4632 reference 0 snaks P813 0: before 1 Q22432 P4632 reference 0 snaks P813 0: before 1 You cannot see it in web UI, but you can see them in JSON, e.g.: https://www.wikidata.org/wiki/Special:EntityData/Q5198.json Few, which are not related to P813 are: Q28287 P2046 qualifier P585: timezone 1 Q38573 P166 qualifier P585: after 1 Q54764 P2046 qualifier P585: timezone 1 Q82986 P580: after 1 There are really many of them. I can produce the whole list if you need that. Mitar -- http://mitar.tnode.com/ https://twitter.com/mitar_m ___ Wikidata mailing list -- wikidata@lists.wikimedia.org To unsubscribe send an email to wikidata-le...@lists.wikimedia.org
[Wikidata] Re: +0000-00-00T00:00:00Z in JSON dump
Hi! I took some time and went over all cases I found and all of them were simply bad data. I suspect that most of them were added using some automatic way which was passing this timestamp in when there was no data. So I cleaned them up or fixed them (in few cases the right value was "unknown" with a range, in some cases it was 1 BCE, but in most cases I just removed the claim because not only that it is false, it simply invalid, it is not even a valid timestamp). You can see examples in my recent changes [1]. At this point I would ask more about how this got in (why it is not denied at insertion time) and even more interesting: the web UI does not show any warning about those values. For many other cases you get various warnings about possibly invalid data, but not here. So maybe adding a warning that if such a timestamp is a value, a warning should be shown next to it, that would be great. Of course, even better would be to prevent insertion (because in 99% it means somebody is blindly inserting a default zero value). [1] https://www.wikidata.org/w/index.php?title=Special:Contributions/Mitar&offset=&limit=500&target=Mitar Mitar On Mon, Jan 10, 2022 at 4:50 PM Lydia Pintscher wrote: > > Hey Mitar, > > Also here a few examples would help to better understand what's going on. > > > Cheers > Lydia > > On Sun, Jan 9, 2022 at 9:52 AM Mitar wrote: > > > > Hi! > > > > I have been processing a recent Wikidata JSON dump. I have noticed > > that some claims have +-00-00T00:00:00Z as the time value. My > > understanding is that those are invalid values for time, at least > > according to [1]. I think they can be safely removed, yes? > > > > [1] https://doc.wikimedia.org/Wikibase/master/php/md_docs_topics_json.html > > > > > > Mitar > > > > -- > > http://mitar.tnode.com/ > > https://twitter.com/mitar_m > > ___ > > Wikidata mailing list -- wikidata@lists.wikimedia.org > > To unsubscribe send an email to wikidata-le...@lists.wikimedia.org > > > > -- > Lydia Pintscher - http://about.me/lydia.pintscher > Product Manager for Wikidata > > Wikimedia Deutschland e.V. > Tempelhofer Ufer 23-24 > 10963 Berlin > www.wikimedia.de > > Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. > > Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg > unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das > Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207. > ___ > Wikidata mailing list -- wikidata@lists.wikimedia.org > To unsubscribe send an email to wikidata-le...@lists.wikimedia.org -- http://mitar.tnode.com/ https://twitter.com/mitar_m ___ Wikidata mailing list -- wikidata@lists.wikimedia.org To unsubscribe send an email to wikidata-le...@lists.wikimedia.org
[Wikidata] +0000-00-00T00:00:00Z in JSON dump
Hi! I have been processing a recent Wikidata JSON dump. I have noticed that some claims have +-00-00T00:00:00Z as the time value. My understanding is that those are invalid values for time, at least according to [1]. I think they can be safely removed, yes? [1] https://doc.wikimedia.org/Wikibase/master/php/md_docs_topics_json.html Mitar -- http://mitar.tnode.com/ https://twitter.com/mitar_m ___ Wikidata mailing list -- wikidata@lists.wikimedia.org To unsubscribe send an email to wikidata-le...@lists.wikimedia.org
[Wikidata] Timezone, before, and after fields in JSON dump
Hi! I have been processing a recent Wikidata JSON dump. According to documentation [1], time datavalue has timezone, before and after fields, which are documented as currently not used. But I noticed that in the dump some claims do have them set. What should be done about them? Are they errors? Are they information? Can they be safely ignored? Should those claims be updated in Wikidata to remove those fields? I can provide a list of those if anyone is interested. [1] https://doc.wikimedia.org/Wikibase/master/php/md_docs_topics_json.html Mitar -- http://mitar.tnode.com/ https://twitter.com/mitar_m ___ Wikidata mailing list -- wikidata@lists.wikimedia.org To unsubscribe send an email to wikidata-le...@lists.wikimedia.org