I second this!

btw:  what is the status of the problem with the missing dumps with
history? (latest available from November 2014)

Lukas

Am Do 26.02.2015 um 14:52 schrieb Markus Kroetzsch:
> Hi,
> 
> It's that time of the year again when I am sending a reminder that we
> still have broken JSON in the dump files ;-). As usual, the problem is
> that empty maps {} are serialized wrongly as empty lists []. I am not
> sure if there is any open bug that tracks this, so I am sending an
> email. There was one, but it was closed [1].
> 
> As you know (I had sent an email a while ago), there are some remaining
> problems of this kind in the JSON dump, and also in the live exported
> JSON, e.g.,
> 
> https://www.wikidata.org/wiki/Special:EntityData/Q4383128.json
> (uses [] as a value for snaks: this item has a reference with an empty
> list of snaks, which is an error by itself)
> 
> However, the situation is considerably worse in the XML dumps, which
> have seen less usage since we have JSON, but as it turns out are still
> preferred by some users. Surprisingly (to me), the JSON content in the
> XML dumps is still not the same as in the JSON dumps. A large part of
> the records in the XML dump is broken because of the map-vs-list issue.
> 
> For example, the latest dump of current revisions [2] has countless
> instances of the problem. The first is in the item Q3261 (empty list for
> claims), but you can easily find more by grepping for things like
> 
> "claims":[]
> 
> It seems that all empty maps are serialized wrongly in this dump
> (aliases, descriptions, claims, ...). In contrast, the site's export
> simply omits the key of empty maps entirely, see
> 
> https://www.wikidata.org/wiki/Special:EntityData/Q3261.json
> 
> The JSON in the JSON dumps is the same.
> 
> Cheers,
> 
> Markus
> 
> 
> [1] https://github.com/wmde/WikibaseDataModelSerialization/issues/77
> [2]
> http://dumps.wikimedia.org/wikidatawiki/20150207/wikidatawiki-20150207-pages-meta-current.xml.bz2
> 
> 


Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Reply via email to