Re: [Wikidata-l] question about 2 different json formats
Am 10.08.2013 22:42, schrieb Jiang BIAN: So is there a spec about the stable external format? If you could include a version number of the format used by the data, it will be much easier to write compatible code and/or notice the changes immediately. I don't think there's a formal spec, though we really should have one. And the version number is a good idea. Put it on bugzilla, please :) -- daniel ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] question about 2 different json formats
On Wed, Aug 7, 2013 at 10:11 PM, Denny Vrandečić denny.vrande...@wikimedia.de wrote: Hi Anthony, that's the internal data structure, and this is bound to change without notice. I am sorry if this caused trouble. If this is a common concern, we will start documenting and announcing those changes. It really should only concern the people processing the XML dumps. We would prefer to actually create a more stable output dump of the knowledge - I guess this would be more appreciated (like the RDF dump that Markus has posted about recently). The call to get the item description should have been https://www.wikidata.org/w/api.php?action=wbgetentitiesformat=jsonids=Q1 This should provide you with a more stable answer. Cheers, Denny 2013/8/1 Huidong Zhang anthonyzh...@google.com Hi, I noticed that the response from http://www.wikidata.org/w/api.php?action=querytitles=Q1prop=revisionsrvprop=contentformat=xml; changed from entity:q1 to entity:[item,1]. Is this change applied to all pages? In the latest wikidata dump ( http://dumps.wikimedia.org/wikidatawiki/latest/wikidatawiki-latest-pages-meta-current.xml.bz2), both formats exist at the same time. For example, page Q100 has: entity:[item,100], while page Q10 has entity:q10. Is it expected? Will the next dump have same format? By the way, http://www.wikidata.org/w/api.php?action=querytitles=Q10prop=revisionsrvprop=contentformat=xml; return entity:[item,10]. About the inconsistency in the dump file, is there any bug entry created for this? (I can create one, if anyone can point me the proper place to do that). Thanks. -- Best wishes, Anthony Zhang (Huidong Zhang) ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Jiang BIAN This email may be confidential or privileged. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it went to the wrong person. Thanks. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] question about 2 different json formats
On 10-08-2013 10:54, Jiang BIAN wrote: On Wed, Aug 7, 2013 at 10:11 PM, Denny Vrandečić denny.vrande...@wikimedia.de mailto:denny.vrande...@wikimedia.de wrote: Hi Anthony, that's the internal data structure, and this is bound to change without notice. I am sorry if this caused trouble. If this is a common concern, we will start documenting and announcing those changes. It really should only concern the people processing the XML dumps. I am one of the people processing the XML dumps, and I don't think it is a big deal. But I have had to change my parser many times to be able to parse new dumps because of changes in the format (in most cases, but not always, because of new features), I just adapt to the changes without fuss, but if the format was documented I could file bug reports whenever the format is deviating from the documentation which might be helpful to the developers. (BTW, the time values seems to be OK again, after many syntax errors in the beginning. But the coordinate values have some strange (probably erroneous?) variations: Values where the precision and/or globe is given as null, and values where the globe is given as the string earth instead of an entity). About the inconsistency in the dump file, is there any bug entry created for this? (I can create one, if anyone can point me the proper place to do that). Not for my sake. I adapted to two entity formats in the dumps immediately when the new format started to appear. Best regards, - Byrial ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] question about 2 different json formats
On 10/08/13 10:29, Byrial Jensen wrote: ... (BTW, the time values seems to be OK again, after many syntax errors in the beginning. But the coordinate values have some strange (probably erroneous?) variations: Values where the precision and/or globe is given as null, and values where the globe is given as the string earth instead of an entity). Thanks for the warning. This was something that has been causing problems in the RDF dump too. I am now validating the globe settings more carefully. Cheers, Markus About the inconsistency in the dump file, is there any bug entry created for this? (I can create one, if anyone can point me the proper place to do that). Not for my sake. I adapted to two entity formats in the dumps immediately when the new format started to appear. Best regards, - Byrial ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l