Pintoch added a comment.

@Lydia_Pintscher , @Smalyshev and @Tpt : is there any info about how RDF is expected to behave as an import format for Wikidata? As far as I can tell, the RDF that gets fed into the Query Service is not designed for import at all:

  • first, there is a lot of redundancy: values are represented by simple values and value nodes, truthy statements are redundant with statement nodes, and other things like that. (this is absolutely not a criticism of the RDF serialization strategy: it totally makes sense as an export format!) So is there any designated subset of the exported triples that data producers would need to emit? I assume that subset would need to be as expressive as possible (so, for instance, the truthy triples would be dropped in favor of the full statement nodes). That is going to be very verbose, right?
  • second, the identifiers on the nodes are generated by Wikibase: so, how does a data producer picks identifiers? Is it just going to impose its own hashes that Wikibase will have to respect?

It would be great to have something else than QuickStatements to represent a data import, but I still have doubts about why RDF is suitable for that in the first place. The good thing about RDF is that it is a standard, so many tools can deal with it. But given the issues mentioned above, I expect it is going to be quite painful to reuse these tools to produce data in the right schema, as everything is deeply reified. Anyway, if that is the path you have chosen, we need specs please!

Also, it seems that this project uses Java, so may I suggest that the reusable parts go to the Wikidata-Toolkit rather than the Primary Sources Tool? Wikidata-Toolkit has already got RDF export, so it would make sense to have RDF import (from RDF statements to the datamodel representation, say).


TASK DETAIL
https://phabricator.wikimedia.org/T173749

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Pintoch
Cc: Pintoch, Aklapper, Hjfocs, Tpt, Lydia_Pintscher, Smalyshev, GoranSMilovanovic, Kiailandi, QZanden, dachary, Izno, Wikidata-bugs, aude, Ricordisamoa, Sjoerddebruin, Mbch331
_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to