Andrawaag added a comment.
You are completely right, the same hashes are not needed to apply EntitySchema's on memory ingestion to Wikidata. I need the hashes as a sanity check that my script created the exact same RDF as being produced by Wikidata natively. So the hashes are only needed in the development phase of the script. Here is a notebook that contains the first prototype <https://public.paws.wmcloud.org/User:Andrawaag/Genewiki/Wikidata_json2ttl.ipynb>. allRD = WDqidRDFEngine(qid="Q38", fetch_all=True) compareRDF = Graph() compareRDF.parse("http://www.wikidata.org/entity/Q38.ttl", ) inboth, left, right = graph_diff(to_isomorphic(compareRDF), to_isomorphic(allRD.rdf_item)) print(len(left)) print(len(compareRDF) If my script works, there should be no difference in the length of both graphs. Currently, that is not the case. I checked various examples and except for the hashes in those normalized statements they seem equal. But if it is indeed difficult to reproduce those hashes, I should reflect on another test to verify. In the actual validation script, not all RDF will be needed. Ignoring the labels for example slims down the RDF graph substantially. So I am currently building functionality into the WikidataIntegrator that allows selecting only certain parts (e.g. no truthy statements, or only truthy statements, no normalized values, etc). A notebook with that code is here <https://public.paws.wmcloud.org/User:Andrawaag/Genewiki/wdi_rdf.ipynb> My PHP skills are a bit rusty, but I will investigate and or consider other test strategies. TASK DETAIL https://phabricator.wikimedia.org/T283997 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Addshore, Andrawaag Cc: Lucas_Werkmeister_WMDE, Aklapper, Andrawaag, Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Addshore, Mbch331
_______________________________________________ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org