[Wikidata-bugs] [Maniphest] T179681: Add HDT dump of Wikidata
jjkoehorst added a comment. Small update from my side. After downloading the latest ttl file from Wikidata I receive no errors but also no output. I tried the exact command with a small dataset and that worked. time sudo docker run -v `pwd`:/wikidata rdfhdt/hdt-cpp:v1.3.3 rdf2hdt -f turtle -p -i wikidata/latest-all.ttl.gz wikidata/latest-all.hdt sudo docker run -v `pwd`:/wikidata rdfhdt/hdt-cpp:v1.3.3 rdf2hdt -f turtle -p 19.75s user 13.90s system 0% cpu 50:21:55.81 total So I am not exactly sure what is happening. This is the temp first 103 lines of the turtle file. time sudo docker run -v `pwd`:/wikidata rdfhdt/hdt-cpp:v1.3.3 rdf2hdt -f turtle -p -i wikidata/tmp.ttl.gz wikidata/tmp.hdt Predicate Bitmap in 21 usp: 0 % / 5.4 % Count predicates in 17 userences: 0 % / 6.75 % Count Objects in 8 us Max was: 8: 0 % / 27 % Bitmap in 9 usx bitmap: 0 % / 39.6 % Bitmap bits: 56 Ones: 38 Object references in 23 usces: 0 % / 42.75 % Sort lists in 17 uslists: 0 % / 64.8 % Index generated in 119 us sudo docker run -v `pwd`:/wikidata rdfhdt/hdt-cpp:v1.3.3 rdf2hdt -f turtle -p 0.04s user 0.03s system 1% cpu 4.868 total and then I can access the turtle file on the local drive. TASK DETAIL https://phabricator.wikimedia.org/T179681 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: jjkoehorst Cc: jjkoehorst, MPhamWMF, Daniel_Mietchen, hoo, Addshore, Smalyshev, Ladsgroup, Arkanosis, Tarrow, Lucas_Werkmeister_WMDE, Aklapper, Invadibot, maantietaja, Akuckartz, Dinadineke, DannyS712, Nandana, tabish.shaikh91, Lahi, Gq86, GoranSMilovanovic, Soteriaspace, Jayprakash12345, JakeTheDeveloper, QZanden, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, TheDJ, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T179681: Add HDT dump of Wikidata
Lydia_Pintscher closed subtask T277662: latest all rdf dump: bad IRI scheme as "Resolved". TASK DETAIL https://phabricator.wikimedia.org/T179681 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Lydia_Pintscher Cc: jjkoehorst, MPhamWMF, Daniel_Mietchen, hoo, Addshore, Smalyshev, Ladsgroup, Arkanosis, Tarrow, Lucas_Werkmeister_WMDE, Aklapper, Invadibot, maantietaja, Akuckartz, Dinadineke, DannyS712, Nandana, tabish.shaikh91, Lahi, Gq86, GoranSMilovanovic, Soteriaspace, Jayprakash12345, JakeTheDeveloper, QZanden, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, TheDJ, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T179681: Add HDT dump of Wikidata
Andrawaag added a subtask: T277662: latest all rdf dump: bad IRI scheme. TASK DETAIL https://phabricator.wikimedia.org/T179681 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Andrawaag Cc: jjkoehorst, MPhamWMF, Daniel_Mietchen, hoo, Addshore, Smalyshev, Ladsgroup, Arkanosis, Tarrow, Lucas_Werkmeister_WMDE, Aklapper, maantietaja, Akuckartz, Dinadineke, DannyS712, Nandana, tabish.shaikh91, Lahi, Gq86, GoranSMilovanovic, Soteriaspace, Jayprakash12345, JakeTheDeveloper, QZanden, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, abian, Wikidata-bugs, aude, TheDJ, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T179681: Add HDT dump of Wikidata
jjkoehorst added a comment. As I was having some issues with compiling the code I used a docker instance directly for the conversion unfortunately it failed due to rdf syntax reasons while using the latest database. As I didn't time it I cannot give any details yet about the performance. ? wikidata sudo docker run -v `pwd`:/wikidata rdfhdt/hdt-cpp:v1.3.3 rdf2hdt -p -i wikidata/latest-all.nt.gz wikidata/latest-all.hdt error: wikidata/latest-all.nt.gz:604276348:139: bad IRI scheme char `2F' Catch exception load: Error parsing input. ERROR: Error parsing input. TASK DETAIL https://phabricator.wikimedia.org/T179681 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: jjkoehorst Cc: jjkoehorst, MPhamWMF, Daniel_Mietchen, hoo, Addshore, Smalyshev, Ladsgroup, Arkanosis, Tarrow, Lucas_Werkmeister_WMDE, Aklapper, maantietaja, Akuckartz, Dinadineke, DannyS712, Nandana, tabish.shaikh91, Lahi, Gq86, GoranSMilovanovic, Soteriaspace, Jayprakash12345, JakeTheDeveloper, QZanden, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, abian, Wikidata-bugs, aude, TheDJ, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] T179681: Add HDT dump of Wikidata
Daniel_Mietchen added a comment. The 32-bit issue at https://github.com/rdfhdt/hdt-cpp/issues/135 that was mentioned above seems to be resolved, so perhaps this can be revisited now? TASK DETAIL https://phabricator.wikimedia.org/T179681 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Daniel_Mietchen Cc: Daniel_Mietchen, hoo, Addshore, Smalyshev, Ladsgroup, Arkanosis, Tarrow, Lucas_Werkmeister_WMDE, Aklapper, Akuckartz, Dinadineke, DannyS712, Nandana, tabish.shaikh91, Lahi, Gq86, GoranSMilovanovic, Soteriaspace, Jayprakash12345, JakeTheDeveloper, QZanden, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, TheDJ, Mbch331 ___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs