Re: [Wikidata-l] next 2 rounds of arbitrary access coming up
Lydia, thanks for your insights! Egon On Tue, May 19, 2015 at 3:07 PM, Lydia Pintscher wrote: > On Tue, May 19, 2015 at 2:41 PM, Egon Willighagen > wrote: >> Dear Lydia, >> >> On Wed, May 13, 2015 at 5:20 PM, Lydia Pintscher >> wrote: >>> The rollout of arbitrary access on Dutch Wikipedia >> >> Is there an overview of Dutch WP pages where it is being used? The >> Berlin/Germany use case experienced resistance, needed further >> discussion and consensus first? Has it been adopted on other Dutch >> pages? How was it received there? > > I don't think there is an overview. Maybe some of the nlwp editors who > are on this list can chime in and give you links. > It was understandably met with resistance on the page for Germany > because it was changed by people outside that community in a > high-profile article. We just need to let each community experiment > with this at their own pace and let them find the right rules for how > to use it. > > > Cheers > Lydia > > -- > Lydia Pintscher - http://about.me/lydia.pintscher > Product Manager for Wikidata > > Wikimedia Deutschland e.V. > Tempelhofer Ufer 23-24 > 10963 Berlin > www.wikimedia.de > > Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. > > Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg > unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das > Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. > > ___ > Wikidata-l mailing list > Wikidata-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- E.L. Willighagen Department of Bioinformatics - BiGCaT Maastricht University (http://www.bigcat.unimaas.nl/) Homepage: http://egonw.github.com/ LinkedIn: http://se.linkedin.com/in/egonw Blog: http://chem-bla-ics.blogspot.com/ PubList: http://www.citeulike.org/user/egonw/tag/papers ORCID: -0001-7542-0286 ImpactStory: https://impactstory.org/EgonWillighagen ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] next 2 rounds of arbitrary access coming up
Dear Lydia, On Wed, May 13, 2015 at 5:20 PM, Lydia Pintscher wrote: > The rollout of arbitrary access on Dutch Wikipedia Is there an overview of Dutch WP pages where it is being used? The Berlin/Germany use case experienced resistance, needed further discussion and consensus first? Has it been adopted on other Dutch pages? How was it received there? Egon -- E.L. Willighagen Department of Bioinformatics - BiGCaT Maastricht University (http://www.bigcat.unimaas.nl/) Homepage: http://egonw.github.com/ LinkedIn: http://se.linkedin.com/in/egonw Blog: http://chem-bla-ics.blogspot.com/ PubList: http://www.citeulike.org/user/egonw/tag/papers ORCID: -0001-7542-0286 ImpactStory: https://impactstory.org/EgonWillighagen ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Where did the JSON dumps go?
On Mon, Jan 19, 2015 at 6:12 PM, Lydia Pintscher wrote: > Apparently a small screwup deleted them all. Jan is working on a fix right > now. Meanwhile, this torrent of the Jan 12 (last week's) JSON dump is available: http://academictorrents.com/details/466d6a3794328acc7c068a45f0380ef3ade8345f/tech :) Egon -- E.L. Willighagen Department of Bioinformatics - BiGCaT Maastricht University (http://www.bigcat.unimaas.nl/) Homepage: http://egonw.github.com/ LinkedIn: http://se.linkedin.com/in/egonw Blog: http://chem-bla-ics.blogspot.com/ PubList: http://www.citeulike.org/user/egonw/tag/papers ORCID: -0001-7542-0286 ImpactStory: https://impactstory.org/EgonWillighagen ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Wikidata-Toolkit NullPointerException
OK, thanks! BTW, I could confirm the NPE solved by adding that json-MMDD/ subdir... Another question: is it possible to cancel to process of parsing a datadump file programmatically? I saw the time out, but integrating it in a GUI where the user may push a cancel button, and would be nice if I could propagate that, and stop the actual processing... Egon On Sun, Jan 18, 2015 at 3:23 PM, Markus Krötzsch wrote: > The issue was fixed in master now. I also added some more INFO-type messages > that will report about the dump files found online and locally. > > Cheers, > > Markus > > > On 18.01.2015 14:26, Markus Krötzsch wrote: >> >> On 18.01.2015 10:58, Egon Willighagen wrote: >>> >>> On Sat, Jan 17, 2015 at 11:04 PM, Markus Krötzsch >>> wrote: >>>> >>>> It is easy to fix this (though I will not fix it tonight, but >>>> tomorrow) by >>>> just adjusting the HTML strings we parse for. >>> >>> >>> Sure! I have subscribed to the bug report. >>> >>> As an intermediate workaround for me, what file name pattern is used >>> in the local cache? >>> >>> I had manually downloaded a file (and made it available as torrent >>> because it was only at about 1 MB/s, [0]) and put this in the folder, >>> but it was not recognized... the file on the server is: >>> http://dumps.wikimedia.org/other/wikidata/20150112.json.gz >>> >>> But as 20150112.json.gz it is not detected... I noted the the json-* >>> pattern in the code, but json-20150112.json.gz didn't work either... >> >> >> The dump files are put into subdirectories of the current directory >> ("."), for example: >> >> ./dumpfiles/wikidatawiki/json-20150105/20150105.json.gz >> (JSON dump) >> >> >> ./dumpfiles/wikidatawiki/current-20141009/wikidatawiki-20141009-pages-meta-current.xml.bz2 >> >> (current revision XML dump) >> >> If you create a directory of this form and put a file in there with the >> file name as found online, then the tool will find it. >> >>> >>> BTW, a second question, is there a way to list all local (JSON) dumps >>> using the WDTK api? >> >> >> Yes, though it's not very convenient right now. To restrict to local >> files, you can use the DumpProcessingController in offline mode (then it >> only looks at local files): >> >> >> DumpProcessingController dumpProcessingController = >> new DumpProcessingController("wikidatawiki"); >> dumpProcessingController.setOfflineMode(true); >> >> List localJsonDumps = >> dumpProcessingController. >> getWmfDumpFileManager(). >> findAllDumps(DumpContentType.JSON); >> >> This gives you a list of MwDumpFile objects that you can access to get >> their date (getDateStamp()) and also to access the file contents. >> >> I think we should log some additional messages about the files that are >> found and used. >> >> Cheers, >> >> Markus >> >>> >>>> We should also improve our error reporting for this case, obviously. >>> >>> >>> Yeah, that's an art no software I ever worked with mastered... it's >>> hard! But it's important... I was completely looking in the wrong >>> place... mind you, monitoring logging messages can be hard too, when >>> WDTK is used in other environments, such as Bioclipse, and you cannot >>> rely on those message to show up :( >>> >>> Thanks for immediately looking into it and looking forward to pointers >>> for my two questions, >>> >>> greetings, >>> >>> Egon >>> >> > > > ___ > Wikidata-l mailing list > Wikidata-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- E.L. Willighagen Department of Bioinformatics - BiGCaT Maastricht University (http://www.bigcat.unimaas.nl/) Homepage: http://egonw.github.com/ LinkedIn: http://se.linkedin.com/in/egonw Blog: http://chem-bla-ics.blogspot.com/ PubList: http://www.citeulike.org/user/egonw/tag/papers ORCID: -0001-7542-0286 ImpactStory: https://impactstory.org/EgonWillighagen ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Wikidata-Toolkit NullPointerException
On Sat, Jan 17, 2015 at 11:04 PM, Markus Krötzsch wrote: > It is easy to fix this (though I will not fix it tonight, but tomorrow) by > just adjusting the HTML strings we parse for. Sure! I have subscribed to the bug report. As an intermediate workaround for me, what file name pattern is used in the local cache? I had manually downloaded a file (and made it available as torrent because it was only at about 1 MB/s, [0]) and put this in the folder, but it was not recognized... the file on the server is: http://dumps.wikimedia.org/other/wikidata/20150112.json.gz But as 20150112.json.gz it is not detected... I noted the the json-* pattern in the code, but json-20150112.json.gz didn't work either... BTW, a second question, is there a way to list all local (JSON) dumps using the WDTK api? > We should also improve our error reporting for this case, obviously. Yeah, that's an art no software I ever worked with mastered... it's hard! But it's important... I was completely looking in the wrong place... mind you, monitoring logging messages can be hard too, when WDTK is used in other environments, such as Bioclipse, and you cannot rely on those message to show up :( Thanks for immediately looking into it and looking forward to pointers for my two questions, greetings, Egon -- E.L. Willighagen Department of Bioinformatics - BiGCaT Maastricht University (http://www.bigcat.unimaas.nl/) Homepage: http://egonw.github.com/ LinkedIn: http://se.linkedin.com/in/egonw Blog: http://chem-bla-ics.blogspot.com/ PubList: http://www.citeulike.org/user/egonw/tag/papers ORCID: -0001-7542-0286 ImpactStory: https://impactstory.org/EgonWillighagen ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Wikidata-Toolkit NullPointerException
OK, thanks! Please let me know if I can test anything on my system. Egon On 17 Jan 2015 22:50, "Markus Krötzsch" wrote: > On 17.01.2015 22:43, Egon Willighagen wrote: > >> This last test from the cmd line is already with master from GitHub... >> > > Thanks, we will investigate. I created a bug report at > > https://github.com/Wikidata/Wikidata-Toolkit/issues/114 > > Markus > > >> Egon >> >> On 17 Jan 2015 22:40, "Markus Krötzsch" > <mailto:mar...@semantic-mediawiki.org>> wrote: >> >> Hi Egon, >> >> WDTK 0.3.0 is rather old and we are about to prepare a new release >> (there are other issues with 0.3.0: the JSON format has changed >> since its release and it won't read the files anyway). Could you try >> if the problem occurs with the current development code at github? >> >> Cheers, >> >> Markus >> >> On 17.01.2015 16:59, Egon Willighagen wrote: >> >> Hi all, >> >> I have been trying today to get the Java library Wikidata-Toolkit >> going, but about to give up... I keep running with both 0.3.0 and >> current master into a NullPointerException... I thought it was >> how I >> called the code, and did add several System.out calls, and in >> the end >> just tried to get it running from the command line... I tried the >> example from the website (though replaced the Dump examples, >> which I >> don't see in master; btw, "mvn test" runs fine) using a pristine >> master: >> >> $ cd wdtk-examples/ >> $ mvn compile >> $ mvn exec:java >> -Dexec.mainClass="org.__wikidata.wdtk.examples.__ >> EntityStatisticsProcessor" >> >> In doing so, I get the same NPE: >> >> **__ >> **__ >> *** Wikidata Toolkit: EntityStatisticsProcessor >> *** >> *** This program will download and process dumps from Wikidata. >> *** It will print progress information and some simple statistics. >> *** Results about property usage will be stored in a CSV file. >> *** See source code for further details. >> **__ >> **__ >> 2015-01-17 16:53:00 INFO - Using download directory >> /home/egonw/var/Projects/__GitHub/Wikidata-Toolkit/wdtk-_ >> _examples/dumpfiles/__wikidatawiki >> [WARNING] >> java.lang.reflect.__InvocationTargetException >> at >> sun.reflect.__NativeMethodAccessorImpl.__invoke0(Native Method) >> at >> sun.reflect.__NativeMethodAccessorImpl.__invoke(__ >> NativeMethodAccessorImpl.java:__57) >> at >> sun.reflect.__DelegatingMethodAccessorImpl.__invoke(__ >> DelegatingMethodAccessorImpl.__java:43) >> at java.lang.reflect.Method.__invoke(Method.java:606) >> at >> org.codehaus.mojo.exec.__ExecJavaMojo$1.run(__ >> ExecJavaMojo.java:293) >> at java.lang.Thread.run(Thread.__java:745) >> Caused by: java.lang.NullPointerException >> at >> org.wikidata.wdtk.dumpfiles.__DumpProcessingController.__ >> processDumpFile(__DumpProcessingController.java:__470) >> at >> org.wikidata.wdtk.dumpfiles.__DumpProcessingController.__ >> processMostRecentDump(__DumpProcessingController.java:__456) >> at >> org.wikidata.wdtk.dumpfiles.__DumpProcessingController.__ >> processMostRecentJsonDump(__DumpProcessingController.java:__426) >> at >> org.wikidata.wdtk.examples.__ExampleHelpers.__ >> processEntitiesFromWikidataDum__p(ExampleHelpers.java:158) >> at >> org.wikidata.wdtk.examples.__EntityStatisticsProcessor.__main(__ >> EntityStatisticsProcessor.__java:88) >> ... 6 more >> >> >> I tried finding what goes wrong, but cannot grasp all the magic >> that >> is going on... the directory it reports was created, but is >> empty... >> >> $ mvn --version >> Apache Maven 3.0.5 >> Maven home: /usr/share/maven >> Java version: 1.7.0_65, vendor: Oracle Corporation >> Java home: /usr/lib/
Re: [Wikidata-l] Wikidata-Toolkit NullPointerException
This last test from the cmd line is already with master from GitHub... Egon On 17 Jan 2015 22:40, "Markus Krötzsch" wrote: > Hi Egon, > > WDTK 0.3.0 is rather old and we are about to prepare a new release (there > are other issues with 0.3.0: the JSON format has changed since its release > and it won't read the files anyway). Could you try if the problem occurs > with the current development code at github? > > Cheers, > > Markus > > On 17.01.2015 16:59, Egon Willighagen wrote: > >> Hi all, >> >> I have been trying today to get the Java library Wikidata-Toolkit >> going, but about to give up... I keep running with both 0.3.0 and >> current master into a NullPointerException... I thought it was how I >> called the code, and did add several System.out calls, and in the end >> just tried to get it running from the command line... I tried the >> example from the website (though replaced the Dump examples, which I >> don't see in master; btw, "mvn test" runs fine) using a pristine >> master: >> >> $ cd wdtk-examples/ >> $ mvn compile >> $ mvn exec:java >> -Dexec.mainClass="org.wikidata.wdtk.examples.EntityStatisticsProcessor" >> >> In doing so, I get the same NPE: >> >> >> *** Wikidata Toolkit: EntityStatisticsProcessor >> *** >> *** This program will download and process dumps from Wikidata. >> *** It will print progress information and some simple statistics. >> *** Results about property usage will be stored in a CSV file. >> *** See source code for further details. >> >> 2015-01-17 16:53:00 INFO - Using download directory >> /home/egonw/var/Projects/GitHub/Wikidata-Toolkit/wdtk-examples/dumpfiles/ >> wikidatawiki >> [WARNING] >> java.lang.reflect.InvocationTargetException >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at sun.reflect.NativeMethodAccessorImpl.invoke( >> NativeMethodAccessorImpl.java:57) >> at sun.reflect.DelegatingMethodAccessorImpl.invoke( >> DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:606) >> at org.codehaus.mojo.exec.ExecJavaMojo$1.run( >> ExecJavaMojo.java:293) >> at java.lang.Thread.run(Thread.java:745) >> Caused by: java.lang.NullPointerException >> at org.wikidata.wdtk.dumpfiles.DumpProcessingController. >> processDumpFile(DumpProcessingController.java:470) >> at org.wikidata.wdtk.dumpfiles.DumpProcessingController. >> processMostRecentDump(DumpProcessingController.java:456) >> at org.wikidata.wdtk.dumpfiles.DumpProcessingController. >> processMostRecentJsonDump(DumpProcessingController.java:426) >> at org.wikidata.wdtk.examples.ExampleHelpers. >> processEntitiesFromWikidataDump(ExampleHelpers.java:158) >> at org.wikidata.wdtk.examples.EntityStatisticsProcessor.main( >> EntityStatisticsProcessor.java:88) >> ... 6 more >> >> >> I tried finding what goes wrong, but cannot grasp all the magic that >> is going on... the directory it reports was created, but is empty... >> >> $ mvn --version >> Apache Maven 3.0.5 >> Maven home: /usr/share/maven >> Java version: 1.7.0_65, vendor: Oracle Corporation >> Java home: /usr/lib/jvm/java-7-openjdk-i386/jre >> Default locale: en_US, platform encoding: UTF-8 >> OS name: "linux", version: "3.16.0-4-686-pae", arch: "i386", family: >> "unix" >> >> Can someone give me some pointers where and how it is testing of dump >> files exist? Is this problem something platform dependent? >> >> Thanks, >> >> Egon >> >> > > ___ > Wikidata-l mailing list > Wikidata-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata-l > ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
[Wikidata-l] Wikidata-Toolkit NullPointerException
Hi all, I have been trying today to get the Java library Wikidata-Toolkit going, but about to give up... I keep running with both 0.3.0 and current master into a NullPointerException... I thought it was how I called the code, and did add several System.out calls, and in the end just tried to get it running from the command line... I tried the example from the website (though replaced the Dump examples, which I don't see in master; btw, "mvn test" runs fine) using a pristine master: $ cd wdtk-examples/ $ mvn compile $ mvn exec:java -Dexec.mainClass="org.wikidata.wdtk.examples.EntityStatisticsProcessor" In doing so, I get the same NPE: *** Wikidata Toolkit: EntityStatisticsProcessor *** *** This program will download and process dumps from Wikidata. *** It will print progress information and some simple statistics. *** Results about property usage will be stored in a CSV file. *** See source code for further details. 2015-01-17 16:53:00 INFO - Using download directory /home/egonw/var/Projects/GitHub/Wikidata-Toolkit/wdtk-examples/dumpfiles/wikidatawiki [WARNING] java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:293) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.wikidata.wdtk.dumpfiles.DumpProcessingController.processDumpFile(DumpProcessingController.java:470) at org.wikidata.wdtk.dumpfiles.DumpProcessingController.processMostRecentDump(DumpProcessingController.java:456) at org.wikidata.wdtk.dumpfiles.DumpProcessingController.processMostRecentJsonDump(DumpProcessingController.java:426) at org.wikidata.wdtk.examples.ExampleHelpers.processEntitiesFromWikidataDump(ExampleHelpers.java:158) at org.wikidata.wdtk.examples.EntityStatisticsProcessor.main(EntityStatisticsProcessor.java:88) ... 6 more I tried finding what goes wrong, but cannot grasp all the magic that is going on... the directory it reports was created, but is empty... $ mvn --version Apache Maven 3.0.5 Maven home: /usr/share/maven Java version: 1.7.0_65, vendor: Oracle Corporation Java home: /usr/lib/jvm/java-7-openjdk-i386/jre Default locale: en_US, platform encoding: UTF-8 OS name: "linux", version: "3.16.0-4-686-pae", arch: "i386", family: "unix" Can someone give me some pointers where and how it is testing of dump files exist? Is this problem something platform dependent? Thanks, Egon -- E.L. Willighagen Department of Bioinformatics - BiGCaT Maastricht University (http://www.bigcat.unimaas.nl/) Homepage: http://egonw.github.com/ LinkedIn: http://se.linkedin.com/in/egonw Blog: http://chem-bla-ics.blogspot.com/ PubList: http://www.citeulike.org/user/egonw/tag/papers ORCID: -0001-7542-0286 ImpactStory: https://impactstory.org/EgonWillighagen ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l