[Wikitech-l] Making the query opensearch API working fine on a self hosted wikimedia server
Hi all, I'm Hicham from Paris. I just created a young company which mission is to make the web speak :) As a first step, I'm building a french version of *Wofram Alpha QA like * system based on dbpedia wikimedia API. As I had some issues to get quick API results for queries, I decided to install wikimedia and host my own version of the french Wikipedia. I did the install from scratch. Following the online documentation, I managed to do it using MWdumper http://izipedia.com/mediawiki/index.php?title=Wikip%C3%A9dia:Accueil_principal *My first issue is the following:* I have an article in wikipédia about this guy: http://izipedia.com/mediawiki/index.php?title=Andr%C3%A9_Manoukian When I search for it in the french official wikipedia api server I got a result http://fr.wikipedia.org/w/api.php?action=opensearchlimit=10format=xmlsearch=andre%20manoukiannamespace=0 but not in my server http://izipedia.com/mediawiki/api.php?action=opensearchlimit=10format=xmlsearch=andre%20manoukiannamespace=0 Although, in other cases opensearch works fine http://izipedia.com/mediawiki/api.php?action=opensearchlimit=10format=xmlsearch=new%20york Any idea ? *My second issue is the following:* Api calls never work with the param *query* See for example http://www.izipedia.com/mediawiki/api.php?action=querylist=searchformat=xmlsrsearch=new%20york wich once again is working really nicely in the official server http://fr.wikipedia.org/w/api.php?action=querylist=searchformat=xmlsrsearch=new%20york In advance thank you very much for your kind advices, best regards, Hicham -- *Hicham Tahiri | Founder of Vocal Apps* *+33760747891 | www.vocal-apps.com | @vocal_appshttp://www.twitter.com/vocal_apps * *24 rue de l'Est, Paris Incubateurs 75020 Paris* ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Making the query opensearch API working fine on a self hosted wikimedia server
On Wed, 2013-02-13 at 11:44 +0100, Hicham TAHIRI wrote: I have an article in wikipédia about this guy: http://izipedia.com/mediawiki/index.php?title=Andr%C3%A9_Manoukian When I search for it in the french official wikipedia api server I got a result http://fr.wikipedia.org/w/api.php?action=opensearchlimit=10format=xmlsearch=andre%20manoukiannamespace=0 but not in my server http://izipedia.com/mediawiki/api.php?action=opensearchlimit=10format=xmlsearch=andre%20manoukiannamespace=0 Although, in other cases opensearch works fine http://izipedia.com/mediawiki/api.php?action=opensearchlimit=10format=xmlsearch=new%20york Any idea ? Wild guess: Your HTML is not well-formed (unclosed elements like p or small) because many used templates (e.g. literal {{#if: 2-7499-0796-9 in the Manoukian article) are missing / uninterpreted. Templates used in the New York article don't include such markup. Make sure templates are installed and try again? :) andre -- Andre Klapper | Wikimedia Bugwrangler http://blogs.gnome.org/aklapper/ ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Extensions and LTS
On Tue, 2013-02-12 at 18:48 -0500, Mark A. Hershberger wrote: I'd like to think that extensions in [[Category:Stable_extensions]] will be maintained, but maybe that isn't right. I certainly haven't tried all of them against 1.19. Wondering how differently Stable extensions is interpreted by individuals who once upon a time added that category on an extension homepage and then forgot about it.../pessimism andre -- Andre Klapper | Wikimedia Bugwrangler http://blogs.gnome.org/aklapper/ ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Maria DB
Hi, I have installed Maria DB to all my servers, including production servers few weeks ago, and I found it quite stable and I like it (even the command line tool for working with sql is far better than the one included in mysql pack) It's supported on all latest ubuntu versions from 10.04 UP (maybe even older) - so my question is, are we going to use it on wikimedia production? I think we could migrate beta cluster for now - it has terrible performance and this could help. It could be a first step to migrate wikimedia production cluster. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Making the query opensearch API working fine on a self hosted wikimedia server
Perhaps extension:TitleKey and/or extension:MWSearch ( with the lucene stuff). The default search stuff with no extensions is kind of rudimentary. -bawolff On 2013-02-13 7:17 AM, Andre Klapper aklap...@wikimedia.org wrote: On Wed, 2013-02-13 at 11:44 +0100, Hicham TAHIRI wrote: I have an article in wikipédia about this guy: http://izipedia.com/mediawiki/index.php?title=Andr%C3%A9_Manoukian When I search for it in the french official wikipedia api server I got a result http://fr.wikipedia.org/w/api.php?action=opensearchlimit=10format=xmlsearch=andre%20manoukiannamespace=0 but not in my server http://izipedia.com/mediawiki/api.php?action=opensearchlimit=10format=xmlsearch=andre%20manoukiannamespace=0 Although, in other cases opensearch works fine http://izipedia.com/mediawiki/api.php?action=opensearchlimit=10format=xmlsearch=new%20york Any idea ? Wild guess: Your HTML is not well-formed (unclosed elements like p or small) because many used templates (e.g. literal {{#if: 2-7499-0796-9 in the Manoukian article) are missing / uninterpreted. Templates used in the New York article don't include such markup. Make sure templates are installed and try again? :) andre -- Andre Klapper | Wikimedia Bugwrangler http://blogs.gnome.org/aklapper/ ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Maria DB
Umm there was a thread several months ago about how it is used on several of the slave dbs, if I recall. -bawolff On 2013-02-13 8:28 AM, Petr Bena benap...@gmail.com wrote: Hi, I have installed Maria DB to all my servers, including production servers few weeks ago, and I found it quite stable and I like it (even the command line tool for working with sql is far better than the one included in mysql pack) It's supported on all latest ubuntu versions from 10.04 UP (maybe even older) - so my question is, are we going to use it on wikimedia production? I think we could migrate beta cluster for now - it has terrible performance and this could help. It could be a first step to migrate wikimedia production cluster. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Maria DB
On Wed, Feb 13, 2013 at 8:05 AM, bawolff bawolff...@gmail.com wrote: Umm there was a thread several months ago about how it is used on several of the slave dbs, if I recall. Indeed, you're looking for mariadb 5.5 in production for english wikipedia http://www.gossamer-threads.com/lists/wiki/wikitech/319925 -Chad ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Maria DB
Okay - so what is outcome? Should we migrate beta cluster? Are we going to use it in production? On Wed, Feb 13, 2013 at 2:08 PM, Chad innocentkil...@gmail.com wrote: On Wed, Feb 13, 2013 at 8:05 AM, bawolff bawolff...@gmail.com wrote: Umm there was a thread several months ago about how it is used on several of the slave dbs, if I recall. Indeed, you're looking for mariadb 5.5 in production for english wikipedia http://www.gossamer-threads.com/lists/wiki/wikitech/319925 -Chad ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Same wikicode - different outcomes
Hi, On 13 February 2013 14:35, Tuszynski, Jaroslaw W. jaroslaw.w.tuszyn...@saic.com wrote: working in the predictable way. Some of them do not work properly and issues can be fixed by changing order of arguments in the template. For example http://commons.wikimedia.org/wiki/Creator:Auguste_Angellier does not show the content of the name field and template is in The problem here was that the “| default=” line is not what it looks like: instead of a space, there was a literal no-break space character (U+00A0) between the pipe separator and the “default” keyword. I have fixed that in https://commons.wikimedia.org/wiki/?diff=90571366 -- [[cs:User:Mormegil | Petr Kadlec]] ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Same wikicode - different outcomes
On Wed, Feb 13, 2013 at 9:08 AM, Petr Kadlec petr.kad...@gmail.com wrote: Hi, On 13 February 2013 14:35, Tuszynski, Jaroslaw W. jaroslaw.w.tuszyn...@saic.com wrote: working in the predictable way. Some of them do not work properly and issues can be fixed by changing order of arguments in the template. For example http://commons.wikimedia.org/wiki/Creator:Auguste_Angellier does not show the content of the name field and template is in The problem here was that the “| default=” line is not what it looks like: instead of a space, there was a literal no-break space character (U+00A0) between the pipe separator and the “default” keyword. It looks like the same was the problem in http://commons.wikimedia.org/w/index.php?title=Creator:Paul_Ackerdiff=p revoldid=90023607, and is the problem in Creator:Derick Baegert. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Corporate needs are different (RE: How can we help Corporations use MW?)
On 02/12/2013 05:30 PM, Mark A. Hershberger wrote: On 02/11/2013 11:25 AM, Daniel Barrett wrote: Imagine if Wikipedia had a separate wiki for every city in the world. The same problem would result. I find it is easier to imagine what would happen if each language had a separate Wikipedia. We would end up with slightly different facts maintained on each wiki. Come on, this will be a similar discussion of what is the NPOV concerning the Falkland island on the English and the Spanish Wikipedia. IMHO each community should organize his wiki on it's own. Meta, Mediawiki, Commons and Wikidata already have interlanguage-communities and I think this doesn't work bad. Wikidata will be a bit different because it will integrate itself into the wikis' structures. Therefore I think that there will be discussion. So it's really great that the developers let the consumers the choice if they wanted to use wikidata or not. Cheers Marco ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Font at dv.wikipedia and dv.wiktionary
Hi, On 02/12/2013 10:25 PM, Ryan Kaldari wrote: We don't have to limit ourselves to free license fonts for what we use on our own servers. We only have to limit ourselves to free license fonts for what we distribute with our software. Of course, we should always try to support free license fonts when they are available, but there is no reason for us to artificially limit ourselves to free fonts for our own projects (assuming the licensing fees are reasonable). So, what about Gill Sans MT used by Wikimedia and its chapters? Do we we have a license also using those on non M$-operating systems? ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Font at dv.wikipedia and dv.wiktionary
On 02/13/2013 02:08 AM, M. Williamson wrote: Doesn't MOSS stand for Maldives Open Source Society? I can't find a lot of info on them as they don't seem to be active online at the moment, but I did find this: https://lists.ubuntu.com/archives/ubuntu-core-marketing/2009-June/00.html Thank you for tracking this down. With a little more research, I was able to find https://groups.google.com/d/topic/mlugmv/Auy_ZJpXEVg/discussion which says there is a font (Thaana1U.ttf) available under the GNU FreeFont license. -- http://hexmode.com/ There is no path to peace. Peace is the path. -- Mahatma Gandhi, Non-Violence in Peace and War ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Font at dv.wikipedia and dv.wiktionary
More research leads to this list of fonts: http://www.wazu.jp/gallery/Fonts_Thaana.html It looks like FreeSerif from https://savannah.gnu.org/projects/freefont would have appropriate glyphs and proper copyright information. -- http://hexmode.com/ There is no path to peace. Peace is the path. -- Mahatma Gandhi, Non-Violence in Peace and War ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Maria DB
On Wed, Feb 13, 2013 at 6:19 AM, Petr Bena benap...@gmail.com wrote: Okay - so what is outcome? Should we migrate beta cluster? Are we going to use it in production? At the risk of derailing the conversation to an unrelated subject, I would rather work on finding a way to keep the db on beta cluster up to date rather than migrate to a whole different SQL implementation that is still not correct. https://bugzilla.wikimedia.org/show_bug.cgi?id=36228 -Chris ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Maria DB
As far as I know we're using a custom mariadb version in production, not the ubuntu version. It would be ideal to use the same versions for everything, but just switching out to MariaDB isn't going to solve anything for us. Also, we're not using MariaDB everywhere in production. We're using a mix of MySQL and MariaDB. Let's not change anything till we know how that's going to work out in production. On Wed, Feb 13, 2013 at 5:19 AM, Petr Bena benap...@gmail.com wrote: Okay - so what is outcome? Should we migrate beta cluster? Are we going to use it in production? On Wed, Feb 13, 2013 at 2:08 PM, Chad innocentkil...@gmail.com wrote: On Wed, Feb 13, 2013 at 8:05 AM, bawolff bawolff...@gmail.com wrote: Umm there was a thread several months ago about how it is used on several of the slave dbs, if I recall. Indeed, you're looking for mariadb 5.5 in production for english wikipedia http://www.gossamer-threads.com/lists/wiki/wikitech/319925 -Chad ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Labs-l] Maria DB
The production migration to MariaDB was paused for a time by the EQIAD datacenter migration and issues involving other projects that took up my time, but the trial production roll-out will resume this month. All signs still point to our using it in production. I did a lot of query testing on an enwiki MariaDB 5.5 slave over the course of more than a month before the first production deployment. Major version migrations with mysql and derivatives are not to be taken lightly in production environments. At a minimum, one must be concerned about query optimizer changes making one particular query type significantly slower. In the case of the switch to 5.5, there are several default behavior changes over 5.1 that can break applications or change results. Hence, some serious work over a plodding time frame before that first production slave switch. Despite those efforts, a couple weeks after the switch, I saw a query generated by what seems to be a very rare edge case from that AFTv4 extension that violated stricter enforcement of unsigned integer types in 5.5, breaking replication and requiring one off rewriting and execution of the query locally to ensure data consistency before skipping over it. I opened a bug, Mathias fixed the extension, and I haven't seen any other compatibility issues from AFTv4 or anything else deployed on enwiki. That said, other projects utilize different extensions, so all of my testing that has gone into enwiki cannot be assumed to fully cover everything else. Because of that, and because I want to continue proceeding with caution for all of our projects, this will continue to be a slow and methodical process at this stage. Bugs in extensions that aren't used by English Wikipedia may be found and require fixing along the way. As the MariaDB roll-out proceeds, I will provide updates on wikitech-l. Best, Asher On Wed, Feb 13, 2013 at 5:19 AM, Petr Bena benap...@gmail.com wrote: Okay - so what is outcome? Should we migrate beta cluster? Are we going to use it in production? On Wed, Feb 13, 2013 at 2:08 PM, Chad innocentkil...@gmail.com wrote: On Wed, Feb 13, 2013 at 8:05 AM, bawolff bawolff...@gmail.com wrote: Umm there was a thread several months ago about how it is used on several of the slave dbs, if I recall. Indeed, you're looking for mariadb 5.5 in production for english wikipedia http://www.gossamer-threads.com/lists/wiki/wikitech/319925 -Chad ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Labs-l mailing list lab...@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/labs-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Labs-l] Maria DB
thanks for updates. Can you tell me what is a difference between maria db you are using and the version that is recommended for use on ubuntu? On Wed, Feb 13, 2013 at 6:58 PM, Asher Feldman afeld...@wikimedia.orgwrote: The production migration to MariaDB was paused for a time by the EQIAD datacenter migration and issues involving other projects that took up my time, but the trial production roll-out will resume this month. All signs still point to our using it in production. I did a lot of query testing on an enwiki MariaDB 5.5 slave over the course of more than a month before the first production deployment. Major version migrations with mysql and derivatives are not to be taken lightly in production environments. At a minimum, one must be concerned about query optimizer changes making one particular query type significantly slower. In the case of the switch to 5.5, there are several default behavior changes over 5.1 that can break applications or change results. Hence, some serious work over a plodding time frame before that first production slave switch. Despite those efforts, a couple weeks after the switch, I saw a query generated by what seems to be a very rare edge case from that AFTv4 extension that violated stricter enforcement of unsigned integer types in 5.5, breaking replication and requiring one off rewriting and execution of the query locally to ensure data consistency before skipping over it. I opened a bug, Mathias fixed the extension, and I haven't seen any other compatibility issues from AFTv4 or anything else deployed on enwiki. That said, other projects utilize different extensions, so all of my testing that has gone into enwiki cannot be assumed to fully cover everything else. Because of that, and because I want to continue proceeding with caution for all of our projects, this will continue to be a slow and methodical process at this stage. Bugs in extensions that aren't used by English Wikipedia may be found and require fixing along the way. As the MariaDB roll-out proceeds, I will provide updates on wikitech-l. Best, Asher On Wed, Feb 13, 2013 at 5:19 AM, Petr Bena benap...@gmail.com wrote: Okay - so what is outcome? Should we migrate beta cluster? Are we going to use it in production? On Wed, Feb 13, 2013 at 2:08 PM, Chad innocentkil...@gmail.com wrote: On Wed, Feb 13, 2013 at 8:05 AM, bawolff bawolff...@gmail.com wrote: Umm there was a thread several months ago about how it is used on several of the slave dbs, if I recall. Indeed, you're looking for mariadb 5.5 in production for english wikipedia http://www.gossamer-threads.com/lists/wiki/wikitech/319925 -Chad ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Labs-l mailing list lab...@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/labs-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Gerrit reviewer bot update
Hello all, With the upgrade of Gerrit, I have also taken the time to improve the Reviewer bot. For those who do not know: the reviewer bot is a tool that adds potential reviewers to new changesets, based on subscriptions [1]. One of the problems we have encountered is the use of the SSH-based Gerrit change feed. This effectively requires 100% uptime to not miss any changesets. This has now been solved: we are now using the mediawiki-commits mailing list, and are reading the messages in a batched way (every five minutes). This change also makes development testing much easier, which is a nice bonus. On the backend, there are two changes: instead of the JSON RPC api, we are now using the REST api, and we have moved hosting from the toolserver to translatewiki.net (thank you, Siebrand!) Best, Merlijn [1] http://www.mediawiki.org/wiki/Git/Reviewers ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] What's our Bus Factor?
On 01/09/2013 10:24 PM, Matthew Flaschen wrote: On 01/09/2013 08:56 PM, Jay Ashworth wrote: It's the new year, and in light of the recent poll about which devs are working on what, let me make another, albeit vaguely macabre, suggestion: If you're a developer, or other staffer, can the people around you pick up the pieces if you get hit by a bus? How badly will it impact delivery and delivery scheduling of what you're working on? This is a good reminder of yet another reason to document things. Is the institutional knowledge about our architecture and plans sufficiently well documented and spread out that we don't have anyone with an unreasonably high bus factor? However, high bus factor is good. As Wikipedia states, The bus factor is the total number of key developers who would need to be incapacitated (as by getting hit by a bus/truck) to send the project into such disarray that it would not be able to proceed. The higher this is, the less likely the project actually would be derailed for such a reason. Matt Flaschen I created https://www.mediawiki.org/wiki/Developers/Maintainers partially to document the activities (highlighted in red) where our bus factor is dangerously low. And it's one of the reasons why LevelUp is useful; it's a systematic way to improve bus factor. Code reviewing is not just a gate to keep bugs out of code, but a social step, to get a second set of eyes on code, and to spread knowledge around. Code review is a much more natural way, rather than sending out a mass email, to keep colleagues informed about changes to areas of their interest. (Of course, if someone creates a new test methodology or wants to do something big, there should be appropriate communication about that.) Chris Steipp and I have had some success running documentation sprints where we improve very specific bits of mediawiki.org, and I would love to help others do similarly. -- Sumana Harihareswara Engineering Community Manager Wikimedia Foundation ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] deployment of the first phase of Wikidata on enwp
Heya :) Third time's a charm, right? We're live on the English Wikipedia with phase 1 now \o/ Details are in this blog post: http://blog.wikimedia.de/2013/02/13/wikidata-live-on-the-english-wikipedia An FAQ is being worked on at http://meta.wikimedia.org/wiki/Wikidata/Deployment_Questions Thanks everyone who helped! I'm happy to answer questions at http://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical). Please also let me know about any issues there. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Community Communications for Wikidata Wikimedia Deutschland e.V. Obentrautstr. 72 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] deployment of the first phase of Wikidata on enwp
Congrats, also on this list :-) Cheers, Marco Lydia Pintscher lydia.pintsc...@wikimedia.de schrieb: Heya :) Third time's a charm, right? We're live on the English Wikipedia with phase 1 now \o/ Details are in this blog post: http://blog.wikimedia.de/2013/02/13/wikidata-live-on-the-english-wikipedia An FAQ is being worked on at http://meta.wikimedia.org/wiki/Wikidata/Deployment_Questions Thanks everyone who helped! I'm happy to answer questions at http://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical). Please also let me know about any issues there. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Community Communications for Wikidata Wikimedia Deutschland e.V. Obentrautstr. 72 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Page view stats we can believe in
I stumbled on the Danish Wiktionary, of all projects. Danish is the 68th biggest language of Wiktionary, and has a little more than 8,000 articles in total. Most of these articles are very short and provide no value to a reader. There is no reason to link to them, and so very unlikely that the next user should stumble upon them unless they are me. Yet, wikistats tries to make be believe that this tiny project has 400,000 or 500,000 page views each month, and has had so for a long time, http://stats.wikimedia.org/wiktionary/EN/TablesPageViewsMonthly.htm (I'm not talking about January 2012, which seems to have been an error, and reports 2-3 times that many views.) My guess is that da.wiktionary has 4,000 page views per month, not 400,000. It's more likely that 400,000 is some background noise, an offset number that should be subtracted from the number of page views for any project. If you look at the log files for just one day, you should see my IP address (85.228.something) and 3-4 other users who have been editing lately, and not many more people, but perhaps a bunch of interwiki bots. We need an explanation to these vastly inflated page view statistics. -- Lars Aronsson (l...@aronsson.se) Aronsson Datateknik - http://aronsson.se ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Corporate needs are different (RE: How can we help Corporations use MW?)
On 2013-02-13 11:27 AM, Marco Fleckinger marco.fleckin...@wikipedia.at wrote: On 02/12/2013 05:30 PM, Mark A. Hershberger wrote: On 02/11/2013 11:25 AM, Daniel Barrett wrote: Imagine if Wikipedia had a separate wiki for every city in the world. The same problem would result. I find it is easier to imagine what would happen if each language had a separate Wikipedia. We would end up with slightly different facts maintained on each wiki. Come on, this will be a similar discussion of what is the NPOV concerning the Falkland island on the English and the Spanish Wikipedia. IMHO each community should organize his wiki on it's own. Meta, Mediawiki, Commons and Wikidata already have interlanguage-communities and I think this doesn't work bad. Wikidata will be a bit different because it will integrate itself into the wikis' structures. Therefore I think that there will be discussion. So it's really great that the developers let the consumers the choice if they wanted to use wikidata or not. Cheers Marco I think you missed the point of the previous email. -bawolff ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Fwd: Re: How to speed up the review in gerrit?
On 12/26/2012 01:09 PM, vita...@yourcmc.ru wrote: Actually registration is open to everyone now by simple form submission. So actually, any one developer could get any change they wanted merged. All they need to do is trivially register a second labs account. Okay, but current situation is also a problem, because with it reviewing and merging takes much more time. And as I've said, I think most extensions aren't as important as the core, and limitting approve for them to core developers is just a waste... Maybe you should add some group similar to previous (SVN) commit access to extensions, so a wider group of people could merge changes to the extensions? When I look at https://www.mediawiki.org/wiki/Git/Gerrit_project_ownership and its history, I see that it's reasonably easy to get Gerrit ownership for less-used and less-maintained extensions, especially ones that are not deployed on Wikimedia sites. But perhaps we should be even more open to newer contributors. Maybe our rule should be: if an extension is not deployed on Wikimedia sites, then we should basically allow anyone to merge new code in (disallowing self-merges), unless the existing maintainers object. This would make things more flexible and encourage faster development in the extensions community, and help developers get more practice in code review so that they could also apply for maintainership of other extensions and core. What do people think? -- Sumana Harihareswara Engineering Community Manager Wikimedia Foundation ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Making the query opensearch API working fine on a self hosted wikimedia server
Thanks Andre Brian ! In fact I didn't install any template ! Is there a tutorial about that ? Same for adding extensions, I didn't find a way to that from the admin panel ? Any link(s) will be welcome br hicham 2013/2/13 Andre Klapper aklap...@wikimedia.org On Wed, 2013-02-13 at 11:44 +0100, Hicham TAHIRI wrote: I have an article in wikipédia about this guy: http://izipedia.com/mediawiki/index.php?title=Andr%C3%A9_Manoukian When I search for it in the french official wikipedia api server I got a result http://fr.wikipedia.org/w/api.php?action=opensearchlimit=10format=xmlsearch=andre%20manoukiannamespace=0 but not in my server http://izipedia.com/mediawiki/api.php?action=opensearchlimit=10format=xmlsearch=andre%20manoukiannamespace=0 Although, in other cases opensearch works fine http://izipedia.com/mediawiki/api.php?action=opensearchlimit=10format=xmlsearch=new%20york Any idea ? Wild guess: Your HTML is not well-formed (unclosed elements like p or small) because many used templates (e.g. literal {{#if: 2-7499-0796-9 in the Manoukian article) are missing / uninterpreted. Templates used in the New York article don't include such markup. Make sure templates are installed and try again? :) andre -- Andre Klapper | Wikimedia Bugwrangler http://blogs.gnome.org/aklapper/ ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l -- *Hicham Tahiri | Fondateur de Vocal Apps* *+33760747891 | www.vocal-apps.com | @vocal_appshttp://www.twitter.com/vocal_apps * *24 rue de l'Est, Paris Incubateurs 75020 Paris* ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Can we kill DBO_TRX? It seems evil!
It looks like Daniel's change to log implicit commits went live on the wmf cluster with the release of 1.21wmf9. Unfortunately, it doesn't appear to be as useful as hoped for tracking down nested callers of Database::begin, the majority of log entries just look like: Wed Feb 13 22:07:21 UTC 2013mw1146 dewiki DatabaseBase::begin: Transaction already in progress (from DatabaseBase::begin), performing implicit commit! It's like we'd need a backtrace at this point. So I think we should revisit this issue and either: - expand the logging to make it more useful - disable it to prevent filling the dberror log with inactionable messages and nothing else - revisit the ideas of either dropping the implicit commit by use of a transaction counter, or of emulating real nested transactions via save points. The negative impact on concurrency due to longer lived transactions and longer held locks may negate the viability of the third option, even though it feels the most correct. -Asher On Wed, Sep 26, 2012 at 4:30 AM, Daniel Kinzler dan...@brightbyte.dewrote: I have submitted two changes for review that hopefully remedy the current problems: * I1e746322 implements better documentation, more consistent behavior, and easier tracking of implicit commits in Database::begin() * I6ecb8faa restores the flushing commits that I removed a while ago under the assumption that a commit without a begin would be a no-op. I hope this addresses any pressing issues. I still think that we need a way to protect critical sections. But an RFC seems to be in order for that. -- daniel ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Making the query opensearch API working fine on a self hosted wikimedia server
Wikipedia database dumps do include the site's templates; however, you need to install all the extensions listed under Greffons de l'analyseur syntaxique on http://fr.wikipedia.org/wiki/Sp%C3%A9cial:Version In particular, the ParserFunctions extension is necessary for {{#if: (used by many Wikipedia templates) to work correctly and not show up on screen. http://www.mediawiki.org/wiki/Extension:ParserFunctions On 02/13/2013 05:13 PM, Hicham TAHIRI wrote: Thanks Andre Brian ! In fact I didn't install any template ! Is there a tutorial about that ? Same for adding extensions, I didn't find a way to that from the admin panel ? Any link(s) will be welcome 2013/2/13 Andre Klapper aklap...@wikimedia.org Wild guess: Your HTML is not well-formed (unclosed elements like p or small) because many used templates (e.g. literal {{#if: 2-7499-0796-9 in the Manoukian article) are missing / uninterpreted. Templates used in the New York article don't include such markup. Make sure templates are installed and try again? :) -- Wikipedia user PleaseStand http://en.wikipedia.org/wiki/User:PleaseStand ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Page view stats we can believe in
acording to http://stats.grok.se/da.d/latest90/mandag has been viewed 127 times in the last 3 months, and ranks on 927. the raw pagecount files are here: http://dumps.wikimedia.org/other/pagecounts-raw/ i then took an arbitrary file and looked into it, at midnight, i guess UTC, feb 1st. as all projects are in this file, lets grep for danish wiktionary, da.d at the beginning of the line: grep '^da\.d\s' pagecounts-20130201-00 | wc 5692276 19572 this means 569 pages accessed in this hour, at least once. so lets sort by third column, which is the page accesses. largest access are at the bottom, so lets take the last 20 lines: grep '^da\.d\s' pagecounts-20130201-00 | sort -k3n,3 | tail -20 da.d pony 2 30008 da.d skak 2 44151 da.d Speciel:Eksporter/engelsk 2 7818 da.d Speciel:Eksporter/hyle 2 4630 da.d Speciel:Eksporter/krog 2 4632 da.d Speciel:Eksporter/skaml%C3%A6ber 2 4632 da.d Forside 3 96050 da.d horse 3 54974 da.d interessant 3 9339 da.d Speciel:Eksporter/arrang%C3%B8rer 3 6948 da.d Speciel:Eksporter/b%C3%B8ger 3 6948 da.d Speciel:Eksporter/forg%C3%A6ves 3 6946 da.d Speciel:Eksporter/hensigtsm%C3%A6ssig 3 6946 da.d Speciel:Eksporter/hvad 3 9900 da.d Speciel:Eksporter/indvendig 3 6948 da.d Speciel:Eksporter/k%C3%A6le 3 6948 da.d Speciel:Eksporter/monogame 3 6944 da.d Speciel:Eksporter/revet 3 6946 da.d Speciel:Eksporter/topstykke 3 6944 da.d springer 3 45292 this means that e.g. springer was supposedly accessed 3 times in that hour. the article does not exist, but there is a red link out of http://da.wiktionary.org/wiki/Wiktionary:Top_1_(Dansk). rupert. On Wed, Feb 13, 2013 at 10:18 PM, Lars Aronsson l...@aronsson.se wrote: I stumbled on the Danish Wiktionary, of all projects. Danish is the 68th biggest language of Wiktionary, and has a little more than 8,000 articles in total. Most of these articles are very short and provide no value to a reader. There is no reason to link to them, and so very unlikely that the next user should stumble upon them unless they are me. Yet, wikistats tries to make be believe that this tiny project has 400,000 or 500,000 page views each month, and has had so for a long time, http://stats.wikimedia.org/wiktionary/EN/TablesPageViewsMonthly.htm (I'm not talking about January 2012, which seems to have been an error, and reports 2-3 times that many views.) My guess is that da.wiktionary has 4,000 page views per month, not 400,000. It's more likely that 400,000 is some background noise, an offset number that should be subtracted from the number of page views for any project. If you look at the log files for just one day, you should see my IP address (85.228.something) and 3-4 other users who have been editing lately, and not many more people, but perhaps a bunch of interwiki bots. We need an explanation to these vastly inflated page view statistics. -- Lars Aronsson (l...@aronsson.se) Aronsson Datateknik - http://aronsson.se ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] URLs for autogenerated documentation
On 02/08/2013 10:18 AM, Antoine Musso wrote: We would like to move the documentation to another host and I think it is a good time to change the URL as well to something a bit more meaningful than svn.mediawiki.org :-] I just wanted to say: YES, I agree, it is a good idea to move away from using svn in any URL having to do with our autogenerated documentation -- it's a misleading term now. :-) -- Sumana Harihareswara Engineering Community Manager Wikimedia Foundation ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Page view stats we can believe in
On 02/14/2013 12:03 AM, rupert THURNER wrote: this means 569 pages accessed in this hour, at least once. Thanks for taking the time to do this check! This number already is unreasonable for an obscure project with 8000 articles. da.d Speciel:Eksporter/engelsk 2 7818 Should Special:Export ever count as page views? Anyway, there are no humans using Special:Export on da.wiktionary in the middle of the night. this means that e.g. springer was supposedly accessed 3 times in that hour. the article does not exist, but there is a red link out of http://da.wiktionary.org/wiki/Wiktionary:Top_1_(Dansk). So are there some stupid bots that follow red links? There could be a large number of such accesses on Wiktionary (in any language) because there are so many red links. But bots should never be counted among the page views. -- Lars Aronsson (l...@aronsson.se) Aronsson Datateknik - http://aronsson.se ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Page view stats we can believe in
Hi all, Lars, Rupert thanks for flagging this and you are quite right: the numbers are too high because webstatscollector, the software that does the counts, just counts every request as a hit including bots, error pages etc. I am planning on running a sprint at the Amsterdam Hackathon to built an easy queryable datastore with clean pageview counts. Please let me know if you are interested in this so I can pitch this. Best, Diederik On Wed, Feb 13, 2013 at 3:36 PM, Lars Aronsson l...@aronsson.se wrote: On 02/14/2013 12:03 AM, rupert THURNER wrote: this means 569 pages accessed in this hour, at least once. Thanks for taking the time to do this check! This number already is unreasonable for an obscure project with 8000 articles. da.d Speciel:Eksporter/engelsk 2 7818 Should Special:Export ever count as page views? Anyway, there are no humans using Special:Export on da.wiktionary in the middle of the night. this means that e.g. springer was supposedly accessed 3 times in that hour. the article does not exist, but there is a red link out of http://da.wiktionary.org/wiki/**Wiktionary:Top_1_(Dansk)http://da.wiktionary.org/wiki/Wiktionary:Top_1_(Dansk) . So are there some stupid bots that follow red links? There could be a large number of such accesses on Wiktionary (in any language) because there are so many red links. But bots should never be counted among the page views. -- Lars Aronsson (l...@aronsson.se) Aronsson Datateknik - http://aronsson.se __**_ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikitech-lhttps://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Analytics] Fwd: Page view stats we can believe in
Hi Erik, You're quite right numbers are inflated, and we've been over this before [1]. Below are some sampled data for da.wiktionary from webstatscollector [2] and squid log [3] Bot traffic is a substantial share of 'page views' (but not the majority as you suggest). We discussed this extensively in April and as I remember (my mail archive is somehow incomplete) decided to implement a second cleaned-up stream without /bot/crawler/spider/http (keeping the original stream so as not break trend lines) However that bot free stream (projectcounts files with extra set of per wiki totals) never happened yet, and I'm pretty sure we changed plans since, and probably now wait for Kraken. Diederik can you add to this? Oh my, I thought this was in operation already. I've actually been looking at these page view stats, and now I feel like a fool. Why not just remove these web pages at http://stats.wikimedia.org/wiktionary/EN/TablesPageViewsMonthly.htm since they contain only nonsense? Continuity with old nonsense is still nonsense, so remove everything now and start a new project with real numbers. [1] On April 8, 2012 you reported a similar issue for Swedish Wikipedia. I checked by then one hour of sampled squid log. 9 out of 13 requests were bots. Nobody doubts that the Swedish Wikipedia has a substantial amount of human traffic. But for smaller projects, the background noise will dominate. If bots are 9 out of 13 requests to sv.wikipedia (really?), they can easily be 99% of traffic to da.wiktionary. One easy way to tell would be to observe the daily rhythm. Since Swedish and Danish are limited to one timezone, traffic in the middle of the night should be much smaller than mid-day traffic. But bots could be operating all night, all day. So the least active hour is probably the background noise from bots. -- Lars Aronsson (l...@aronsson.se) Aronsson Datateknik - http://aronsson.se ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Fwd: Re: How to speed up the review in gerrit?
On Wed, Feb 13, 2013 at 4:35 PM, Sumana Harihareswara suma...@wikimedia.org wrote: Maybe our rule should be: if an extension is not deployed on Wikimedia sites, then we should basically allow anyone to merge new code in (disallowing self-merges), unless the existing maintainers object. Having a can review all extensions group is easy, but allowing for exemptions will be a pain to manage the ACLs for. For every extension that opts out of being reviewed by this group, we'd have to adjust its ACL to block the inherited permissions. -Chad ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Fwd: Re: How to speed up the review in gerrit?
On 02/13/2013 07:57 PM, Chad wrote: On Wed, Feb 13, 2013 at 4:35 PM, Sumana Harihareswara suma...@wikimedia.org wrote: Maybe our rule should be: if an extension is not deployed on Wikimedia sites, then we should basically allow anyone to merge new code in (disallowing self-merges), unless the existing maintainers object. Having a can review all extensions group is easy, but allowing for exemptions will be a pain to manage the ACLs for. For every extension that opts out of being reviewed by this group, we'd have to adjust its ACL to block the inherited permissions. How about instead of can review all extensions, we make it easier to request review rights on non-WMF extensions? You still have to ask for each extension you want, but if the maintainer's okay with it (or not around), the burden of proof is less. No one really needs review rights on *all* the non-deployed extensions, only the few they work on. Matt Flaschen ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Gerrit reviewer bot update
Any chance the reviewer-bot can support additional triggers? For example, diff content, commit-message content, etc. Also, it wold be nice to specify whether the currently supported filters should apply to ANY of the files in the change (the current behavior) or ALL of the changed files. --Waldir On Wed, Feb 13, 2013 at 8:19 PM, Merlijn van Deen valhall...@arctus.nlwrote: Hello all, With the upgrade of Gerrit, I have also taken the time to improve the Reviewer bot. For those who do not know: the reviewer bot is a tool that adds potential reviewers to new changesets, based on subscriptions [1]. One of the problems we have encountered is the use of the SSH-based Gerrit change feed. This effectively requires 100% uptime to not miss any changesets. This has now been solved: we are now using the mediawiki-commits mailing list, and are reading the messages in a batched way (every five minutes). This change also makes development testing much easier, which is a nice bonus. On the backend, there are two changes: instead of the JSON RPC api, we are now using the REST api, and we have moved hosting from the toolserver to translatewiki.net (thank you, Siebrand!) Best, Merlijn [1] http://www.mediawiki.org/wiki/Git/Reviewers ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] 'Page View' Stats for the Timed Media Handler
Greetings all, Victor is releasing a video tomorrow for valentines day and whilst I was discussing it with him, the topic of how many users actually watch our videos came up. Do we currently have a way of collecting play/click stats for content played by the TimedMediaHandler off of commons? If not, does anyone have any ideas on how to go about getting this information? ~Matt Walker Wikimedia Foundation Fundraising Technology Team ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Analytics] Fwd: Page view stats we can believe in
Lars, I think you are overdoing it. The reports are not nonsense, but have over time become more inaccurate than some other stats we present. Actually if the reports would have mentioned 'pages served' rather than 'page views' they still would be spot on. Of course I also would have hoped this filter to be implemented now. But sometimes projects take longer than planned, at WMF like everywhere else. The stats still show a breakdown per language, and relative growth, assuming bot activity is more or less consistent from one month to another (of course not over longer periods). Last quote I got (in April?) is that overall 40% of traffic is bot related. That could be more now. Erik -Original Message- From: Lars Aronsson [mailto:l...@aronsson.se] Sent: Thursday, February 14, 2013 1:28 AM To: Erik Zachte Cc: 'A mailing list for the Analytics Team at WMF and everybody who has an interest in Wikipedia and analytics.'; Wikimedia developers Subject: Re: [Analytics] Fwd: [Wikitech-l] Page view stats we can believe in Hi Erik, You're quite right numbers are inflated, and we've been over this before [1]. Below are some sampled data for da.wiktionary from webstatscollector [2] and squid log [3] Bot traffic is a substantial share of 'page views' (but not the majority as you suggest). We discussed this extensively in April and as I remember (my mail archive is somehow incomplete) decided to implement a second cleaned-up stream without /bot/crawler/spider/http (keeping the original stream so as not break trend lines) However that bot free stream (projectcounts files with extra set of per wiki totals) never happened yet, and I'm pretty sure we changed plans since, and probably now wait for Kraken. Diederik can you add to this? Oh my, I thought this was in operation already. I've actually been looking at these page view stats, and now I feel like a fool. Why not just remove these web pages at http://stats.wikimedia.org/wiktionary/EN/TablesPageViewsMonthly.htm since they contain only nonsense? Continuity with old nonsense is still nonsense, so remove everything now and start a new project with real numbers. [1] On April 8, 2012 you reported a similar issue for Swedish Wikipedia. I checked by then one hour of sampled squid log. 9 out of 13 requests were bots. Nobody doubts that the Swedish Wikipedia has a substantial amount of human traffic. But for smaller projects, the background noise will dominate. If bots are 9 out of 13 requests to sv.wikipedia (really?), they can easily be 99% of traffic to da.wiktionary. One easy way to tell would be to observe the daily rhythm. Since Swedish and Danish are limited to one timezone, traffic in the middle of the night should be much smaller than mid-day traffic. But bots could be operating all night, all day. So the least active hour is probably the background noise from bots. -- Lars Aronsson (l...@aronsson.se) Aronsson Datateknik - http://aronsson.se ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Making the query opensearch API working fine on a self hosted wikimedia server
I believe search is done on non-expanded wikitext, so I doubt that parser functions makes much of a difference (however it is important in order for your wikis not to be broken for human visitors). The two extensions I was talking about were: https://www.mediawiki.org/wiki/Extension:TitleKey which makes certain things be case insensitive. And more importantly https://www.mediawiki.org/wiki/Extension:MWSearchalong with https://www.mediawiki.org/wiki/Extension:Lucene-search . This will significantly improve your search results and is what wikipedia uses for search. However be warned that this a hard extension to install. (Probably the hardest to install of any mediawiki extension. ) -bawolff On 2013-02-13 6:34 PM, Kevin Israel pleasest...@live.com wrote: Wikipedia database dumps do include the site's templates; however, you need to install all the extensions listed under Greffons de l'analyseur syntaxique on http://fr.wikipedia.org/wiki/Sp%C3%A9cial:Version In particular, the ParserFunctions extension is necessary for {{#if: (used by many Wikipedia templates) to work correctly and not show up on screen. http://www.mediawiki.org/wiki/Extension:ParserFunctions On 02/13/2013 05:13 PM, Hicham TAHIRI wrote: Thanks Andre Brian ! In fact I didn't install any template ! Is there a tutorial about that ? Same for adding extensions, I didn't find a way to that from the admin panel ? Any link(s) will be welcome 2013/2/13 Andre Klapper aklap...@wikimedia.org Wild guess: Your HTML is not well-formed (unclosed elements like p or small) because many used templates (e.g. literal {{#if: 2-7499-0796-9 in the Manoukian article) are missing / uninterpreted. Templates used in the New York article don't include such markup. Make sure templates are installed and try again? :) -- Wikipedia user PleaseStand http://en.wikipedia.org/wiki/User:PleaseStand ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Fwd: Re: How to speed up the review in gerrit?
On Wed, Feb 13, 2013 at 8:12 PM, Matthew Flaschen mflasc...@wikimedia.org wrote: How about instead of can review all extensions, we make it easier to request review rights on non-WMF extensions? You still have to ask for each extension you want, but if the maintainer's okay with it (or not around), the burden of proof is less. No one really needs review rights on *all* the non-deployed extensions, only the few they work on. I tend to agree. And I think this is pretty much the standard we have (consensus by existing devs). For anything that's not maintained by anyone, I think we can just AGF. -Chad ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Fwd: Re: How to speed up the review in gerrit?
On 02/14/2013 02:12 AM, Matthew Flaschen wrote: On 02/13/2013 07:57 PM, Chad wrote: On Wed, Feb 13, 2013 at 4:35 PM, Sumana Harihareswara suma...@wikimedia.org wrote: Maybe our rule should be: if an extension is not deployed on Wikimedia sites, then we should basically allow anyone to merge new code in (disallowing self-merges), unless the existing maintainers object. Having a can review all extensions group is easy, but allowing for exemptions will be a pain to manage the ACLs for. For every extension that opts out of being reviewed by this group, we'd have to adjust its ACL to block the inherited permissions. How about instead of can review all extensions, we make it easier to request review rights on non-WMF extensions? Good idea, but in general there could just be 3+ different classes of extensions? The class can be calculated by its importance, e.g. installed on WMF-sites, number of other wikis using it, etc. You still have to ask for each extension you want, but if the maintainer's okay with it (or not around), the burden of proof is less. No one really needs review rights on *all* the non-deployed extensions, only the few they work on. Matt Flaschen ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Making the query opensearch API working fine on a self hosted wikimedia server
On 02/13/2013 09:20 PM, Brian Wolff wrote: And more importantly https://www.mediawiki.org/wiki/Extension:MWSearchalong There was a typo. The URL is https://www.mediawiki.org/wiki/Extension:MWSearch Matt Flaschen ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Fwd: Re: How to speed up the review in gerrit?
On Wed, Feb 13, 2013 at 9:22 PM, Marco Fleckinger marco.fleckin...@wikipedia.at wrote: Having a can review all extensions group is easy, but allowing for exemptions will be a pain to manage the ACLs for. For every extension that opts out of being reviewed by this group, we'd have to adjust its ACL to block the inherited permissions. How about instead of can review all extensions, we make it easier to request review rights on non-WMF extensions? Good idea, but in general there could just be 3+ different classes of extensions? The class can be calculated by its importance, e.g. installed on WMF-sites, number of other wikis using it, etc. Having classes of extensions is difficult to maintain from an ACL standpoint. Permissions in Gerrit are directly inherited (and there's no multiple inheritance), so things in mediawiki/extensions/* all have the same permissions. So having rules that apply to only some of those repositories requires editing ACLs for each repository in each group. -Chad ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Fwd: Re: How to speed up the review in gerrit?
On 02/14/2013 03:28 AM, Chad wrote: On Wed, Feb 13, 2013 at 9:22 PM, Marco Fleckinger marco.fleckin...@wikipedia.at wrote: Having a can review all extensions group is easy, but allowing for exemptions will be a pain to manage the ACLs for. For every extension that opts out of being reviewed by this group, we'd have to adjust its ACL to block the inherited permissions. How about instead of can review all extensions, we make it easier to request review rights on non-WMF extensions? Good idea, but in general there could just be 3+ different classes of extensions? The class can be calculated by its importance, e.g. installed on WMF-sites, number of other wikis using it, etc. Having classes of extensions is difficult to maintain from an ACL standpoint. Permissions in Gerrit are directly inherited (and there's no multiple inheritance), so things in mediawiki/extensions/* all have the same permissions. So having rules that apply to only some of those repositories requires editing ACLs for each repository in each group. Sorry, I think you misunderstood me. I meant classes like: * Used by WMF * non-WMF very important * non-WMF important * non-WMF less important * non-WMF unimportant No multiple inheritance will be needed for this model. Cheers, Marco ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Fwd: Re: How to speed up the review in gerrit?
On Wed, Feb 13, 2013 at 9:33 PM, Marco Fleckinger marco.fleckin...@wikipedia.at wrote: On 02/14/2013 03:28 AM, Chad wrote: On Wed, Feb 13, 2013 at 9:22 PM, Marco Fleckinger marco.fleckin...@wikipedia.at wrote: Having a can review all extensions group is easy, but allowing for exemptions will be a pain to manage the ACLs for. For every extension that opts out of being reviewed by this group, we'd have to adjust its ACL to block the inherited permissions. How about instead of can review all extensions, we make it easier to request review rights on non-WMF extensions? Good idea, but in general there could just be 3+ different classes of extensions? The class can be calculated by its importance, e.g. installed on WMF-sites, number of other wikis using it, etc. Having classes of extensions is difficult to maintain from an ACL standpoint. Permissions in Gerrit are directly inherited (and there's no multiple inheritance), so things in mediawiki/extensions/* all have the same permissions. So having rules that apply to only some of those repositories requires editing ACLs for each repository in each group. Sorry, I think you misunderstood me. I meant classes like: * Used by WMF * non-WMF very important * non-WMF important * non-WMF less important * non-WMF unimportant No multiple inheritance will be needed for this model. Having these groups in Gerrit, or just in practice for application for permissions? Having groups like this in Gerrit would be a pain, that's what I'm saying. -Chad ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Can we kill DBO_TRX? It seems evil!
On 2013-02-13 6:24 PM, Asher Feldman afeld...@wikimedia.org wrote: Unfortunately, it doesn't appear to be as useful as hoped for tracking down nested callers of Database::begin, the majority of log entries just look like: Wed Feb 13 22:07:21 UTC 2013mw1146 dewiki DatabaseBase::begin: Transaction already in progress (from DatabaseBase::begin), performing implicit commit! For starters, why isnt wfWarn being called with a second argument of 2? Why isnt $fname defaulting to wfGetCaller() if unspecified? Sure a full backtrace or other more complicated logging may be useful/needed but as a start we know that saying called from database::begin is pretty useless. Before worrying too much about loggging on production, why not deal with the ones that occur regularly on testing wikis. They are extremely easy to reproduce locally and happen quite regularly. (For example on every purge action). -bawolff ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Analytics] Fwd: Page view stats we can believe in
On 02/14/2013 02:56 AM, Erik Zachte wrote: Lars, I think you are overdoing it. The reports are not nonsense, but have over time become more inaccurate than some other stats we present. Actually if the reports would have mentioned 'pages served' rather than 'page views' they still would be spot on. N, nobody in the web business counts bot accesses. Pages, page views, are human page views. You need to filter out bots, API calls, and non-page fetches. The main Wikistats, counting articles and users is very accurate, and these nonsense page view stats give Wikistats a bad name. Plus they are used by all the GLAM projects to show museums how much people view pictures from their museum, and now that's all fake and exaggeration. It's 2-3 years wasted. Please don't waste any more years or months of our time. We now have to go back to museums and apologize. The stats still show a breakdown per language, No, that's exactly what fails. Wikistats indicates that Wiktionary has more page views than Wikisource, and believed this, and it surprised me, but now I understand that we are counting bots that follow red links, and that is a sport Wiktionary will always win. Humans tend to read Wikisource, but bots are drawn to spend time in the link mazes of Wiktionary. and relative growth, assuming bot activity is more or less consistent from one month to another (of course not over longer periods). Last quote I got (in April?) is that overall 40% of traffic is bot related. That could be more now. And it's far more for smaller projects, and for link-intensive Wiktionary, and for those languages of Wikipedia that create articles by bots, such as Dutch, Swedish, Vietnamese and Volapük. This bot-created article about a spider has been viewed 12 times in the last 30 days, but only by bots? http://nl.wikipedia.org/wiki/Acantheis_variatus http://stats.grok.se/nl/latest/Acantheis_variatus Bots creating articles and bots reading them, what a joke! And they are creating articles about spiders! -- Lars Aronsson (l...@aronsson.se) Aronsson Datateknik - http://aronsson.se ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Fwd: Re: How to speed up the review in gerrit?
On 14.02.2013 6:33, Marco Fleckinger wrote: On 02/14/2013 03:28 AM, Chad wrote: On Wed, Feb 13, 2013 at 9:22 PM, Marco Fleckinger marco.fleckin...@wikipedia.at wrote: Having a can review all extensions group is easy, but allowing for exemptions will be a pain to manage the ACLs for. For every extension that opts out of being reviewed by this group, we'd have to adjust its ACL to block the inherited permissions. How about instead of can review all extensions, we make it easier to request review rights on non-WMF extensions? Good idea, but in general there could just be 3+ different classes of extensions? The class can be calculated by its importance, e.g. installed on WMF-sites, number of other wikis using it, etc. Having classes of extensions is difficult to maintain from an ACL standpoint. Permissions in Gerrit are directly inherited (and there's no multiple inheritance), so things in mediawiki/extensions/* all have the same permissions. So having rules that apply to only some of those repositories requires editing ACLs for each repository in each group. Sorry, I think you misunderstood me. I meant classes like: * Used by WMF * non-WMF very important * non-WMF important * non-WMF less important * non-WMF unimportant No multiple inheritance will be needed for this model. That is a really great idea. If there were mediawiki/extensions/(wmf|non-wmf-unimportant|non-wmf-important)/* subdirectories introduced, such classification should encourage extension developers to improve their extensions so they can move up from non-WMF unimportant to non-WMF importantand maybe higher. I do not develop for MW last year, however this idea is great and I give my personal +100 for having extensions importance hierarchy in repository. Should be easier for reviewers as well. Maybe even corporate donations campaigns for non-WMF very important can be introduced at some late stage then, which is even better to make more extensions useful and stable. Dmitriy ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Fwd: Re: How to speed up the review in gerrit?
On 02/14/2013 12:04 AM, Dmitriy Sintsov wrote: That is a really great idea. If there were mediawiki/extensions/(wmf|non-wmf-unimportant|non-wmf-important)/* subdirectories introduced, such classification should encourage extension developers to improve their extensions so they can move up from non-WMF unimportant to non-WMF importantand maybe higher. I'm wary about giving such classifications too much weight given that they're inherently subjective. Remember how many completely private wikis there are, which don't disclose which extensions they run. Matt Flaschen ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Fwd: Re: How to speed up the review in gerrit?
Also, the locations of extensions should be stable. If the repos start moving around as extensions move up and down the ladder it will cause confusion. -bawolff On 2013-02-14 1:12 AM, Matthew Flaschen mflasc...@wikimedia.org wrote: On 02/14/2013 12:04 AM, Dmitriy Sintsov wrote: That is a really great idea. If there were mediawiki/extensions/(wmf|non-wmf-unimportant|non-wmf-important)/* subdirectories introduced, such classification should encourage extension developers to improve their extensions so they can move up from non-WMF unimportant to non-WMF importantand maybe higher. I'm wary about giving such classifications too much weight given that they're inherently subjective. Remember how many completely private wikis there are, which don't disclose which extensions they run. Matt Flaschen ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Fwd: Re: How to speed up the review in gerrit?
On 14.02.2013 9:14, Brian Wolff wrote: Also, the locations of extensions should be stable. If the repos start moving around as extensions move up and down the ladder it will cause confusion. Disadvantages are really small comparing to huge advantages for both reviewers and extension's authors. Another advantage is that importance and stability of extension will be evaluated by experienced WMF developer rather than self-placing extension description page into [[Category:Stable extensions]]. Dmitriy ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Fwd: Re: How to speed up the review in gerrit?
I would consider git pull to mysteriously stop working because somebody moved the git repo to be a pretty big disadvantage. -bawolff On 2013-02-14 1:22 AM, Dmitriy Sintsov ques...@rambler.ru wrote: On 14.02.2013 9:14, Brian Wolff wrote: Also, the locations of extensions should be stable. If the repos start moving around as extensions move up and down the ladder it will cause confusion. Disadvantages are really small comparing to huge advantages for both reviewers and extension's authors. Another advantage is that importance and stability of extension will be evaluated by experienced WMF developer rather than self-placing extension description page into [[Category:Stable extensions]]. Dmitriy __**_ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikitech-lhttps://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Fwd: Re: How to speed up the review in gerrit?
On 02/14/2013 12:25 AM, Brian Wolff wrote: I would consider git pull to mysteriously stop working because somebody moved the git repo to be a pretty big disadvantage. Similar to what I said earlier, I don't think anyone needs to request review/merge rights for all non-WMF less important extensions (or another grouping like that). They can request rights for extensions they're actually working on. We should just socially consider who's using the extension (that we know of), how important it is (subjective), and whether it's being actively maintained (generally pretty clear) when granting these rights. I don't think we need formal groups beyond whether the WMF is using it (which is already marked clearly). Matt Flaschen ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l