Re: [Toolserver-l] interwiki.py

2012-01-16 Thread Hydriz Wikipedia

As an interwiki bot runner myself, I find this plan a little too constrained. 
Many of us run the interwiki bots on many different configuration and with this 
MMP project created, some configurations that we use would then not be 
available with this new plan. I can't think of much, but seeing from top -c, I 
can tell that the other bot runners run their bots differently from mine.
Personally, I rather we wait for the Pywikipedia devs to fix that script, 
install more memory for interwiki bots, or create another custom login server 
just for running interwiki bots.
Your plan is generally okay, just about having only 5 people to run this 
project, from many many bot operators, its quite hard to choose. Its best if 
people don't run multiple interwiki bots for one project (especially 
enwiktionary, which has an overload of interwiki bots).

Regards,
Hydriz

From: w...@daniel.baur4.info
To: toolserver-l@lists.wikimedia.org
Date: Sun, 15 Jan 2012 17:38:40 +0100
Subject: Re: [Toolserver-l] interwiki.py

Hello,
At Sunday 15 January 2012 17:13:26 DaB. wrote:
 Isn't this a bit too many interwiki 
 bots?
 
yes, there are, although not the cpu-load is the problem but the memory-usage. 
The best solution would be if the mediawiki-devs finaly get rid of interwiki-
links in the article-text of course, but I have the fealing thta will not 
happen soon. The second best solution would be, if the interwiki.py would fix 
their code, but there I have also the fealing that will take some time.
 
So here is my plan to fix the problem on our (the TS) side:
1.) I create a MMP called interwiki-bot (or something).
2.) YOU (the ts-users) choose (by election, by appointing, by playing Trip to 
Jerusalem,  I don't care) 5 of you who will become member of that MMP until 
15th February. Only rule: 1 of the 5 has to be an active user of a non-
wikipedia-project (like wikisource or wiktionary or so).
3.) The members of the MMP create a wikimedia-project-account (like ts-
interwikibot or something) and request global-bot-status until 1. April.
4.) After 2. April no-one is allowed to run a interwiki-bot except the MMP.
 
Any problems with my plan?
 
Sincerly,
DaB.
 
-- 
Userpage: [[:w:de:User:DaB.]] ― PGP: 2B255885

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette 
___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] interwiki.py

2012-01-16 Thread K. Peachey
Perhaps you should look at why people are running with different
settings then standardize. Interwiki bots should all be doing the
roughly the same job, shouldn't they? so whats with the different
settings?

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


Re: [Toolserver-l] interwiki.py

2012-01-16 Thread Hydriz Wikipedia

Sometimes these bot operators might just want to run the bot on a few pages, or 
some special settings that I don't know of. But still entirely blocking people 
from running interwiki bots is quite ridiculous.

Regards,
Hydriz

 From: p858sn...@gmail.com
 Date: Mon, 16 Jan 2012 18:09:12 +1000
 To: toolserver-l@lists.wikimedia.org
 Subject: Re: [Toolserver-l] interwiki.py
 
 Perhaps you should look at why people are running with different
 settings then standardize. Interwiki bots should all be doing the
 roughly the same job, shouldn't they? so whats with the different
 settings?
 
 ___
 Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
 https://lists.wikimedia.org/mailman/listinfo/toolserver-l
 Posting guidelines for this list: 
 https://wiki.toolserver.org/view/Mailing_list_etiquette
  ___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] interwiki.py

2012-01-16 Thread Krinkle
On Sun, Jan 15, 2012 at 5:38 PM, DaB. w...@daniel.baur4.info wrote:


 So here is my plan to fix the problem on our (the TS) side:



+1 :)
___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] interwiki.py

2012-01-16 Thread Merlijn van Deen
2012/1/16 Hydriz Wikipedia ad...@wikisorg.tk:
 Personally, I rather we wait for the Pywikipedia devs to fix that script,

This is not going to happen anytime soon. Considering the state of the
code base (two hundred exceptions for three hunderd wikis, long
functions and no automated testing - and thus practically untestable),
and the state of the InterLanguage extension ('will be installed
soon'), so-one is really willing to invest a lot of time in tracking
memory usage and reducing it.

The only reasonable action we can take to reduce the memory
consumption is to let the OS do its job in freeing memory: using one
process to track pages that have to be corrected (using the database,
if possible), and one process to do the actual fixing (interwiki.py).
This should be reasonably easy to implement (i.e. use a pywikibot page
generator to generate a list of pages, use a database layer to track
interlanguage links and popen('interwiki.py page') if this is a
fixable situation)

Best,
Merlijn

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


Re: [Toolserver-l] interwiki.py

2012-01-16 Thread Hydriz Wikipedia

Yes, so probably our issues here are the lack of coordination of bot owners and 
memory usage issue. Shouldn't we write some simple script which can 
automatically remove old memory used by the interwiki script? DaB's idea was 
okay for me, just that one of the points was that no one else can run the 
interwiki script anymore, which is ridiculous to me.
Maybe the MMP can be used to ensure that there is no overlapping bots? All 
interwiki bot owners should join this project, check an available wiki that no 
one has taken up and start asking for clearance to run their own interwiki bot 
there.

Regards,
Hydriz

 From: valhall...@arctus.nl
 Date: Mon, 16 Jan 2012 09:19:19 +0100
 To: toolserver-l@lists.wikimedia.org
 Subject: Re: [Toolserver-l] interwiki.py
 
 2012/1/16 Hydriz Wikipedia ad...@wikisorg.tk:
  Personally, I rather we wait for the Pywikipedia devs to fix that script,
 
 This is not going to happen anytime soon. Considering the state of the
 code base (two hundred exceptions for three hunderd wikis, long
 functions and no automated testing - and thus practically untestable),
 and the state of the InterLanguage extension ('will be installed
 soon'), so-one is really willing to invest a lot of time in tracking
 memory usage and reducing it.
 
 The only reasonable action we can take to reduce the memory
 consumption is to let the OS do its job in freeing memory: using one
 process to track pages that have to be corrected (using the database,
 if possible), and one process to do the actual fixing (interwiki.py).
 This should be reasonably easy to implement (i.e. use a pywikibot page
 generator to generate a list of pages, use a database layer to track
 interlanguage links and popen('interwiki.py page') if this is a
 fixable situation)
 
 Best,
 Merlijn
 
 ___
 Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
 https://lists.wikimedia.org/mailman/listinfo/toolserver-l
 Posting guidelines for this list: 
 https://wiki.toolserver.org/view/Mailing_list_etiquette
  ___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] interwiki.py

2012-01-16 Thread Andre Engels
2012/1/16 Hydriz Wikipedia ad...@wikisorg.tk:
 Yes, so probably our issues here are the lack of coordination of bot owners
 and memory usage issue. Shouldn't we write some simple script which can
 automatically remove old memory used by the interwiki script? DaB's idea was
 okay for me, just that one of the points was that no one else can run the
 interwiki script anymore, which is ridiculous to me.

 Maybe the MMP can be used to ensure that there is no overlapping bots? All
 interwiki bot owners should join this project, check an available wiki that
 no one has taken up and start asking for clearance to run their own
 interwiki bot there.

Even bots on different wikis will have a large overlap. Perhaps we
should restrict not-'selected' interwiki bots to running with -back
set (for autonomous runs on the main namespace of Wikipedia, because I
think that that's where most bots are active)?

-- 
André Engels, andreeng...@gmail.com

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] interwiki.py

2012-01-16 Thread Hercule Hercule
I think that the most important diffenrence is the home wiki for the
launch.

2012/1/16 K. Peachey p858sn...@gmail.com

 Perhaps you should look at why people are running with different
 settings then standardize. Interwiki bots should all be doing the
 roughly the same job, shouldn't they? so whats with the different
 settings?

 ___
 Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
 https://lists.wikimedia.org/mailman/listinfo/toolserver-l
 Posting guidelines for this list:
 https://wiki.toolserver.org/view/Mailing_list_etiquette

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] interwiki.py

2012-01-16 Thread DaB.
Hello,
At Monday 16 January 2012 13:41:28 DaB. wrote:
 As an interwiki bot runner myself, I find this plan a little too
 constrained.

It is a little bit drastic, yes but it will work. There were some other ideas 
in the past (see Chris Grant's mail), but they didn't work at the end. The new 
plan has the following advantages:
- YOU (the ts-users) make the rules and decide who should be in the MMP (I 
never said BTW that the people in the MMP should make the rules),
-The roots can contact the group quite easily instead of speaking to douzends 
of users,
-The Wikimedia-Project-People (Wikipedians, Wikisourclers, etc.) have only 1 
contact-adress too,
-The cases of bot a removes a link and bot b put it in again 5 minutes later 
will reduce very much.

 Many of us run the interwiki bots on many different
 configuration and with this MMP project created, some configurations that
 we use would then not be available with this new plan. I can't think of
 much, but seeing from top -c, I can tell that the other bot runners run
 their bots differently from mine.

Like Hercule said, that should be the homewiki for most times; and it should 
be no problem of the MMP-people to switch the homewiki now and then (e.g. if 
they run 5 instances of their bot and change the homewiki ever hour, then 
every project is the homewiki every 3 days).
The MMP should also only be for interwiki-bots which run permantly; if an user 
let run a bot because a wiki needs to change 100 interwiki-links on a one-
time-base, that's no problem.

 Personally, I rather we wait for the
 Pywikipedia devs to fix that script, install more memory for interwiki
 bots, or create another custom login server just for running interwiki
 bots.

Throwing more hardware at a problem doesn't fix the problem at all and like 
Merlijn wrote already, I doubt that the pywikipedia-devs will fix the problem 
soon (they know about it for years, and don't seems to care that after some 
time a simple python-script needs more memory than a java-programm *including* 
the virtual maschine!).

 Your plan is generally okay, just about having only 5 people to run
 this project, from many many bot operators, its quite hard to choose. Its
 best if people don't run multiple interwiki bots for one project
 (especially enwiktionary, which has an overload of interwiki bots).

In theory, 1 user would be enough to run a interwiki-bots (or serveral 
instances of it) for all wikis. I increased the number to 5 to make sure that 
there is always somebody to controll the bots.

 
 Regards,
 Hydriz

Sincerly,
DaB.

-- 
Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885


signature.asc
Description: This is a digitally signed message part.
___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] interwiki.py

2012-01-16 Thread Tim Landscheidt
Merlijn van Deen valhall...@arctus.nl wrote:

 Personally, I rather we wait for the Pywikipedia devs to fix that script,

 This is not going to happen anytime soon. Considering the state of the
 code base (two hundred exceptions for three hunderd wikis, long
 functions and no automated testing - and thus practically untestable),
 and the state of the InterLanguage extension ('will be installed
 soon'), so-one is really willing to invest a lot of time in tracking
 memory usage and reducing it.

 The only reasonable action we can take to reduce the memory
 consumption is to let the OS do its job in freeing memory: using one
 process to track pages that have to be corrected (using the database,
 if possible), and one process to do the actual fixing (interwiki.py).
 This should be reasonably easy to implement (i.e. use a pywikibot page
 generator to generate a list of pages, use a database layer to track
 interlanguage links and popen('interwiki.py page') if this is a
 fixable situation)

We could also move the pressure: Labs' bot running infra-
structure doesn't seem to be /that/ far from opening.  If
interwiki bots were running there, it would allow the foun-
dation to judge whether pushing for the deployment of Inter-
Language isn't worth it in the end.

  Meanwhile I think DaB.'s proposal is very adequate.

Tim


___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


Re: [Toolserver-l] interwiki.py

2012-01-16 Thread K. Peachey
On Tue, Jan 17, 2012 at 3:02 AM, Tim Landscheidt t...@tim-landscheidt.de 
wrote:
 We could also move the pressure: Labs' bot running infra-
 structure doesn't seem to be /that/ far from opening.  If
 interwiki bots were running there, it would allow the foun-
 dation to judge whether pushing for the deployment of Inter-
 Language isn't worth it in the end.
Labs isn't a fix all solution for situations like these, Since the
issue is interwiki,py has memory management problems amongst others
apparently I would be guessing ryan would be hesitant to have it
running that labs platform even though labs is designed to do more
more virtual containers than a shared system like how the toolserver
operates unless those issues were resolved.

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] interwiki.py

2012-01-16 Thread Sumana Harihareswara
On 01/16/2012 01:39 PM, K. Peachey wrote:
 On Tue, Jan 17, 2012 at 3:02 AM, Tim Landscheidt t...@tim-landscheidt.de 
 wrote:
 We could also move the pressure: Labs' bot running infra-
 structure doesn't seem to be /that/ far from opening.  If
 interwiki bots were running there, it would allow the foun-
 dation to judge whether pushing for the deployment of Inter-
 Language isn't worth it in the end.
 Labs isn't a fix all solution for situations like these, Since the
 issue is interwiki,py has memory management problems amongst others
 apparently I would be guessing ryan would be hesitant to have it
 running that labs platform even though labs is designed to do more
 more virtual containers than a shared system like how the toolserver
 operates unless those issues were resolved.

Since is this something labs could do? has come up, please feel free
to add features and functionality you'd like in Labs at

https://www.mediawiki.org/wiki/Wikimedia_Labs/Toolserver_features_wanted

-- 
Sumana Harihareswara
Volunteer Development Coordinator
Wikimedia Foundation

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette