Re: [Toolserver-l] Reasons for not migrating to Tool Lab

2012-10-01 Thread Tim Landscheidt
(anonymous) wrote:

 We currently have no plans for having the user databases on the same
 servers as the replicated databases. Direct joins will not be
 possible, so tools will need to be modified.

 This is unfortunate, and a huge step backwards from the situation on
 the toolserver.

 For example, the project I maintain on toolserver (the enwiki WP 1.0
 assessment data) has user database tables with several million rows of
 data about articles, from which it needs to select the data for pages
 from fixed categories on the wiki, which themselves could have a few
 thousand members. The natural way to do this is to join against the
 categorylinks table. Any non-join solution is going to be much, much
 less efficient.

 A key role of the toolserver setup was that it allowed these sorts of
 joins. Web hosting is cheap and data about the live wiki is already
 available in non-joinable form through the API with no replag.

Even more: If Labs replication isn't bound by Toolserver
tradition, it would be *very* nice not to fragment the data
according to the different WMF clusters, plus Commons or
not, plus (separate) user databases or not, but have one
cluster where users can join as logic suggests.  As Toolser-
ver merges Commons onto other clusters already, this seems
to be possible with MySQL.

Tim


___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


Re: [Toolserver-l] Reasons for not migrating to Tool Lab

2012-09-28 Thread Bryan Tong Minh
On Thu, Sep 27, 2012 at 7:58 PM, Platonides platoni...@gmail.com wrote:
 On 27/09/12 01:07, Ryan Lane wrote:
 We currently have no plans for having the user databases on the same
 servers as the replicated databases. Direct joins will not be
 possible, so tools will need to be modified.

 -50

 It's such a useful feature, that it would be worth making a local mysql
 slaves for having them.
 I know, the all-powerful labs environment is unable to run a mysql
 instance, but we could use MySQL cluster, trading memory (available) to
 get joins (denied).


 I'm not the one setting up the databases. If you want information
 about why this won't be available, talk to Asher (binasher in
 #wikimedia-operations on Freenode). Maybe he can be convinced
 otherwise.

 Of course, in the production cluster we don't do joins this way. We
 handle the joins in the app logic, which is a more appropriate way of
 doing this.

 I disagree. In production you can just create a new table in the wiki
 db. We can't create new tables there in the toolserver (the dbs are a
 mirror or what there is in production). Thus, we create a new db in the
 same server and use a cross-db join instead of joining a new table.

 Joining several wiki tables is probably more strange, with the exception
 of commons, which is more often joined to others, as the commons images
 are also at the local wikis.

Which brings us to the next point, will the commons database be
replicated to all clusters, like the toolserver?


Bryan

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


Re: [Toolserver-l] Reasons for not migrating to Tool Lab

2012-09-27 Thread Ariel T. Glenn
Στις 26-09-2012, ημέρα Τετ, και ώρα 23:38 +, ο/η Tim Landscheidt
έγραψε:
 (anonymous) wrote:
 
  [...]
  Ryan Lane wrote:
  If WMF becomes evil, fork the entire infrastructure into EC2,
  Rackspace cloud, HP cloud, etc. and bring the community operations
  people along for the ride. Hell, use the replicated databases in Labs
  to populate your database in the cloud.
 
  Tim Landscheidt wrote:
  But the nice thing about Labs is that you can try out (re-
  plicable :-)) replication setups at no cost, and don't have
  to upfront investments on hardware, etc., so when time
  comes, you can just upload your setup to EC2 or whatever and
  have a working Wikipedia clone running in a manageable time-
  frame.
 
  This is not an easy task. Replicating the databases is enormously
  challenging (they're huge datasets in the cases of the big wikis) and
  they're constantly changing. If you tried to rely on dumps alone, you'd
  always be out of date by at least two weeks (assuming dumps are working
  properly). Two weeks on the Internet is a lot of time.
 
 I don't know if this is not an easy task, but you are proba-
 bly right.  So what?  If a scenario of WMF turning rogue
 couldn't bear losing two weeks of edits while saving almost
 a decade, we should work on ways to incremental dumps.
 

In fact there are (experimental) adds/changes dumps, so while it might
not be a 5 minute procedure to get that data into your copy, and
deletions and suppressions wouldn't be covered, the amount of data that
would be lost would be pretty small.

Ariel


___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] Reasons for not migrating to Tool Lab

2012-09-27 Thread Ryan Lane
On Thu, Sep 27, 2012 at 1:36 AM, Federico Leva (Nemo)
nemow...@gmail.com wrote:
 In general in Labs we don't have a large need for a queuing system
 right now.

 Of course, because nobody is using it right now. I suppose Toolserver didn't
 need it when it had only a few users consuming its resources.


I should know better than to feed a troll, but Labs is relatively
heavily used. At this moment there are 233 virtual machines running
across 125 projects. It's actively used by quite a number of bots
(which have already moved from Toolserver). It's being used by the
following teams;

* Analytics
* Editor-engagement
* Visual editor
* Global education
* QA
* Mobile
* Pediapress
* Localization
* Wikidata
* Operations
* Fundraising
* Core services

Many of those teams host multiple active projects.

Additionally, we have a number of volunteer driven projects. Here's a
few choice ones:

* Bots
* Deployment-prep
* Maps (for OpenStreetMaps)
* Wikistats
* Wikitrust
* Signwriting
* Phabricator
* Metavidwiki
* Huggle
* Glam
* Wiki loves monuments
* Blamemaps
* Counter vandalism network

It was used extensively during Google summer of code by the students
and mentors. It's also used very heavily during hackathons; most
projects demo at the end with Labs.

These projects aren't in great need of a queue because they don't
fight against each other for shared resources. When bots and tools are
added that need to do expensive, long-running queries against a set of
common databases we'll likely need some form of queuing system, but it
hasn't been a high priority since we haven't been working on
Toolserver like features.

- Ryan

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


Re: [Toolserver-l] Reasons for not migrating to Tool Lab

2012-09-27 Thread Andrei Cipu
 Additionally, we have a number of volunteer driven projects. Here's a

 few choice ones:
 
 * Bots
 * Deployment-prep
 * Maps (for OpenStreetMaps)
 * Wikistats
 * Wikitrust
 * Signwriting
 * Phabricator
 * Metavidwiki
 * Huggle
 * Glam
 * Wiki loves monuments
 * Blamemaps
 * Counter vandalism network
 

Where can we find more information about these projects, especially OSM and WLM?


___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


Re: [Toolserver-l] Reasons for not migrating to Tool Lab

2012-09-27 Thread Carl (CBM)
On Wed, Sep 26, 2012 at 2:25 PM, Ryan Lane rl...@wikimedia.org wrote:
 We currently have no plans for having the user databases on the same
 servers as the replicated databases. Direct joins will not be
 possible, so tools will need to be modified.

This is unfortunate, and a huge step backwards from the situation on
the toolserver.

For example, the project I maintain on toolserver (the enwiki WP 1.0
assessment data) has user database tables with several million rows of
data about articles, from which it needs to select the data for pages
from fixed categories on the wiki, which themselves could have a few
thousand members. The natural way to do this is to join against the
categorylinks table. Any non-join solution is going to be much, much
less efficient.

A key role of the toolserver setup was that it allowed these sorts of
joins. Web hosting is cheap and data about the live wiki is already
available in non-joinable form through the API with no replag.

- Carl

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


Re: [Toolserver-l] Reasons for not migrating to Tool Lab

2012-09-27 Thread Platonides
On 27/09/12 17:21, Andrei Cipu wrote:
 Additionally, we have a number of volunteer driven projects. Here's a
 
 few choice ones:

 * Bots
 * Deployment-prep
 * Maps (for OpenStreetMaps)
 * Wikistats
 * Wikitrust
 * Signwriting
 * Phabricator
 * Metavidwiki
 * Huggle
 * Glam
 * Wiki loves monuments
 * Blamemaps
 * Counter vandalism network

 
 Where can we find more information about these projects, especially OSM and 
 WLM?

You can go to https://labsconsole.wikimedia.org/ List projects, take a
look at project members and bug them to tell you their evil plans :)

The project for WLM is actually one for making a tool for judging the
images (Wlmjudging). There are a couple of VMs, I don't know if it
produced something. I'd ask Ynhockey about it.

All real work for Wiki Loves Monuments is done in the toolserver (plus
http://wlm.wikimedia.org/ which has a copy of the data produced at TS).


___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


Re: [Toolserver-l] Reasons for not migrating to Tool Lab

2012-09-27 Thread Platonides
On 27/09/12 01:07, Ryan Lane wrote:
 We currently have no plans for having the user databases on the same
 servers as the replicated databases. Direct joins will not be
 possible, so tools will need to be modified.

 -50

 It's such a useful feature, that it would be worth making a local mysql
 slaves for having them.
 I know, the all-powerful labs environment is unable to run a mysql
 instance, but we could use MySQL cluster, trading memory (available) to
 get joins (denied).

 
 I'm not the one setting up the databases. If you want information
 about why this won't be available, talk to Asher (binasher in
 #wikimedia-operations on Freenode). Maybe he can be convinced
 otherwise.
 
 Of course, in the production cluster we don't do joins this way. We
 handle the joins in the app logic, which is a more appropriate way of
 doing this.

I disagree. In production you can just create a new table in the wiki
db. We can't create new tables there in the toolserver (the dbs are a
mirror or what there is in production). Thus, we create a new db in the
same server and use a cross-db join instead of joining a new table.

Joining several wiki tables is probably more strange, with the exception
of commons, which is more often joined to others, as the commons images
are also at the local wikis.

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


Re: [Toolserver-l] Reasons for not migrating to Tool Lab

2012-09-27 Thread Federico Leva (Nemo)

Ryan wrote:
 I should know better than to feed a troll, but Labs is relatively
 heavily used.

I'm sorry if that was perceived as trolling. I know most of that stuff, 
I like it, I've advertised some of those projects quite a lot myself 
etc., but it seems to be nowhere close to a few hundreds users doing 
very expensive and maybe silly things as those DaB. mentioned: queries 
lasting days, grep or sed run on hundreds of GiB of stuff, scripts 
taking 4 GiB of memory etc.; not to mention all sorts of queries which 
can be triggered by any of the tens of thousands of users of the web 
tools on TS.
That said, of course you know better what's the capacity of Labs; what I 
can be sure of is that it's not infinite as you almost pretend.
Anyway, the problem is not (yet) if typical TS stuff will have enough 
resources but if it will be possible at all (current answer: no, never! 
change how you do things instead!).


Nemo

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


Re: [Toolserver-l] Reasons for not migrating to Tool Lab-OSM

2012-09-27 Thread Mike Dupont
Here! i am interest in that!
let me know!

On Fri, Sep 28, 2012 at 7:03 AM, Ryan Lane rl...@wikimedia.org wrote:

 The long-term plan is to have OSM in production. OSM in Labs is meant
 for puppetization, test, and development. I think we even have the
 hardware for OSM in production. Someone just needs to put the effort
 in for puppetization




-- 
James Michael DuPont
Member of Free Libre Open Source Software Kosova http://flossk.org
Saving wikipedia(tm) articles from deletion http://SpeedyDeletion.wikia.com
Contributor FOSM, the CC-BY-SA map of the world http://fosm.org
Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3
Free Software Foundation Europe Fellow http://fsfe.org/support/?h4ck3rm1k3
___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] Reasons for not migrating to Tool Lab

2012-09-26 Thread Tim Landscheidt
(anonymous) wrote:

 [...]
 I think I'd add general direction of centralizing everything under a single
 Wikimedia Foundation is a bad idea as a permanent blocker. Maybe there's a
 reasonable case for why deprecating the Toolserver and creating Wikimedia
 Labs is a great idea, but I don't see it yet.

 I don't see why each (Wikimedia) chapter shouldn't have its own replica of
 the databases. We want free content to be free (and re-used and re-mixed and
 whatever else). If you're going to invest in infrastructure, I think it
 makes more sense to bolster replication support than try to compete with the
 Toolserver.

 That said, pooled resources can sometimes be a smart move to save on
 investments such as hardware. Chapters working together is not a bad thing
 (I believe some chapters donated to Wikimedia Deutschland for Toolserver
 support in the past). But the broader point is that users should be very
 cautious of the general direction that a Wikimedia (Foundation) Labs is
 headed and ask whether it's really a good idea iff it means the destruction
 of free-standing projects such as the Toolserver.

IMHO you have to differentiate between data and function.
It makes no sense to build artificial obstacles when setting
up some tool that can only be reasonably used with the live
dataset.  On the other hand, preparing for a day where WMF
turns rogue is never wrong.

  But the nice thing about Labs is that you can try out (re-
plicable :-)) replication setups at no cost, and don't have
to upfront investments on hardware, etc., so when time
comes, you can just upload your setup to EC2 or whatever and
have a working Wikipedia clone running in a manageable time-
frame.

Tim


___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


Re: [Toolserver-l] Reasons for not migrating to Tool Lab

2012-09-26 Thread Ryan Lane
 temporary blockers
 * no replication of wikimedia wiki databases
 ** joining of user databases with wiki databases

We currently have no plans for having the user databases on the same
servers as the replicated databases. Direct joins will not be
possible, so tools will need to be modified.

 * no support for script execution dependency (on ts: currently done by sge)

There's less of a need for this in Labs. If whatever you are running
is really expensive, you can have your own instance. That said, I was
looking at integrating a global queuing system. It won't be SGE,
though.

If someone is really keen on SGE, then I recommend they work with us
to puppetize it. Thankfully, open grid engine is already packaged in
ubuntu, which should make that much easier.

 * no support for servlets


I'm not sure what you mean by servlet?

 missing support blockers
 * no support for new users not familar with unix based systems

Can you describe how this is handled in Toolserver currently?

 * no transparent updating of packages with security problems/bug


Ubuntu has unattended-upgrades. It's generally enabled on instances.

 permanent blockers
 * license problems (i wrote code at work for my company and reuse parts for
 my bot framework. I have not the right to declare this code as open source
 which is needed by labs policy.)

This will continue to be a permanent blocker.

You can't decide that on your own, but you can ask your employer if
you can open source the code.

 * no DaB.


I'd love DaB to help us improve Labs.

Everything about Labs is fully open. Anyone can help build it, even
the production portions.

- Ryan

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


Re: [Toolserver-l] Reasons for not migrating to Tool Lab

2012-09-26 Thread Daniel Schwen
At the risk of outing myself as naive: I do not see this as a
problem like MZMcBride does. I think the foundation should have earned
our trust by now and them locking down the data does not seem like a
credible threat to me.
In any case:

a) you can download dumps to access the data independently from WMF
b) the replication to the TS is already at the mercy of WMF. The TS
does not make the data any free-er.

Best,
Dschwen


 I think I'd add general direction of centralizing everything under a single
 Wikimedia Foundation is a bad idea as a permanent blocker. Maybe there's a
 reasonable case for why deprecating the Toolserver and creating Wikimedia
 Labs is a great idea, but I don't see it yet.

 I don't see why each (Wikimedia) chapter shouldn't have its own replica of
 the databases. We want free content to be free (and re-used and re-mixed and
 whatever else). If you're going to invest in infrastructure, I think it
 makes more sense to bolster replication support than try to compete with the
 Toolserver.

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


Re: [Toolserver-l] Reasons for not migrating to Tool Lab

2012-09-26 Thread Erik Moeller
On Wed, Sep 26, 2012 at 10:15 AM, MZMcBride z...@mzmcbride.com wrote:
 I think I'd add general direction of centralizing everything under a single
 Wikimedia Foundation is a bad idea as a permanent blocker.

As others have noted, there's a difference between offering data
(which we do - we've spent a lot of time, money and effort to ensure
that stuff like dumps.wikimedia.org works reliably even at enwiki
scale) and providing a working environment for the dev community.

Having a primary working environment like Labs makes sense in much the
same way that it makes sense to have a primary multimedia repository
like Commons (and Wikidata, and in future probably a gadget
repository, a Lua script repository, etc.). It enables community
network effects and economies of scale that can't easily be replicated
and reduces wasteful duplication of effort.

That said, I'd love to make more real-time data feeds available for
third parties in general. The analytics team is currently looking into
offering a sensible alternative to the IRC feed for edit metadata, for
example.

Erik
-- 
Erik Möller
VP of Engineering and Product Development, Wikimedia Foundation

Support Free Knowledge: https://wikimediafoundation.org/wiki/Donate

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


Re: [Toolserver-l] Reasons for not migrating to Tool Lab

2012-09-26 Thread Ryan Lane
 As others have noted, there's a difference between offering data
 (which we do - we've spent a lot of time, money and effort to ensure
 that stuff like dumps.wikimedia.org works reliably even at enwiki
 scale) and providing a working environment for the dev community.

 Having a primary working environment like Labs makes sense in much the
 same way that it makes sense to have a primary multimedia repository
 like Commons (and Wikidata, and in future probably a gadget
 repository, a Lua script repository, etc.). It enables community
 network effects and economies of scale that can't easily be replicated
 and reduces wasteful duplication of effort.


I'd like to go a little further on this point.

One of the goals of Labs is to have a fully virtualized clone of our
entire infrastructure that is also completely puppetized in a way
that's reusable by third parties. If you're worried about WMF, then
you should participate in Labs. You should help puppetize and should
help make everything usable by non-WMF entities.

Bringing community operations members back into the operations of the
site is another one of the goals of Labs. If we have enough community
operations people, then the projects aren't dependent on the knowledge
of the staff to survive.

If WMF becomes evil, fork the entire infrastructure into EC2,
Rackspace cloud, HP cloud, etc. and bring the community operations
people along for the ride. Hell, use the replicated databases in Labs
to populate your database in the cloud.

- Ryan

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


Re: [Toolserver-l] Reasons for not migrating to Tool Lab

2012-09-26 Thread Ryan Lane
 Yes, there's a difference. But in this case, as far as I understand it, a
 direct cost (or casualty) of setting up Wikimedia Labs is the existence of
 the Toolserver. Does Wikimedia need a great testing infrastructure? Yes, of
 course. (And it's not as though the Toolserver has ever been without its
 share of issues; I'm not trying to white-wash the past here.) But the
 question is: if such a Wikimedia testing infrastructure comes at the cost of
 losing the Toolserver, is that acceptable?


This is a scarecrow argument. The mere existence of Labs doesn't mean
the loss of Toolserver.

Labs is more than just a testing infrastructure. It's an
infrastructure for creating things, for enable volunteer operations,
for bringing operations and development together, for integrating
other projects, and for providing free hosting to projects that may
not have it otherwise. Labs just also happens to need some of the same
features as Toolserver.

Again, as I've mentioned, Labs purpose isn't a Toolserver replacement.
It's vision is much, much larger than what the Toolserver can do.

 Ryan Lane wrote:
 If WMF becomes evil, fork the entire infrastructure into EC2,
 Rackspace cloud, HP cloud, etc. and bring the community operations
 people along for the ride. Hell, use the replicated databases in Labs
 to populate your database in the cloud.

 Tim Landscheidt wrote:
 But the nice thing about Labs is that you can try out (re-
 plicable :-)) replication setups at no cost, and don't have
 to upfront investments on hardware, etc., so when time
 comes, you can just upload your setup to EC2 or whatever and
 have a working Wikipedia clone running in a manageable time-
 frame.

 This is not an easy task. Replicating the databases is enormously
 challenging (they're huge datasets in the cases of the big wikis) and
 they're constantly changing. If you tried to rely on dumps alone, you'd
 always be out of date by at least two weeks (assuming dumps are working
 properly). Two weeks on the Internet is a lot of time.

 But more to the point, even if you suddenly had a lot of infrastructure
 (bandwidth for constantly retrieving the data, space to store it all, and
 extra memory and CPU to allow users to, y'know, do something with it) and
 even if you suddenly had staff capable of managing these databases, not
 every table is in even available currently. As far as I'm aware,
 http://dumps.wikimedia.org doesn't include tables such as user,
 ipblocks, archive, watchlist, any tables related to global images or
 global user accounts, and probably many others. I'm not sure a full audit
 has ever been done, but this is partially tracked by
 https://bugzilla.wikimedia.org/show_bug.cgi?id=25602.

 So beyond the silly simplicity of the suggestion that one could simply move
 to the cloud!, there are currently technical impossibilities to doing so.


It's the same impossibilities for forking any single CC project
online. We're not allowed by our privacy policy (and very likely by
law) to provide that information. It's absurd to fault us on this. I
guess we're being evil by not being evil.

We've providing every single other needed piece of the puzzle required
for forking.

- Ryan

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


Re: [Toolserver-l] Reasons for not migrating to Tool Lab

2012-09-26 Thread Hersfold
You may not have meant for it to lead to the end of the Toolserver, but 
apparently that's how WMDE is taking it, and it sounds like that's going 
to be the inevitable result. To say otherwise is rather naive at this 
point, given the size of the threads talking about this.



User:Hersfold
hersfoldw...@gmail.com

On 9/26/2012 6:06 PM, Ryan Lane wrote:

Yes, there's a difference. But in this case, as far as I understand it, a
direct cost (or casualty) of setting up Wikimedia Labs is the existence of
the Toolserver. Does Wikimedia need a great testing infrastructure? Yes, of
course. (And it's not as though the Toolserver has ever been without its
share of issues; I'm not trying to white-wash the past here.) But the
question is: if such a Wikimedia testing infrastructure comes at the cost of
losing the Toolserver, is that acceptable?


This is a scarecrow argument. The mere existence of Labs doesn't mean
the loss of Toolserver.

Labs is more than just a testing infrastructure. It's an
infrastructure for creating things, for enable volunteer operations,
for bringing operations and development together, for integrating
other projects, and for providing free hosting to projects that may
not have it otherwise. Labs just also happens to need some of the same
features as Toolserver.

Again, as I've mentioned, Labs purpose isn't a Toolserver replacement.
It's vision is much, much larger than what the Toolserver can do.


Ryan Lane wrote:

If WMF becomes evil, fork the entire infrastructure into EC2,
Rackspace cloud, HP cloud, etc. and bring the community operations
people along for the ride. Hell, use the replicated databases in Labs
to populate your database in the cloud.

Tim Landscheidt wrote:

But the nice thing about Labs is that you can try out (re-
plicable :-)) replication setups at no cost, and don't have
to upfront investments on hardware, etc., so when time
comes, you can just upload your setup to EC2 or whatever and
have a working Wikipedia clone running in a manageable time-
frame.

This is not an easy task. Replicating the databases is enormously
challenging (they're huge datasets in the cases of the big wikis) and
they're constantly changing. If you tried to rely on dumps alone, you'd
always be out of date by at least two weeks (assuming dumps are working
properly). Two weeks on the Internet is a lot of time.

But more to the point, even if you suddenly had a lot of infrastructure
(bandwidth for constantly retrieving the data, space to store it all, and
extra memory and CPU to allow users to, y'know, do something with it) and
even if you suddenly had staff capable of managing these databases, not
every table is in even available currently. As far as I'm aware,
http://dumps.wikimedia.org doesn't include tables such as user,
ipblocks, archive, watchlist, any tables related to global images or
global user accounts, and probably many others. I'm not sure a full audit
has ever been done, but this is partially tracked by
https://bugzilla.wikimedia.org/show_bug.cgi?id=25602.

So beyond the silly simplicity of the suggestion that one could simply move
to the cloud!, there are currently technical impossibilities to doing so.


It's the same impossibilities for forking any single CC project
online. We're not allowed by our privacy policy (and very likely by
law) to provide that information. It's absurd to fault us on this. I
guess we're being evil by not being evil.

We've providing every single other needed piece of the puzzle required
for forking.

- Ryan

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette



___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


Re: [Toolserver-l] Reasons for not migrating to Tool Lab

2012-09-26 Thread Ryan Lane
On Wed, Sep 26, 2012 at 6:29 PM, Hersfold hersfoldw...@gmail.com wrote:
 You may not have meant for it to lead to the end of the Toolserver, but
 apparently that's how WMDE is taking it, and it sounds like that's going to
 be the inevitable result. To say otherwise is rather naive at this point,
 given the size of the threads talking about this.


I'll be honest, I don't really care about the politics behind any of
this, and I'm going to ignore anything more related to that. WMDE
dropping Toolserver is their decision and it doesn't affect how Labs
will operate in the future.

Labs is adding infrastructure needed to support Toolserver users. If
there's anything the Toolserver community needs that isn't in our
current roadmap, I'm more than happy to work those issues with the
community. The environment isn't going to be exactly the same, so
tools and bots may need to be modified. We can provide the necessary
resources, access, and training to integrate into the new environment.
WMDE will be providing resources to help with migrations.

Overall the environment provided by Labs has the ability to be much
more flexible and much more powerful than Toolserver. I hope everyone
migrates over, but I'll understand if anyone feels like it's too much
work.

- Ryan

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette