Re: [Wikitech-l] Can we help Tor users make legitimate edits?

2013-01-04 Thread aude
On Sat, Jan 5, 2013 at 4:27 AM, Risker  wrote:

>
>
> >
> Bawolff has it right, pretty much.  For legitimate users, an IPBE can be
> handed out. We have very limited human resources on the projects themselves
> to address the issuing of tokens and IPBEs now.
>
> For me, this is largely a philosophical argument; yes, it would be in
> keeping with the "everyone can edit" ethic to enable Tor editing. For a
> very small number of WMF projects, it might attract a greater number of
> editors; if the project itself wants to consider Tor editing appropriate,
> it would be nice to find a way to exempt that project from the general
> prohibition. On the other hand, for the vast majority of projects, it would
> attract more problems and/or require excess attention from the limited
> number of volunteers (ie, checkusers) who are qualified to determine if an
> IPBE or "Tor token" is appropriate for a specific user.  On some projects,
> almost every single editor who has ever been found to use [not yet blocked]
> Tor IPs was identified as such because of a legitimate concern about that
> editor's behaviour.
>

I hope we don't (but rarely, perhaps) checkuser accounts that are behaving
properly, so don't think we'd necessarily find many of the well-behaving
tor users.

Of course, we find the bad behavior accounts.

Cheers,
Katie





>
> Risker/Anne
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>



-- 
@wikimediadc / @wikidata
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Can we help Tor users make legitimate edits?

2013-01-04 Thread Risker
On 4 January 2013 20:44, bawolff  wrote:

> On Fri, Jan 4, 2013 at 9:53 AM, Tyler Romeo  wrote:
> [..]
> > As far as a solution goes, I have a complete codebase for
> > Extension:TokenAuth, which allows users to have MediaWiki sign a blinded
> > token, which can then be used to bypass a specific IP block in order to
> log
> > in and edit. It is almost ready; there are just a few functionality
> > problems with the JavaScript crypto library.
>
> That sounds really cool. However I'm not sure how it solves the
> problem. If we allow people to get tokens signed that lets them bypass
> the TOR blocks, we may as well just not hand out tor blocks in the
> first place (if everyone can get a blinded token), or hand out the
> overrides via IP block exempt group (If we limit who can get such
> tokens).
>
>
Bawolff has it right, pretty much.  For legitimate users, an IPBE can be
handed out. We have very limited human resources on the projects themselves
to address the issuing of tokens and IPBEs now.

For me, this is largely a philosophical argument; yes, it would be in
keeping with the "everyone can edit" ethic to enable Tor editing. For a
very small number of WMF projects, it might attract a greater number of
editors; if the project itself wants to consider Tor editing appropriate,
it would be nice to find a way to exempt that project from the general
prohibition. On the other hand, for the vast majority of projects, it would
attract more problems and/or require excess attention from the limited
number of volunteers (ie, checkusers) who are qualified to determine if an
IPBE or "Tor token" is appropriate for a specific user.  On some projects,
almost every single editor who has ever been found to use [not yet blocked]
Tor IPs was identified as such because of a legitimate concern about that
editor's behaviour.

Risker/Anne
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Can we help Tor users make legitimate edits?

2013-01-04 Thread bawolff
On Fri, Jan 4, 2013 at 9:53 AM, Tyler Romeo  wrote:
[..]
> As far as a solution goes, I have a complete codebase for
> Extension:TokenAuth, which allows users to have MediaWiki sign a blinded
> token, which can then be used to bypass a specific IP block in order to log
> in and edit. It is almost ready; there are just a few functionality
> problems with the JavaScript crypto library.

That sounds really cool. However I'm not sure how it solves the
problem. If we allow people to get tokens signed that lets them bypass
the TOR blocks, we may as well just not hand out tor blocks in the
first place (if everyone can get a blinded token), or hand out the
overrides via IP block exempt group (If we limit who can get such
tokens).

-bawolff

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] LevelUp (sequel to "What do you want to learn?" & 20% time)

2013-01-04 Thread Sumana Harihareswara
On 11/21/2012 07:10 PM, Sumana Harihareswara wrote:
> LevelUp is a mentorship program that will start in January 2013 and that
> replaces the "20% time" policy
> https://www.mediawiki.org/wiki/Wikimedia_engineering_20%25_policy for
> Wikimedia Foundation engineers.  Technical contributors, volunteer or
> staff, have the opportunity to participate; see
> https://www.mediawiki.org/wiki/Mentorship_programs/LevelUp for more details.
> 
> We started 20% time to ensure that Wikimedia Foundation engineers would
> spend at least 20% of each week on tasks that directly serve the
> Wikimedia developer and user community, including bug triage, code
> review, extension review, documentation, urgent bugfixes, and so on.  It
> had various flaws. 1 day every week, I made people task-switch and it
> got in the way of their deadlines, and it was perceived as a chore that
> always needed doing.
> 
> It felt like enforcing a rota to do the dishes.  So instead, let's build
> a dishwasher.  :-)  We can cross-train each other and fill in the empty
> rows on the maintainership table
> https://www.mediawiki.org/wiki/Developers/Maintainers so our whole
> community gains the capacity to get stuff done faster.
> 
> If you've been frustrated because of code review delays, I want you to
> sign up for LevelUp -- by March 2013 you could be a comaintainer of a
> codebase and be merging and improving other people's patchsets, which
> will give them more time and incentive to merge yours. :-)
> 
> When I asked what people wanted to learn, I got a variety of responses
> -- including "MediaWiki in general", "puppet", "networking", and "JS,
> PHP, HTML, CSS, SQL" -- all of which you can learn through LevelUp.
> When I asked how you wanted to learn, all of you said you wanted
> real-life, hands-on work with mentors who could answer your questions.
> Here you go. :-)
> 
> I won't be starting the matchmaking process in earnest till I come back
> from the Thanksgiving break on Monday, but I will reply to talk page
> messages and emails then. :-)

Sorry for the delay on this.  I am doing matchmaking now -- I have
started off with the people who had already contacted me to tell me what
they are interested in learning or teaching for January-March 2013.
I'll put matches or people awaiting matching down in
https://www.mediawiki.org/wiki/Mentorship_programs/LevelUp/Q1_2013 ; I'm
maintaining that page so please let me add you.

You can sign up now by emailing me and telling me what you'd like to
learn or teach.  I can't absolutely guarantee you that I can match you
with someone, but the probability is very high.

-- 
Sumana Harihareswara
Engineering Community Manager
Wikimedia Foundation

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] revamped/updated QA docs on mediawiki.org

2013-01-04 Thread Chris McMahon
I've sorted, linked, tagged, organized, and gardened our collection of QA
pages on mw.o to be more useful.  Of course there is always more to do, so
comments, criticism, edits are welcome.

http://www.mediawiki.org/wiki/QA

-Chris
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Adapting Visual Editor for 1.19 - [FCKeditor issues]

2013-01-04 Thread Thomas Gries
Am 04.01.2013 19:57, schrieb Mark A. Hershberger:
> Even more reason for us to get VE working against 1.19. I saw that
> some people have added instructions for FCKeditor on 1.19, but VE
> would be much more compelling if we can get it working. 

Yes!

Switching between MediaWiki Wiki syntax and the FCKeditor WYSIWYG method
created so often problems, and one often misses many of the advanced
Wiki syntax and parser function s ---  that I simply decided to disable
the FCKeditor for all my users



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Adapting Visual Editor for 1.19 - [FCKeditor issues]

2013-01-04 Thread Mark A. Hershberger
On 01/04/2013 01:26 PM, Thomas Gries wrote:
> I must warn users, because the FCKeditor as such has known security issues.

Even more reason for us to get VE working against 1.19.  I saw that some
people have added instructions for FCKeditor on 1.19, but VE would be
much more compelling if we can get it working.


-- 
http://hexmode.com/

Language will always shift from day to day. It is the wind blowing
through our mouths. -- http://hexm.de/np

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Adapting Visual Editor for 1.19 - [FCKeditor issues]

2013-01-04 Thread Thomas Gries
Am 04.01.2013 18:02, schrieb Mark A. Hershberger:
>
> * FCKeditor is no longer supported for MediaWiki, but people are still
> using it and, for some reason, like what it provides.  If we can make
> the Visual Editor available to them, I'm hoping the need for FCKeditor
> will disappear.
Hi Mark,
and users of FCKeditor

- regarding the FCKeditor and the extension
https://www.mediawiki.org/wiki/Extension:FCKeditor_%28by_Mafs%29 which I
co-authored many years ago, but stopped using it -

I must warn users, because the FCKeditor as such has known security issues.

These (some are severe) issues can quickly be spotted by a search with
your favorite search engine, results are like
http://www.cvedetails.com/vulnerability-list/vendor_id-2724/Fckeditor.html


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Generating documentation from JavaScript doc comments

2013-01-04 Thread Matthew Flaschen
On 01/04/2013 08:00 AM, Krinkle wrote:
> Doxygen is indeed not meant for JavaScript. With some hacks it can be tricked 
> into reading comment blocks from javascript files, but that won't scale for 
> our code base, nor will it be enough to make a useful structure given the 
> dynamic way JavaScript works.
> 
> JSDoc is pretty solid, though there are some concerns:
> * The syntax is somewhat foreign compared to what we're doing right now
> * Development is unclear (v2 on google-code has been discontinued, v3 on 
> github is a rewrite still being worked on)
> * Written in javascript, but doesn't run on node. Requires Java.

One that does run on node is YUIDoc.  I'm using an older version
successfully for ProveIt, and hopefully it has improved since then.
http://yui.github.com/yuidoc/

> I've recently looked into a documentation generator for VisualEditor and 
> though I haven't stopped looking yet, I'm currently pausing rather long at 
> JSDuck. It is very well engineered and especially optimised for modern 
> JavaScript (inheritance, mixins, event emitters, override/overload methods 
> from another module, modules, etc.).

How do you do modules?  I don't see @module at
https://github.com/senchalabs/jsduck/wiki/Guide .  JSDoc
(http://usejsdoc.org/#JSDoc3_Tag_Dictionary) and YUIDoc
(http://yui.github.com/yuidoc/syntax/index.html) both have them.

I think the module concept is important, since we have so many, and many
(e.g. the API ones) just modify existing classes.

> It is also easy to extend when needing to implement custom @tags.

That's good, and we could use it to implement module.

> I've set up a vanilla install for VisualEditor's code base here:
> 
> http://integration.wmflabs.org/mwext-VisualEditor-docs/

The docs definitely look great.  I like (among other things) that they
link to docs for the native types.

> Right now, like MediaWiki core, VisualEditor is just documenting code 
> loosely, not following any particular doc-syntax, so we're bound to require a 
> few tweaks[1] no matter which framework we choose. Our current syntax is just 
> something we came up with loosely based on what we're used to with Doxygen.

Right, we're going to have to change stuff either way, so the important
thing is choosing something solid.

> Right the demo on labs  only uses the "Doc" app of JSDuck, but it also 
> supports Guides, Demo's, interactive live-editable Examples and more.
> 
> A few random things I like in particular about JSDuck are:
> * Support documenting parameters of callback functions
> * Support documenting events emitted by a class/module
> * Option to show/hide inherited methods and other characteristics
> * Support to completely document objects for @param and @return (like @param 
> {Object} foo, @param {number} foo.bar)

Those do sound pretty cool.

> If it works out, I think we can get this going for MediaWiki core as well.

Great.  I was thinking we could start with a Labs install just to see
how it looks initially.

> Regardless of the framework we choose, we should set it up to be generated 
> for branches and update on merge from jenkins's post-merge hooks. Come to 
> think of it, we should probably do that for the PHP/Doxygen as well (is that 
> still running from the cronjob on svn.wikimedia.org?).

Agreed, auto-updating docs on merge would be nice.

Matt Flaschen

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Adapting Visual Editor for 1.19

2013-01-04 Thread Gabriel Wicke
On 01/04/2013 09:02 AM, Mark A. Hershberger wrote:
> There is a dependency on Parsoid and node.js, of course, that a
> FCKeditor doesn't need, but I'm assuming right now that if MediaWiki
> works with the extension, then the Parsoid instance will just run.

This should be the case. Parsoid only interacts with MediaWiki through
long-established web API calls, so should be compatible with very old
MediaWiki versions.
-- 
Gabriel Wicke
Senior Software Engineer
Wikimedia Foundation

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Adapting Visual Editor for 1.19

2013-01-04 Thread Chris McMahon
On Fri, Jan 4, 2013 at 10:04 AM, David Gerard  wrote:

> On 4 January 2013 17:02, Mark A. Hershberger  wrote:
>
> > Is anyone else interested in helping to make this happen?
>
>
> I have no coding ability but would LOVE this for our work 1.19
> instances, and would be most pleased to test.
>
>
I think it would be valuable to have a coordinated effort to test Visual
Editor at a time when such a project could provide useful feedback to the
VE development team.

-Chris
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Adapting Visual Editor for 1.19

2013-01-04 Thread David Gerard
On 4 January 2013 17:02, Mark A. Hershberger  wrote:

> Is anyone else interested in helping to make this happen?


I have no coding ability but would LOVE this for our work 1.19
instances, and would be most pleased to test.


- d.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] Adapting Visual Editor for 1.19

2013-01-04 Thread Mark A. Hershberger
Would it be possible to adapt the Visual Editor to run under 1.19?

I have a couple of reasons for wanting that:

* 1.19 is our LTS release.  Visual Editor looks awesome and I'd like to
provide people who are stuck on older versions of MediaWiki with a
persuasive reason to upgrade to 1.19 at least.

* FCKeditor is no longer supported for MediaWiki, but people are still
using it and, for some reason, like what it provides.  If we can make
the Visual Editor available to them, I'm hoping the need for FCKeditor
will disappear.

There is a dependency on Parsoid and node.js, of course, that a
FCKeditor doesn't need, but I'm assuming right now that if MediaWiki
works with the extension, then the Parsoid instance will just run.

I'll be setting up and testing Visual Editor soon and start looking at
what is needed to get it running against 1.19 and begin to see how
feasible that even is.

Is anyone else interested in helping to make this happen?


-- 
http://hexmode.com/

Language will always shift from day to day. It is the wind blowing
through our mouths. -- http://hexm.de/np

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Can we help Tor users make legitimate edits?

2013-01-04 Thread Tyler Romeo
On the topic of whether allowing Tor users to edit is a concern, I believe
so. Because of Tor blocks, it is sometimes extremely difficult, or even
impossible altogether, to edit Wikipedia for some users. I believe we
should give these users the opportunity to contribute rather than have them
punished because of others who misuse Tor for spamming/sockpuppeting.

As far as a solution goes, I have a complete codebase for
Extension:TokenAuth, which allows users to have MediaWiki sign a blinded
token, which can then be used to bypass a specific IP block in order to log
in and edit. It is almost ready; there are just a few functionality
problems with the JavaScript crypto library.

*--*
*Tyler Romeo*
Stevens Institute of Technology, Class of 2015
Major in Computer Science
www.whizkidztech.com | tylerro...@gmail.com


On Sat, Dec 29, 2012 at 7:12 PM, Platonides  wrote:

> On 28/12/12 18:29, Tilman Bayer wrote:
> > On Fri, Dec 28, 2012 at 1:26 AM, Sumana Harihareswara wrote:
> >> I've floated this problem past Tor and privacy people, and here are a
> >> few ideas:
> >>
> >> 1) Just use the existing mechanisms more leniently.  Encourage the
> >> communities (Wikimedia & Tor) to use
> >> https://en.wikipedia.org/wiki/Wikipedia:Request_an_account (to get an
> >> account from behind Tor) and to let more people get IP block exemptions
> >> even before they've made any edits (< 30 people have gotten exemptions
> >> on en.wp in 2012).  Add encouraging "get an exempt account" language to
> >> the "you're blocked because you're using Tor" messaging.  Then if
> >> there's an uptick in vandalism from Tor then they can just tighten up
> >> again.
>
> This seems the right approach.
>
>
> >> 2) Encourage people with closed proxies to re-vitalize
> >> https://en.wikipedia.org/wiki/Wikipedia:WOCP .  Problem: using closed
> >> proxies is okay for people with some threat models but not others.
>
>
> I didn't know about it. This is an interesting concept. It would be
> possible to setup some 'public wikipedia proxys' (eg. by an European
> chapter) and encourage its use.
> It would still be possible to checkuser people going through that, but
> a 2-tier process would be needed (wiki checkuser + proxy admin) thus
> protecting from a “rogue checkuser” (Is that the primary concern of good
> editors wishing to use proxys?). We could use that setup for gaining
> information about usage (eg. it was 90% spam).
>
>
> >> 3) Look at Nymble - http://freehaven.net/anonbib/#oakland11-formalizing
> >> and http://cgi.soic.indiana.edu/~kapadia/nymble/overview.php .  It
> would
> >> allow Wikimedia to distance itself from knowing people's identities, but
> >> still allow admins to revoke permissions if people acted up.  The user
> >> shows a real identity, gets a token, and exchanges that token over tor
> >> for an account.  If the user abuses the site, Wikimedia site admins can
> >> blacklist the user without ever being able to learn who they were or
> >> what other edits they did.  More: https://cs.uwaterloo.ca/~iang/ Ian
> >> Golberg's, Nick Hopper's, and Apu Kapadia's groups are all working on
> >> Nymble or its derivatives.  It's not ready for production yet, I bet,
> >> but if someone wanted a Big Project
> >
> > As Brad and Ariel point out, Nymble in the form described on the linked
> > project page does not seem to allow long-term blocks, and cannot deal
> with
> > dynamic IPs. In other words, it would only provide the analogue of
> > autoblock functionality for Tor users. The linked paper by Henry and
> > Goldberg is more realistic about these limitations, discussing IP
> addresses
> > only as one of several possible "unique identifiers" (§V). From the
> > concluding remarks to that chapter, it seems most likely that they would
> > recommend "some form of PKI or government ID-based registration" for our
> > purposes.
>
> Requiring a government ID for connecting through tor would be even worse
> for privacy.
>
> I completely agree that matching with the IP address used to request the
> nymble token is not enough. Maybe if the tokens were instead based in
> ISP+zone geolocation, that could be a way. Still, that would still miss
> linkability for vandals which use eg. both their home and work connections.
>
>
> > 3a) A token authorization system (perhaps a MediaWiki extension) where
> > the server blindly signs a token, and then the user can use that token
> > to bypass the Tor blocks.  (Tyler mentioned he saw this somewhere in a
> > Bugzilla suggestion; I haven't found it.)
>
> Bug 3729 ?
>
>
> >> Thoughts? Are any of you interested in working on this problem?  #tor on
> >> the OFTC IRC server is full of people who'd be interested in talking
> >> about this.
>
> This is a social problem. We have the tools to fix it (account creation
> + ip block exemption). If someone asked me for that (in a project where
> I can) because they are censored by their government I would gladly
> grant it.
> That also means that when they replaced 'Jimbo' with

Re: [Wikitech-l] monitoring / control system for bots

2013-01-04 Thread Matma Rex

On Fri, 04 Jan 2013 05:42:45 +0100, Lars Aronsson  wrote:


On 01/02/2013 06:11 PM, Matthew Flaschen wrote:

Every wiki has a different approach to bots.  But for English Wikipedia,
that is not how the approval process
(https://en.wikipedia.org/wiki/Wikipedia:BOTAPPROVAL) works:

"Small changes, for example to fix problems or improve the operation of
a particular task, are unlikely to be an issue, but larger changes
should not be implemented without some discussion. Completely new tasks
usually require a separate approval request. Bot operators may wish to
create a separate bot account for each task."


That is what the rules say, but do you have any science
to back up that this is also how it works in practice?
How many bot accounts are revoked each month
because their owners were naughty and used their bots
in a different manner from what they applied for?
The idea with a bot account, after all, is that nobody
bothers to watch your edits in the Recent Changes.

I think you can go forward if you accept that there are
some bots that run like a machinery, according to the
rules, and other bot accounts that are used like a more
advanced browser for a creative and spontaneous user.



You are both assuming that there are no other wikis except for the English 
Wikipedia.

For example, on pl.wiki, there are basically only two kinds of bots: 
interwiki-only and multipurpose. As long as you're not breaking anything using 
the bot and not doing anycontroversial changes, if you've gotten the flag, you 
can do any task you deem necessary. A bot control in this case simply wouldn't 
work.

Not to mention that I think *most* of the bots n pl.wiki are ran from users' 
home computers, most often on AWB or a local pywikipedia install, but there are 
at least three people who use their own libraries, including myself.

And if this is an en.wiki-only matter, this isn't really the right list to 
discuss it.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Wikidata change propogation

2013-01-04 Thread Daniel Kinzler
Thanks Rob for starting the conversation about this.

I have explained our questions about how to run updates in the mail titled
"Running periodic updates on a large number of wikis", because I feel that this
is a more general issue, and I'd like to decouple it a bit from the Wikidata
specifics.

I'll try to reply and clarify some other points below.

On 03.01.2013 23:57, Rob Lanphier wrote:
> The thing that isn't covered here is how it works today, which I'll
> try to quickly sum up.  Basically, it's a single cron job, running on
> hume[1].  
[..]
> When a change is made on wikidata.org with the intent of updating an
> arbitrary wiki (say, Hungarian Wikipedia), one has to wait for this
> single job to get around to running the update on whatever wikis are
> in line prior to Hungarian WP before it gets around to updating that
> wiki, which could be hundreds of wikis.  That isn't *such* a big deal,
> because the alternative is to purge the page, which will also work.

Worse: currently, we would need one cron job for each wiki to update. I have
explained this some more in the "Running periodic updates" mail.

> Another problem is that this is running on a specific, named machine.
> This will likely get to be a big enough job that one machine won't be
> enough, and we'll need to scale this up.

My concern is not so much scalability (the updater will just be a dispatcher,
shoveling notifications from one wiki's database to another) but the lack of
redundancy. We can't simply configure the same cron job on another machine in
case the first one crashes. That would lead to conflicts and duplicates. See the
"Running periodic updates" mail for more.

> The problem is that we don't have a good plan for a permanent solution
> nailed down.  It feels like we should make this work with the job
> queue, but the worry is that once Wikidata clients are on every single
> wiki, we're going to basically generate hundreds of jobs (one per
> wiki) for every change made on the central wikidata.org wiki.

The idea is for the dispatcher jobs to look at all the updates on wikidata that
have note yet been handed to the target wiki, batch them up, wrap them in a Job,
and post them to the target wiki's job queue. When the job is executed on the
target wiki, the notifications can be further filtered, combined and batched
using local knowledge. Based on this, the required purging is performed on the
client wiki, rerende/link update jobs scheduled, etc.

However, the question of where, when and how to run the dispatcher process
itself is still open, which is what I hope to change with the "Running periodic
updates" mail.

-- daniel

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Generating documentation from JavaScript doc comments

2013-01-04 Thread Krinkle
On Dec 28, 2012, at 5:05 AM, Matthew Flaschen  wrote:

> We have all these JavaScript documentation comments, but we're not
> actually generating docs.  This has been talked about before, e.g.
> http://www.gossamer-threads.com/lists/wiki/wikitech/208357?do=post_view_threaded,
> https://bugzilla.wikimedia.org/show_bug.cgi?id=40143,
> https://www.mediawiki.org/wiki/Requests_for_comment/Documentation_overhaul#Implementation_ideas
> .
> 
> I don't think Doxygen is the best choice, though.  JavaScript is really
> put forth as a sort-of afterthought.
> 
> I suggest JSDoc (http://usejsdoc.org/), simply because it's a standard
> library and has has been put forward in the past, with good rationale.
> 
> I know there are other good ones too.
> 
> What do you think?
> 
> Matt Flaschen

Doxygen is indeed not meant for JavaScript. With some hacks it can be tricked 
into reading comment blocks from javascript files, but that won't scale for our 
code base, nor will it be enough to make a useful structure given the dynamic 
way JavaScript works.

JSDoc is pretty solid, though there are some concerns:
* The syntax is somewhat foreign compared to what we're doing right now
* Development is unclear (v2 on google-code has been discontinued, v3 on github 
is a rewrite still being worked on)
* Written in javascript, but doesn't run on node. Requires Java.
* Features appear to cover the general cross-language cases, but too limited 
when trying to document more complex javascript solutions (e.g. VisualEditor's 
code base).

I've recently looked into a documentation generator for VisualEditor and though 
I haven't stopped looking yet, I'm currently pausing rather long at JSDuck. It 
is very well engineered and especially optimised for modern JavaScript 
(inheritance, mixins, event emitters, override/overload methods from another 
module, modules, etc.).

It is also easy to extend when needing to implement custom @tags.

I've set up a vanilla install for VisualEditor's code base here:

http://integration.wmflabs.org/mwext-VisualEditor-docs/

Right now, like MediaWiki core, VisualEditor is just documenting code loosely, 
not following any particular doc-syntax, so we're bound to require a few 
tweaks[1] no matter which framework we choose. Our current syntax is just 
something we came up with loosely based on what we're used to with Doxygen.

Right the demo on labs  only uses the "Doc" app of JSDuck, but it also supports 
Guides, Demo's, interactive live-editable Examples and more.

A few random things I like in particular about JSDuck are:
* Support documenting parameters of callback functions
* Support documenting events emitted by a class/module
* Option to show/hide inherited methods and other characteristics
* Support to completely document objects for @param and @return (like @param 
{Object} foo, @param {number} foo.bar)
* Live search and permalinks
* Markdown all the way + duck extension for doc specific syntax (e.g. @link and 
#method)

If it works out, I think we can get this going for MediaWiki core as well.

Regardless of the framework we choose, we should set it up to be generated for 
branches and update on merge from jenkins's post-merge hooks. Come to think of 
it, we should probably do that for the PHP/Doxygen as well (is that still 
running from the cronjob on svn.wikimedia.org?).

-- Krinkle

[1] Perfectionist alert, this commit does more than just the necessary 
"tweaks": https://gerrit.wikimedia.org/r/42221/


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] Running periodic updates on a large number of wikis.

2013-01-04 Thread Daniel Kinzler
This is a follow-up to Rob's mail "Wikidata change propogation". I feel that the
question of running periodic jobs on a large number of wikis is a more generic
one, and deserves a separate thread.

Here's what I think we need:

1) Only one process should be performing a given update job on a given wiki.
This avoids conflicts and duplicates during updates.

2) No single server should be responsible for running updates on a given wiki.
This avoids a single point of failure.

3) The number of processes running update jobs (lets call them workers) should
be independent of the number of wikis to update. For better scalability, we
should not need one worker per wiki.

Such a system could be used in many scenarios where a scalable periodic update
mechanism us needed. For Wikidata, we need it to let the Wikipedias know when
data they are using from Wikidata has been changed.

Here is what we have come up with so far for that use case:

Currently:
* there is a maintenance script that has to run for each wiki
* the script is run periodically from cron on a single box
* the script uses a pid file to make sure only one instance is running.
* the script saves it's last state (continuation info) in a local state file.

This isn't good: It will require one process for each wiki (soon, all 280 or so
Wikipedias), and one cron entry for each wiki to fire up that process.

Also, the update process for a given wiki can only be configured on a single
box, creating a single point of failure. If we had a chron entry for wiki X on
two boxes, both processes could end up running concurrently, because they won't
see each other's pid file (and even if they did, via NFS or so, they wouldn't be
able to detect whether the process with the id in the file is alive or not).

And, if the state file or pid file gets lost or inaccessible, hilarity ensues.


Soon:
* We will implement a DB based locking/coordination mechanism that ensures that
only one worker will be update any given wiki, starting where the previous job
left off. The details are described in
.

* We will still be running these jobs from cron, but we can now configure a
generic "run ubdate jobs" call on any number of servers. Each one will create
one worker, that will then pick a wiki to update and lock it against other
workers until it is done.

There is however no mechanism to keep worker processes from piling up if
performing an update run takes longer than the time it takes for the next worker
to be launched. So the frequency of the cron job has to be chosen fairly low,
increasing update latency.

Note that each worker decides at runtime which wiki to update. That means it can
not be a maintenance script running with the target wiki's configuration. Tasks
that need wiki specific knowledge thus have to be deferred to jobs that the
update worker posts to the target wiki's job queue.


Later:
* Let the workers run persistently, each running it's own poll-work-sleep loop
with configurable batch size and sleep time.
* Monitor the workers and re-launch them if they die.

This way, we can easily scale by tuning the expected number of workers (or the
number of servers running workers). We can further adjust the update latency by
tuning the batch size and sleep time for each worker.

One way to implement this would be via puppet: puppet would be configured to
ensure that a given number of update workers is running on each node. For
starters, two or three boxes running one worker each, for redundancy, would be
sufficient.

Is there a better way to do this? Using start-stop-daemon or something like
that? A grid scheduler?

Any input would be great!

-- daniel



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l