[Wikitech-l] secure.wikimedia.org is no more

2012-11-14 Thread Faidon Liambotis
Hi,

Following last year's Native HTTPS efforts¹, I've pushed a change² today
that redirects all the old secure.wikimedia.org URLs to the respective
native HTTPS ones, e.g.
 https://secure.wikimedia.org/wikipedia/en/wiki/Main_Page gets redirected to
 https://en.wikipedia.org/wiki/Main_Page

The redirects are HTTP temporary redirects (302) for now. I'll soon
switch them to permanent (301), please do let me know if you see any
breakage in the meantime.

Regards,
Faidon

¹: 
http://blog.wikimedia.org/2011/10/03/native-https-support-enabled-for-all-wikimedia-foundation-wikis/
²: https://gerrit.wikimedia.org/r/#/c/13429/

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] secure.wikimedia.org is no more

2012-11-14 Thread Faidon Liambotis
On Wed, Nov 14, 2012 at 01:48:27PM -0500, Derric Atzrott wrote:
> >Following last year's Native HTTPS efforts¹, I've pushed a change² today
> >that redirects all the old secure.wikimedia.org URLs to the respective
> >native HTTPS ones, e.g.
> > https://secure.wikimedia.org/wikipedia/en/wiki/Main_Page gets redirected to
> > https://en.wikipedia.org/wiki/Main_Page
> 
> Does anyone know if EFF's HTTPS Everywhere extension is set up to redirect to
> secure.wikimedia.org?  If so, someone might want to let them know that we've
> made this change.
> 
> I'll volunteer to do so if no one else wishes to.

HTTPS Everywhere is currently set up to redirect using the native HTTPS
support (http://en.wp -> https://en.wp); it used to support redirects to
secure.wikimedia.org, but Roan Kattouw and Sam Reed updated it quite a
while ago. secure.wm.org never supported HTTP and secure.wm.org HTTPS
gets redirected by our redirects without any privacy loss, so there's
nothing to add to HTTPS Everywhere that I can see.

Thanks for the offer though.

Regards,
Faidon

PS. Fun fact: HTTPS Everywhere's git master already has rules for
Wikidata & Wikivoyage, thanks to the always awesome Reedy.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Proposed new WMF browser support framework for MediaWiki

2012-11-20 Thread Faidon Liambotis
On Tue, Nov 20, 2012 at 05:19:51PM -0800, James Forrester wrote:
> In WMF Engineering, we've been struggling with what we mean by 'supporting'
> browsers, and how we can match limited developer time to our natural desire
> to make everyone happy.
> 
> So, to turn this mass of text into an 'ask', I would love the thoughts of
> this list about this. Do you think this might work? Is "making sure all the
> different parts of MediaWiki keep working with " something
> you'd see yourself volunteering to do?

So, I think you're intermixing two different things: MediaWiki support
and "WMF Engineering" or "cluster" browser support. They're not exactly
the same thing.

For example, browsers make a huge difference in the SSL features they
support. So, we currently don't do SNI, as it's unsupported on certain
browser/platforms¹ (mainly: Windows XP and Android < 3). Other SSL
features (e.g. RFC 5077) are in a similar state, and I guess we can
think of other such features not exclusive to MediaWiki.

Should this policy be expanded to cover such cases too? I think so. We
should definitely put more effort though and expand your use cases to
ops use cases too.

Regards,
Faidon

¹: 
http://en.wikipedia.org/wiki/Server_Name_Indication#Browsers_with_support_for_TLS_server_name_indication.5B5.5D

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Proposed new WMF browser support framework for MediaWiki

2012-11-20 Thread Faidon Liambotis
On Tue, Nov 20, 2012 at 05:46:22PM -0800, Brion Vibber wrote:
> "Current and immediately-previous" releases are also really hard to match
> up between projects on fast release cycles (like Chrome and Firefox which
> are pushing out new "major versions" every couple months) and those where
> "major versions" only change a few times per decade, like IE.
> 
> Supporting Chrome 22 (23 - 1) and supporting IE 9 (10 - 1) are totally
> different animals with different usage profiles. Really nobody should be
> running Chrome 22 -- it probably means your computer's broken and not
> installing updates -- but IE 9's all over the place -- as is 8.

Agreed. IE 9 is only supported from Vista onwards and Windows XP is
21.29% of our user base according to the latest stats¹. I'm not sure
it's realistic to say that 20% of our user base may just "happen to
work" by luck.

Regards,
Faidon

¹: http://stats.wikimedia.org/wikimedia/squids/SquidReportOperatingSystems.htm

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Proposed new WMF browser support framework for MediaWiki

2012-11-21 Thread Faidon Liambotis
GOn Wed, Nov 21, 2012 at 09:17:24AM -0800, James Forrester wrote:
> Those numbers are people using Windows XP, not people using Windows XP
> with IE. I believe the numbers for (XP && IE) are going to be
> substantially lower - probably half - but still far to high to
> discount. 

Doh, my bad.

> However, you are right that Windows XP is likely to become the next
> barrier to proper Web development after IE6, and perhaps we should
> instead make an exception for IE compared to the other big four
> browsers and suggest supporting current, and two immediately-previous
> versions.

Well, I think with this exception you're trying to adjust the "latest
browser" rule to cover special cases, instead of acknowledging these
special cases for what they are.

e.g. if IE 11 comes out in 6 months while IE 8/Windows XP remains
prevalent, are we going to adjust the rule to say "three
immediately-previous versions"? What if Microsoft suddenly decides to
use the Chrome/Firefox versioning scheme?

The real reason is that you want Windows XP support, so you might just
as well put that in the rules, instead of extrapolating from the
browser's platform support.  Also, do note another thing from the other
sub-thread: SNI works on IE 8 on Vista or later, but not on IE 8 on
Windows XP, so the browser version rule won't work well in this case.

I think there is a more general flaw here, as also evident with the
Android exception. The problem is that you just can't drop support for
browsers that have a large market share, no matter when they were
released. I think that market share needs to be factored in in the
policy, or else we'll end up adding exceptions to the rules every time
our policy dictates that we're going to drop support for an older but
popular platform.

> Given that I suggested "I'd be happy to talk through the individual
> browser-level decisions but it might be easier to agree that we want
> to focus browser support before we decide the exact focus of this."
> I'm assuming this means you're happy with the overall policy and we're
> just bike-shedding over which versions of which browsers? ;-)

Well, it completely makes sense to me to have a supported browsers
policy, if that's what you're asking. On the other hand, I'm no
MediaWiki or web developer, so my view should be considered as such :)

Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Limiting storage/generation of thumbnails without loss of functionality

2013-01-23 Thread Faidon Liambotis
On Wed, Jan 23, 2013 at 01:53:39PM -0800, Aaron Schulz wrote:
> I'd strongly suggest considering this kind of approach.

Ditto. Among other benefits already mentioned, having a predetermined
set of sizes would help greatly in the architecture and capacity
planning of media storage, as well as in various maintenance tasks that
we occasionally need to do (like replicating thumbnails across
datacenters).

Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] [Labs-l] Maria DB

2013-02-14 Thread Faidon Liambotis
On Thu, Feb 14, 2013 at 05:14:31PM +0100, Mark Bergsma wrote:
> Debug information is *highly useful* in a production setup, and we try
> to run all our core applications with it so we have a chance to debug
> issues when they occur.
> 
> I think the only reason distributions omit debug information is to
> save disk space.

A lot of Debian packages ship -dbg alongside the main package, that
contains the stripped-out debug symbols in /usr/lib/debug (gdb loads
those automatically, either based on the filename or build-id). The
toolchain handles this more or less automatically, but it still needs
maintainer action to define this separate package and upload it with
every package upload.

Ubuntu has experimented in the past with the concept of automatically
generating and shipping symbols for *all* packages, packaged up in a
"ddeb"s (same format as .deb) and shipped via a different repository
that isn't mirrored by all of the downstream mirrors.

This was years ago, I'm not sure what has happened since then. I
remember being discussed in Debian as well, but it was never adopted,
probably because noone ever implemented it :)

For MySQL/MariaDB, it seems that the Debian packages don't ship a -dbg
package by default. That's a shame, we can ask for that. As for the rest
of Asher's changes, I'd love to find a way to make stock packages work
in our production setup, but I'm not sure if the maintainer would
welcome the extra complexity of conditionally switching behaviors. We
can try if you're willing to, Asher :)

Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Etherpad Lite labs server going down today at 22:00 UTC (14:00 Pacific) for upgrade

2013-02-28 Thread Faidon Liambotis
On Thu, Feb 28, 2013 at 11:53:45AM -0800, Mark Holmquist wrote:
> As absurd as this is for me to be sending out a warning about taking
> down a labs service, this seems appropriately courteous especially
> given the amount of use this instance has been getting.

I've been in quite important meetings with 15+ attendants where Etherpad
Lite has been used exclusively -- so, clearly not for purposes that are
testing or staging.  So, Labs is the wrong place to have this.  Can you
coordinate with us (operations team) to move the service into a
production environment?

Thanks,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Proposal: Wikitech contributors

2013-04-03 Thread Faidon Liambotis
On Tue, Apr 02, 2013 at 05:45:58AM -0700, Quim Gil wrote:
> * wikitech.wikimedia.org would become the one and only site for our
> open source software contributors, powered by semantic software and
> an ontology of categories shared across wiki pages, Bugzilla and
> hopefully Gerrit.

This is excellent. We recently had the merger of labsconsole and
wikitech and this is a great (and ambitious!) next step. Kudos.

One thing that I'd like to see though -and which I think is central to
the success of your proposal- would be allocation of more engineering
resources for wikitech. Ryan and Andrew are doing an excellent job by
holding the fort almost by themselves, but if we're to do this, I'd like
to see more of an effort and commitment by other people/teams.

Having the same person do UI, sole maintainership of OpenStack &
authentication MW extensions and operations/upgrade for the wiki (among
a million other things), just doesn't cut it. Both the labsconsole and
wikitech experiences had smaller or larger several issues and while the
recent efforts as part of the merger alleviated some of them, we still
have a long way to go. For example, I think some of OSM pages' in
particular could use some help from UX experts -- no offence to Ryan,
I'm sure I'd do worse; but UX is in neither of our job descriptions.

Incorporating wikitech to the regular platform processes (deployments &
version updates, configs etc.) would also be needed if we want to do a
reasonable job and provide contributors with a maintained platform. From
lagging MediaWiki versions to IPv6, wikitech is just not on the same
level of support as the rest of the cluster right now.

Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Countdown to SSL for all sessions?

2013-04-30 Thread Faidon Liambotis
On Tue, Apr 30, 2013 at 11:14:48AM -0700, Daniel Friesen wrote:
> On Tue, 30 Apr 2013 10:27:21 -0700, Petr Bena  wrote:
> 
> >SSL is requiring more CPU, both on server and client and disable all
> >kinds of cache (such as squid or varnish), and some browsers may have
> >problems with it OR in some countries encryption may be even illegal.
> 
> SSL does not disable caches. SSL is terminated at the cluster,
> Squid/Varnish are in on the raw HTTP and serve out HTTP EXACTLY the
> same way they serve out normal HTTP requests (they even use the
> exact same cache thanks to our protocol relative urls).
 
I can verify that the above is correct and Petr is wrong.

However, we terminate SSL before proxying to the normal caching layers,
and the infrastructure for this is too small to handle significant
portions of the traffic (if it were bigger, it'd be a waste of resources
and hence money, considering its current usage). If we were to push
normal traffic to them, we'd quickly reach all kinds of limits, incl.
CPU and network.

That isn't to say that it's impossible to scale up this infrastructure
if needed (or, more likely, adapt the design of the infrastructure to
incorporate such an expansion by putting it closer to the caching
layers), but it should be considered that it's not just about enabling a
MediaWiki config setting to do this but also involves other
operations-related engineering work.

That being said, my gut tells me that making all the logins SSL-enabled
is not going to make a significant difference compared to current usage,
but I don't have any numbers to back this up right now. Maybe Ryan has
them.

Cheers,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [WikimediaMobile] Caching Problem with Mobile Main Page?

2013-05-05 Thread Faidon Liambotis
On Fri, May 03, 2013 at 03:19:13PM -0700, Asher Feldman wrote:
> 1) Our multicast purge stream is very busy and isn't split up by cache
> type, so it includes lots of purge requests for images on
> upload.wikimedia.org.  Processing the purges is somewhat cpu intensive, and
> I saw doing so once per varnish server as preferable to twice.

I believe the plan is to split up the multicast groups *and* to filter
based on predefined regexps on the HTCP->PURGE layer, via the
varnishhtcpd rewrite. But I may be mistaken, Mark and Brandon will know
more.

> There are multiple ways to approach making the purges sent to the frontends
> actually work such as rewriting the purges in varnish, rewriting them
> before they're sent to varnish depending on where they're being sent, or
> perhaps changing how cached objects are stored in the frontend.  I
> personally think it's all an unnecessary waste of resources and prefer my
> original approach.

Although the current VCL calls vcl_recv_purge after the rewrite step
(and hence actually rewriting purges too), unless I'm mistaken this is
actually unnecessary. The incoming purges match the way the objects are
stored in the cache: both are without the .m. (et al) prefix, as normal
"desktop" purges are matched with objects that had their URLs rewritten
in vcl_recv. Handling purges after the rewrite step might be unnecessary
but it doesn't mean it's a bad idea though; it doesn't hurt much and
it's better as it allows us to also purge via the original .m. URL,
which is what a person might do instictively.

While mobile purges were actually broken recently in the past in a
similar way as you guessed with I77b88f[1] ("Restrict PURGE lookups to
mobile domains") they were fixed shortly after with I76e5c4[2], a full
day before the frontend cache TTL was removed.

1: 
https://gerrit.wikimedia.org/r/#q,I77b88f3b4bb5ec84f70b2241cdd5dc496025e6fd,n,z
2: 
https://gerrit.wikimedia.org/r/#q,I76e5c4218c1dec06673aa5121010875031c1a1e2,n,z

What actually broke them again this time is I3d0280[3], which stripped
absolute URIs before vcl_recv_purge, despite the latter having code that
matches only against absolute URIs. This is my commit, so I'm
responsible for this breakage, although in my defence I have an even
score now for discovering the flaw last time around :)

I've pushed and merged I08f761[4] which moves rewrite_proxy_urls after
vcl_recv_purge and should hopefully unbreak purging while also not
reintroducing BZ #47807.

3: 
https://gerrit.wikimedia.org/r/#q,I3d02804170f7e502300329740cba9f45437a24fa,n,z
4: 
https://gerrit.wikimedia.org/r/#q,I08f7615230037a6ffe7d1130a2a6de7ba370faf2,n,z

As a side note, notice how rewrite_proxy_urls & vcl_recv_purge are both
flawed in the same way: the former exists solely to workaround a Varnish
bug with absolute URIs, while the latter is *depending* on that bug to
manifest to actually work. req.url should always be a (relative) URL and
hence the if (req.url ~ '^http:') comparison in vcl_recv_purge should
normally always evaluate to false, making the whole function a no-op.

However, due to the bug in question, Varnish doesn't special-handle
absolute URIs in violation of RFC 2616. This, in combination with the
fact that varnishhtcpd always sends absolute URIs (due to an
RFC-compliant behavior of LWP's proxy() method), is why we have this
seemingly wrong VCL code but which actually works as intended.

This Varnish bug was reported by Tim upstream[5] and the fix is
currently sitting in Varnish's git master[6]. It's simple enough and it
might be worth it to backport it, although it might be more troulbe that
it's worth, considering how it will break purges with our current VCL :)

5: https://www.varnish-cache.org/trac/ticket/1255 
6: 
https://www.varnish-cache.org/trac/changeset/2bbb032bf67871d7d5a43a38104d58f747f2e860

Cheers,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] draft goals for Engineering Community Team for the next 12 months

2013-06-29 Thread Faidon Liambotis

Hi Sumanah,

On Fri, Jun 28, 2013 at 07:39:19PM -0400, Sumana Harihareswara wrote:

https://www.mediawiki.org/wiki/Wikimedia_Engineering/2013-14_Goals#Wikimedia_Technical_Community

The Engineering Community Team has some draft goals for what we'd like
to achieve in the next 12 months.  We'll still be running Bugzilla,
putting out the monthly report, running GSoC and OPW, and doing those
other continuous tasks (as you can follow at
https://www.mediawiki.org/wiki/Wikimedia_Platform_Engineering#Engineering_Community_Team
).  But what else should we be concentrating on?  This is a draft of
what we'd like to focus on, quarter by quarter.



I welcome your comments here or on the talk page.


Back in April, Quim made a proposal on this list for a plan to attract 
new contributors that included a "deeper restructuring of our community 
spaces", including a reshuffling/repurposing of wikitech/mediawiki.org.


I think the outcome of that discussion was to run an experiment and 
reevaluate, but I might have lost track -- the Wikitech contributors 
RFC[1] shows no real updates since April though.


A deeper restructuring sounds like goal material. Is this under 
consideration for the coming year?


Personally, I'd love to see some movement in the wikitech/mediawiki 
split, I think it becomes increasingly important the further we invest 
into projects such as Wikimedia Labs/Tool Labs and bridging the gap 
between operations and software development.


Thanks,
Faidon

1: https://www.mediawiki.org/wiki/Requests_for_comment/Wikitech_contributors

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] git.wikimedia.org dead?

2013-08-11 Thread Faidon Liambotis

Hi,

On Sun, Aug 11, 2013 at 12:51:15PM +0200, rupert THURNER wrote:

As chad points out, its being served now


it's plural (robots.txt)


many thanks for getting it up quickly last time! unfortunately
https://git.wikimedia.org is unresponsive again.


Thanks for the report! I just restarted it again. Root cause was the 
same, unfortunately it's not just zip files that kill it; googlebot 
asking for every file/revision is more than enough.


Until we have a better solution (and monitoring!) in place, I changed 
robots.txt to Disallow /. This means no search indexing for now, 
unfortunately.


Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wikimedia's anti-surveillance plans: site hardening

2013-08-17 Thread Faidon Liambotis

On Fri, Aug 16, 2013 at 08:04:24PM -0400, Zack Weinberg wrote:
Hi, I'm a grad student at CMU studying network security in general 
and censorship / surveillance resistance in particular. I also used 
to work for Mozilla, some of you may remember me in that capacity. My 
friend Sumana Harihareswara asked me to comment on Wikimedia's plans 
for hardening the encyclopedia against state surveillance.




First of all, thanks for your input. It's much appreciated. As I'm sure 
Sumanah has already mentioned, all of our infrastructure is being 
developed in the open using free software and we'd be also very happy to 
accept contributions in code/infrastructure-as-code as well.


That being said, literally everything in your mail has been already 
considered and discussed multiple times :), plus a few others you didn't 
mention (GCM ciphers, OCSP stapling, SNI & split certificates, 
short-lived certificates, ECDSA certificates).  A few have been 
discussed on wikitech, others are under internal discussion & 
investigation by some of us with findings to be posted here too when we 
have something concrete.


I don't mean this to sound rude, but I think you may be oversimplifying 
the situation quite a bit.


Enabling HTTPS for everyone on a website our scale isn't a trivial thing 
to do. Besides matters of policy -blocking Wikipedia in China isn't 
something that can be lightly done- there are significant technical 
restrictions. Just to lay a few examples here: there is no software that 
can both do both SPDY and take as input the key for encrypting SSL 
session tokens, something that's needed if you need a cluster of 
load-balancers (you also need to rotate it periodically; a lot of people 
miss this). There is no software out there than support both having a 
shared SSL session cache and also do SPDY[1]. etc.


DANE is great and everything but there's no software availalble that 
satisfies our GeoDNS requirements *and* supports DNSSEC. I know, I've 
tried them all. Traditional DNSSEC signing proxies (e.g.  OpenDNSSEC) 
don't work at all with DNSSEC. (FWIW, we're switching to gdnsd which has 
a unique set of characteristics and whose author Brandon Black was hired 
in the ops team shortly after we decided to switch to gdnsd.)


Plus, DNSSEC has only a marginal adoption client-side (and DANE has none 
yet) and has important side effects. For example, you're more likely to 
be used as a source for DNS amplification attacks as your responses get 
larger. More importantly though, you're breaking users, something that 
needs to be carefully weighted with the benefits.


If you need numbers, this is from a paper from USENIX Security '13 last 
week titled "Measuring the Practical Impact of DNSSEC Deployment"[2]:
"And we show, for the first time, that enabling DNSSEC measurably 
increases end-to-end resolution failures. For every 10 clients that are 
protected from DNS tampering when a domain deploys DNSSEC, approximately 
one ordinary client (primarily in Asia) becomes unable to access the 
domain."


Is dedicating (finite) engineering time to write the necessary code for
e.g. gdnsd to support DNSSEC, just to be able to support DANE for which 
there's exactly ZERO browser support, while at the same time breaking a 
significant chunk of users, a sensible thing to do?


We'll keep wikitech -and blog, where appropriate- up to date with our 
plans as these evolve. In the meantime, feel free to dive in our puppet 
repository and see our setup and make your suggestions :)


Best,
Faidon
(wmf ops)

[1]: stud does SSL well (but not SPDY), but does not pass 
X-Forwarded-For, and Varnish doesn't support the PROXY protocol that 
stud provides, so either of the two would need to be coded (and we've 
already explored what it'd take to code it). nginx scales up and has 
some support for SPDY but doesn't have a shared-across-systems session 
cache or session token key rotation support nor it supports ECDSA.  
Apache 2.4 has all that, but we're not sure of its performance 
characteristics yet, plus mod-spdy won't cut it for us. etc.


[2]: 
https://www.usenix.org/conference/usenixsecurity13/measuring-practical-impact-dnssec-deployment

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] trace gitblit, was: Re: Wikimedia's anti-surveillance plans: site hardening

2013-08-18 Thread Faidon Liambotis

On Sat, Aug 17, 2013 at 10:19:10PM +0200, rupert THURNER wrote:

(2) by when you will adjust your operating guideline, so it is clear
to faidon, ariel and others that 10 minutes tracing of an application
and getting a holistic view is mandatory _before_ restoring the
service, if it goes down for so often, and for days every time. the 10
minutes more can not be noticed if it is gone for more than a day.


I think you're making several incorrect assumptions and mistakes here.

First, is, respectfully, a wrong approach on how we should react to 
emergencies. git.wm.org as a service has owners and I'm not one of them.  
My reaction was a reaction for a service I know next to nothing about, 
on a service outage, on a Sunday. The service was down and the first 
priority is to restore it and making sure it won't die again when I'm 
not looking at my screen. An overload caused by Googlebot was deemed to 
be causing this and I decided that Google indexing could be temporarily 
suspended until the situation was to be properly assesed, by the right 
people, with a clear head. I still think it was an fair compromise to 
make.


Second, you're assuming that the reason of the outage was a software bug 
and that a stacktrace was going to be helpful. I haven't followed up, 
but I don't think this is correct here -- and even if it was, it's wrong 
to assume it will always be, that's what post-mortems are for.  
Preliminary investigation showed the website being crawled for every 
single patch by every single author, plus stats going back random 
amounts of time, for all Wikimedia git projects. All software will break 
under the right amount of load and there's nothing stacktraces can help 
with here. (rel=nofollow would help, but a stacktrace wouldn't tell you 
this).


Third, you're assuming that others are not working with gitblit 
upstream. Chad is already collaborating with gitblit upstream and has 
done so before this happened. He's already engaging with them on other 
issues potentially relevant to this outage[1]. He also has an excellent 
track record with (at least) his previous collaboration with Gerrit 
upstream. Finally, Ori contributed a patch that was merged for one of 
the root causes of gitblit outages[2].


Fourth, you're extrapolating my personal "pushing upstream" attitude 
from a single incident and single interaction with me and from there 
extrapolating the team's and foundation's attitude and finally reaching 
to the conclusion that we won't collaborate with upstreams for HTTPS.  
These are all incorrect extrapolations and wild guesses. 

I can tell you for a fact that you couldn't be farther from truth. Both 
I and others work closely with a large number of upstreams regularly.  
I've worked with all kinds of upstreams for years, long before I joined 
the foundation and I'm not planning to stop anytime soon -- it's also 
part of my job description and of the organizational mandate, so it's 
much more than my personal modus operandi.


Finally, specifically for the cases you mention: for HTTPS, ironically, 
I sent a couple of patches to httpd-dev last week for bugs that I found 
while playing with Apache 2.4 for Wikimedia's SSL cluster. One of them 
came after discussions with Ryan on SSL sessions security and potential 
attacks[3]. As for DNS, I worked closely with upstream[4], Brandon, for 
gdnsd when I was still evaluating it for use by Wikimedia (and I even 
maintain it in Debian nowadays[5]); Brandon was hired by the foundation 
a few months later -- it's hard to get better relations with upstream 
than having them on the team, don't you think?


I hope these address your concerns. If not, I'd be happy to provide more 
information and take feedback, but please, let's keep it civil, let's 
keep it technical and let's keep it on-topic and in perspective -- the 
issue at hand (security & protection from state surveillance) is far too 
important for us to be discussing the response to a minor gitblit outage 
in the same thread, IMHO.


Best,
Faidon

1: https://code.google.com/p/gitblit/issues/detail?id=274
2: 
https://github.com/gitblit/gitblit/commit/da71511be5a4d205e571b548d6330ed7f7b72d99
3: 
http://mail-archives.apache.org/mod_mbox/httpd-dev/201308.mbox/%3c20130805111906.ga26...@tty.gr%3E
4: https://github.com/blblack/gdnsd/issues/created_by/paravoid?state=closed
5: http://packages.debian.org/sid/gdnsd

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wikimedia's anti-surveillance plans: site hardening

2013-08-22 Thread Faidon Liambotis

On Sat, Aug 17, 2013 at 05:55:36PM -0400, Sumana Harihareswara wrote:

I suggest that we also update either
https://meta.wikimedia.org/wiki/HTTPS or a hub page on
http://wikitech.wikimedia.org/ or
https://www.mediawiki.org/wiki/Security_auditing_and_response with
up-to-date plans, to make it easier for experts inside and outside the
Wikimedia community to get up to speed and contribute.  For topics under
internal discussion and investigation, I would love a simple bullet
point saying: "we're thinking about that, sorry nothing public or
concrete yet, contact $person if you have experience to share."


This is a good suggestion. We had a pad that we've been working on even 
before this thread; a few of us (Ryan, Mark, Asher, Ken, myself) met the 
other day and worked a bit on our strategy from the operations 
perspective and put out our notes at:

https://wikitech.wikimedia.org/wiki/HTTPS/Future_work

It's still very rudimentary bullet-point summary so it might not be an 
easy read. Feel free to ask questions here or or on-wiki.


There's obviously still a lot of unknowns -- we have a lot of "evaluate 
this" TODO item. Feel provide feedback or audit our choices, though, 
it'd be very much welcome. If you feel you can help in some of these 
areas in some other ways, feel free to say so and we'll try to find a 
way to make it happen.


Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wikimedia's anti-surveillance plans: site hardening

2013-08-23 Thread Faidon Liambotis

On Fri, Aug 23, 2013 at 06:53:29AM -0700, Bry8 Star wrote:

At my first few small-scale implementations, i did not pay attention
to rate-limiting techniques, then i realized its importance over time.


RRL support for gdnsd is being tracked upstream at:
https://github.com/blblack/gdnsd/issues/36
(filed by yours truly, 7 months ago; Brandon has left some really good 
and large responses there)


You're right that it's a prerequisite to DNSSEC support, due to the 
large DNSSEC responses -and more importantly, for tiny queries- being 
appealing to DNS amplification attackers.


Thanks,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Guidelines for db schema changes

2012-04-24 Thread Faidon Liambotis
On Tue, Apr 24, 2012 at 05:52:24PM -0700, Rob Lanphier wrote:
> As we do more frequent deploys, it's going to become critical that we
> get database schema changes correct, and that we do so in a way that
> gives us time to prepare for said changes and roll back to old
> versions of the software should a deploy go poorly.  This applies both
> to MediaWiki core and to WMF-deployed extensions.
> 
> I'd like to propose that we make the following standard practice:

I'm still new around here so pardon me if this sounds infeasible for us:

In other systems I've worked before, such problems have been solved by
each schema-breaking version providing schema *and data* migrations for
both forward *and backward* steps.

This means that the upgrade transition mechanism knew how to add or
remove columns or tables *and* how to fill them with data (say by
concatenating two columns of the old schema). The same program would
also take care to do the exact opposite steps in a the migration's
backward method, in case a rollback was needed.

The migrations themselves can be kept in the source tree, perhaps even
versioned and with the schema version kept in the database, so that both
us and external users can at any time forward their database to any
later version, automagically.

I think that both Ruby on Rails and Python/Django (with South) employ
such schemas and I've seen them work well in practice before.

Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Relations with freenode and wikimedia

2012-06-21 Thread Faidon Liambotis
On Thu, Jun 21, 2012 at 09:35:56AM +0200, Petr Bena wrote:
> One developer recently complained about some freenode policies,
> specifically that wiki projects (wikipedia etc has some kind of
> exception) are no longer allowed to be hosted on freenode network,
> which is supposed to host only opensource projects. It's fact that as
> the wikimedia project is becoming more large the freenode is getting
> less and less suitable. Right now there is a page [1] where are
> discussed other options for IRC. One of the options is to leave
> freenode and set up own wikimedia IRC network, which has lot of
> benefits but also lot of issues (moving to another network is
> complicated given to number of channels and users).

Setting up and properly maintaining an IRC network is extremely
complicated.  We really *really* shouldn't do that, esp. since there is
no reason for us to do so, when there are other open networks around.

Even if the situation with freenode doesn't work out (which I think it
will), we could perhaps reach out to OFTC, an alternative IRC network
where some free software projects have fled to (there was discussions in
the past to merge freenode/OFTC but those proved to be unfruitful)a.

Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Welcoming OpenHatch to organize the pre-Wikimania hackathon

2012-06-22 Thread Faidon Liambotis
On Thu, Jun 21, 2012 at 05:27:08PM -0400, Asheesh Laroia wrote:
> Excerpts from Alolita Sharma's message of Mon Jun 18 14:42:08 -0400 2012:
> > Excellent news Sumana!
> > 
> > Welcome Asheesh and OpenHatch team :-)
> 
> Thanks for the warm welcome!

Welcome! I wonder how you'll manage to do both Wikimania's hackathon and
DebConf :)

See you soon,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Barkeep code review tool

2012-06-30 Thread Faidon Liambotis
On Sat, Jun 30, 2012 at 04:06:37PM -0700, Roan Kattouw wrote:
> On Fri, Jun 29, 2012 at 2:20 PM, Marcin Cieslak  wrote:
> > As seen on IRC:
> >
> > https://github.com/ooyala/barkeep/wiki/Comparing-Barkeep-to-other-code-review-tools
> >
> The most prominent feature of Barkeep mentioned on this page is that
> it was built for a post-commit review workflow. Given that the reason
> we moved MediaWiki to git was basically so that we could move *away*
> from post-commit review, I don't think using Barkeep as-is would work.

Well, in the ops puppet repo though, we very often +2 commits ourselves
and push them, instead of waiting for someone else to review/approve
them. You could argue that it's our workflow that it's wrong, but I just
think the needs for infrastructure-as-code might be different from the
needs code development has.

It's like asking for pre-execution reviews of whatever you type in a
shell prompt and we can all agree that's just crazy :) In a perfect
world we'd stage every change in VMs where we'd do local puppet commits
without reviews; then push those tested changesets into the pre-commit
review system to get into production. But we're very far from that and
being perfectionists just hurts us more on our daily work.

Having a proper post-commit review workflow would be much better than
hacking around the system and reviewing commits ourselves. It'd also
allow us to actually have post-commit reviews, something that rarely
happens right now. At least I'd do that, while currently it's a PITA to
do so.

> Then again, from watching the demo video (see getbarkeep.org) it looks
> like their UI is a lot better than Gerrit's, and I like features like
> saved searches and most-commented-on-commits dashboards. Integrating
> Barkeep or the UI/UX ideas from it with Gerrit (or vice versa --
> integrating Gerrit's pre-commit review workflow support with Barkeep
> -- but I think that would be harder) would be cool but I have no
> concrete ideas as to how it could be done right now.

Barkeep claims to work with both post- and pre-commit workflows,
although the details elude me.

The UI is much *much* nicer than Gerrit's. They also have a demo
website, which is a pleasure to work with IMHO.

They also claim to have useful, relevant and configurable e-mail
notifications too, in contrast to Gerrit's which are basically useless.
Maybe I'm too much of a relic to prefer reading commit diffs in my mail
client, rather than fancy web interfaces :)

All in all, I like it very much but I don't have a broad knowledge of
how people use Gerrit right now and therefore I can't form an opinion on
whether it's suitable for us.

At least there's some competition in the space and other people having
the same problems as (at least) I do, that's good :)

Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-19 Thread Faidon Liambotis
On Thu, Jul 19, 2012 at 06:29:58PM -0700, Roan Kattouw wrote:
> gitlab might be this, but it's written in Ruby so presumably our
> developer community would be less able to contribute to it. And I'm
> pretty sure ops is not just gonna say "sure, no problem" if/when we
> ask them to deploy a Ruby web app :)

It's not like Java is popular in the ops world either :)

Preferences varies between the team of course, but my personal view is
that I don't mind either, as long as they don't assume or mess too much
with the system (like JBoss is, or like Rails apps with tons of vendor/
dependencies or "gem installs").

Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-19 Thread Faidon Liambotis
On Thu, Jul 19, 2012 at 11:58:54AM -0700, Erik Moeller wrote:
> From what I can tell, we have essentially three choices:
> 
> * Continue to work with the heavily centralized and clunky Gerrit
> workflow, and try to beat it into shape to not cause us too much pain,
> while seeing people increasingly move into GitHub for doing
> experimental projects. Hope for Gerrit's large developer base to
> ultimately join a rewrite effort, or to incrementally make it better.
> Build Gerrit/GitHub bridges if possible.
> 
> * Drink the kool-aid and join the wonderful semi-open world of GitHub,
> with all the risks and benefits it entails.
> 
> * Help build a realistic alternative to GitHub, probably by building
> on the open source toolset that's the closest right now in terms of
> its overall architectural approach and existing functionality. (My
> impression: Gitorious is closer in functionality but at a lower
> overall quality, Phabricator/Barkeep are much more limited in
> functionality right now and therefore would take a long time to be
> usable with our current workflow assumptions.)

That's a nice summary. I (too?) like option 3. My distaste for Gerrit
and closed platforms are probably on par right now.

Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-23 Thread Faidon Liambotis
On Fri, Jul 20, 2012 at 02:52:55PM -0700, Ryan Lane wrote:
> On Thu, Jul 19, 2012 at 11:55 PM, Antoine Musso  wrote:
> > Daniel Friesen wrote:
> >> The ops guys hate ruby.
> >
> > I am pretty sure they love it. Puppet itself is a DSL based on top of
> > ruby. The ops argument is we don't want to handle security updates and
> > nasty performance bug for yet another language.
> 
> No. No. We hate it. I, personally, also dislike puppet, but that's
> another discussion totally.

I think the "we" is a bit unwarranted. I don't hate Ruby and I certainly
don't hate Ruby more than Java :-) I also don't feel the same about
puppet; I do see some problems with it, but none of them have nothing to
do with the fact that it's written in Ruby.

I think this discussion is pointless though. If we find a good tool for
the job and it's clear how to install and maintain it, I don't see why
we should care about its implementation language, at least from an ops
perspective. I've seen horrible & difficult to operate software in
Python and perfect ones in Ruby. I don't see how our personal
preferences towards languages have any value here.

Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-25 Thread Faidon Liambotis
On Wed, Jul 25, 2012 at 02:21:03PM -0400, Derric Atzrott wrote:
> >As mentioned before, we can't use github enterprise at all, since it
> doesn't allow for hosting public repos. Let's ignore that it even exists.
> 
> I feel like as Wikipedia is one of the top 10 most visited sites on the
> Internet we might be able to work out something special with them though
> right?  I'm not saying we have to go down that route, nor have I even
> examined all the advantages and disadvantages of the idea.  I feel though
> that the possibility exists though and should be looked into.
> 
> If I was GitHub and the WMF approached me about potentially using GitHub
> Enterprise for MediaWiki and MediaWiki extensions and NOT for creating a
> competition service to GitHub, then I would likely entertain the idea of
> crafting a special set of terms for them.  Furthermore, I personally might
> even charge them differently given that that charging per developer would be
> crippling to an open-source project.  Of course, I am not GitHub, nor can I
> anticipate what they might do, nor their internal policies, but I can speak
> for myself and how I would run a FOSS focused company.

I think the BitKeeper story is relevant here: BitKeeper was one of the
first DVCSes. It was (is?) a proprietary for-profit tool that gave
special free licenses to certain free/open-source software projects,
like Linux. Linux was using it for years, due to having some unique at
the time features (and Linus liking it), although it was a controversial
choice in the community.

At one point, due to some "reverse engineering" (basically typing "help"
at its server and showing that in a talk) by some community members, the
company behind BitKeeper decided to revoke this free (as in beer)
license from the community members, effectively halting Linux
development. Git was first published a week after that.

Now, the situation is a bit different here. But I can certainly imagine
getting this "special" license exception revoked, GitHub Enterprise
discontinued as a product or whatever else. Can you imagine the
disruption that would cause to our development and operations?

Regars,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Serious alternatives to Gerrit

2012-07-26 Thread Faidon Liambotis
On Thu, Jul 26, 2012 at 12:08:39AM -0700, Rob Lanphier wrote:
> > I can get behind the decision to use a currently substandard tool in order
> > to preserve Wikimedia's long term freedom.
> 
> Even if we accept that Gerrit is substandard (which I don't),
> preserving freedom is a motivating factor.  Moving fully to GitHub
> means not merely letting people who are more comfortable with
> proprietary tools use them.  It crosses the line to requiring it for
> everyone, which sucks.

This is not to you specifically but as a general remark: there are
multiple other alternatives that have been mentioned besides Gerrit and
GitHub with a potential of getting the best of both worlds. I think it's
better to not polarize the discussion between the two most popular
options at this point and talk about all options.

Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Criteria for "serious alternative"

2012-07-26 Thread Faidon Liambotis
On Thu, Jul 26, 2012 at 11:27:44AM -0700, Rob Lanphier wrote:
> Nothing else has been advocated with a degree of seriousness as to warrant
> consideration at this point.  That's not to say we're done with those
> options; if someone wants to put together a serious proposal, there's still
> a little time.  However, in order to practically consider the alternatives,
> we need to have the serious proposals enumerated, and a credible plan for
> addressing any deficiencies.

I don't understand. I thought we were collecting problems and *ideas* on
how to solve them, not solid plans for a migration.

You're basically comparing two options that have little or no work
involved (not changing Gerrit, moving to an externally-maintained
service that most of us know how it works) vs. plans that need *time* to
install, play with and evaluate.

Has anyone been allocated to that task? I don't like either of the two
options but don't think I can just stop what I'm doing and spend a week
evaluating e.g. Gitlab, Barkeep and Phabricator just to present my
argument (or counterargument) to the Wiki page.

My understanding of the process was that we would collect a broad set of
arguments/ideas/proposals and people would be later assigned to the task
of evaluating them and proposing a viable solution and a migration path
(or not, and propose that we stay with Gerrit).

The wiki page seems to also imply something like that by saying:
   Brion Vibber will lead this evaluation, with help from Chad Horohoe
   and David Schoonover.

If it's me that misunderstood that's fine and I'm sorry. I'll just feel
a bit silly for trying to argue for an impossible outcome :)

Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Criteria for "serious alternative"

2012-07-27 Thread Faidon Liambotis
On Thu, Jul 26, 2012 at 11:36:50PM -0700, Rob Lanphier wrote:
> On Thu, Jul 26, 2012 at 12:49 PM, Faidon Liambotis 
> wrote:
> > My understanding of the process was that we would collect a broad set of
> > arguments/ideas/proposals and people would be later assigned to the task
> > of evaluating them and proposing a viable solution and a migration path
> > (or not, and propose that we stay with Gerrit).
> >
> 
> Yes, we're seeking a broad range of proposals.  However, "proposals" is the
> key word.  That means looking the requirements, reading the website and
> matching against those requirements, and stitching together something that
> at least looks good on paper.  I'm not expecting anyone to set up a
> prototype, but I am asking that, given how long we've been talking about
> this, that we narrow down our options a bit to the things that we know are
> worth looking at rather than (still, a year later) having the "have you
> looked at this?" discussion again.

I think GitLab looks promising but I'm unable to judge it against all of
our requirements just from the online demo and without spending some
amount of time on it. I just put some pros and cons as I see them to the
Wiki page and hope it can be considered.

Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Can we make an acceptable behavior policy? (was: Re: Mailman archives broken?)

2012-08-17 Thread Faidon Liambotis
On Fri, Aug 17, 2012 at 07:05:24AM -0400, MZMcBride wrote:
> > "the mess you made".
> > 
> > Right there, in that phrase, you have aggressively indicated the following:
> > 
> > a) That you believe someone fucked up;
> > b) That you think they're incompetent;
> > c) That you think they're being lazy about it
> 
> I didn't intend to indicate most of that, of course. That said, system
> administrators are trusted to not break things and when they do, there's a
> moral obligation to make a good-faith effort to fix that which was broken by
> their actions. In this case, the moral culpability equation is enhanced by
> various factors previously discussed.

I've been silent because others seemed to handle it. But I can't
anymore. Your initial mail was disturbing enough. Your complete lack of
understanding of what multiple people are saying to you and the lack of
an apology are even worse.

Your words hurt people, created a bad precedent of aggressive behavior
and are counter-productive. Please stop this.

Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] IPv6 usage on Wikimedia?

2012-09-18 Thread Faidon Liambotis
On Tue, Sep 18, 2012 at 01:30:14PM -0700, S Page wrote:
> I imagine mobile users on IPv6 might be more aggressively cached by
> their providers, and they aren't requesting as many resources per page
> view, so Wikimedia's share of IPv6 users might be higher.

I wouldn't count on mobile devices having any significant IPv6 adoption,
even in our scale.

The dual stack (IPv4v6 PDP) 3GPP specs are relatively new and supported
by only a handful of carriers, the number of devices supporting two PDPs
is exactly zero¹ and the only smartphone operating systems supporting
IPv6 on the mobile interface are Symbian and Android >= 4.0/ICS (and the
latter only partially, depending on the handset's chipset). The
situation with USB sticks and MiFis are similar although I was
pleasantly surprised last week to see Verizon's newer generation of
MiFis supporting IPv6 out of the box.

On the WiFi interface things are significantly better (works with all
Symbians and Android >= 2.1) but still with more hiccups than on fixed
(DHCPv6 and RFC 6106 support is non existent, IPv6 is only supported on
the relatively new iOS >= 4 etc.)

Regards,
Faidon

¹: The Nokia N900 supported that but a) it's EOL, b) it needed a patched
kernel and manual configuration anyway.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] IPv6 routing problem?

2012-10-15 Thread Faidon Liambotis
Hi,

Thanks for forwarding the report. I've chatted with the user via IRC on
Sunday and subsequently via e-mail, so we're on it. For what it's worth,
the underlying issue is still there, although restoring European traffic
via the esams (Amsterdam) cluster has significantly reduced the impact.

Regards,
Faidon

On Sun, Oct 14, 2012 at 06:16:14PM +0200, Erwin Dokter wrote:
> This was just posted on [[en:Wikepedia:Village Pump (technical)]]:
> 
> 
> Hi,
> I am the admin for an ISP and we are now deploying IPv6 to some customers
> 
> bits.wikimedia.org is failing over IPv6 from the the range 2a02:3d8::/32
> 
> the routing is broken upstream of our primary IPv6 provider between
> tele2.net and wikimedia so it may also be affecting other ip6
> address ranges
> 
> [[bminish@redbox ~]$ tracepath6 bits.wikimedia.org
>  1?: [LOCALHOST]   0.017ms pmtu 1500
>  1:  gw6.mayo.lan  0.178ms
>  1:  gw6.mayo.lan  0.146ms
>  2:  2a02:3d8:1:::1:1  0.757ms
>  3:  brendan1-brendan2.westnet.ie  0.983ms
>  4:  isl-kw-brendan1.westnet.ie1.766ms
>  5:  2a02:3d8::104::1  2.549ms
>  6:  ktm12-kw.westnet.ie   4.917ms
>  7:  piglet-eth2v3006.westnet.ie   5.308ms
>  8:  mole-ge2.westnet.ie   12.328ms
>  9:  2001:978:2:60::3:111.503ms
> 10:  te3-7.ccr01.dub01.atlas.cogentco.com  27.049ms asymm 17
> 11:  te1-4.ccr01.man01.atlas.cogentco.com  26.786ms asymm 17
> 12:  te1-6.ccr02.lhr01.atlas.cogentco.com  26.927ms asymm 17
> 13:  2001:978::112 27.599ms asymm 17
> 14:  peer-as174.cbv.tele2.net  27.216ms
> 15:  cbv-core-2.gigabiteth4-4.tele2.net29.172ms
> 16:  cbv-core-3.tengige0-0-0-0.tele2.net   35.459ms !N
>  Resume: pmtu 1500
> 
> 
> The message was unsigned, but posted from 2A01:7B8:2000:A6:0:0:0:10.
> 
> -- 
> Erwin Dokter
> 
> 
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Let's talk about Solr

2012-10-18 Thread Faidon Liambotis
On Thu, Oct 18, 2012 at 11:22:05AM -0700, Asher Feldman wrote:
> I think Solr is the right direction for us to go in.  Current efforts can
> pave the way for a complete refresh of WMF's article full text search as
> well as how our developers approach information retrieval.  We just need to
> make sure that these efforts are unified, with commonality around the
> client api, configuration, indexing (preferably with updates asynchronously
> pushed to Solr in near real-time), and schema definition.  This is
> important from an operational aspect as well, where it would be ideal to
> have a single distributed and redundant cluster.

I'm curious, has anyone evaluated ElasticSearch and whether it'd be more
or less suitable for us than Solr? If so, I'd be very interested in the
comparison results for our use cases.

Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] SPDY?

2012-10-28 Thread Faidon Liambotis
On Sat, Oct 27, 2012 at 09:52:34PM +0300, Strainu wrote:
> Are there any plans for an SPDY [1] test on the Wikimedia servers?
> 
> I'm currently doing some speed tests on a robot and I found out (not
> quite to my surprise) that it's much quicker to get whole pages
> (hundreds at a time) than to ask the API for each page's last edit
> time. I would love to see how would the results compare when using
> SPDY.

Some of us have talked a few times about SPDY -nothing too formal, no
time allocated or work being done- and checked out the options right
now.  Unfortunately, a proper deployment isn't that simple:

• There aren't many implementations out there, and even less so mature
  ones. Nginx is a good candidate for us because we already use it for SSL
  termination, but its SPDY implementation was first release ~2 months
  ago -- albeit with a large website behind it.

• SPDY is young and is being developed as a living specification; there
  are already three incompatible versions out there, with SPDY/1 not
  being used anymore and SPDY/3 being supported by Firefox 15 and Chrome
  22 (released just 10 days ago!). Its security hasn't been proven yet
  either, with an attack on it called CRIME being released a month ago.
  It's a moving target and this doesn't make things easier.

• To really take advantage of SPDY and connection multiplexing, you need
  to undo any domain sharding that you might have (which is for HTTP of
  course a standard/best practice for everyone, incl. us). This is a
  significant change in our architecture, one that can't be done from
  one day to the other.

• SPDY as it is right now requires SSL. Our SSL cluster is not scaled up
  yet and not ready to serve a significant portion (or all) of our
  traffic. There are various problems in doing so and the solution to
  some of them are in direct conflict with SPDY.
  
• Finally, SPDY is a bit controversial right now. A lot of smart people
  are discussing about HTTP/2.0 on IETF lists and SPDY is a big part of
  those discussions with various proponents but also oponnents.
  
  For example, It looks like Varnish —that we use a lot and planning to
  use even more— isn't going to support SPDY natively ever but is
  instead planning architectural changes to better support HTTP/2.0,
  when that comes.  More on Poul-Henning Kamp's mail¹ and keynote at the
  recent VUG²; I was there and asked specifically about SPDY and from
  the answer I got, it doesn't seem likely that Varnish will implement SPDY
  any time soon.

Now, none of this is to say that we won't try! We need all the help we
can get, so if there are interested people around there that are willing
to help, Labs is open :) We'll make sure to try and provide all the
support you might need!

Regards,
Faidon

¹: https://www.varnish-cache.org/docs/trunk/phk/http20.html
²: https://www.varnish-cache.org/vug6/

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Should MediaWiki CSS prefer non-free fonts?

2013-10-27 Thread Faidon Liambotis

On Mon, Oct 28, 2013 at 01:32:30PM +1100, Tim Starling wrote:

Yes, we should prefer to use free software. We should also strive to
ensure that our support for users on non-free platforms is optimal, as
long as that doesn't negatively impact on users of free platforms. So
I don't think it is a problem to specify non-free fonts in font lists.


It's a bit more complicated than that. Linux distros ship with 
fontconfig (which is used by Cairo, which in turn is used by at least 
Firefox). Fontconfig aliases fonts via a set of rules and the default 
rules map popular non-free fonts to their free metric equivalents, or 
generics. e.g. 


$ fc-match Helvetica
n019003l.pfb: "Nimbus Sans L" "Regular"

$ fc-match Arial
LiberationSans-Regular.ttf: "Liberation Sans" "Regular"

$ fc-match "Courier New"
LiberationMono-Regular.ttf: "Liberation Mono" "Regular"

$ fc-match INVALID
DejaVuSans.ttf: "DejaVu Sans" "Book"

This effectively means that, for Linux, having the free fonts at the end 
of the CSS font selection is probably[1]  a no-op: the browser will 
never fallback via the CSS, but match the first font on the list to an 
equivalent found on the system via fontconfig's fallback mechanisms. It 
will be an educated guess and possibly do the right thing but it won't 
be what the web designer intended.


This basically strengthens your point: free fonts should be first in the 
list.


Regards,
Faidon

[1]: I say "probably", because I vaguely remember the interactions 
between Firefox & fontconfig to be complicated. Maybe they're being 
smarter -- someone should test :)


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Should MediaWiki CSS prefer non-free fonts?

2013-10-28 Thread Faidon Liambotis

On Mon, Oct 28, 2013 at 02:56:42PM -0700, S Page wrote:

On Mon, Oct 28, 2013 at 10:20 AM, Erik Moeller  wrote:

Prioritizing freely licensed fonts while also explicitly naming the
preferred non-free fonts seems like an easy fix.



Again, this is already done for us by fontconfig when presented with the
ASCII sequence "H e l v e t i c a". There's no reason for our font stack to
do it unless some freely-licensed font looks better than the non-free font;
also it will work against the brave and few who follow steps like
http://www.binarytides.com/gorgeous-looking-fonts-ubuntu-linux/ to adjust
the appearance of those well-known names.


The problem with relying on fontconfig instead of being explicit about 
your choices, is that it's basically undefined behavior. On my system 
fontconfig chooses Liberation Sans, on yours it might choose Roboto, 
etc.


Maybe the designer doesn't care about which font is used as long as it's 
Sans Serif, but in this case why would the CSS say "Helvetica" instead 
of "sans-serif"? IOW, if one wants to be explicit about their font 
choices, they might just as well be explicit across platforms.


(I don't know much about web design though, I could be wrong :)

Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Architectural leadership in Wikimedia's technical community

2013-11-07 Thread Faidon Liambotis

On Tue, Nov 05, 2013 at 05:57:31PM -0800, Erik Moeller wrote:

So how should this role evolve going forward? Some possible paths (you
know I like to present options ;-):


The "architect" title, besides the job description that you described, 
is also a seniority level within the WMF's engineering department.  
Other organizations do e.g.  "sr./staff/sr. staff" and/or numeric 
levels, we do "associate / (blank) / sr. / (lead) architect". At least 
that's my understanding of what was presented during the last all-staff 
and documented on officewiki.


What would happen to this seniority level, if any of the options you 
presented were to be adopted? You seem to hint that there would be a 
mapping with option D ("salary increases") but it's not clear to me how 
direct of a mapping or a prerequisite would that be.


I don't think it'd reasonable to say that we have, as an organization 
with ~180 FT staff, a peer review process, an HR department, managers, 
directors & VPs *but* you can't be promoted inside the organization 
until an open community process says so (or, in case of option A, *at 
all*). It'd be even more illogical considering that currently no other 
positions exist where there is such a connection: this hasn't been the 
case for promotions to Sr. -- and, even in the leadership track, there 
have been promotions to managers, directors & VPs, with no such open 
community process.


If that's not the intention, on the other hand, I think it's useful to 
either hear WMF management's views on that or, if it is up for 
discussion, have this discussion in parallel with this one.


Whether we like it or not, the existing title has *two* meanings as it 
is —and my understanding is that the salary aspect came first, too— and 
I don't think we can have a meaningful discussion for the one without 
knowing the implications for the other.


Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Ops] Operations buy in on Architecture of mwlib Replacement

2013-11-13 Thread Faidon Liambotis

On Wed, Nov 13, 2013 at 03:41:33PM -0800, Matthew Walker wrote:
* Node.JS itself should be installable via apt package (we'll have to 
do a custom package so that we get Node v10)


I haven't looked at your document yet, but a quick note on that: I have 
nodejs 0.10 backported packages ready for about 10 days now.


We typically avoid running multiple versions of the same package across 
the infrastructure (and our apt repo isn't split like that, thankfully), 
so I'd like to upgrade the existing users to 0.10. These are parsoid, 
statsd, perhaps the not-production-yet limn, etherpad-lite and of these, 
parsoid is the one with the most impact.


As such, we've agreed with Gabriel -which needed node 0.10 for rashomon 
anyway- to test the new version under the parsoid RTT suite & 
subsequently in the Parsoid Labs instance, before we go and upgrade 
production. (The packages have been in parsoid.wmflabs.org's /root/ 
since). I haven't heard since but as they don't /need/ a new Node 
version right now, I guess this is low priority for them (and we, ops, 
don't care much either).


I think this would happen anyway before the PDF service would ever reach 
production, but I think we can prioritize it a bit more and make sure it 
will. Gabriel, what do you think?


Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Module storage is coming

2013-12-03 Thread Faidon Liambotis

On Tue, Dec 03, 2013 at 12:30:00AM -0800, Ori Livneh wrote:

We ran a controlled test and found that module storage reduced page load
times by 156 ms, on average. Aaron has some data available at <
https://meta.wikimedia.org/wiki/Research:Module_storage_performance>, but
we still need to write several sections. The size of the effect is
substantially smaller on mobile, for some reason, which is surprising. We
hope to make the dataset public soon.


That sounds great, Ori. Nice work, from both of you :)


156ms shaved off of 90% of page views is pretty nice.
http://perspectives.mvdirona.com/2009/10/31/TheCostOfLatency.aspx is worth
reading for context and scale:

"This conclusion may be surprising -- people notice a half second delay? --
but we had a similar experience at Amazon.com. In A/B tests, we tried
delaying the page in increments of 100 milliseconds and found that even
very small delays would result in substantial and costly drops in 
revenue."


I couldn't agree more. It's widely accepted across the industry that bad 
site performance/latency is detrimental to user engagement (simply put: 
speed is a feature). It's exciting to see some much-needed good work in 
this area.


Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] ARM servers

2014-01-13 Thread Faidon Liambotis

On Mon, Jan 13, 2014 at 05:47:12PM +0800, James Salsman wrote:

Can someone more familiar with the Foundation's server infrastructure
needs than I please create a page somewhere with a checklist of
packages, modules, tools, etc., which need to be on arm but aren't
yet?


Before we do that, we need to find a use case for ARM servers and we 
need to find quality ARM hardware (= not first generation) in reasonable 
prices for their performance.


We've thought a bit about it in the past, but couldn't come up with a 
use case that made technical or financial sense. We have dozens of x86 
servers e.g. just for MediaWiki; having thousands of ARM servers for the 
same purpose instead doesn't sound very appealing. It'll increase our 
problems (e.g. scalability), it will slow down individual requests and 
it's unlikely to provide a technical or financial benefit. Other uses 
cases are similar: even our storage servers are too busy CPU-wise for 
ARM to make sense.


Maybe if ARM gets sufficiently fast (e.g. with arm64) and they've been  
proven in the field by the early adopters, it might sense for us in the 
long run. But I don't foresee us becoming one of these early adopters 
anytime soon. I'd love to be convinced otherwise, though :)


Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Jake requests enabling access and edit access to Wikipedia via TOR

2014-01-13 Thread Faidon Liambotis

On Mon, Jan 13, 2014 at 11:43:53AM -0500, Marc A. Pelletier wrote:

If you start with that assumption, then it is unreasonable to assume
that the endpoints aren't /also/ compromised or under surveillance.

Editing Wikipedia is an inherently public action, if your security or
life is in danger from editing it, then TOR will protect neither because
even if you had 100% confidence in every possible exit node (which is
most certainly false), it does nothing to protect the endpoints.

What TOR may be good at is to protect your privacy from casual or
economic spying; in which case going to some random Internet access
point to create an account protects you adequately.


What do you mean by "endpoints"? Based on the above, I think you've 
completely misunderstood Tor's design & mode of operation.  
https://www.torproject.org/about/overview.html.en might be a good start, 
and I'm sure you can find more technical information if you search 
around.


As for the "inherently public action" of editing Wikipedia: the content 
of edits is public, but the identity of the editor is not (or should not 
be), so your claim baffles me a bit. Plus, it sounds a bit like a 
variation of the "I have nothing to hide" argument to me, to which I 
couldn't disagree more with.


Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tor exemption process (was: Re: Jake requests enabling access and edit access to Wikipedia via TOR)

2014-01-17 Thread Faidon Liambotis

On Fri, Jan 17, 2014 at 01:26:04PM -0800, Erik Moeller wrote:

The Board or global community could decide that protecting users'
right to anonymity is more important than having abuse prevention
tools relying on IP disclosure, but in the absence of such a
Board-level decision or community-wide vote, I don't think the
situation relative to Tor users will change. My personal view is that
we should transition away from tools relying on IP disclosure, given
the global state of Internet surveillance and censorship which makes
tools like Tor necessary.


Hear, hear. I couldn't agree more.

My own view:

This matter isn't about dissidents in oppressive regimes or suspected 
criminals. It never was, but it has become especially apparent to 
everyone this summer.


The whole world -literally everyone- is being constantly surveilled and 
our communications recorded for decades to come. Everyone is a suspect 
and everyone has a file. We'll never be sure again, for example, that 
actions that we perform today, as innocent as they are now -like a 
Wikipedia edit- won't be used against us in 5 or 10 years to link us 
with a crime or group.


All access & edits to Wikipedia being monitored isn't some paranoid 
theory anymore, we can be more than sure of it. Tor is one of the very 
few ways to resist to this pervasive surveillance and work around the 
panopticon of modern states. We *must* find a way to support it as a 
first-class citizen, for exactly the same reasons Wikipedia has been 
protective of users' privacy and has a stringent privacy policy.


(I was at 30C3; I got a bazillion complaints from numerous people about 
this every time I mentioned my affiliation, even before Jake's talk)


Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Drop support for PHP 5.3

2014-02-18 Thread Faidon Liambotis
On Tue, Feb 18, 2014 at 09:51:25AM -0800, Chad wrote:
> On Tue, Feb 18, 2014 at 9:48 AM, Trevor Parscal wrote:
> > PHP 5.4 added a few important features[1], namely traits, shorthand array
> > syntax, and function array dereferencing. I've heard that 5.3 is nearing
> > end of life.
> >
> > I propose we drop support for PHP 5.3 soon, if possible.
> >
> >
> I'm in favor of bumping to a 5.4 minimum as well since 5.3 is
> approaching its end of life upstream.
> 
> As I pointed out on IRC, the question is how quickly the distros
> will follow. Right now the current Ubuntu LTS has us stuck on
> 5.3.something. It looks like 14.04 will have 5.5.8 which is nice
> but not out until April :)

That is not actually the holdup (or if it is, it's a miscommunication
and it shouldn't be). We can backport/build/maintain PHP packages
ourselves. We, in fact, run our own 5.3 packages with some minor changes
compared to precise's.

Last time we were discussing PHP 5.4 it was quite a while ago but I
remember hearing that we'd need to do some porting work for our
extensions. Plus, we we re having a debate we were having about Suhosin
that I don't think ended up anywhere :)

However, last I heard, platform engineering is focusing on HHVM now
instead, so I'm not sure if it actually makes sense to spend resources
to move to PHP 5.4 right now.

Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Webfonts

2014-03-13 Thread Faidon Liambotis
On Thu, Mar 13, 2014 at 07:20:27PM -0400, MZMcBride wrote:
> I think you're mostly right, though the exact terms of the trade-offs
> aren't clear here (e.g., "some bandwidth"). We'll need more explicit
> measurements in order to reach full agreement on what user benefit vs.
> site performance trade-offs Wikimedia is willing to accept.
> 
> It's also necessary to hear from the Wikimedia operations team. In
> addition to end-users weighing the importance of a feature against its
> cost, the operations team must also make practical considerations. Some
> Wikimedia wikis get some substantial traffic. ;-)

As you say, we need to have more data before we can promise anything,
"some bandwidth" isn't enough.

As a general comment though, delivering a small amount of files such as
webfonts is a trivial task and we can easily scale the infrastructure to
handle *a lot* of additional traffic. It costs, though, both in
bandwidth (a dollar amount per mbps) and in upgrades that might be
necessary in network ports or hardware upgrades (loadbalancers, routers,
servers).

I think at this point it's more useful to focus the discussion on the
usefulness of webfonts, especially in combination with the performance
impact that they have on clients (a problem that we can't throw money
at). If the outcome is that the feature enhances the overall user
experience, we'll handle the infrastructure part.

Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Mapping our upstream projects

2014-03-18 Thread Faidon Liambotis
Quim,

On Tue, Mar 18, 2014 at 10:47:04AM -0700, Quim Gil wrote:
> * projects we develop that we want others to use and contribute to (e.g.
> MediaWiki)
> * projects others develop and we embed in our architecture (e.g.  > 
> Elasticsearch)
> * projects others develop and we embed in our processes (e.g. Jenkins)
> 
> These projects define our location in the free software map. The health of
> our projects depends on their own health, and also on the health of our
> common links.

I'm not sure what "embed in our architecture" or "embed in our
processes" means, could you clarify that?

I see for example that the page has a lot of the shiny stuff (e.g. Lua
is there, but bash/PHP/Python/Ruby are not). Moreover, a few random
libraries are there but others that we take for granted are not (e.g.
librsvg is there, but noone thought of gzip or bzip2; unihan vs. all of
the dozens of fonts that we use, etc.). Not to mention all the hundreds
of absolutely essential tools that we use for system maintenance that
noone ever sees or cares about, from Linux to GNU sed, dpkg, apt etc.

I think this needs to be clarified and/or scoped a bit better, including
explaining the rationale & motivation behind doing all this work.

For what it's worth, a uniqued dpkg -l across the production
infrastructure shows 3276 software packages and personally I'd have a
very hard time filtering the list based on what fits in the above
description.

Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Engineering] Update on HHVM

2014-03-21 Thread Faidon Liambotis
Thanks for the update, Ori. Exciting stuff :)

On Fri, Mar 21, 2014 at 03:42:41AM -0700, Ori Livneh wrote:
> * We need good packages. The packages provided by Facebook have some deep
> issues that need to be fixed before they meet our packaging standards.
>  This is a good opportunity to recognize Faidon's leadership on this front:
> he has been liasoning with Facebook and Debian, working to resolve the
> outstanding issues. Thanks, Faidon!

To be clear, since I saw some discussion on the IRC backlog: this isn't
about "Debian packaging standards", the "Debian way" or anything like
that.  The packages they provide are not real packages, they are
essentially just tarballs packaged in the ar format and extractable with
dpkg. It's in the form of this ugly, Facebook-specific, unmaintainable
shell script:
https://github.com/hhvm/packaging/blob/master/hhvm/deb/package

It's not "packages" that can be improved by us or anyone else. There is
progress on making packages, and a Debian Developer that works at
Facebook has been enlisted, on company time AIUI, to also help them with
that. There is work of his & mine in progress at:
http://anonscm.debian.org/gitweb/?p=collab-maint/hhvm.git;a=shortlog
It's already buildable and probably better already than the dl.hhvm.com
packages. I'd be perfectly okay with these packages finding their way
into Wikimedia production, even *before* they are up to Debian standards
and suitable for an upload into Debian proper.

There /are/ some interactions with the Debian project that are on our
best interests, as they might help us foster a healthy community around
HHVM, e.g. 
http://lists.alioth.debian.org/pipermail/pkg-php-maint/2014-January/012988.html
but these are nice-to-haves, not blockers for our HHVM production
deployment.

> * We need to port a bunch of C extensions to the Zend PHP interpreter to
> HHVM. The most complex by far is LuaSandbox. Tim has been working on that.
> In the process, he has made substantial improvements to the Zend extension
> compatibility layer provided by HHVM, which we are waiting to have merged
> upstream: .  Once they are
> merged, they will be in the queue for the next release.  Releases are cut
> every eight weeks.
> * I also want to recognize Max Seminik, who stepped up to port Wikidiff2,
> producing a patch in short order.

Note that there is still the outstanding issue of how to deploy these
extensions, as it's my understanding that HHVM's ABI is not stable and
hence they would need to be rebuilt with every new HHVM version using
the HHVM source tree. It's a bit messy, Tim has all the details.

> * We need to adapt our app server configuration for HHVM. This includes
> configuring HHVM itself as well as reconfiguring Apache to act as a fastcgi
> reverse-proxy.

This will also require some puppet work -- the current classes aren't
great and it won't be too easy to plug an alternative implementation
without some rework. Not too much work, but have it on your radar
nevertheless.

Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] ogv.js media player update: Flash and GPU acceleration

2014-03-30 Thread Faidon Liambotis
On Sun, Mar 30, 2014 at 09:51:31AM -0700, Brion Vibber wrote:
> I spent a little more time the last few weekends on ogv.js
> (JavaScript-based player for Ogg Theora and Vorbis media in IE and Safari)

This is just awesome work, Brion and in amazingly little time. I'm
really excited to see this. Keep up and let me know if I can help in any
way! (e.g. the seeking/Swift/Varnish header stuff)

Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] GeoData now uses Elasticsearch

2014-04-10 Thread Faidon Liambotis
On Thu, Apr 10, 2014 at 05:04:38AM +0400, Max Semenik wrote:
> And finally, appreciation: this was made possible only thanks to awesome
> help from our search team, Nik Everett and Chad Horohoe. You kick ass
> guys!

Extending appreciation: thanks Max, good work! This is great :)

Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Hardening WP/WM against traffic analysis (take two)

2014-06-06 Thread Faidon Liambotis
Hi Zack,

Thanks for bringing this up again, this is a very useful discussion to
have.

On Thu, Jun 05, 2014 at 12:45:11PM -0400, Zack Weinberg wrote:
> * what page is the target reading?
> * what _sequence of pages_ is the target reading?  (This is actually
> easier, assuming the attacker knows the internal link graph.)

The former should be pretty easy too, due to the ancillary requests that
you already briefly mentioned.

Because of our domain sharding strategy that places media under a
separate domain (upload.wikimedia.org), an adversary would know for a
given page (1) the size of the encrypted text response, (2) the count
and size of responses to media that were requested immediately after the
main text response. This combination would create a pretty unique
fingerprint for a lot of pages, especially well-curated pages that would
have a fair amount of media embeded into them.

Combine this with the fact that we provide XML dumps of our content and
images, plus a live feed of changes in realtime and it should be easy
enough for a couple of researchers (let alone state agencies with
unlimited resources) to devise an algorithm that calculates these with
great accuracy and exposes (at least) reads.

> What I would like to do, in the short term, is perform a large-scale
> crawl of one or more of the encyclopedias and measure what the above
> eavesdropper would observe.  I would do this over regular HTTPS, from
> a documented IP address, both as a logged-in user and an anonymous
> user.

I doubt you can create enough traffic to make a difference, so yes, with
my operations hat on, sure, you can go ahead. Note that all of our
software, production stack/config management and dumps of our content
are publically available and free (as in speech) to use, so you or
anyone else could even create a replica environment and do this kind of
analysis without us ever noticing.

> With that data in hand, the next phase would be to develop some sort
> of algorithm for automatically padding HTTP responses to maximize
> eavesdropper confusion while minimizing overhead.  I don't yet know
> exactly how this would work.  I imagine that it would be based on
> clustering the database into sets of pages with similar length but
> radically different contents.

I don't think it'd make sense to involve the database in this at all.
It'd make much more sense to postprocess the content (still within
MediaWiki, most likely) and pad it to fit in buckets of predefined
sizes. You'd also have to take care of padding images as well, as the
combination of count/size alone leaks too many bits of information.

However, as others mentioned already, this kind of attack is partially
addressed with the introduction of SPDY / HTTP/2.0, which is on our
roadmap. A full production deployment, including undoing optimizations
such as domain sharding (and SSL+SPDY by default, for everyone) is many
months ahead, however it does make me wonder if it makes much sense to
spend time focusing on plain HTTPS attacks right now.

Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] What's up with upload.wikimedia.org 's crossdomain.xml

2014-07-06 Thread Faidon Liambotis
On Sat, Jul 05, 2014 at 09:09:07PM -0700, Brion Vibber wrote:
> That X-Range header was an experiment me and Faidon tried for the ogv.js
> media player I've been prototyping (Flash fallback version) . We couldn't
> get the extra header -- or the regular Range header -- to work through the
> varnish layer though, so current code doesn't use it.
> 
> Its safe to remove that part from the file.

Removed :) Thanks Brian.

Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Tor relay

2014-11-10 Thread Faidon Liambotis
Hi,

This is just a quick update that as of late October, WMF is officially
hosting a Tor relay:
https://atlas.torproject.org/#details/DB19E709C9EDB903F75F2E6CA95C84D637B62A02

This is not an exit node, and it's just a small contribution to the
network. Really - anyone can do it: https://www.eff.org/torchallenge/

We want to note that editing via Tor is a whole other matter, and I
would refer to previous conversations on this list if you have questions
about the current state of things. We also currently have no plans to
run an exit node or a hidden service, but we're open to suggestions for
additional ways in which WMF can support anonymity and privacy.

Sincerely,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tor relay, and freenode

2014-11-10 Thread Faidon Liambotis
Hi Pine,

On Mon, Nov 10, 2014 at 01:28:54AM -0800, Pine W wrote:
> Thanks for the note. Would it be within our mission scope to host a
> Freenode server? We use Freenode a *lot* for public and private
> communications. There have been previous discussions about WMF support for
> upstream services, and WMF has been talking about offering non-monetary
> support to affiliates, so I think hosting a Freenode server could make
> sense.

We in fact, already do: that would be dickson.freenode.net.

Unfortunately, the server has been taken offline since the compromise of
the Freenode network back in September. We are in touch with the
Freenode admins to understand the circumstances of the break-in and to
what extent dickson was affected, as well as our post-breakin strategy
(forensics, whether reimaging would be enough, etc.). This has taken
longer than expected, unforunately.

Suffice to say, the Freenode server is in a different administrative
realm and is sandboxed and entirely separate from WMF production;
compromises there would not have any effect to our production network.

Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Tor relay

2014-11-10 Thread Faidon Liambotis
On Mon, Nov 10, 2014 at 11:35:42PM +0100, Federico Leva (Nemo) wrote:
> * "Advertised Bandwidth 3.3 MB/s", what does it mean and can it be
> increased?

As we do not set any advertised bandwidth in our configuration, the
value in Atlas is the bandwidth observed by the network. We are still in
a ramp-up phase and going to continue being so for until approximately
the end of 2014. Read
https://blog.torproject.org/blog/lifecycle-of-a-new-relay for more. We
do not plan to set an bandwidth limit at this point, either in the Tor
configuration or externally, in our network.

> * I see in puppet that there is at least some logging enabled. What is
> being logged and why? "The best policy is to keep no logs."
> 

What logs are you referring to? If you're referring to "Log notice
/var/log/tor/tor.log" then this just logs statistics, nothing else.
torrc's manpage says on the matter: "[w]e advise using "notice" in most
cases, since anything more verbose may provide sensitive information to
an attacker who obtains the logs". We do not keep traffic/usage logs in
any way nor are we planning to.

Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Removal of mobile.wikipedia.org and wap.wikipedia.org in support of browser HSTS preload

2015-07-16 Thread Faidon Liambotis
Hi,

If you look at https://phabricator.wikimedia.org/T104942#1436332 (linked
from this thread, before Adam posted his own data) an analysis was done
on a file called "per-domain-count" which we previously extracted from
sampled 1:1000 logs for approximately 25 days for all kinds of
domain-popularity purposes and cleanups that we've been doing as part of
the HTTPS project (more background at
https://phabricator.wikimedia.org/T102827#1429852 and also see T102826,
T102814, T102815).

Those logs above are sampled and aren't as accurate as the Hadoop data
Adam used due to other infrastructure faults that have happened in that
25-day period but they are generally okay for extracting those broad
conclusions, especially if we look at the relative popularity of e.g.
.wap. vs. .m. rather than the absolute numbers.

Finally, note that in any case there is a hard limitation of a
look-behind window of 90 days due to our data retention policy, as well
as practical considerations for extracting results from unsampled logs
for larger periods of time. You're absolutely right, though, that a
1-day sample is usually not enough, especially considering the
seasonality of data like e.g. a very different mobile-to-desktop ratio
on weekends.

Faidon

On Thu, Jul 16, 2015 at 09:55:14AM -0400, John wrote:
> Can we look at a wider sample? using a single day as judgement factor is a
> bad idea. However if the data supports your position I dont see any serious
> problems. You might want to take a look at either the UA's or refering
> sources to see if there is a primary source for the traffic and mitigate
> that.
> 
> On Thu, Jul 16, 2015 at 9:03 AM, Adam Baso  wrote:
> 
> > Looks like the user pageviews for wap.wikipedia.org and
> > mobile.wikipedia.org
> > subdomains are approximately 0.02% of the size of pageviews for
> > m.wikipedia.org subdomains based on a recent one day check.
> >
> > hive> select count(*) from
> > wmf.webrequest where
> > year = 2015 and month = 7 and day = 14
> > and access_method = 'mobile web'
> > and (uri_host like '%.wap.wikipedia.org' OR uri_host like '%.
> > mobile.wikipedia.org')
> > and is_pageview = true and agent_type = 'user';
> >
> > 35,543
> >
> > hive> select count(*) from
> > wmf.webrequest where
> > year = 2015 and month = 7 and day = 14
> > and access_method = 'mobile web'
> > and uri_host like '%.m.wikipedia.org'
> > and is_pageview = true and agent_type = 'user';
> >
> > 202,024,891
> >
> >
> > On Thu, Jul 16, 2015 at 5:41 AM, John  wrote:
> >
> > > ... Have we done any analysis on usage of those subdomains?
> > >
> > > On Thu, Jul 16, 2015 at 8:34 AM, Adam Baso  wrote:
> > >
> > > > There's a ticket for removing mobile.wikipedia.org and
> > wap.wikipedia.org
> > > > domains/subdomains, which are legacy domain names superceded by
> > > > m.wikipedia.org and its subdomains.
> > > >
> > > > https://phabricator.wikimedia.org/T104942
> > > >
> > > > The rationale for the removal of these legacy domain names is to help
> > > > support HSTS preloading in browsers with the existing TLS SAN cert.
> > > >
> > > > After review of the ticket, can anyone think of a compelling reason to
> > > keep
> > > > those old domain names?
> > > >
> > > > I'm going to open a separate thread on mobile-l about this given this
> > is
> > > > more mobile-targeted, yet some people only operate on one of wikitech-l
> > > or
> > > > mobile-l.
> > > >
> > > > -Adam
> > > > ___
> > > > Wikitech-l mailing list
> > > > Wikitech-l@lists.wikimedia.org
> > > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > > ___
> > > Wikitech-l mailing list
> > > Wikitech-l@lists.wikimedia.org
> > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [WikimediaMobile] Wikipedia iOS app moving to GH

2015-07-23 Thread Faidon Liambotis
Hi Brian,

The arguments for/against GitHub etc. were discussed at length across
all of our engineering staff & community, exactly 3 years ago, which
reached consensus:
https://www.mediawiki.org/wiki/Git/Gerrit_evaluation#GitHub

In my opinion, this is not something that should be addressed on a
per-team basis (and especially not by making the decision first
internally in the team and then announcing it to the wider audience as a
done deal). Individual and team-wide preferences should be considered as
input to the wider discussion but ultimately people should yield to the
collective decision. A per-team decision for critical tooling like the
one you just announced would be inappropriate in a corporate setting,
and is even more so in a community-facing organization.

All this applies to both our Git tooling as well as CI, for which is
worth noting that there are people in the foundation working on it full
time. It's not very different than our issue tracking tooling either,
for which we already know the huge pains that we've suffered in the past
by having it fragmented across multiple different tools that each
individual team picked.

We can always revisit past decisions and reopen past discussions (to
some extent, it's a sign of health) but your way is not the way to do
this, IMHO.

Best,
Faidon

On Wed, Jul 22, 2015 at 05:43:07PM -0400, Brian Gerstle wrote:
> This isn't really about Gerrit vs. GitHub. To be clear, we're mainly doing
> this for CI (i.e. Travis).
> 
> That said, we (the iOS team) plan for our workflow to play to GitHub's
> strengths—which also happen to be our personal preferences.  In short, this
> means "amending patches" becomes "pushing another commit onto a branch."
> We've run into issues w/ rebasing & amending patches destroying our diff in
> Gerrit, and problems with multiple people collaborating on the same patch.
> We think GitHub will not only provide integrations for free CI, but, as an
> added bonus, also resolve some of the workflow deficiencies that we've
> personally encountered with Gerrit.
> 
> 
> On Wed, Jul 22, 2015 at 5:14 PM, Gergo Tisza  wrote:
> 
> > On Wed, Jul 22, 2015 at 4:39 AM, Petr Bena  wrote:
> >
> >> Good job, you aren't the only one. Huggle team is using it for quite
> >> some time. To be honest I still feel that github is far superior to
> >> our gerrit installation and don't really understand why we don't use
> >> it for other projects too.
> >>
> >
> > GitHub is focused on small projects; for a project with lots of patches
> > and committers it is problematic in many ways:
> > * poor repository management (fun fact: GitHub does not even log force
> > pushes, much less provides any ability to undo them)
> > * noisy commit histories due to poor support of amend-based workflows, and
> > also because poor message generation of the editing interface (Linus wrote
> > a famous rant
> >  on that)
> > * no way to mark patches which depend on each other
> > * diff view works poorly for large patches
> > * CR interface works poorly for large patches (no way to write draft
> > comments so you need to do two passes; discussions can be marked as
> > obsolete by unrelated code changes in their vicinity)
> > * hard to keep track of cherry-picks
> >
> >
> > ___
> > Mobile-l mailing list
> > mobil...@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/mobile-l
> >
> >
> 
> 
> -- 
> EN Wikipedia user page: https://en.wikipedia.org/wiki/User:Brian.gerstle
> IRC: bgerstle
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] CORS blocking metrics

2015-08-16 Thread Faidon Liambotis
On Sat, Aug 15, 2015 at 01:00:38AM -0700, Gergo Tisza wrote:
> That does not sound like a big deal since we are loading most Javascript
> files from our own servers, and can fully control what headers are set, but
> we ran into occasional problems in the past when using CORS (MediaViewer
> uses CORS-enabled image loading to get access to certain performance
> statistics): some people use proxies or firewalls which strip CORS headers
> from the responses as some sort of misguided security effort, causing the
> request to fail. We wanted to know how many users would be affected by this
> if we loaded ResourceLoader scripts via CORS.

For Wikimedia sites, it is now impossible for proxies or firewalls to
strip headers, after the switch to HTTPS-only. Was this analysis done
before or during the HTTPS-only migration?

Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] HTTP/2 switch-on schedule for WMF sites?

2015-09-10 Thread Faidon Liambotis
On Thu, Sep 10, 2015 at 09:14:27PM +0100, Neil Harris wrote:
> Does anyone know if the WMF engineering team has a schedule for
> deploying HTTP/2 on its sites, preferably in the near future, and if
> so, what the progress is toward that goal?

We have no firm schedule yet. It's mostly blocked on upstream work. You
can follow the progress on the relevant task on Phabricator,
https://phabricator.wikimedia.org/T96848.

Regards,
Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Scope of ArchCom

2016-01-28 Thread Faidon Liambotis
On Fri, Jan 22, 2016 at 02:30:22PM -0800, Rob Lanphier wrote:
>  On Fri, Jan 22, 2016 at 2:08 PM, Alex Monk  wrote:
> 
> > To clarify - are you saying this ([deploying increasingly excellent
> > software on the Wikimedia production cluster in a consensus-oriented
> > manner]) is the actual current scope of ArchCom, or are you advocating for
> > a change in scope?
> 
> It's my attempt to clarify the scope, but you could argue it's a change.
> 
> Ultimately, WMF TechOps has correctly blocked a lot of software making it
> to the Wikimedia cluster that hasn't been through the RFC process, even
> though they themselves weren't entirely clear about the scope.  Wikimedia
> Foundation leadership has an (unfortunately) long history of being unclear
> about the scope.  I share the blame for this.  This is my attempt to
> clarify.

This is true, although the word "blocked" is perhaps a bit strong.

We generally prefer large architectural changes to be discussed with a
wider group across the movement, than just us and the person or team
that proposed them. An architecture that grows organically without much
coordination or cohesion isn't going to be sane, but a process where
TechOps are the gatekeeper for every single architectural change is not
a healthy one either. Hence our... recommendation to move those
discussions into the RfC forum, for the lack of a better venue.

That said, there have been important deployments that have bypassed the
RfC process entirely (including proposals that resulted into staffed WMF
teams) and others that did go via the RfC process, but the resulting
feedback wasn't incorporated into the final design (for various
reasons).

It's also worth noting that the opposite has happened as well: TechOps
has blocked the production deployment of features that the MediaWiki
ArchComm has approved. The fact that an optional feature is considered
good enough for the MediaWiki architecture does not mean that it's
appropriate for Wikimedia's complex and demanding production environment
-- or for being worked on by the Wikimedia Foundation, for that matter.
This is especially true given that ArchComm really has absolutely no say
in resourcing and a given feature may not have secured funding (people,
hardware etc.)

Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Scope of ArchCom

2016-01-29 Thread Faidon Liambotis
On Thu, Jan 28, 2016 at 10:53:27AM -0800, Rob Lanphier wrote:
> > This is especially true given that ArchComm really has absolutely no say
> > in resourcing and a given feature may not have secured funding (people,
> > hardware etc.)
> 
> Awwwyou're mail was so great, and then you ended with this!  Are you
> saying that the only real power in this world belongs to people with
> control of the money?

That's kinda stretching what I said, doesn't it :)

What I'm saying is that there is a (probably unavoidable) disconnect
between the ArchComm's and WMF's (or WMDE's, or other orgs' for that
matter) decision processes and cadences.

The ArchComm isn't in the path of resourcing and generally does not vet
RfCs based on whether e.g. they are backed by fully-staffed teams (or
even whether the required infrastructure for implementing them exists or
can be procured, under our constraints). My understanding is also that
as a purely technical body, it doesn't do much of a cost/benefit
analysis either. The ArchComm thus tends to judge ideas on their merits
and their merits alone -- and not unreasonably so.

This effectively means that some of the ArchComm-"approved" ideas may be
unimplementable -- at least until some organization or department
decides to foot the bill, possibly going via their budgeting process
(which can even be on an annual basis), etc.

So -- yes, I think there is a particular amount of "power" that the
ArchComm doesn't have and cannot really have; I don't think that's a
problem per se, but I do think it needs to be recognized and planned
for. This could example be to generally limit the scope of the committee
(e.g. to architecture direction and not feature planning; or to
MediaWiki/software architecture and not infrastructure planning, etc.)
and/or by ensuring budget owners are attending and influencing the
decision-making process.

Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Proposal to invest in Phabricator Calendar

2016-05-16 Thread Faidon Liambotis
On Sun, May 15, 2016 at 10:59:40PM +0200, Andre Klapper wrote:
> On Sat, 2016-05-14 at 20:51 +0200, Ricordisamoa wrote:
> > If we're going to be investing money into improving Phabricator 
> > upstream, I think we should start with making Differential usable
> > (i.e. a suitable replacement for Gerrit)
> 
> If you have *specific* issues, please point them out by linking to
> tasks. "Usable" is too subjective to be a basis for discussions.

If we have spare budget for the FY, a good start, I think, would be
(properly) implementing https://secure.phabricator.com/T5000, by
implementing https://secure.phabricator.com/T8092 which in turn depends
on https://secure.phabricator.com/T8093 and possibly depends on
https://secure.phabricator.com/T4369 and
https://secure.phabricator.com/T4245.

https://secure.phabricator.com/T10691 (depending on all of the above)
could be potentially interesting for us too.

Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] eqiad->codfw datacenter switchover, weeks of Apr 17th/May 1st

2017-04-07 Thread Faidon Liambotis
Hi all,

You may have heard already that, like last year, we are planning to
switch our active datacenter from eqiad to codfw in the week of April
17th and back to eqiad two weeks later, on the week of May 1st. We do
this periodically in order to exercise our ability to run from the
backup site in case of a disaster, as well as our ability to switch
seamlessly to it with little user impact.

Switching will be a gradual, multi-step process, the most visible step
of which will be the switch of MediaWiki application servers and
associated data stores. This will happen on April 19th (eqiad->codfw)
and May 3rd (codfw->eqiad), both at 14:00 UTC. During those windows, the
sites will be placed into read-only mode, for a period that we estimate
to last approximately 20 to 30 minutes.

Furthermore, the deployment train will freeze for the weeks of April
17th and May 1st[1], but operate normally on the week of April 24th, in
order to exercise our ability to deploy code while operating from the
backup datacenter.

1: https://wikitech.wikimedia.org/wiki/Deployments

Compared to last year we have improved our processes considerably[2], in
particular by making more services operate in an active/active manner,
as well as by working on an automation and orchestration framework[3] to
perform parallel executions across the fleet. The core of the MediaWiki
switchover will be performed semi-automatically using a new software[4]
that will execute all the necessary commands in sequence with little
human involvement, and thus lowering the risk of introducing errors and
delays.

2: https://wikitech.wikimedia.org/wiki/Switch_Datacenter
3: https://github.com/wikimedia/cumin
4: https://github.com/wikimedia/operations-switchdc

Improving and automating our processes means that we're not going to be
following the exact same steps as last year. Because of that, and
because of other changes introduced in our environment over the course
of the year, there is a possibility of errors creeping into the process.
We'll certainly try to fix any issues that arise during those weeks and
we'd like to ask everyone to be on high-alert and vigilant.

To report any issues, please use one of the following channels:

1. File a Phabricator issue with project #codfw-rollout
2. Report issues on IRC: Freenode channel #wikimedia-tech (if urgent, or
during the migration)
3. Send an e-mail to the Operations list: o...@lists.wikimedia.org (any time)

Thanks,
Faidon
--
Faidon Liambotis
Principal Operations Engineer
Acting Director of Technical Operations
Wikimedia Foundation

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] eqiad->codfw datacenter switchover, weeks of Apr 17th/May 1st

2017-04-19 Thread Faidon Liambotis
Hi all,

The first part of this switch completed successfully today. The total
time in which the projects were in read-only was approximately 17
minutes.

There were a few hiccups that were or still are being dealt with, the
most major ones identified so far being:

- The switch to codfw in combination with what looks like a
  ContentTranslation bug resulted into an overload of the x1 database
  shard which in turn affected Echo & Flow. The total outage for Echo &
  Flow was from 15:36 until 16:18 UTC.
  
  ContentTranslation was also misbehaving since 15:36 and was disabled
  on all Wikipedias at 15:57 UTC. The Language team and Roan are trying
  to identify the root cause. It will be gradually reenabled once it's
  found and the issue gets fixed. [T163344]

- ORES Watchlist items appear duplicated. the ORES team is still
  investigating this. [T163337]

- codfw API database slaves overloaded and an additional one had to be
  pooled in in order to handle the load but it doesn't look like it was
  enough to fully alleviate this. The API is and has been available
  throughout this work, albeit with reduced performance. This is still
  being dealt with by the DBA team. [T163351]

- The IPs of the Redis servers used for locking was misconfigured in
  mediawiki-config, which resulted into file uploads (e.g. to Commons)
  and deletions to not work until it was manually fixed. This was not
  working between 14:30-14:44 UTC. [T163354]

The issues above are still being worked on and the situation is
evolving, some some of the above may be inaccurate already. Phabricator
is the authoritative place for both the root cause and mitigation action
of all of those issues, with #codfw-rollout being the common tag.

Please follow-up there either on existing issues or new ones that you
may discover over the course of the next 2-3 weeks :)

Thanks to everyone in and outside of ops for both the substantial amount
of work that has gone into preparing for this day, as well as for all
the firefighting for the better part of today. Expect to hear more from
us when this project concludes.

Best,
Faidon
--
Faidon Liambotis
Principal Operations Engineer
Acting Director of Technical Operations
Wikimedia Foundation

On Fri, Apr 07, 2017 at 04:58:09PM +0300, Faidon Liambotis wrote:
> Hi all,
> 
> You may have heard already that, like last year, we are planning to
> switch our active datacenter from eqiad to codfw in the week of April
> 17th and back to eqiad two weeks later, on the week of May 1st. We do
> this periodically in order to exercise our ability to run from the
> backup site in case of a disaster, as well as our ability to switch
> seamlessly to it with little user impact.
> 
> Switching will be a gradual, multi-step process, the most visible step
> of which will be the switch of MediaWiki application servers and
> associated data stores. This will happen on April 19th (eqiad->codfw)
> and May 3rd (codfw->eqiad), both at 14:00 UTC. During those windows, the
> sites will be placed into read-only mode, for a period that we estimate
> to last approximately 20 to 30 minutes.
> 
> Furthermore, the deployment train will freeze for the weeks of April
> 17th and May 1st[1], but operate normally on the week of April 24th, in
> order to exercise our ability to deploy code while operating from the
> backup datacenter.
> 
> 1: https://wikitech.wikimedia.org/wiki/Deployments
> 
> Compared to last year we have improved our processes considerably[2], in
> particular by making more services operate in an active/active manner,
> as well as by working on an automation and orchestration framework[3] to
> perform parallel executions across the fleet. The core of the MediaWiki
> switchover will be performed semi-automatically using a new software[4]
> that will execute all the necessary commands in sequence with little
> human involvement, and thus lowering the risk of introducing errors and
> delays.
> 
> 2: https://wikitech.wikimedia.org/wiki/Switch_Datacenter
> 3: https://github.com/wikimedia/cumin
> 4: https://github.com/wikimedia/operations-switchdc
> 
> Improving and automating our processes means that we're not going to be
> following the exact same steps as last year. Because of that, and
> because of other changes introduced in our environment over the course
> of the year, there is a possibility of errors creeping into the process.
> We'll certainly try to fix any issues that arise during those weeks and
> we'd like to ask everyone to be on high-alert and vigilant.
> 
> To report any issues, please use one of the following channels:
> 
> 1. File a Phabricator issue with project #codfw-rollout
> 2. Report issues on IRC: Freenode channel #wikimedia-tech (if urgent, or
> during the migration)
> 3. Send an e-mail to the Operation

Re: [Wikitech-l] eqiad->codfw datacenter switchover, weeks of Apr 17th/May 1st

2017-05-05 Thread Faidon Liambotis
Hi all,

The switchback to eqiad is successfully completed as this week. The main
read-only phase of the switch happend already on Wednesday at 14:30 UTC,
as originally scheduled.

The readonly time was approximate 13 minutes in this run (down from 17)
and was more uneventful than the switchover two weeks ago. Multiple bugs
were fixed and small features were added over the course of the past two
weeks that explain this drop in runtime and increased resilience.

Short summary of where we're at:

- Extension:Cognate caused a brief x1 outage; it's still unclear whether
  this was switchover related or not and it's still being investigated.
  [T164407]

- The job queue corruption issue that was found during the first
  switchover was worked around, but a long-term fix to the issue is
  still pending. [T163337]

- The Content Translation issues are still being worked on, but didn't
  cause an issue this time. [T163344]

- We were unfortunately unable to use the new MediaWiki etcd integration
  this time either, and despite Tim's herculean efforts at the last
  minute, due to stability reasons. [T156924]

The workboard for the project is still at #codfw-rollout and will
continue to be updated as we go through these issues.

This was, overall, a success. A number of issues were identified and
most of them have already been fixed -- this was the purpose of this
whole endeavour :)

Many thanks to everyone that has contributed to this goal! This was
really an effort across multiple teams and many individuals, all of
which worked hard and under strict deadlines to contribute to this
project. I am personally grateful to all of you, you know who you are :)

Work on this project will continue throughout the next fiscal year (July
2017 - June 2018) across the Technology department, with the ultimate
holy-grail goal of an active-active setup for all of our services. We'll
keep you all up-to-date on the progress.

Best regards,
Faidon
--
Faidon Liambotis
Principal Operations Engineer
Acting Director of Technical Operations
Wikimedia Foundation

On Wed, Apr 19, 2017 at 08:33:49PM +0300, Faidon Liambotis wrote:
> Hi all,
> 
> The first part of this switch completed successfully today. The total
> time in which the projects were in read-only was approximately 17
> minutes.
> 
> There were a few hiccups that were or still are being dealt with, the
> most major ones identified so far being:
> 
> - The switch to codfw in combination with what looks like a
>   ContentTranslation bug resulted into an overload of the x1 database
>   shard which in turn affected Echo & Flow. The total outage for Echo &
>   Flow was from 15:36 until 16:18 UTC.
>   
>   ContentTranslation was also misbehaving since 15:36 and was disabled
>   on all Wikipedias at 15:57 UTC. The Language team and Roan are trying
>   to identify the root cause. It will be gradually reenabled once it's
>   found and the issue gets fixed. [T163344]
> 
> - ORES Watchlist items appear duplicated. the ORES team is still
>   investigating this. [T163337]
> 
> - codfw API database slaves overloaded and an additional one had to be
>   pooled in in order to handle the load but it doesn't look like it was
>   enough to fully alleviate this. The API is and has been available
>   throughout this work, albeit with reduced performance. This is still
>   being dealt with by the DBA team. [T163351]
> 
> - The IPs of the Redis servers used for locking was misconfigured in
>   mediawiki-config, which resulted into file uploads (e.g. to Commons)
>   and deletions to not work until it was manually fixed. This was not
>   working between 14:30-14:44 UTC. [T163354]
> 
> The issues above are still being worked on and the situation is
> evolving, some some of the above may be inaccurate already. Phabricator
> is the authoritative place for both the root cause and mitigation action
> of all of those issues, with #codfw-rollout being the common tag.
> 
> Please follow-up there either on existing issues or new ones that you
> may discover over the course of the next 2-3 weeks :)
> 
> Thanks to everyone in and outside of ops for both the substantial amount
> of work that has gone into preparing for this day, as well as for all
> the firefighting for the better part of today. Expect to hear more from
> us when this project concludes.
> 
> Best,
> Faidon
> --
> Faidon Liambotis
> Principal Operations Engineer
> Acting Director of Technical Operations
> Wikimedia Foundation
> 
> On Fri, Apr 07, 2017 at 04:58:09PM +0300, Faidon Liambotis wrote:
> > Hi all,
> > 
> > You may have heard already that, like last year, we are planning to
> > switch our active datacenter from eqiad to codfw in the week of April
> > 17th and back to eqiad two weeks later, on the week 

Re: [Wikitech-l] Try out the commit message validator

2017-11-07 Thread Faidon Liambotis
On Tue, Nov 07, 2017 at 11:19:57AM -0700, Bryan Davis wrote:
> We could probably add checks for some common ones if someone compiled a list.
> 
> Running a full spell check would be difficult because of the number of
> false positives there would be based on a "normal" dictionary. Commit
> messages often contain technical jargon (maybe something to try and
> avoid) and snippets of code  (e.g. class names like
> TemplatesOnThisPageFormatter) that would not be in any traditional
> dictionary that we could count on being on the local host.

Debian's lintian (lint tool for packages) has a check for common
typos/misspellings in its informational mode. The package ships with
/usr/bin/spellintian which is a simple spellchecker that can run
independently.

The benefit of using spellintian over e.g. aspell is that it addresses
the issues you already identified: a) it just identifies typos, not
complaining on unknown words it doesn't know, b) it's been created from
observing typos in source code and package descriptions in the wild, so
it's tailored to technical jargon and their misspellings. It could be a
good fit to git commit messages.

That doesn't mean it's free of false positives though, so I wouldn't
recommend to use it in a voting check in a CI pipeline.

Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Allow HTML email

2020-09-23 Thread Faidon Liambotis
On Wed, Sep 23, 2020 at 02:45:37PM +1000, Tim Starling wrote:
> We still haven't heard from Faidon who, last I heard, still reads his
> emails by piping telnet into less or something. But I think he can
> make sense of multipart/alternative as long as it's not base-64
> encoded. You should send the plain text as the first part so he
> doesn't have to page down too far  ;)

On behalf of the Mutt & other console email clients user club, we
approve of this change. We haven't formed consensus on it yet, but I
suspect we'd even be willing to go one step further and negotiate the
use of emojis as well (perhaps even emojis in subject lines). No
promises for responding in HTML, though; that's probably going to have
to wait another century.

Faidon

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l