Re: [Wikitech-l] wikipedia lacks a share' button

2011-10-30 Thread Marco Schuster
On Sat, Oct 29, 2011 at 4:22 PM, Daniel Friesen
li...@nadir-seen-fire.com wrote:
 - It doesn't scale very well. If you do try to add more vendors and users
 do enable most of them, you still end up loading from each enabled vendor
 slowing things down.
With the exception of the FB Like/Recommend button, everything (even
the FB share link) is just an image paired with a HTML link. Maybe
other sites allow embedding their logos, so the only image which needs
to be loaded externally is the FB one.

 - Frankly the UI is pretty bad.
That's the price you have to pay for total privacy, unfortunately.

 - Once you enable a vendor we drop right back to a 3rd party script being
 injected into the page such that it can do malicious things.

 Btw, if you're a 3rd party with a script in a page you can go pretty far
 abusing XHR and history.pushState to make it look to a user like they're
 browsing the website normally when in reality they're on the same page
 with the script still running. Oh, and that includes making it look like
 you're safely visiting the login page when in reality you didn't change
 pages and the script is still running ready to catch passwords.
Do you have any links with further info on this?

Marco

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wikipedia lacks a share' button

2011-10-29 Thread Marco Schuster
Hi,

On Sun, Oct 23, 2011 at 7:03 PM, Roan Kattouw roan.katt...@gmail.com wrote:
 This is the reason why we absolutely cannot have the
 Facebook Like button: Facebook makes you use an FB-hosted button image
 (and JS too, I think), collects data from every user that views the
 Like button even if they don't click it (this is the part that
 violates the privacy policy), and disallows self-hosting.

German IT news site heise.de solved the privacy and load-time problem:
http://www.heise.de/extras/socialshareprivacy/

Unfortunately it's in German, but the code is easy to understand.

Marco

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Proposed Authentication Schema for Wikimedia projects

2011-10-20 Thread Marco Schuster
Nice idea, but most users hate inputting a password/drawing on the
screen to unlock it. So if you lose your phone or it gets stolen, all
your credentials are lost and in the hands of an unknkown attacker.
Also, phones tend to break during day-to-day usage (beverage spills,
falls from desks).

While these problems were mentioned on the design page, I have another
scenario: colleague comes over to your desk, swaps your phone with his
so you don't notice immediately, pranks you on Facebook or Wikipedia
and then swaps the phone back.

Marco

On Tue, Oct 18, 2011 at 4:51 AM,  packs-24...@mypacks.net wrote:
 I originally posted this idea on G+ and Arthur Richards suggested I 
 cross-post it here.  My friend, Isaac Potoczny-Jones is a computer security 
 professional.  He developed a new authentication schema that layers on top of 
 existing technologies and leverages a user's smartphone and QRCodes to 
 improve authentication usability, eliminate human-generated passwords, and 
 further improve security by separating the authentication channel from the 
 login session.   He's calling this capability Animate Login and as part of 
 the proof of concept, he developed a MediaWiki implementation.   I believe 
 the Wikimedia foundation should pursue adding this technique as part of the 
 primary login options for it's projects.  I would personally love to be able 
 to just point my phone at the login screen and have the system log me in to 
 Wikipedia without having to type anything or remember complex passwords.  
 Wikimedia has worked hard to consolidate logins across the many projects over 
 the last couple years and this would be a great way of providing seamless 
 login.   It should be very low overhead and relatively easy to implement.  
 Isaac is very interested in seeing his tool put to use on Wikipedia.   
 Wikimedia could lead the way to improved authentication that also vastly 
 improves the user experience!

 Isaac explains the project in some detail on this Google Plus post:
 https://plus.google.com/u/0/112702172838704084335/posts/B9UR2zzDY3f?hl=en

 His landing page for the project is here:
 http://animate-innovations.com/content/animate-login

 The website has videos, links to a MediaWiki instance where its in use and 
 more.

 From the conversations I've had with him, I know that he has thought long and 
 hard about this application and has sought to address/understand all of the 
 potential attack vectors.  Compared to human-generated passwords, this would 
 be vastly more secure and dramatically improve the user experience of logging 
 in.  It might even entice new or old editors to login and give it a try and 
 thus re-engage them in editing.  I'm also certain it could generate a fair 
 bit of buzz as people learn they can use their smartphone to login to 
 Wikipedia.

 I hope you'll consider working with Isaac.  I'll point him to this thread so 
 he knows it is here.   I know he'd love to see this implemented in Wikipedia.

 Don

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] testing mobile browsers?

2011-07-09 Thread Marco Schuster
On Sat, Jul 9, 2011 at 8:51 PM, Håkon Wium Lie howc...@opera.com wrote:
 Opera comes in two flavors for mobile devices: Opera Mini and Opera
 Mobile. Opera Mobile is, indeed, close to the desktop version in the
 sense that it runs the same display, javascript engine etc. on the
 device.

The versions of Opera Mobile floating in the wild are kinda different.
Every HTC HD2 user with Windows Mobile 6.5 is likely to still run the
ages-old buggy HTC version (8.x AFAIR, compared to current v10!), as
the official versions STILL don't support the multi-touch features
even though libraries exist which abstract the multi-touch -.-

Marco

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Special:Search goose chase

2011-06-19 Thread Marco Schuster
On Sun, Jun 19, 2011 at 8:40 PM, MZMcBride z...@mzmcbride.com wrote:
 The main issue (to me) is that it says Did you mean: [bold blue link],
 which in this context I think most users would expect to be able to click
 and go directly to the article. It should be simple enough to query the page
 table for page existence, but perhaps two messages would be better here.
+1, this has confused me for ages (at least make the link some other
color than blue!)

Marco

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] wiki/ prefix in URL

2011-05-17 Thread Marco Schuster
On Wed, May 18, 2011 at 2:00 AM, John Vandenberg jay...@gmail.com wrote:
 Is there something else on these virtual hosts other than a few
 regexes which are extremely unlikely to be used as page names (i.e.
 \/w\/.*\.php).
Anything beginning with /w/ must be disallowed as a page title for this to work.

Marco

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] search=steven+tyler gets Steven_tyler

2011-05-15 Thread Marco Schuster
On Sun, May 15, 2011 at 5:02 PM, Aryeh Gregor
simetrical+wikil...@gmail.com wrote:
 You cannot fix the problem by doing accent/diacritic normalization.
 i and I are the same letter in English but different letters in
 Turkish.  You cannot get around that.  We'd need to have a separate
 case-folding algorithm for Turkish wikis, or make them use one that's
 incorrect for their language.
Actually non-Turkish/Azerbaijan wikis have this problem too, if the
wiki has articles or redirects using these characters...

Marco

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] integration of MediaWiki and version control system as backend?

2011-05-10 Thread Marco Schuster
On Sat, May 7, 2011 at 12:10 AM, Neil Kandalgaonkar ne...@wikimedia.org wrote:
 I don't believe anybody has successfully done this for MediaWiki, and to
 my knowledge this is would be difficult to impossible. Our backend
 storage has to implement SQL in some way.

What about LDAP? It's used for authentication anyway, and it'd open
the way to block users from editing,reading etc. certain namespaces or
articles totally fine-grained... simply make a sub-element
userWriteBlock: cn=foobar,dc=en,dc=wikipedia,dc=org to an article /
NS entry you don't want a certain user to be able to edit.
Files could also be stored in a LDAP daemon, though this DOES suck for
anyone trying to edit the LDAP tree by hand.

Marco

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Wiki Inter-Searchability

2011-03-21 Thread Marco Schuster
Hi,

On Mon, Mar 21, 2011 at 3:49 PM, Tod listac...@gmail.com wrote:
 Is this possible?  Is the wiki search driven by a crawler that would
 follow the links on that new wiki home page?  If not, is there an
 approach I could follow to be able to provide search capability against
 a select number of these individual wikis under one umbrella?
I've never tried it personally, but I think SphinxSearch may be worth
a check; it works directly against the database.

Marco

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Highest Priority Bugs

2011-03-16 Thread Marco Schuster
On Wed, Mar 16, 2011 at 11:38 AM, Roan Kattouw roan.katt...@gmail.com wrote:
 Normal shell users can execute all but one of the steps required for
 wiki creation: root access is needed to create the DNS entry for the
 new subdomain. Previously we just had RobH handle all wiki creations,
 but he's been working almost exclusively on setting up the Virginia
 datacenter for a while now AFAIK. I previously suggested on IRC that
 we could have regular shell users like Ashar do the wiki creations at
 a scheduled time and assign a root to do the DNS stuff for them.

Why is a rootuser needed for changing DNS entries? A zonefile is a
normal textfile, after all - and if you use PowerDNS on one server
configured as supermaster with a MySQL backend and other servers with
PowerDNS as superslave (backend is not important there), you don't
even need shell access to manage your DNS. Not even for creating new
zones, as the supermaster makes all slaves automatically sync those
zones where the invidual slaveserver is listed with a NS entry.
/powerdns_ad

Marco

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] How would you disrupt Wikipedia?

2011-01-10 Thread Marco Schuster
On Mon, Jan 10, 2011 at 7:25 PM, Dmitriy Sintsov ques...@rambler.ru wrote:
 There's been done everything at my primary work to undermine my
 MediaWiki deployment efforts - that it easily can be installed via the
 linux package - so why he is installing that manually, markup is
 primitive, inflexible, PHP is inferior language, use ASP.NET
 instead and so on.
ASP.NET? Only if you want all your sourcecode exposed.

Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Missing Section Headings

2011-01-03 Thread Marco Schuster
On Mon, Jan 3, 2011 at 6:59 PM, Platonides platoni...@gmail.com wrote:
 He is indeed using IE5 for Mac.

 I guess that adding
 h1, h2, h3, h4, h5, h6 {
    overflow: visible;
 }

 to his monobook.css will fix it.
 Do we have some CSS trick to target IE5 Mac?
 Removing the headers isn't too friendly, so if there's an easy fix, I
 would apply it.
There is some CSS conditional stuff for IE5/Mac, didn't we have a CSS
fix file especially for this browser?

Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Commons ZIP file upload for admins

2010-11-30 Thread Marco Schuster
On Tue, Nov 30, 2010 at 8:48 AM, Dmitriy Sintsov ques...@rambler.ru wrote:
 * Bryan Tong Minh bryan.tongm...@gmail.com [Tue, 30 Nov 2010 08:44:43
 +0100]:
 I think that the most recent version should be sufficient. I don't
 think Java would break backwards compatibility: users wouldn't be
 happy if their old jar suddenly stops working on a new JVM.

 Why an outdated and inefficient ZIP format, after all? 7zip is
 incompatible to JVM, should it be a better choice for archive uploads?
 Or, that is too hard to parse on PHP side (I gueses console exec is
 required)?
You can create a zip easily on all major OSes with drag'n'drop.
Windows supports it IIRC from Win 98 SE and up, a standard Linux by
the tools the desktop installs (for KDE, it once was Ark), and MacOS
also delivers ZIP out of the box.
For ZIP, there are even built-in PHP functions to handle it.
7zip is, though open source, requiring third-party plugins, both for
the OS and servers, and 7zip is not really widespread. RAR and ZIP are
the dominant formats in cross-platform data exchange.

Marco


-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Refresh of the Mediawiki logo

2010-11-11 Thread Marco Schuster
On Thu, Nov 11, 2010 at 9:07 PM, David Gerard dger...@gmail.com wrote:
 On 11 November 2010 19:55, MZMcBride z...@mzmcbride.com wrote:

 The muted colors are a nice start, but I think the yellow still really
 sticks out when the logos are presented as a family. Maybe the petal color
 could be changed?


 I think this not being a photo does not make it better than the photo
 version. It's entirely unclear there's enough of a problem to be
 solved here.
I think it does look better than the older version, nice work.

Marco


-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Resource Loader problem

2010-11-10 Thread Marco Schuster
On Wed, Nov 10, 2010 at 10:56 AM, Roan Kattouw roan.katt...@gmail.com wrote:
 We're not looking for a full-blown parser, just one that has a few
 basic features that we care about. The current JS parser only
 supports expansion of message parameters ($1, $2, ...), and we want
 {{PLURAL}} support too. AFAIK that's pretty much all we're gonna need.
 Michael Dale's implementation has $1 expansion and {{PLURAL}}, AFAIK,
 and maybe a few other features.

Actually PHP and JS are a bit similar. Different function names and
slight syntax differences, but I think it is possible to take the
existing PHP parser, strip out the references to MW internals and
replace the database queries with appropriate API calls.
That would also enable a true WYSIWYG editor or live preview at
least, as having a JS parser will also allow that the resulting DOM
nodes have some kind of reference attribute which can be looked at
to find the wikitext responsible for the creation of the node (and so,
enable inline editing).
Actually, this seems just perfect for a GSoC project for next year:
Port the MW parser to JavaScript, and a followup project to make a
WYSIWYG/inline editor based on it.

Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Cross wiki script importing

2010-11-02 Thread Marco Schuster
On Tue, Nov 2, 2010 at 1:09 AM, bawolff bawolff...@gmail.com wrote:
 May I ask how? If you're logged in to the secure server, then the
 cookies won't get transmitted to the unsecure server when loading js
 from them. At the very worse (if we really put on our tin foil hats) I
 suppose someone could intercept the non-secured js script, do a man in
 the middle type thing and replace the script with malicious js.
 However if someone actually has the ability to do that, they could
 already do that with the geoip lookup. Thus I don't see how doing the
 importScriptURI reduces security.
Firefox and IE will whine that the site attempts to load unsecure
resources. Also, it is indeed possible to transmit cookies; it's
enough that the user has also logged in into the unsecure servers in
the past and is e.g. at a public WiFi hotspot now and so uses the
secure gateway.

Marco


-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Firesheep

2010-10-25 Thread Marco Schuster
On Mon, Oct 25, 2010 at 7:15 PM, Hay (Husky) hus...@gmail.com wrote:
 Has anyone seen this?

 http://codebutler.com/firesheep

 A new Firefox plugin that makes it trivially easy to hijack cookies
 from a website that's using HTTP for login over an unencrypted
 wireless network. Wikipedia isn't in the standard installation as a
 site (lots of other sites, such as Facebook, Twitter, etc. are). We
 are using HTTP login by default, so i guess we're vulnerable as well
 (please say so if we're using some other kind of defensive mechanism
 i'm not aware of). Might it be a good idea to se HTTPS as the standard
 login? Gmail has been doing this since april this year.
Firesheep works by snooping cookies, not login processes, and it's
even without software like this incredibly easy to own someone. All it
needs to own a Wikipedia admin or user is being in the same network as
him.
The admin in question doesn't even have to visit Wikipedia directly,
there are enough pages hotlinking to upload.wikimedia.org, which
should cause the browser to transmit session data.

If you're in need of using secure login, then you can use the secure
webserver, but in the past it had some load issues.

Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Commons ZIP file upload for admins

2010-10-25 Thread Marco Schuster
On Mon, Oct 25, 2010 at 10:09 PM, Aryeh Gregor
simetrical+wikil...@gmail.com wrote:
 On Mon, Oct 25, 2010 at 3:50 PM, Max Semenik maxsem.w...@gmail.com wrote:
 Instead of amassing social constructs around technical deficiency, I
 propose to fix bug 24230 [1] by implementing proper checking for JAR
 format.

 Does that bug even affect Wikimedia?  We have uploads segregated on
 their own domain, where we don't set cookies or do anything else
 interesting, so what would an uploaded JAR file even do?
upload.wikimedia.org could end up on Google's Safe Surfing (or however
it's called) blacklist for hosting malicious .jar's which are injected
on another pwned web site or loaded through pwned advertising brokers.
Given the fact that Java is the 2nd biggest exploit vector in terms of
exploits (but 1st in terms of impact - users don't update Java as
often as the Adobe Reader), it should not be allowed to upload JARs
(or things that look like something else, but infact can be loaded and
executed by the JRT) to Wikipedia.

Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Convention for logged vs not-logged page requests

2010-10-19 Thread Marco Schuster
On Wed, Oct 20, 2010 at 12:49 AM, Krinkle krinklem...@gmail.com wrote:
 But the short version without /w/index.php but with direct ?parameters
 doensn't for for action=raw (ctype=text/javascript)

 See the errror on: 
 http://meta.wikimedia.org/wiki/User:Krinkle/global.js?action=raw

Strange. I'm sure this is to prevent users from using Wikipedia as
spy-javascript-hoster, but why does
http://meta.wikimedia.org/w/index.php?title=User:Krinkle/global.jsaction=raw
work then?

Marco


-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] About wiki links: could they point to page id?

2010-10-04 Thread Marco Schuster
On Mon, Oct 4, 2010 at 10:54 AM, Strainu strain...@gmail.com wrote:
 2010/10/4 Alex Brollo alex.bro...@gmail.com:
 It's strange (but I guess that there's a sound reason) that plain wikilinks
 point to a variable field of wiki records (the name of the page) while many
 troubles would be solved, if they could point to the invariable field of
 such records: the id. The obviuos restult is, that all links are broken (and
 need fixing) as soon as a page is moved (t.i. renamed).

 Don't redirects exist specifically for that?
Better, use permalinks, which point to a specific revision id.



 My question is: which is the sound reason for this strange thing? There's
 some idea about fixing this?

 Err.. perhaps they decided people should be able to comprehend the
 link destianation? Plus I remember something about nice URLs being a
 MUST DO in SEO a while ago...but I'm not 100% sure on that. I
 certainly hope this won't change anytime soon, on Wikipedia at least.
It 'd be nice to have the page_id as optional parameter... but I think
you can get the page's title via the API.

Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Static dump of German Wikipedia

2010-09-27 Thread Marco Schuster
Hi all,

just a quick status update: The dump is currently running at 2req/s
and ignores all pages which have is_redirect set; also I changed the
storage method: the new files are appended to
/mnt/user-store/dewiki_static/articles.tar, as I noticed I was filling
up the inodes of the file system; doing the storage inside a tarball
will prevent this and I don't have to waste time downloading the tons
of files to my PC, only one huge tarball when its done.
I also managed to get a totally stripped down version of the Vector
skin file loading an article via JSON (I won't release it now though,
it's a damn hack - nothing except loading works, as I have removed
every JS file... should be pretty till Sunday).
Current dump position is at 92927, stripping out the redirects 53171
articles have really been downloaded, resulting in 770MB of
uncompressed tar (I expect gzip or bz2 compression to save lots of
space though).
For the redirects: how do I get the redirect target page (maybe even
the #section)?

Marco

PS: Are there any *fundamental* differences between the Vector skin
files of different languages except the localisation? Could this maybe
be converted to Javascript, maybe $(#footer-info-lastmod).html(page
was last changed at foobar)?


-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Toolserver-l] Static dump of German Wikipedia

2010-09-24 Thread Marco Schuster
On Sat, Sep 25, 2010 at 12:56 AM, Platonides platoni...@gmail.com wrote:
 Ariel T. Glenn wrote:
 Στις 23-09-2010, ημέρα Πεμ, και ώρα 21:27 -0500, ο/η Q έγραψε:
 Given the fact that static dumps have been broken for *years* now,
 static dumps are on the bottom of WMFs priority list; I thought it
 would be the best if I just went ahead and built something that can be
 used (and, of course, improved).

 Marco

 That's what I just said. Work with them to fix it, IE: volunteer. IE:
 you fix it.


 Actually it's not so much that they are on the bottom of the list as
 that there are two people potentially looking at them, and they are
 Tomasz (who is also doing mobile) and me (and I am doing the XML dumps
 rather than the HTML ones, until they are reliable and happy).

 However if you are interested in working on these, I am *very* happy to
 help with suggestions, testing, feedback, etc., even while I am still
 woroking on the XML dumps.  Do yuu have time and interest?

 Ariel

 Most (all?) articles should be already parsed in memcached. I think the
 bottleneck would be the compression.
 Note however that the ParserOutput would still need postprocessing, as
 would ?action=render. The first thing that comes to my mind is to remove
 the edit links (this use case alone seems enough for implementing
 editsection stripping). Sadly, we can't (easily) add the edit sections
 after the rendering.
This should be doable using a simple regex which plainly goes for
span class=editsection.

Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Static dump of German Wikipedia

2010-09-23 Thread Marco Schuster
Hi all,

I have made a list of all the 1.9M articles in NS0 (including
redirects / short pages) using the Toolserver; now I have the list I'm
going to download every single of 'em (after the trial period tonight,
I want to see how this works out. I'd like to begin with downloading
the whole thing in 3 or 4 days, if noone objects) and then publish a
static dump of it. Data collection will be on the Toolserver
(/mnt/user-store/dewiki-static/articles/); the request rate will be 1
article per second and I'll download the new files once or twice a day
to my home PC, so there should be no problem with the TS or Wikimedia
server load.
When this is finished in ~ 21-22 days, I'm going to compress them and
upload them to my private server (well, if Wikimedia has an archive
server, that 'd be better) as a tgz file so others can play with it.
Furthermore, though I have no idea if I'll succeed, I plan on hacking
a static Vector skin file which will load the articles using jQuery's
excellent .load() feature, so that everyone with JS can enjoy a truly
offline Wikipedia.

Marco

PS: When trying to invoke /w/index.php?action=render with an invalid
oldid, the server returns HTTP/1.1 200 OK and an error message, but
shouldn't this be a 404 or 500?

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Is the $_SESSION secure?

2010-09-23 Thread Marco Schuster
On Fri, Sep 24, 2010 at 1:36 AM, Neil Kandalgaonkar ne...@wikimedia.org wrote:
 On 9/23/10 2:24 PM, Ryan Lane wrote:

 The contents of that session on the server are unencrypted, correct?
 Depending on what the secret is, he may or may not want to use it. For
 instance, that is probably a terrible place to put credit card numbers
 temporarily.

 Good point, but in this case I'm just storing the path to a temporary file.

 The file isn't even sensitive data; it's just a user-uploaded media file
 for which the user has not yet selected a license, although we
 anticipate they will in a few minutes.
If it's user-uploaded, take care of garbage collection; actually, how
does PHP handle it if you upload a file and then don't touch it during
the script's runtime? Will it automatically be deleted after the
script is finished or after a specific time?

Marco


-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Toolserver-l] Static dump of German Wikipedia

2010-09-23 Thread Marco Schuster
On Fri, Sep 24, 2010 at 3:44 AM, Marcin Cieslak sa...@saper.info wrote:
 John Vandenberg jay...@gmail.com wrote:

 http://download.wikimedia.org/dewiki/

 Is there any problem with using them?

 I think they are from June 2008.

 Are they?

 http://download.wikimedia.org/dewiki/20100903/

These are the database dumps. In order to get any HTML out of it, you
need to set up either MediaWiki and/or a replacement parser; not to
mention the delicate things enWP folks did with template magic, which
requires setting up ParserFunctions - these might even depend on
whatever version is currently running live.
That's why static dumps (or ?action=render output) are the thing you
need when you want to create offline versions or things like
Mobipocket Wikipedia (which is my actual goal with the static dump).

Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] ResourceLoader, now in trunk!

2010-09-07 Thread Marco Schuster
Hi,

On Tue, Sep 7, 2010 at 9:44 AM, Roan Kattouw roan.katt...@gmail.com wrote:
 Also, it would
 be great if these high-level JS-libraries like jQuery actually were
 ported into DOM API level (native browser's implementation instead of
 extra JS layer). However, these questions are to FF/IE/Opera
 developers...
 I definitely think this is the future, provided it's implemented
 reliably cross-browser. Also, you'd probably want to have a fallback
 library for browsers that have no or incomplete (e.g. missing a jQuery
 feature that's newer than the browser) native support.
Please, no. The various browsers all have their problems with
standards even now, and I don't expect they'd get a jquery (or
whatever JS framework) clientside implemented without having all sorts
of problems.

Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Developing true WISIWYG editor for media wiki

2010-08-03 Thread Marco Schuster
On Tue, Aug 3, 2010 at 10:53 AM, Jacopo Corbetta
jacopo.corbe...@gmail.com wrote:
 However, the editing mode provided by browsers is a nightmare of
 incompatibilities. Basically, each browser produces a different output
 given identical commands, so currently MeanEditor is not completely up
 to the task. An external application might be an interesting solution.

I don't have the link ready, but Google solved this in Google Docs by
re-implementing this in Javascript... they intercept mouse
movements/clicks and keyboard events and then javascript-render the
page.
Given the complexity of wikitext, I fear rewriting the parser in
Javascript is the only way to get a 100% compatible wikitext editor...

Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] accessibility problems with vector

2010-06-20 Thread Marco Schuster
On Sun, Jun 20, 2010 at 2:12 PM, Tisza Gergo gti...@gmail.com wrote:
 Anyway, removing table-related attributes doesn't offer much advantage in
 itself. There will be a few validator warnings about it, so what? Getting rid 
 of
 table layouts would be nice, but IE6/7 do not understand display:table either,
 so until IE 6 and 7 die and 8 stops trying to be backwards-compatible, they 
 are
 here to stay, I'm afraid.
IE6 is (thanks god) dying out, one problem less... but yeah, that IE7
doesn't like display: table sucks. But why is IE8 falling back to
compat-mode at Wikimedia sites? Any way to force IE8 to
standard-conforming mode?

Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [GSoC] Extension management platform

2010-06-02 Thread Marco Schuster
Hey,

just one thing which makes Wordpress AutoUpdate suck and would be
great if you'd take care of it while designing: check for the
permissions of ALL files you try to overwrite / update BEFORE trying
the update - maybe include a update.xml for each version delta which
details the changed files. Then make a fileperms() on each one and
look if the www-data user is allowed to write on this file.

Third time in a row Wordpress and the Debian package screwed up with
either itself or some PHP module here on writing permissions... and I
had to un-mess the update by hand (and I'm not the only one)... don't
want to see this again in MediaWiki :(

Marco

On Thu, Jun 3, 2010 at 1:55 AM, Jeroen De Dauw jeroended...@gmail.com wrote:
 Hey all,

 As a lot of you already know, I'm again doing a Google Summer of Code
 project for the WMF this year. The goal of this *awesome* project is to
 create a set user friendly administration interfaces via which
 administrators can manage the configuration of their wiki and extensions as
 well as manage extension installation and updating. The user experience
 should be as *awesome* as the Wordpress one, or even better.

 After doing research in existing code and talking to the relevant people, I
 created a little roadmap [0] of how I plan to proceed with my project. Any
 feedback and comments on this would be very much appreciated (esp the critic
 ones :). It'd be to bad to reinvent things already achieved by people,
 simply by me not knowing about it! I hope to start with the actual coding by
 this weekend, and will update the roadmap, and the docs page itself [1], as
 well as my blog [2], as I make progress.

 [0] http://www.mediawiki.org/wiki/Extension_Management_Platform/Roadmap
 [1] http://www.mediawiki.org/wiki/Extension_Management_Platform
 [2] http://blog.bn2vs.com/tag/extension-management/

 Cheers

 --
 Jeroen De Dauw
 * http://blog.bn2vs.com
 * http://wiki.bn2vs.com
 Don't panic. Don't be evil. 50 72 6F 67 72 61 6D 6D 69 6E 67 20 34 20 6C 69
 66 65!
 --
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l




-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] a wysiwyg editor for wikipedia?

2010-06-01 Thread Marco Schuster
On Wed, Jun 2, 2010 at 1:42 AM, K. Peachey p858sn...@yahoo.com.au wrote:
 On Tue, Jun 1, 2010 at 11:22 PM, Marco Schuster
 ma...@harddisk.is-a-geek.org wrote:
 On Tue, Jun 1, 2010 at 4:09 AM, Jacopo Corbetta
 jacopo.corbe...@gmail.com wrote:
 In our experience, the biggest obstacle is to get the different
 browsers to reliably make the same changes to HTML. The editor
 interface is non-standard, and browsers sometimes disagree on encoding
 rules, escaping, choice of tags, etc.
 We could do the really hard way, like Google did with Google Docs
 (http://googledocs.blogspot.com/2010/05/whats-different-about-new-google-docs.html):
 make *everything* via JS by capturing keystrokes and mouse movements.
 This way a consistent and reproducible user experience on all
 platforms can be achieved. And by doing it all in JS, the editor could
 also generate a wikitext-delta right away and doesn't need to transfer
 the whole page's wikitext.

 Marco
 Google Doc's interface acts like sh*t unless your on a super dooper
 decent computer and gives negative views on the usability of a
 service.
 -Peachey
I run a 3 year old laptop (Intel C2D though, but near-to-no cpu load
in Firefox, even less in Chrome) and no problems with it, except that
printing always does totally not look like what I see in the
browser... hopefully Google will make proper PDF export for printing.

Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Reasonably efficient interwiki transclusion

2010-05-25 Thread Marco Schuster
On Tue, May 25, 2010 at 8:58 PM, Roan Kattouw roan.katt...@gmail.comwrote:

 To the point of whether parsing on the on the distant wiki makes more
 sense: I guess there are points to be made both ways. I originally
 subscribed to the idea of parsing on the home wiki so expanding the
 same template with the same arguments would always result in the same
 (preprocessed) wikitext, but I do see how parsing on the local wiki
 would help for stuff like {{SITENAME}} and {{CONTENTLANG}}.


Why not mix it? Take other templates etc. from the source wiki and set magic
stuff like time / contentlang to target wiki values.

Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Bugzilla Weekly Report

2010-05-17 Thread Marco Schuster
On Mon, May 17, 2010 at 10:49 AM, Roan Kattouw roan.katt...@gmail.comwrote:

 2010/5/17  repor...@isidore.wikimedia.org:
  Bugs marked FIXED  :  1622

[snip]

  Top 5 Bug Resolvers
 
  roan.kattouw [AT] gmail.com 19
  jeluf [AT] gmx.de   15
  innocentkiller [AT] gmail.com   11
  sam [AT] reedyboy.net   6
  tparscal [AT] wikimedia.org 5
 
 That can't be right.

It can, because of the cleanup of Chad. Depending on how the reporter
software calculates, direct DB changes might confuse the counting
algorithms.

Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Technical means for tagging content (was: [Foundation-l] Statement on appropriate educational content)

2010-05-09 Thread Marco Schuster
On Sun, May 9, 2010 at 3:09 PM, K. Peachey p858sn...@yahoo.com.au wrote:

 On Sun, May 9, 2010 at 11:08 PM, K. Peachey p858sn...@yahoo.com.au
 wrote:
  On Sun, May 9, 2010 at 10:50 PM, Ævar Arnfjörð Bjarmason
  ava...@gmail.com wrote:
  It's pretty easy to do arbitrary content tagging (and filtering now).
  You just add a template or external link to the page. E.g. {{PG-13}}.
 
  Then all some third party has to do is to download
  templatelinks.sql.gz (or externallinks.sql.gz) in addition to the
  image dump.
 
  You just have to start getting people to tag things consistently. The
  good thing is that you can start now without any additional software
  support.
 Also, Do we even have image dumps anymore?


AFAIR backups exist, but I don't know if there are any public dumps. I'd
imagine they're simply too big to maintain and archive...

Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Broken videos

2010-03-17 Thread Marco Schuster
On Wed, Mar 17, 2010 at 5:37 PM, Aryeh Gregor
simetrical+wikil...@gmail.com wrote:
 On Wed, Mar 17, 2010 at 10:39 AM, Platonides platoni...@gmail.com wrote:
 Would it help adding a link rel=prefetch to the first video in the page?

 In Firefox, yes.  Does anyone else implement that?  If it's only
 Firefox, we could just as well replace Cortado with video autobuffer
 to begin with as soon as the page loads.  In Chrome and Safari, we
 don't even need the autobuffer, since they don't implement it.
 (Actually, video autobuffer preload=auto would be the current way to
 do it, given recent spec changes.)
I hope no one is ever insane enough to use this. Imagine those people
with cellphones and no data transfer flat (~70% of mobile internet
users) - their bills will skyrocket when even a single video is set to
auto-preload.

Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Broken videos

2010-03-17 Thread Marco Schuster
On Wed, Mar 17, 2010 at 9:11 PM, Trevor Parscal tpars...@wikimedia.org wrote:
 On 3/17/10 1:02 PM, Marco Schuster wrote:
 On Wed, Mar 17, 2010 at 5:37 PM, Aryeh Gregor
 simetrical+wikil...@gmail.com  wrote:

 On Wed, Mar 17, 2010 at 10:39 AM, Platonidesplatoni...@gmail.com  wrote:

 Would it help adding alink rel=prefetch  to the first video in the 
 page?

 In Firefox, yes.  Does anyone else implement that?  If it's only
 Firefox, we could just as well replace Cortado withvideo autobuffer
 to begin with as soon as the page loads.  In Chrome and Safari, we
 don't even need the autobuffer, since they don't implement it.
 (Actually,video autobuffer preload=auto  would be the current way to
 do it, given recent spec changes.)

 I hope no one is ever insane enough to use this. Imagine those people
 with cellphones and no data transfer flat (~70% of mobile internet
 users) - their bills will skyrocket when even a single video is set to
 auto-preload.

 Marco


 Hopefully browsers on phones and low-bandwidth devices could just ignore
 this attribute.
What about people who use tethering? A browser can't determine if he
connects via a flatrate connection (DSL, company network) or via
tethering and a mobile connection.

Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] modernizing mediawiki

2010-03-03 Thread Marco Schuster
On Wed, Mar 3, 2010 at 4:30 PM, David Gerard dger...@gmail.com wrote:
 On 3 March 2010 15:06, Paul Houle p...@ontology2.com wrote:

    For a large-scale site,  there's going to be a lot of administration
 work to be done,  so it doesn't matter if the system is difficult to set
 up and configure.


 As it turns out, MediaWiki isn't really hard at all :-)


    Wordpress,  on the other hand,  set out with the mission of being
 the 'cheap and cheerful' program that would dominate the market for
 blogging software.  Everything about Wordpress is designed to make it
 easy to set up a Wordpress site quickly and configure it easily.
 Wordpress does scale OK to fairly large blogs and high traffic if you
 SuperCache it.


 Multi-user WordPress is a bit arsier. Comparable faff to MediaWiki setup.
apt-get install wordpress, and let dpkg handle the rest. it's really easy.

marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] modernizing mediawiki

2010-03-02 Thread Marco Schuster
On Wed, Mar 3, 2010 at 5:57 AM, Ryan Lane rlan...@gmail.com wrote:
 I don't really find updates to be terribly difficult. You mostly just
 check out (or download) the newest version, and run update.php. This
 is probably more difficult without shell access.
With Wordpress upgrades it's even easier: two clicks and you're done
(okay, except if you run multi-user WP setups). Same for extension
updates. It even *notifies* you for updates, especially for
security-critical - if you don't follow the -announce lists and
subsequently never update, your wiki can and will be open to any
security issue coming up.

 -I don't want to go to my ftp to download my local settings file, add a few 
 lines then reupload it. This is caveman-like behavior for the modern 
 internet.

 Get a host that supports SSH. Use VI, Emacs, nano, pico, etc.
HAHAHA, sorry but this way of thinking is stone-age. Who are we to
require our users to get more expensive hosting AND knowledge of
VI/Emacs (a newbie most likely won't have HEARD of ssh, vi and emacs!)
just for being able to modify the core settings of a wiki without
having the FTP extra work? Come on, it's so easy to make a web-based
settings editor. Mighta even be lots easier to just move all settings
stuff except MySQL data into the DB.

Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] hiphop progress

2010-03-01 Thread Marco Schuster
The point of $IP is that you can use multisite environments by just
having index.php and Localsettings.php (and skin crap) in the
per-vhost directory, and have extensions and other stuff centralized
so you can update the extension once and all the wikis automatically
have it.
However, the Installer could be patched, to resolve $IP automatically
if the user wishes to run a HipHop environment.

Marco

On Mon, Mar 1, 2010 at 2:59 PM, Ævar Arnfjörð Bjarmason
ava...@gmail.com wrote:
 On Mon, Mar 1, 2010 at 13:35, Domas Mituzas midom.li...@gmail.com wrote:
 Still, the decision to merge certain changes into MediaWiki codebase (e.g. 
 relative includes, rather than $IP-based absolute ones) would be quite 
 invasive.
 Also, we'd have to enforce stricter policy on how some of the dynamic PHP 
 features are used.

 I might be revealing my lack of knowledge about PHP here but why is
 that invasive and why do we use $IP in includes in the first place? I
 did some tests here:

    http://gist.github.com/310380

 Which show that as long as you set_include_path() with $IP/includes/
 at the front PHP will make exactly the same stat(), read() etc. calls
 with relative paths that it does with absolute paths.

 Maybe that's only on recent versions, I tested on php 5.2.

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l




-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] hiphop progress

2010-03-01 Thread Marco Schuster
On Mon, Mar 1, 2010 at 3:26 PM, Daniel Kinzler dan...@brightbyte.de wrote:
 Marco Schuster schrieb:
 The point of $IP is that you can use multisite environments by just
 having index.php and Localsettings.php (and skin crap) in the
 per-vhost directory, and have extensions and other stuff centralized
 so you can update the extension once and all the wikis automatically
 have it.

 That'S a silly multi-host setup. Much easier to have a single copy of
 everything, and just use conditionals in localsettings, based on hostname or 
 path.
Downside of this: as a provider, *you* must make the change, not the
customer, as it is one central file.

Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] User-Agent:

2010-02-16 Thread Marco Schuster
On Tue, Feb 16, 2010 at 4:44 PM, Anthony wikim...@inbox.org wrote:
 On Tue, Feb 16, 2010 at 10:39 AM, Domas Mituzas midom.li...@gmail.comwrote:
 Been like that for ages, haven't it?


 No idea.  For ages you've been able to just go onto the Wikimedia servers
 and change whatever you feel like, and answer to nobody?  You must be
 misunderstanding my question or something.
Correct me if I'm wrong, but AFAIR Domas was the MySQL admin guy since
pretty much the beginning, and I think that the fact he's sysop won't
change anyway, no matter what happens.
For the notes, the step of banning everything without UA sucks
totally. Sure, the API and other abuse-prone stuff can be blocked, but
ordinary article watching should ALWAYS be possible, no matter what
fucked-up UA you use.

Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] User-Agent:

2010-02-16 Thread Marco Schuster
Hi,

On Tue, Feb 16, 2010 at 8:31 PM, Domas Mituzas midom.li...@gmail.com wrote:
 You can sure assume, that we need to come up with something to defend a new 
 policy.
Yeah, ban no/broken-UA clients for these things that do cause CPU
load, but leave article reading unharmed. Normal readers with Privoxy
or other privacy filters (you know, people DO still use them, even if
their percentage is small!) can at least READ, then.

 Presumably some percentage of that 20-50% will come back as the
 spammers realize they have to supply the string.  Presumably we
 then start playing whack-a-mole.

 Yes, we will ban all IPs participating in this.
Good luck fighting a dynamic bot herder (though I do ask me, with the
spam blacklist and the captchas for URLs, what the hell can a botnet
master achieve by hitting Wikipedia?!).

 Presumably there's a plan for what to do when the spammers begin
 supplying a new, random string every time.

 Random strings are easy to identify, fixed strings are easy to verify.
The point is, what should bot writers do:
1) no UA at all, that's the typical newbie mistake who just supplies
GET /w/index.php?action=edit, which works with his localhost wiki and
every other wegs.
2) default UA of the programming language (PHPs thingy, cURL, Python,
some bots may even use wget and bash scripting, it's not THAT
difficult to write a Wikibot in bashscript!)
3) own UA (stuff like HDBot v1.1 (http://xyz.tld), which I couldn't
use some longer time ago)
4) spoof a browser UA (bad, as the site cant differ between bot and browser)

To avoid the ban, only 3 and 4 are possible, as the default UAs are
blocked for most cases. But as 3 not really works, or at least is hard
to troubleshoot, it leaves only 4, which you do not want.

Please write some doc that answers this once and for all.

Marco

PS: Oh, and please, please make the 403 msg something that people can
figure out what's wrong, it takes AGES if you are a newbie to
scripting.
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] importing enwiki into local database

2010-02-14 Thread Marco Schuster
On Sun, Feb 14, 2010 at 1:00 PM, Robert Ullmann rlullm...@gmail.com wrote:

 Are you using $wgUseTidy? It is an HTML cleanup process that is always
 enabled on WMF projects. Since it it is there, template creators often
 miss closing spans and other things, or leave in extra close tags, and
 never notice because the tidy fixes them. Would not surprise me a bit
 if dozens of en.wp templates have such errors; I find them
 occasionally on en.wikt.

What about turning wgUseTidy off for some time? Maybe some night hours... so
that our template magicians are forced to clean up the templates and the
other crap which resides deep buried into the wikitext.
It is not a 3rd user's responsibility to install extra software just to be
able to render our content... or at least, it shouldn't be. We should not
artificially make it difficult to get our content forked or duplicated or
elsewhile re-used.

Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] importing enwiki into local database

2010-02-14 Thread Marco Schuster
On Mon, Feb 15, 2010 at 2:30 AM, Aryeh Gregor
simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com
 wrote:

 On Sun, Feb 14, 2010 at 7:34 PM, Marco Schuster
 ma...@harddisk.is-a-geek.org wrote:
  What about turning wgUseTidy off for some time? Maybe some night hours...
 so
  that our template magicians are forced to clean up the templates and the
  other crap which resides deep buried into the wikitext.

 They would just complain until we turned it back on.  Unless we left
 it off for good, which might be reasonable, but then we'd have to
 improve Sanitizer, I guess.  (Maybe once html5lib is more usable.)

Why? Why must software take care of the crap that users do? Either we force
them to write proper code, or they never will. It's like with PHP, with the
difference that we still have the option to make our templatewriters and
coders learn, instead of having to support stone-age stuff with crappy
workarounds.
Every single extra step that has to be taken to get a WP fork up and running
is one step too much.

Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] confirm dfe03c58a0a....

2010-02-10 Thread Marco Schuster
Hi all,

I just got this email from bugzilla. Apparently Google Apps has screwed up
something again so the message itself doesnt annoy me, but why are the
users' passwords still sent in CLEARTEXT in these days??
Can someone (tm) of the mailman admins or tech guys please fix this up?

Thanks,
Marco

On Wed, Feb 10, 2010 at 2:47 PM, wikitech-l-requ...@lists.wikimedia.orgwrote:

 Your membership in the mailing list Wikitech-l has been disabled due
 to excessive bounces The last bounce received from you was dated
 10-Feb-2010.  You will not get any more messages from this list until
 you re-enable your membership.  You will receive 3 more reminders like
 this before your membership in the list is deleted.

 To re-enable your membership, you can simply respond to this message
 (leaving the Subject: line intact), or visit the confirmation page at


 https://lists.wikimedia.org/mailman/confirm/wikitech-l/dfe03c58a0a1fe2.https://lists.wikimedia.org/mailman/confirm/wikitech-l/dfe03c58a0a1fe2700f74689430fa846e36fbcdb


 You can also visit your membership page at


 https://lists.wikimedia.org/mailman/options/wikitech-l/marco%40harddisk.is-a-geek.org


 On your membership page, you can change various delivery options such
 as your email address and whether you get digests or not.  As a
 reminder, your membership password is

 blanked


 If you have any questions or problems, you can contact the list owner
 at

wikitech-l-ow...@lists.wikimedia.org

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] (no subject)

2010-01-28 Thread Marco Schuster
On Thu, Jan 28, 2010 at 5:02 PM, Tei oscar.vi...@gmail.com wrote:

 On 28 January 2010 15:06, 李琴 q...@ica.stc.sh.cn wrote:
  Hi all,
   I have  built a LocalWiki.   Now I want the data of it to keep
 consistent
  with the
  Wikipedia and one work I should do is to get the data of update from
  Wikipedia.
  I get the URLs through analyzing the RSS
  (
 http://zh.wikipedia.org/w/index.php?title=Special:%E6%9C%80%E8%BF%91%E6%9B%B4%E6%94%B9feed=rss
 )
  and get all HTML content of the edit box by analyzing
  these URLs after opening an URL and clicking the ’edit this page’.
 
  That’s because I visit it too frequently and my IP address is prohibited
  or the network is too slow?

 李琴 well.. thats webscrapping, that is a poor tecnique, one with lots
 of errors that generate lots of trafic.

 One thing a robot must do is read and follow  the
 http://zh.wikipedia.org/robots.txt file ( probably you sould read it
 too)
 As a general rule of Internet, a  rude robot will be banned by the
 site admins.

 It would be a good idea to anounce your bot as a bot in the user_agent
 string .  Good bot beavior is one that read a website like a human.  I
 don't know,  like 10 request minute?.  I don't know about this
 Wikipedia site rules about it.

 What you are suffering could be  automatic or manual throttling, since
 is detected a abusive number of request from your IP.

 Wikipedia seems to provide fulldumps of his wiki, but are unusable
 for you, since are giganteous :-/, trying to rebuilt wikipedia on your
 PC with a snapshot would be like summoning Tchulu in a teapot. But.. I
 don't know, maybe the zh version is smaller, or your resources
 powerfull enough.  One feels that what you have built has a severe
 overload (wastage of resources) and there must be better ways to do
 it...

Indeed there are. What you need:
1) the Wikimedia IRC live feed - last time I've looked at it, it was at
irc://irc.wikimedia.org/ and then each project had its own channel.
2) A PHP IRC bot framework - Net_SmartIRC is well-written and easy to get
started with
3) the page source you can EASILY get either in rendered form
http://zh.wikipedia.org/w/index.php?title=TITLEaction=render or in raw form
http://zh.wikipedia.org/w/index.php?title=TITLEaction=raw (this is page
source).

Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] enwiki database dump schedule

2010-01-28 Thread Marco Schuster
On Fri, Jan 29, 2010 at 2:33 AM, Anthony wikim...@inbox.org wrote:

 On Mon, Jan 25, 2010 at 6:23 PM, Tomasz Finc tf...@wikimedia.org wrote:

  New snapshot ready.
 
  http://download.wikipedia.org/enwiki/20100116
 

 And the history dump, which had run for a month and a half and looked like
 it was going to actually complete for the first time in years, is now
 broken.  Thanks a lot.

How are the old revisions backed up, by the way? Just replication to a
remote datacenter?

Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Failed to Download any pages-articles.xml.bz2 file

2009-12-15 Thread Marco Schuster
This is likely your company proxy not supporting huge downloads. Might be
its cache partition got filled up.

Marco

2009/12/15 Rob Giberson ajax...@gmail.com

 I failed several times and ended up downloading exactly the same number of
 bytes:  1,465,454KB (1.46GB), even for different versions of that file.

 Is it possible that the files themselves are corrupted? Anyone successfully
 downloaded one of these big files recently?

 Rob

 On Mon, Dec 14, 2009 at 8:11 PM, Rob Giberson ajax...@gmail.com wrote:

  You got that right. I am behind a company proxy. Any idea how to get this
  around?
 
 
  On Mon, Dec 14, 2009 at 8:09 PM, K. Peachey p858sn...@yahoo.com.au
 wrote:
 
   On Tue, Dec 15, 2009 at 12:25 PM, Rob Giberson ajax...@gmail.com
  wrote:
   Guess this problem might be asked several (many) times before...but
  
   I tried to download the pages-articles.xml.bz2 file which is
  approximately
   5.xGB. However, all versions I tried failed at about 1.5GB. I noticed
   someone posted a Python code online as a workaround, but it did not
 work
  for
   me.
  
   My machine is Windows Server 2008 64 bit. Any idea how to get this
 huge
   file?
  
   Appreciated.
   Rob
  You wouldn't happen to have proxies between you and your internet
  connection would you?
 
  -Peachey
 
  ___
  Wikitech-l mailing list
  Wikitech-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikitech-l
 
 
 
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l




-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [WikiEN-l] Extracting main titles from enwiki-latest-all-titles-in-ns0.gz

2009-12-12 Thread Marco Schuster
Hi,

On Sat, Dec 12, 2009 at 4:35 PM, David Gerard dger...@gmail.com wrote:


 2009/12/11 Behrang Saeedzadeh behran...@gmail.com:
  Hi,
 
  I have downloaded enwiki-latest-all-titles-in-ns0.gz and I want to
 extract
  main titles and store them in another file. For example, some titles have
  meta information (e.g. disambiguation etc.) and I want these to be
 removed.
  Can I remove all the text between parentheses from the titles to achieve
  this?
 

You have to parse it by hand.


  Also some titles start with the ! character. and some are enclosed
 between
  two or three of them such as !Adiso_Amigos!. What is the purpose of !
 in
  such cases?

It's part of the topic's name (in case of 
http://en.wikipedia.org/wiki/%C2%A1Adios_Amigos!, the band's name). The
reverse exclamation mark is part of the Spanish language.

  Also why some titles are enclosed between two double quotes such
  as 400_Years_of_Telescope?

Same case: The  are part of the topic's name (e.g. 
http://en.wikipedia.org/wiki/%22Weird_Al%22_Yankovic).

Marco

PS: Next time, please do correct copypaste so people have a chance to see
what you want. Both your supplied examples had to be corrected, the second
one was missing a the: http://en.wikipedia.org/wiki/
400_Years_of_the_Telescope


-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wikimedia IRC Group Contacts looking for a developer

2009-12-04 Thread Marco Schuster
Didn't seanw write some system for exactly that, or is it broken for ages?

Marco

On Fri, Dec 4, 2009 at 9:25 PM, Rjd0060 rjd0060.w...@gmail.com wrote:

 Hi all -

 The IRC Group Contacts are in search of a developer to create and
 maintain a new cloak request system.  Some basic details are posted at
 http://meta.wikimedia.org/wiki/IRC/Cloaks/System .

 If you are interested (or know anybody who might be) or have any
 questions, please get in touch with us.  You can email
 irc-contacts-ow...@lists.wikimedia.org or poke one of us individually
 ( http://meta.wikimedia.org/wiki/IRC/Group_Contacts#Names ).

 Thanks in advance.

 --
 Ryan
 User:Rjd0060

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l




-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Unicode equivalence

2009-12-01 Thread Marco Schuster
2009/12/1 Praveen Prakash me.prav...@gmail.com

 Popular transliteration tool for Malayalam typing (*Varamozhi*) and popular
 font (*Anjali OldLipi*) are currently supporting Unicode 5.1 in windows.
 Recently (two or three days before) Microsoft announced their own tool for
 Malayalam typing which also supporting 5.1. Microsoft's default Karthika
 font for Malayalam also now supporting 5.1. But IE6 is not supporting
 unicode 5.1 even with supporting fonts.

Is dynamic reverse conversion at clientside using javascript possible?
This way we could output UC5.1 to everything supporting it, and older /
crappy browsers / OSes can display still correctly.
Sure, it adds a JS dependency, but I do think we can require JS for that.

Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [MediaWiki] Enhancement: LaTeX images quality (eliminate white background)

2009-11-29 Thread Marco Schuster
On Sun, Nov 29, 2009 at 5:46 PM, Aryeh Gregor
simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com
 wrote:

 On Sun, Nov 29, 2009 at 11:45 AM, Aryeh Gregor
 simetrical+wikil...@gmail.com simetrical%2bwikil...@gmail.com wrote:

But not one to make the images transparent by default,
 unless it somehow plays nicely with IE6.

Who cares about that browser?? It's been history for years! I really doubt
that ANYONE still using that bitrotten thing of a browser cares about
alpha-transparency or Wikipedia in general, and I don't think more time,
money or energy should be wasted on making anything look ok on it - as long
as a IE6 user is still able to read the text, it's ok IMO.

Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [MediaWiki] Enhancement: LaTeX images quality (eliminate white background)

2009-11-29 Thread Marco Schuster
On Sun, Nov 29, 2009 at 6:28 PM, Aryeh Gregor
simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com
 wrote:

 On Sun, Nov 29, 2009 at 12:19 PM, Marco Schuster
 ma...@harddisk.is-a-geek.org wrote:
  Who cares about that browser??

 ~15% of our users use it.  If our goal is to make a broadly usable
 website, we care about it.

So what? They'll see blocky images, but can make out what the content is.


  as long as a IE6 user is still able to read the text, it's ok IMO.

 We need to weigh the interests of our users without regard to
 politics.

Sometimes this is necessary though. Many people today still don't know that
IE6 is dangerous. Wikipedia should warn those users and tell them how to
upgrade.

Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [MediaWiki] Enhancement: LaTeX images quality (eliminate white background)

2009-11-17 Thread Marco Schuster
On Tue, Nov 17, 2009 at 4:11 PM, Aryeh Gregor
simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com
 wrote:

 On Tue, Nov 17, 2009 at 3:53 AM, Alexander Shulgin
 alex.shul...@gmail.com wrote:
  Today, while reading my morning load of news I've come across a
  Wikipedia article[1] with some embedded LaTeX formulae, used within a
  table.  The table header on that page has background of a distinct
  color which makes formula images look ugly.

 See bug:

 https://bugzilla.wikimedia.org/show_bug.cgi?id=8

 As far as I know, the only thing actually blocking us from doing this
 was something like IE5 on Mac printing transparent images with black
 backgrounds.  That's probably not relevant anymore.  We're still stuck
 with the fact that IE6 doesn't support alpha channels, though -- we
 could make the fully-transparent parts of the background transparent,
 but I don't see how we could avoid aliasing effects on sane browsers
 without making things look extremely ugly on IE6.

Aren't there various workarounds using JS and filters for IE6?

Marco


-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] __TOC__ handling

2009-09-06 Thread Marco Schuster
On Sun, Sep 6, 2009 at 12:05 PM, Aryeh Gregor
simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com
 wrote:

 This is correct.  Although it's includes/parser/Parser.php (not all of
 us use Windows or Mac! :P).

AFAIR Mac OS X's Partition Manager allows you to set up a HFS partition as
case sensitive :p
Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wikipedia iPhone app official page?

2009-09-04 Thread Marco Schuster
On Fri, Sep 4, 2009 at 9:21 PM, Chad innocentkil...@gmail.com wrote:

 Wheee! TortoiseSVN indeed spoils us Windows users, as it's made
 version control so easy that...well...a Windows user can do it ;-)

If Windows had a decent command line / shell (has its suckyness improved for
Win7?), I bet that TortoiseSVN had far less downloads... it simply is the
only way to make SVN usable on Windows.

Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Version control systems (was: Wikipedia iPhone app official page?)

2009-09-04 Thread Marco Schuster
On Sat, Sep 5, 2009 at 1:34 AM, David Gerard dger...@gmail.com wrote:

 [subject changed]

 2009/9/5 Marco Schuster ma...@harddisk.is-a-geek.org:
  On Fri, Sep 4, 2009 at 9:21 PM, Chad innocentkil...@gmail.com wrote:

  Wheee! TortoiseSVN indeed spoils us Windows users, as it's made
  version control so easy that...well...a Windows user can do it ;-)

  If Windows had a decent command line / shell (has its suckyness improved
 for
  Win7?), I bet that TortoiseSVN had far less downloads... it simply is the
  only way to make SVN usable on Windows.


 That or Cygwin. (git works well in Cygwin too. At my last workplace we
 made damn sure to put Cygwin on our few Windows servers with sshd
 running.) Cygwin made even command-line CVS usable.

Yup, cygwin is really cool... but you still need a proper GUI frontend.
Windows's cmd simply sucks compared to e.g. xterm or konsole, which, of
course, both can be run in cygwin. But getting X output via cygwin is a damn
nightmare of a task.
Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wiktionary API acceptable use policy

2009-09-03 Thread Marco Schuster
On Thu, Sep 3, 2009 at 10:26 PM, Aryeh Gregor
simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com
 wrote:

 I don't know how formal or authoritative that is.  You might want to
 ask someone like Brion.  I think the answer in practice is that
 nobody's going to waste time blocking you if you don't cause
 noticeable load, but I don't know if there's an official statement
 anywhere.  I vaguely recall that some sites might pay Wikimedia a fee
 to do commercial live mirroring, but I'm not sure on that.

AFAIK one of these is spiegel.de which gets some kind of live feed, they
arranged it with WM DE.
Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wikipedia iPhone app official page?

2009-08-29 Thread Marco Schuster
On Sat, Aug 29, 2009 at 11:52 PM, Gregory Maxwell gmaxw...@gmail.comwrote:

 I laughed at this... GIT has a number of negatives, but poor speed is
 not one of them especially if you're used to working with SVN and a
 remote server.  Maybe this is just a windows issue? GIT leaves a lot
 of work to the filesystem.

And so to the disk. If the disk or the controller sucks or is simply old
(not everyone has shiny new hardware), you're also damn slow. What should
also not be underestimated is the diskspace demand of a GIT repo - not
everyone has the money to buy new, high-sized disks (or has not the
possibility to upgrade, especially laptop users). Of course, the diskspace
issue can be solved partially by splitting that what is now one big SVN repo
to multiple smaller ones - a standard GIT pull should e.g. not include the
custom MWF stuff or the sources of the helper tools/scripts and whatever is
hidden in the deep world of svnroot.
Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wiki not responding to patches

2009-08-13 Thread Marco Schuster
On Thu, Aug 13, 2009 at 9:10 PM, Taja Anand taja.w...@gmail.com wrote:

 3) removed all the content of index.php !! [it still runs]

This sounds bad, shouldnt be possible.
Are you running on Windows or Linux? What http server (apache, lighttpd)?
Any PHP cache installed / active? Any Squid proxy accidentally set? Cleared
the browser cache? (Firefox has had a similar problem for me ages ago)

Is localhost pointing to 127.0.0.1? (Verify this in the hosts file of your
OS)
Oh, and did you verify you use the correct path in your browser?

Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] w...@home Extension

2009-08-02 Thread Marco Schuster
On Sun, Aug 2, 2009 at 2:32 AM, Platonides platoni...@gmail.com wrote:

  I'd actually be interested how YouTube and the other video hosters
 protect
  themselves against hacker threats - did they code totally new
 de/en-coders?

 That would be even more risky than using existing, tested (de|en)coders.

Really? If they simply don't publish the source (and the binaries), then the
only possible way for an attacker is fuzzing... and that can take long time.

Marco

--
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] w...@home Extension

2009-08-01 Thread Marco Schuster
On Sat, Aug 1, 2009 at 9:35 PM, Brian brian.min...@colorado.edu wrote:

  Never trust the client. Ever, ever, ever. If you have a working model
  that relies on a trusted client you're fucked already.
 
  Basically, if you want to distribute binaries to reduce hackability
  ... it won't work and you might as well be distributing source.
  Security by obscurity just isn't.
 
 
  - d.
 

 Ok, nice rant. But nobody cares if you scramble their scientific data
 before
 sending it back to the server. They will notice the statistical blip and
 ban
 you.

What about video files exploiting some new 0day exploit in a video input
format? The Wikimedia transcoding servers *must* be totally separated from
the other WM servers to prevent 0wnage or a site-wide hack.

About users who run encoding chunks - they have to get a full installation
of decoders and stuff, which also has to be kept up to date (and if the
clients run in different countries - there are patents and other legal stuff
to take care of!); also, the clients must be protected from getting infected
chunks so they do not get 0wned by content wikimedia gave to them (imagine
the press headlines)...

I'd actually be interested how YouTube and the other video hosters protect
themselves against hacker threats - did they code totally new de/en-coders?

Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Watchlistr.com, an outside site that asks for Wikimedia passwords

2009-07-23 Thread Marco Schuster
On Thu, Jul 23, 2009 at 8:50 PM, Happy-melon happy-me...@live.com wrote:



 Aryeh Gregor 
 simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com
 wrote in message
 news:7c2a12e20907231051s638dd2f9v399ac2a79e185...@mail.gmail.com...
  On Thu, Jul 23, 2009 at 1:37 PM, Tim Starlingtstarl...@wikimedia.org
  wrote:
  To help in the proving trustworthy, or else process, I have released
  the source code of Watchlistr - please take a look at it. You will see
  that I take the utmost care in securing user information. The wiki
  logins are encrypted with AES in our database. The key used to encrypt
  each user's login list is their site username, which is stored as a
  SHA1 hash in our database. If a cracker were to, somehow, gain access
  to the database, they would be left with a pile of garbage.
 
  They would only have to get the site usernames to decrypt the login
  info.  They could get those the next time each user logs in, if
  they're not detected immediately.  There's no way around this; if your
  program can log in as the users, so can an attacker who's able to
  subvert your program.

 Or, since the set of registered Wikimedia users is both vastly smaller than
 the superset of all possible usernames (remember it's restricted to users
 with a global login AFAICT), and readily accessible through a
 high-throughput API, a brute-force attack would be, if not trivial,
 certainly extremely feasible.
 
  As for the other solutions that were presented - I was really trying
  to create a cross-platform, cross-browser solution that would not
  hinge on one particular technology. Javascript would be great, but
  what if someone doesn't have JS enabled? OAuth and a read-only API
  would be close-to-ideal, but they currently don't work with/don't
  exist on the Wikimedia servers. I am, however, open to other workable
  solutions that are presented - let me know.
 
  I would suggest you apply for a toolserver account:
 
  https://wiki.toolserver.org/view/Account_approval_process
 
  Once you have a toolserver account, I'd be willing to work with you to
  arrange for some form of direct access to all wikis' watchlist tables
  (I'm a toolserver root).  You then wouldn't need to possess any login
  info.

 This looks like a *much* more acceptable system.  Although how would you
 authenticate without collecting proscribed data...?


Let the user prove account ownership by a talk page edit. This was the way
Interiot used in his old edit counter... (is this one still active?)

Marco


-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] For the Germans: PHP Release Party in Munich July 17th

2009-07-10 Thread Marco Schuster
Forwarding this to vereinde-l and wikide-l.

Marco

On Thu, Jul 9, 2009 at 3:52 PM, Brion Vibber brion.vib...@gmail.com wrote:

 Thought this might be of interest to some of our folks in and around
 Germany:

 http://phpugmunich.org/dokuwiki/php_release_party

 Wouldn't hurt to have a MediaWikian or two there to represent. :) It's
 at a biergarten so you know I'd be there if I were local! ;)

 -- brion vibber (brion @ wikimedia.org)

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Proposal: switch to HTML 5

2009-07-08 Thread Marco Schuster
On Wed, Jul 8, 2009 at 3:46 AM, Gregory Maxwell gmaxw...@gmail.com wrote:

 There is only a short period of time remaining where a singular
 browser recommendation can be done fairly and neutrally. Chrome and
 Opera will ship production versions and then there will be options.
 Choices are bad for usability.


We should not recommend Chrome - as good as it is, but it has serious
privacy problems.
Opera is not Open Source, so I think we'd best stay with Firefox, even if
Chrome/Opera begin to support video tag.

Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] secure slower and slower

2009-07-08 Thread Marco Schuster
On Wed, Jul 8, 2009 at 10:04 PM, Aryeh Gregor
simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com
 wrote:

 On Wed, Jul 8, 2009 at 10:45 AM, Gregory Maxwellgmaxw...@gmail.com
 wrote:
  Provided your changes didn't break the site, I'd take a
  bet that you could have a malware installer running for days before it
  was discovered.

 What, on enwiki?  I'd bet ten minutes before it's noticed someone
 using NoScript configured to prompt about cross-site loads or
 something.


And if you're not a real technical expert who sees aha, the site where the
js comes from is surely NOT wikipedia, it doesn't help anyone. People click
away these warnings very often and don't bother to actually read them...
this is something most people forget in security themes.

Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] secure slower and slower

2009-07-06 Thread Marco Schuster
On Tue, Jul 7, 2009 at 4:03 AM, Aryeh Gregor
simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com
 wrote:

 But really -- have there been *any* confirmed incidents of MITMing an
 Internet connection in, say, the past decade?  Real malicious attacks
 in the wild, not proof-of-concepts or white-hat experimentation?  I'd
 imagine so, but for all people emphasize SSL, I can't think of any
 specific case I've heard of, ever.  It's not something normal people
 need to worry much about, least of all for Wikipedia.


Public congresses, schools without protection for ARP spoofing (I got 0wned
this way myself), maybe corporate networks w/o proper network setup... they
all allow sniffing or in-line traffic manipulation.
Not that uncommon attacks, and when you know the colleague you do not like
is WP admin, you simply have to wait for him to visit WP logged in, and you
have either his pass or the cookies.

Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] On templates and programming languages

2009-07-02 Thread Marco Schuster
On Fri, Jul 3, 2009 at 4:22 AM, Aryeh Gregor
simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com
 wrote:

 On Thu, Jul 2, 2009 at 10:18 PM, Steve Bennettstevag...@gmail.com wrote:
  So:
  1) The chosen language will support iteration over finite sets
  2) Could it support general iteration, recursion etc?
  3) If so, are there any good mechanisms for limiting the
  destrutiveness of an infinite loop?

 You don't really need an infinite loop.  DoS would work fine if you
 can have any loop.  Even with just foreach:

 foreach(array(1,2)as $x1)foreach(array(1,2)as $x2)

 A few dozen of those in a row will give you a nice short bit of code
 that may as well run forever.

You can make some kind of counter, which gets incremented each
foreach/while/for loop. If it reaches 200 (or whatever), execution is
stopped.

Marco


-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Marco Schuster
On Tue, Jun 30, 2009 at 10:25 PM, Brion Vibber br...@wikimedia.org wrote:

 Aryeh Gregor wrote:
  Our
  current tarballs are 10 MB; we could easily just chuck in Lua binaries
  for Linux x86-32 and Windows without even noticing the size increase,
  and allow users to enable it with one line in LocalSettings.php.

 Hmm... it might be interesting to experiment with something like this,
 if it can _really_ be compiled standalone. (Linux binary distribution is
 a hellhole of incompatible linked library versions!)

Static compiling the stuff? How would this affect the binary size? (And: is
static linking working across different libc versions?)

BTW, what about Mac OS / FreeBSD hosts?

Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] On templates and programming languages

2009-06-30 Thread Marco Schuster
On Tue, Jun 30, 2009 at 10:45 PM, Aryeh Gregor 
simetrical+wikil...@gmail.com simetrical%2bwikil...@gmail.com wrote:

 Alternatively, is the libc ABI stable enough that we could dynamically
 link libc, and statically link everything else?  The other libraries
 required are very small.

I wouldn't count on this... at least we should provide a dyn-linked version
for those wanting less storage/memory/whatever consumption.

How do statically compiled programs for x86 platforms behave on x64, btw?
And what about more exotic platforms like ARM (which can also be
multi-endian, IXP4xx is an example) / SPARC (Toolserver!!!) or PowerPC? Are
they actually supported by Lua?



  BTW, what about Mac OS / FreeBSD hosts?

 Are there any shared webhosts you know of that run Mac or BSD?  At
 worst, they can fall into the same group as the no-exec() camp, able
 to use Wikipedia content but not 100%.


The webhoster hosting our school's homepage does, for example... They host
all schools in Munich, and I think they're a bit security-paranoid. We don't
have any issues hosting a MediaWiki there, actually. (OK, we never imported
WP content.)


Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] wikimedia, wikipedia and ipv6

2009-06-15 Thread Marco Schuster
On Mon, Jun 15, 2009 at 11:46 AM, Peter Gervai grin...@gmail.com wrote:

 On Fri, Jun 12, 2009 at 23:55, Platonidesplatoni...@gmail.com wrote:

  List archives are not searchable by google.

 Is it on purpose? Why?

Yep, so that accidentally published private data doesn't get indexed by
google. Same for deliberately published data or personal insults with
clearnames of users, wikide-l had this some times. Dunno about such
experiences on other lists.

Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] more bugzilla components

2009-05-26 Thread Marco Schuster
On Tue, May 26, 2009 at 7:55 PM, Michael Dale md...@wikimedia.org wrote:

 *I also want to report some strangeness with bugzilla. I sometimes get
 the below error when trying to log in (without restrict to ip checked
 ) and I occasionally get time-outs when submitting bugs:

 Undef to trick_taint at Bugzilla/Util.pm line 67
Bugzilla::Util::trick_taint('undef') called at
 Bugzilla/Auth/Persist/Cookie.pm line 61


 Bugzilla::Auth::Persist::Cookie::persist_login('Bugzilla::Auth::Persist::Cookie=ARRAY(0xXX)',
 'Bugzilla::User=HASH(0xXX)') called at Bugzilla/Auth.pm line 147

Do you have IPv6 enabled? If yes, switch it off. Shoulda help you.

Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wiki editor wysiwyg

2009-05-11 Thread Marco Schuster
On Mon, May 11, 2009 at 5:05 PM, Daniel Kinzler dan...@brightbyte.dewrote:

 Basically, this is the reason why there is no really good WYSIWYG editor
 for
 MediaWiki. I don't want to discurage you, I just want to point out where
 the
 problems are. Some examples: the closing }} of a template may actually be
 contained in the definition of another template, same with thables. I have
 seen
 |} being replaced by something like {{TableEnd}}. Lots of fun there.

Also there are people who use date/time to switch between template
contents... it's really funny how people can use something like the MW
markup - they end literally there where no man eh coder has been before.

Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wiki editor wysiwyg

2009-05-11 Thread Marco Schuster
On Mon, May 11, 2009 at 8:50 PM, Daniel Schwen li...@schwen.de wrote:

 The simple (albeit ugly) solution would to add a parser version field to
 the
 revision table, drag the old parser along as 'legacy', make the new  parser
 the default (and only) option for all new edits, and spit out a warning
 when
 you are editing a legacy revision for the first time. The warning you be
 made
 dependent on the cases that break with the new parser.
 Cases that break could be detected by comparing tidied HTML output from
 both
 parser versions.


Sounds cool, but it'd require a formalization of MW markup first (something
that should have been done long ago).
What about correcting stuff from old behavior to new parser via
bots/update scripts, even for old revisions?

Marco


-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] OpenID MediaWiki Extension v.0.8.4.1 - Identity Providers UI

2009-04-19 Thread Marco Schuster
On Sun, Apr 19, 2009 at 5:55 AM, Sergey Chernyshev 
sergey.chernys...@gmail.com wrote:

  And, I can't choose the case spelling of my nick (it's harddisk on
 OID),
  normally it shoulda be HardDisk, but I think this is an OpenID-related
  problem - anyway, it'd be cool if you could make an additional field for
  the
  user to input desired username.
 It's very possible that your provider returns lowercase nickname and
 MediaWiki user is automatically created.


Indeed, this is the error... but (also in order to avoid name collision) it
'd be nice for people to choose their own username.

Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] OpenID MediaWiki Extension v.0.8.4.1 - Identity Providers UI

2009-04-18 Thread Marco Schuster
On Sat, Apr 18, 2009 at 9:00 AM, Sergey Chernyshev 
sergey.chernys...@gmail.com wrote:

 Hope you like it, but I'm still open to suggestions about improving the
 interface so you all finally install it on your wikis ;)

There's a double escape on the confirmation page which redirects to the OID
provider (\continue\)... unfortunately it redirected to myopenid too fast
to CnP the page.
And, I can't choose the case spelling of my nick (it's harddisk on OID),
normally it shoulda be HardDisk, but I think this is an OpenID-related
problem - anyway, it'd be cool if you could make an additional field for the
user to input desired username.

Besides that, it's ENORMOUSLY cool.

Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Dealing with Large Files when attempting a wikipedia database download - Focus upon Bittorrent List

2009-04-18 Thread Marco Schuster
On Fri, Apr 17, 2009 at 11:55 PM, Jameson Scanlon 
jameson.scan...@googlemail.com wrote:

 Is it possible for anyone to indicate more comprehensive lists of
 torrents/trackers than these?  Are there any plans for all the
 database download files to be available in this way (I imagine that
 there would also be some PDF manual which would go along with these to
 indicate offline viewing, and potentially more info than this).

In theory, one can create a torrent with the Wikipedia servers as webseeds
easily. Question is, how many torrent clients except Azureus support these?

Marco


-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Skin JS cleanup and jQuery

2009-04-17 Thread Marco Schuster
On Fri, Apr 17, 2009 at 1:38 PM, Aryeh Gregor
simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com
 wrote:

 On Thu, Apr 16, 2009 at 6:35 PM, Marco Schuster
 ma...@harddisk.is-a-geek.org wrote:
  Are there any plans to use Google Gears for storage on clients? Okay,
 people
  have to enable it by hand, but it shoulda speed up page loads for people
  very much (at least for those who use it).

 What, specifically, would be stored in Google Gears?  Would HTML5's
 localStorage also be suitable?

Isn't GG supposed to be an implementation of localStorage for browsers who
don't support it yet (does any browser support localStorage *now*, btw?)?
What could be stored is JS bits likely not to change THAT often, i.e. if
Wikipedia is ever going to make a WYSIWYG editor available (Wikia has it!!!)
its JS files could be cached, same for those tiny little flag icons , the
wikipedia ball, the background of the page... maybe even some parts of the
sitewide CSS.

Actually, it could be expanded to store whole articles (then simply copy
over or enhance
http://code.google.com/intl/de-DE/apis/gears/articles/gearsmonkey.html - I'm
gonna modify it for german Wikipedia when i've got some time).


Marco


-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Skin JS cleanup and jQuery

2009-04-17 Thread Marco Schuster
On Fri, Apr 17, 2009 at 11:42 PM, Brion Vibber br...@wikimedia.org wrote:

 * Background JavaScript worker threads

 Not super high-priority for our largely client-server site. Can be
 useful if you're doing some heavy work in JS, though, since you can have
 it run in background without freezing the user interface.


You mean...stuff like bots written in Javascript, using the XML API?
I could imagine also sending mails via Special:Emailuser in the background
to reach multiple recipients - that's a PITA if you want send mails to
multiple users.


 * Geolocation services

 Also available in a standardized form in upcoming Firefox 3.5. Could be
 useful for geographic-based search ('show me interesting articles on
 places near me') and 'social'-type things like letting people know about
 local meetups (like the experimental 'geonotice' that's been running
 sometimes on the watchlist page).

That sounds kinda interesting, even if the accuracy on non-GPS-enabled
devices isn't that high... can this in any way be joined with the OSM
integration?

Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Skin JS cleanup and jQuery

2009-04-16 Thread Marco Schuster
On Wed, Apr 15, 2009 at 11:05 PM, Brion Vibber br...@wikimedia.org wrote:

 Just a heads-up --

 Michael Dale is working on some cleanup of how the various JavaScript
 bits are loaded by the skins to centralize some of the currently
 horridly spread-out code and make it easier to integrate in a
 centralized loader so we can serve more JS together in a single
 compressed request.


Are there any plans to use Google Gears for storage on clients? Okay, people
have to enable it by hand, but it shoulda speed up page loads for people
very much (at least for those who use it).

Marco


-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Mailing lists problems

2009-03-30 Thread Marco Schuster
On Mon, Mar 30, 2009 at 7:32 PM, Anthony wikim...@inbox.org wrote:
 On Mon, Mar 30, 2009 at 12:57 PM, Brion Vibber br...@wikimedia.org wrote:

 If we could have it only send sorry mails on non-spam mails, that
 probably would be nice. Hopefully some day we can get there. :)


 Sending it only to SPF-verified addresses wouldn't be hard, would it?

 (I must admit I have no idea how widespread SPF use is.)
Google verifies SPF (and, for non-Google-Apps-users also has it so it
can be verified), for example. And it's not that difficult to set up
an SPF record if you run your own mail server.

Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Dump processes seem to be dead

2009-02-25 Thread Marco Schuster
2009/2/25 John Doe phoenixoverr...@gmail.com:
 Id recommend either 10m or 10% of
 the database which ever is larger for new dumps to screen out a majority of
 the deletions. what are your thoughts on this process brion (and the rest of
 the tech team)?
Another idea: If $revision is
deleted/oversighted/whateverhowmadeinvisible, then find out the block
ID for the dump so that only this specific block needs to be
re-created in next dump run. Or, better: do not recreate the dump
block, but only remove the offending revision(s) from it. Shoulda save
a lot of dump preparation time, IMO.

Marco

-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschätsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Dump processes seem to be dead

2009-02-23 Thread Marco Schuster
2009/2/22 Robert Ullmann rlullm...@gmail.com:
 Want everyone to just dynamically crawl the live DB, with whatever
 screwy lousy inefficiency? FIne, just continue as you are, where that
 is all that can be relied upon!

Even if you had the dumps, you have another problem: They're
incredibly big and so a bit difficult to parse. So, a small suggestion
if the dumps will ever be workin' again: Split the history and current
db stuff by alphabet, please.

Marco

PS: Are there any measurements what traffic is generated by ppl who
download the dumps? Have there been any attempts to distribute them
via BitTorrent?
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschätsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wikipedia giving 403 Forbidden

2009-02-21 Thread Marco Schuster
On Sat, Feb 21, 2009 at 9:00 PM, Leon Weber l...@leonweber.de wrote:
 On 22.02.2009 03:57:15, jida...@jidanni.org wrote:
 OK, can you please stop giving 403 Forbidden for HEAD on both pages
 that do and don't exist. It makes testing difficult.

 % HEAD -PS -H 'User-agent: leon' http://en.wikipedia.org/
 HEAD http://en.wikipedia.org/ -- 301 Moved Permanently

 Where does that make testing too hard?

You first have to find some dude who tells you oh, 403 means probably
wrong user agent. IMO it should be stated in the HTML content of the
403 page WHY the request failed.

Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschätsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] empty parts in print preview and print

2009-02-16 Thread Marco Schuster
On Mon, Feb 16, 2009 at 3:49 PM, Uwe Baumbach u.baumb...@web.de wrote:
 Empty part (white area) seems to start right after a toc and then ends at not 
 remarkable places within text to show/print the rest of an article.
I 've seen similar issues on Starwars Wikia sometimes...maybe it's the same bug.

Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschätsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Help by Extensions

2009-02-10 Thread Marco Schuster
On Tue, Feb 10, 2009 at 8:50 PM, Jan Luca j...@jans-seite.de wrote:
 Can't help me someone?
The code exactly outputs what it should
(http://toolserver.org/~jan/poll/dev/main.php?page=wiki_outputid=2
returns 2!).

And as it's my code you took without source notice (from
http://code.harddisk.is-a-geek.org/filedetails.php?repname=hd_botpath=%2Fhttp_wrapper.inc.phprev=38sc=1
- don't think I don't notice this. Please add a source note to your
code and I'll be fine with it)...
 if(isset($get_server)) {
is totally wrong. Use if($get_server!=) {

And where do you use main.php? It doesn't get included anywhere.

Marco
-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschätsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] war on Cite/{{cite}}

2009-01-31 Thread Marco Schuster
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Sat, Jan 31, 2009 at 2:03 PM, Domas Mituzas  wrote:
 Hello,

 I understand the need for cite, thats why it is still there :) But...
 (...)
What about converting these to ref tags?

 Unfortunately, {{cite}} is the only template I can profile/account for
 now, we don't have proper per-template profiling, but I wish to get
 one some day. Then we'd have more war on ... topics ;-D
Stub templates, for example :D

 Generally, templates are major part of our parsing, and thats over 50%
 of our current cluster CPU load.
Wow. Can you compare the load to the systems with the load caused by
solely using  tags?

Marco
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (MingW32)
Comment: Use GnuPG with Firefox : http://getfiregpg.org (Version: 0.7.2)

iD8DBQFJhG4xW6S2GapJUuQRAsQdAJ0WHP1DfI0+5BF5s0PYlHe6Ax5rPwCfRXax
f/yjmuQRbPinnl4mzvRWCtw=
=F6F1
-END PGP SIGNATURE-

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] Crawling deWP

2009-01-27 Thread Marco Schuster
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi all,

I want to crawl around 800.000 flagged revisions from the German
Wikipedia, in order to make a dump containing only flagged revisions.
For this, I obviously need to spider Wikipedia.
What are the limits (rate!) here, what UA should I use and what
caveats do I have to take care of?

Thanks,
Marco

PS: I already have a revisions list, created with the Toolserver. I
used the following query: select fp_stable,fp_page_id from
flaggedpages where fp_reviewed=1;. Is it correct this one gives me a
list of all articles with flagged revs, fp_stable being the revid of
the most current flagged rev for this article?
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (MingW32)
Comment: Use GnuPG with Firefox : http://getfiregpg.org (Version: 0.7.2)

iD8DBQFJf5wcW6S2GapJUuQRAl8NAJ0Xs+ImyTqmoX2Vtj6k6PK9ntlS5wCeJjsl
M5kMETB3URYni5TilIOt8Fs=
=j7Og
-END PGP SIGNATURE-

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Crawling deWP

2009-01-27 Thread Marco Schuster
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Wed, Jan 28, 2009 at 12:49 AM, Rolf Lampa  wrote:
 Marco Schuster skrev:
 I want to crawl around 800.000 flagged revisions from the German
 Wikipedia, in order to make a dump containing only flagged revisions.
 [...]
 flaggedpages where fp_reviewed=1;. Is it correct this one gives me a
 list of all articles with flagged revs,


 Doesn't the xml dumps contain the flag for flagged revs?

The xml dumps are nothing for me, way too much overhead (especially,
they are old, and I want to use single files, it's easier to process
these than one hge xml file). And they don't contain flagged
revisions flags :(

Marco
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (MingW32)
Comment: Use GnuPG with Firefox : http://getfiregpg.org (Version: 0.7.2)

iD8DBQFJf5/cW6S2GapJUuQRAj1KAJ9feF3ElQTQbuENa2xfDoXJE5pq5QCfYtRd
x8lfmVHMzmVOqtO39MCfieQ=
=8YJP
-END PGP SIGNATURE-

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] [Toolserver-l] Crawling deWP

2009-01-27 Thread Marco Schuster
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Wed, Jan 28, 2009 at 1:13 AM, Daniel Kinzler  wrote:
 Marco Schuster schrieb:
 Fetch them from the toolserver (there's a tool by duesentrieb for that).
 It will catch almost all of them from the toolserver cluster, and make a
 request to wikipedia only if needed.
 I highly doubt this is legal use for the toolserver, and I pretty
 much guess that 800k revisions to fetch would be a huge resource load.

 Thanks, Marco

 PS: CC-ing toolserver list.

 It's a legal use, the only problem is that the tool i wrote for is is quite
 slow. You shouldn't hit it at full speed. So it might actually be better to
 query the main server cluster, they can distribute the load more nicely.
What is the best speed, actually? 2 requests per second? Or can I go up to 4?

 One day i'll rewrite WikiProxy and everything will be better :)
:)

 But by then, i do hope we have revision flags in the dumps. because that would
 be The Right Thing to use.
Still, using the dumps would require me to get the full history dump
because I only want flagged revisions and not current revisions
without the flag.

Marco
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (MingW32)
Comment: Use GnuPG with Firefox : http://getfiregpg.org (Version: 0.7.2)

iD8DBQFJgAIpW6S2GapJUuQRAuY/AJ47eppKPbBqjz0l4HllCPolMWz9KACfRurR
Lod/wkd4ZM0ee+cPTfaO7yg=
=zB26
-END PGP SIGNATURE-

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] 403 with content to Python?

2009-01-25 Thread Marco Schuster
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Sun, Jan 25, 2009 at 2:50 PM, Platonides  wrote:
 Marco Schuster wrote:
 I used HDBot API x.y (PHP $phpversion) as UA. No idea what triggered
 the filters.

 Perhaps the mention to php, although I'm not being blocked when using
 that UA, so can't test.

Yeah, I'm also not blocked anymore...nice to hear that. But again,
it'd be nice to see in an error message what part of the UA triggered
the filter and why this part is blocked.
Brion, do you have a list of blocked UA (parts)?

Marco
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (MingW32)
Comment: http://getfiregpg.org

iD4DBQFJfJOQW6S2GapJUuQRAiwgAJdXucmjZ4d9BToMAnK3uKuzq3ooAJ4mFGFZ
AeFuiPnC+cSzTuseHDtAUg==
=OwNP
-END PGP SIGNATURE-

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] 403 with content to Python?

2009-01-24 Thread Marco Schuster
On Fri, Jan 23, 2009 at 7:03 PM, Brion Vibber br...@wikimedia.org wrote:
 On 1/23/09 2:36 AM, Andre Engels wrote:
 Two questions:
 1. Why is this User Agent getting this response? If I remember
 correctly, this was installed in the early days of the pywikipediabot,
 when Brion wanted to block it because it had a programming error
 causing it to fetch each page twice (sometimes even more?). If that is
 the actual reason, I see no reason why it should still be active years
 afterward...

 This has nothing to do with pywikipediabot.

 We too frequently encountered poorly-written bots and site-scrapers
 which slammed the servers too hard and caused problems. Blocking default
 UAs of common libraries cut these incidents down dramatically, and helps
 encourage thoughtful bot writers to put specific information into their
 user-agent string, making it possible to track them down more easily if
 they are problematic.

Is there any list of those UAs or UA parts available?
I had this problem some time ago with my bot which used a custom UA
string and got access denied, so I changed its UA to Firefox as I had
no nerves to track down WHICH part of the UA triggered the filter.

Marco

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] 403 with content to Python?

2009-01-24 Thread Marco Schuster
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Sun, Jan 25, 2009 at 1:11 AM, Aryeh Gregor  wrote:
 On Sat, Jan 24, 2009 at 4:05 AM, Marco Schuster
  wrote:
 Is there any list of those UAs or UA parts available?
 I had this problem some time ago with my bot which used a custom UA
 string and got access denied, so I changed its UA to Firefox as I had
 no nerves to track down WHICH part of the UA triggered the filter.

 Just change it to something like YourBotName, run by Marco Schuster
 .  That will certainly avoid any filters, and
 provide the desired info.
I used HDBot API x.y (PHP $phpversion) as UA. No idea what triggered
the filters.

 I don't know why the error page doesn't give this info already.  The
 current message only confuses people and -- if they can figure out
 it's UA-based -- tempts them to mimic browser UA strings.
Anyone skilled enough to write a bot is skilled enough to find that out, IMO.
Anyway, it should also be in the error message what part of the UA is forbidden.

Marco
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (MingW32)
Comment: http://getfiregpg.org

iD8DBQFJe7C4W6S2GapJUuQRAvcgAJ9YY1N0ckE9DzqG21K45teAiG1QVQCfcGBJ
hFtOQisDPnYlLyXjTwKaTTI=
=iuTY
-END PGP SIGNATURE-

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Drafts extension in testing

2009-01-21 Thread Marco Schuster
On Tue, Jan 20, 2009 at 11:35 PM, Aryeh Gregor
simetrical+wikil...@gmail.com wrote:
 On Tue, Jan 20, 2009 at 4:40 PM, Platonides platoni...@gmail.com wrote:
 IMHO we still need some kind of saving into firefox
 storage, for cases like a read-only db. Instead of 'You can't save, the
 site is read-only'-'Save-draft'-'No, you can't, the db is read-only',
 'You can't save, the site is read-only'-'Save-draft'-'The site is
 read-only, the draft has been saved into your browser'.

 This can be done in cutting-edge browsers using HTML5's localStorage
 and sessionStorage.

What about Google Gears? Yeah, it's Google, but GG supports a variety
of browsers and we wouldn't have to wait for M$ to support it properly
in IE 20.

Marco

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Non-latin characters broken in donation comments

2008-12-01 Thread Marco Schuster
On Mon, Dec 1, 2008 at 7:02 PM, Brion Vibber [EMAIL PROTECTED] wrote:
 Tei wrote:
 Is me or maybe form   need   charset=*UTF-8*  added to it?

 Considering that the part that's broken isn't even *on* our form, I'm
 pretty sure it's not something on our form. :) The name gets put in at
 PayPal's forms, and is passed on to us with the payment completion data.

Can you reverse the buggy encoding of Paypal (iconv)?

Marco

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] [Commons-l] Support for Chemical Markup Language

2008-11-29 Thread Marco Schuster
On Sun, Nov 30, 2008 at 1:11 AM, Brian Salter-Duke
[EMAIL PROTECTED] wrote:
 On Sun, 30 Nov 2008 00:50:08 +0100, Platonides [EMAIL PROTECTED] wrote:
 See https://bugzilla.wikimedia.org/show_bug.cgi?id=16491
 That users can embed javascript is not acceptable to run it on Wikipedia.
 Other parameters, like urlContents or signed wouldn't be used but at
 least they can be disabled.

 I am afraid this is all beyond my expertise. Are you saying that there
 is no way Jmol can ever be used on WMF projects?

There is, as soon as the Javascript embedding possibility gets
disabled and the extension gets a proper review (TM).

Marco

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Upload filesize limit bumped

2008-11-22 Thread Marco Schuster
On Sat, Nov 22, 2008 at 1:43 PM, Daniel Kinzler [EMAIL PROTECTED]wrote:

 Anyway, HTTP doesn't support feedback during upload (or any feedback,
 really),
 and HTML does not offer a way for multi-file uploads (which would also be
 quite
 handy). Any solutions I have so far seen for that are based either on a
 Java
 Applet or on Falsh.

RS.com's upload indicator apparently works via an iframe:

form name=ul method=post
action=http://rs426l3.rapidshare.com/cgi-bin/upload.cgi?rsuploadid=149063132559697458;
enctype=multipart/form-data onsubmit=return zeigeProcess();
div id=progbar style=display:none;

iframe
src=http://rs426l3.rapidshare.com/progress.html?uploadid=149063132559697458;
name=pframe width=100% height=120 frameborder=0 marginwidth=0
marginheight=0 scrolling=no/iframe
/div
div id=dateiwahl
input type=file size=65 id=dateiname name=filecontent
onchange=zeigeUploadBtn(); /
input type=image id=btnupload name=u
src=/img2/upload_file.jpg style=visibility:hidden; /
   /div
/form


The usage of an unique upload ID ensures at their end that the progressbar
iframe always gets the right data. It refreshes using AJAX technology:
http://pastebin.com/f56b8c8f4
I'll take a look if this might be applicable to put into MW.

Marco
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Language committee and language setup

2008-11-15 Thread Marco Schuster
On Sun, Nov 16, 2008 at 1:58 AM, Brion Vibber [EMAIL PROTECTED] wrote:
 Gerard Meijssen wrote:
 Hoi,
 While you are at it, please have a look at bug 15013... It has been waiting
 today for 121 days.. 121 days after registering in Bugzilla. If there are
 any issues please let them be known.

 https://bugzilla.wikimedia.org/show_bug.cgi?id=15013

 1) The bug was improperly labeled and could not be found when searching
 specifically for the request. This likely didn't help it to receive any
 attention! :)
Bit of offtopic, but are there actually any howtos how to correctly
label your bugs e.g. for config changes, wiki creation etc. so that
these do not get overlooked?

Marco

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l