Re: [Wikitech-l] wikipedia lacks a share' button
On Sat, Oct 29, 2011 at 4:22 PM, Daniel Friesen li...@nadir-seen-fire.com wrote: - It doesn't scale very well. If you do try to add more vendors and users do enable most of them, you still end up loading from each enabled vendor slowing things down. With the exception of the FB Like/Recommend button, everything (even the FB share link) is just an image paired with a HTML link. Maybe other sites allow embedding their logos, so the only image which needs to be loaded externally is the FB one. - Frankly the UI is pretty bad. That's the price you have to pay for total privacy, unfortunately. - Once you enable a vendor we drop right back to a 3rd party script being injected into the page such that it can do malicious things. Btw, if you're a 3rd party with a script in a page you can go pretty far abusing XHR and history.pushState to make it look to a user like they're browsing the website normally when in reality they're on the same page with the script still running. Oh, and that includes making it look like you're safely visiting the login page when in reality you didn't change pages and the script is still running ready to catch passwords. Do you have any links with further info on this? Marco ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] wikipedia lacks a share' button
Hi, On Sun, Oct 23, 2011 at 7:03 PM, Roan Kattouw roan.katt...@gmail.com wrote: This is the reason why we absolutely cannot have the Facebook Like button: Facebook makes you use an FB-hosted button image (and JS too, I think), collects data from every user that views the Like button even if they don't click it (this is the part that violates the privacy policy), and disallows self-hosting. German IT news site heise.de solved the privacy and load-time problem: http://www.heise.de/extras/socialshareprivacy/ Unfortunately it's in German, but the code is easy to understand. Marco ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Proposed Authentication Schema for Wikimedia projects
Nice idea, but most users hate inputting a password/drawing on the screen to unlock it. So if you lose your phone or it gets stolen, all your credentials are lost and in the hands of an unknkown attacker. Also, phones tend to break during day-to-day usage (beverage spills, falls from desks). While these problems were mentioned on the design page, I have another scenario: colleague comes over to your desk, swaps your phone with his so you don't notice immediately, pranks you on Facebook or Wikipedia and then swaps the phone back. Marco On Tue, Oct 18, 2011 at 4:51 AM, packs-24...@mypacks.net wrote: I originally posted this idea on G+ and Arthur Richards suggested I cross-post it here. My friend, Isaac Potoczny-Jones is a computer security professional. He developed a new authentication schema that layers on top of existing technologies and leverages a user's smartphone and QRCodes to improve authentication usability, eliminate human-generated passwords, and further improve security by separating the authentication channel from the login session. He's calling this capability Animate Login and as part of the proof of concept, he developed a MediaWiki implementation. I believe the Wikimedia foundation should pursue adding this technique as part of the primary login options for it's projects. I would personally love to be able to just point my phone at the login screen and have the system log me in to Wikipedia without having to type anything or remember complex passwords. Wikimedia has worked hard to consolidate logins across the many projects over the last couple years and this would be a great way of providing seamless login. It should be very low overhead and relatively easy to implement. Isaac is very interested in seeing his tool put to use on Wikipedia. Wikimedia could lead the way to improved authentication that also vastly improves the user experience! Isaac explains the project in some detail on this Google Plus post: https://plus.google.com/u/0/112702172838704084335/posts/B9UR2zzDY3f?hl=en His landing page for the project is here: http://animate-innovations.com/content/animate-login The website has videos, links to a MediaWiki instance where its in use and more. From the conversations I've had with him, I know that he has thought long and hard about this application and has sought to address/understand all of the potential attack vectors. Compared to human-generated passwords, this would be vastly more secure and dramatically improve the user experience of logging in. It might even entice new or old editors to login and give it a try and thus re-engage them in editing. I'm also certain it could generate a fair bit of buzz as people learn they can use their smartphone to login to Wikipedia. I hope you'll consider working with Isaac. I'll point him to this thread so he knows it is here. I know he'd love to see this implemented in Wikipedia. Don ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] testing mobile browsers?
On Sat, Jul 9, 2011 at 8:51 PM, Håkon Wium Lie howc...@opera.com wrote: Opera comes in two flavors for mobile devices: Opera Mini and Opera Mobile. Opera Mobile is, indeed, close to the desktop version in the sense that it runs the same display, javascript engine etc. on the device. The versions of Opera Mobile floating in the wild are kinda different. Every HTC HD2 user with Windows Mobile 6.5 is likely to still run the ages-old buggy HTC version (8.x AFAIR, compared to current v10!), as the official versions STILL don't support the multi-touch features even though libraries exist which abstract the multi-touch -.- Marco ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Special:Search goose chase
On Sun, Jun 19, 2011 at 8:40 PM, MZMcBride z...@mzmcbride.com wrote: The main issue (to me) is that it says Did you mean: [bold blue link], which in this context I think most users would expect to be able to click and go directly to the article. It should be simple enough to query the page table for page existence, but perhaps two messages would be better here. +1, this has confused me for ages (at least make the link some other color than blue!) Marco ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] wiki/ prefix in URL
On Wed, May 18, 2011 at 2:00 AM, John Vandenberg jay...@gmail.com wrote: Is there something else on these virtual hosts other than a few regexes which are extremely unlikely to be used as page names (i.e. \/w\/.*\.php). Anything beginning with /w/ must be disallowed as a page title for this to work. Marco ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] search=steven+tyler gets Steven_tyler
On Sun, May 15, 2011 at 5:02 PM, Aryeh Gregor simetrical+wikil...@gmail.com wrote: You cannot fix the problem by doing accent/diacritic normalization. i and I are the same letter in English but different letters in Turkish. You cannot get around that. We'd need to have a separate case-folding algorithm for Turkish wikis, or make them use one that's incorrect for their language. Actually non-Turkish/Azerbaijan wikis have this problem too, if the wiki has articles or redirects using these characters... Marco ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] integration of MediaWiki and version control system as backend?
On Sat, May 7, 2011 at 12:10 AM, Neil Kandalgaonkar ne...@wikimedia.org wrote: I don't believe anybody has successfully done this for MediaWiki, and to my knowledge this is would be difficult to impossible. Our backend storage has to implement SQL in some way. What about LDAP? It's used for authentication anyway, and it'd open the way to block users from editing,reading etc. certain namespaces or articles totally fine-grained... simply make a sub-element userWriteBlock: cn=foobar,dc=en,dc=wikipedia,dc=org to an article / NS entry you don't want a certain user to be able to edit. Files could also be stored in a LDAP daemon, though this DOES suck for anyone trying to edit the LDAP tree by hand. Marco ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Wiki Inter-Searchability
Hi, On Mon, Mar 21, 2011 at 3:49 PM, Tod listac...@gmail.com wrote: Is this possible? Is the wiki search driven by a crawler that would follow the links on that new wiki home page? If not, is there an approach I could follow to be able to provide search capability against a select number of these individual wikis under one umbrella? I've never tried it personally, but I think SphinxSearch may be worth a check; it works directly against the database. Marco ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Highest Priority Bugs
On Wed, Mar 16, 2011 at 11:38 AM, Roan Kattouw roan.katt...@gmail.com wrote: Normal shell users can execute all but one of the steps required for wiki creation: root access is needed to create the DNS entry for the new subdomain. Previously we just had RobH handle all wiki creations, but he's been working almost exclusively on setting up the Virginia datacenter for a while now AFAIK. I previously suggested on IRC that we could have regular shell users like Ashar do the wiki creations at a scheduled time and assign a root to do the DNS stuff for them. Why is a rootuser needed for changing DNS entries? A zonefile is a normal textfile, after all - and if you use PowerDNS on one server configured as supermaster with a MySQL backend and other servers with PowerDNS as superslave (backend is not important there), you don't even need shell access to manage your DNS. Not even for creating new zones, as the supermaster makes all slaves automatically sync those zones where the invidual slaveserver is listed with a NS entry. /powerdns_ad Marco ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] How would you disrupt Wikipedia?
On Mon, Jan 10, 2011 at 7:25 PM, Dmitriy Sintsov ques...@rambler.ru wrote: There's been done everything at my primary work to undermine my MediaWiki deployment efforts - that it easily can be installed via the linux package - so why he is installing that manually, markup is primitive, inflexible, PHP is inferior language, use ASP.NET instead and so on. ASP.NET? Only if you want all your sourcecode exposed. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Missing Section Headings
On Mon, Jan 3, 2011 at 6:59 PM, Platonides platoni...@gmail.com wrote: He is indeed using IE5 for Mac. I guess that adding h1, h2, h3, h4, h5, h6 { overflow: visible; } to his monobook.css will fix it. Do we have some CSS trick to target IE5 Mac? Removing the headers isn't too friendly, so if there's an easy fix, I would apply it. There is some CSS conditional stuff for IE5/Mac, didn't we have a CSS fix file especially for this browser? Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Commons ZIP file upload for admins
On Tue, Nov 30, 2010 at 8:48 AM, Dmitriy Sintsov ques...@rambler.ru wrote: * Bryan Tong Minh bryan.tongm...@gmail.com [Tue, 30 Nov 2010 08:44:43 +0100]: I think that the most recent version should be sufficient. I don't think Java would break backwards compatibility: users wouldn't be happy if their old jar suddenly stops working on a new JVM. Why an outdated and inefficient ZIP format, after all? 7zip is incompatible to JVM, should it be a better choice for archive uploads? Or, that is too hard to parse on PHP side (I gueses console exec is required)? You can create a zip easily on all major OSes with drag'n'drop. Windows supports it IIRC from Win 98 SE and up, a standard Linux by the tools the desktop installs (for KDE, it once was Ark), and MacOS also delivers ZIP out of the box. For ZIP, there are even built-in PHP functions to handle it. 7zip is, though open source, requiring third-party plugins, both for the OS and servers, and 7zip is not really widespread. RAR and ZIP are the dominant formats in cross-platform data exchange. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Refresh of the Mediawiki logo
On Thu, Nov 11, 2010 at 9:07 PM, David Gerard dger...@gmail.com wrote: On 11 November 2010 19:55, MZMcBride z...@mzmcbride.com wrote: The muted colors are a nice start, but I think the yellow still really sticks out when the logos are presented as a family. Maybe the petal color could be changed? I think this not being a photo does not make it better than the photo version. It's entirely unclear there's enough of a problem to be solved here. I think it does look better than the older version, nice work. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Resource Loader problem
On Wed, Nov 10, 2010 at 10:56 AM, Roan Kattouw roan.katt...@gmail.com wrote: We're not looking for a full-blown parser, just one that has a few basic features that we care about. The current JS parser only supports expansion of message parameters ($1, $2, ...), and we want {{PLURAL}} support too. AFAIK that's pretty much all we're gonna need. Michael Dale's implementation has $1 expansion and {{PLURAL}}, AFAIK, and maybe a few other features. Actually PHP and JS are a bit similar. Different function names and slight syntax differences, but I think it is possible to take the existing PHP parser, strip out the references to MW internals and replace the database queries with appropriate API calls. That would also enable a true WYSIWYG editor or live preview at least, as having a JS parser will also allow that the resulting DOM nodes have some kind of reference attribute which can be looked at to find the wikitext responsible for the creation of the node (and so, enable inline editing). Actually, this seems just perfect for a GSoC project for next year: Port the MW parser to JavaScript, and a followup project to make a WYSIWYG/inline editor based on it. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Cross wiki script importing
On Tue, Nov 2, 2010 at 1:09 AM, bawolff bawolff...@gmail.com wrote: May I ask how? If you're logged in to the secure server, then the cookies won't get transmitted to the unsecure server when loading js from them. At the very worse (if we really put on our tin foil hats) I suppose someone could intercept the non-secured js script, do a man in the middle type thing and replace the script with malicious js. However if someone actually has the ability to do that, they could already do that with the geoip lookup. Thus I don't see how doing the importScriptURI reduces security. Firefox and IE will whine that the site attempts to load unsecure resources. Also, it is indeed possible to transmit cookies; it's enough that the user has also logged in into the unsecure servers in the past and is e.g. at a public WiFi hotspot now and so uses the secure gateway. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Firesheep
On Mon, Oct 25, 2010 at 7:15 PM, Hay (Husky) hus...@gmail.com wrote: Has anyone seen this? http://codebutler.com/firesheep A new Firefox plugin that makes it trivially easy to hijack cookies from a website that's using HTTP for login over an unencrypted wireless network. Wikipedia isn't in the standard installation as a site (lots of other sites, such as Facebook, Twitter, etc. are). We are using HTTP login by default, so i guess we're vulnerable as well (please say so if we're using some other kind of defensive mechanism i'm not aware of). Might it be a good idea to se HTTPS as the standard login? Gmail has been doing this since april this year. Firesheep works by snooping cookies, not login processes, and it's even without software like this incredibly easy to own someone. All it needs to own a Wikipedia admin or user is being in the same network as him. The admin in question doesn't even have to visit Wikipedia directly, there are enough pages hotlinking to upload.wikimedia.org, which should cause the browser to transmit session data. If you're in need of using secure login, then you can use the secure webserver, but in the past it had some load issues. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Commons ZIP file upload for admins
On Mon, Oct 25, 2010 at 10:09 PM, Aryeh Gregor simetrical+wikil...@gmail.com wrote: On Mon, Oct 25, 2010 at 3:50 PM, Max Semenik maxsem.w...@gmail.com wrote: Instead of amassing social constructs around technical deficiency, I propose to fix bug 24230 [1] by implementing proper checking for JAR format. Does that bug even affect Wikimedia? We have uploads segregated on their own domain, where we don't set cookies or do anything else interesting, so what would an uploaded JAR file even do? upload.wikimedia.org could end up on Google's Safe Surfing (or however it's called) blacklist for hosting malicious .jar's which are injected on another pwned web site or loaded through pwned advertising brokers. Given the fact that Java is the 2nd biggest exploit vector in terms of exploits (but 1st in terms of impact - users don't update Java as often as the Adobe Reader), it should not be allowed to upload JARs (or things that look like something else, but infact can be loaded and executed by the JRT) to Wikipedia. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Convention for logged vs not-logged page requests
On Wed, Oct 20, 2010 at 12:49 AM, Krinkle krinklem...@gmail.com wrote: But the short version without /w/index.php but with direct ?parameters doensn't for for action=raw (ctype=text/javascript) See the errror on: http://meta.wikimedia.org/wiki/User:Krinkle/global.js?action=raw Strange. I'm sure this is to prevent users from using Wikipedia as spy-javascript-hoster, but why does http://meta.wikimedia.org/w/index.php?title=User:Krinkle/global.jsaction=raw work then? Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] About wiki links: could they point to page id?
On Mon, Oct 4, 2010 at 10:54 AM, Strainu strain...@gmail.com wrote: 2010/10/4 Alex Brollo alex.bro...@gmail.com: It's strange (but I guess that there's a sound reason) that plain wikilinks point to a variable field of wiki records (the name of the page) while many troubles would be solved, if they could point to the invariable field of such records: the id. The obviuos restult is, that all links are broken (and need fixing) as soon as a page is moved (t.i. renamed). Don't redirects exist specifically for that? Better, use permalinks, which point to a specific revision id. My question is: which is the sound reason for this strange thing? There's some idea about fixing this? Err.. perhaps they decided people should be able to comprehend the link destianation? Plus I remember something about nice URLs being a MUST DO in SEO a while ago...but I'm not 100% sure on that. I certainly hope this won't change anytime soon, on Wikipedia at least. It 'd be nice to have the page_id as optional parameter... but I think you can get the page's title via the API. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Static dump of German Wikipedia
Hi all, just a quick status update: The dump is currently running at 2req/s and ignores all pages which have is_redirect set; also I changed the storage method: the new files are appended to /mnt/user-store/dewiki_static/articles.tar, as I noticed I was filling up the inodes of the file system; doing the storage inside a tarball will prevent this and I don't have to waste time downloading the tons of files to my PC, only one huge tarball when its done. I also managed to get a totally stripped down version of the Vector skin file loading an article via JSON (I won't release it now though, it's a damn hack - nothing except loading works, as I have removed every JS file... should be pretty till Sunday). Current dump position is at 92927, stripping out the redirects 53171 articles have really been downloaded, resulting in 770MB of uncompressed tar (I expect gzip or bz2 compression to save lots of space though). For the redirects: how do I get the redirect target page (maybe even the #section)? Marco PS: Are there any *fundamental* differences between the Vector skin files of different languages except the localisation? Could this maybe be converted to Javascript, maybe $(#footer-info-lastmod).html(page was last changed at foobar)? -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Toolserver-l] Static dump of German Wikipedia
On Sat, Sep 25, 2010 at 12:56 AM, Platonides platoni...@gmail.com wrote: Ariel T. Glenn wrote: Στις 23-09-2010, ημέρα Πεμ, και ώρα 21:27 -0500, ο/η Q έγραψε: Given the fact that static dumps have been broken for *years* now, static dumps are on the bottom of WMFs priority list; I thought it would be the best if I just went ahead and built something that can be used (and, of course, improved). Marco That's what I just said. Work with them to fix it, IE: volunteer. IE: you fix it. Actually it's not so much that they are on the bottom of the list as that there are two people potentially looking at them, and they are Tomasz (who is also doing mobile) and me (and I am doing the XML dumps rather than the HTML ones, until they are reliable and happy). However if you are interested in working on these, I am *very* happy to help with suggestions, testing, feedback, etc., even while I am still woroking on the XML dumps. Do yuu have time and interest? Ariel Most (all?) articles should be already parsed in memcached. I think the bottleneck would be the compression. Note however that the ParserOutput would still need postprocessing, as would ?action=render. The first thing that comes to my mind is to remove the edit links (this use case alone seems enough for implementing editsection stripping). Sadly, we can't (easily) add the edit sections after the rendering. This should be doable using a simple regex which plainly goes for span class=editsection. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Static dump of German Wikipedia
Hi all, I have made a list of all the 1.9M articles in NS0 (including redirects / short pages) using the Toolserver; now I have the list I'm going to download every single of 'em (after the trial period tonight, I want to see how this works out. I'd like to begin with downloading the whole thing in 3 or 4 days, if noone objects) and then publish a static dump of it. Data collection will be on the Toolserver (/mnt/user-store/dewiki-static/articles/); the request rate will be 1 article per second and I'll download the new files once or twice a day to my home PC, so there should be no problem with the TS or Wikimedia server load. When this is finished in ~ 21-22 days, I'm going to compress them and upload them to my private server (well, if Wikimedia has an archive server, that 'd be better) as a tgz file so others can play with it. Furthermore, though I have no idea if I'll succeed, I plan on hacking a static Vector skin file which will load the articles using jQuery's excellent .load() feature, so that everyone with JS can enjoy a truly offline Wikipedia. Marco PS: When trying to invoke /w/index.php?action=render with an invalid oldid, the server returns HTTP/1.1 200 OK and an error message, but shouldn't this be a 404 or 500? -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Is the $_SESSION secure?
On Fri, Sep 24, 2010 at 1:36 AM, Neil Kandalgaonkar ne...@wikimedia.org wrote: On 9/23/10 2:24 PM, Ryan Lane wrote: The contents of that session on the server are unencrypted, correct? Depending on what the secret is, he may or may not want to use it. For instance, that is probably a terrible place to put credit card numbers temporarily. Good point, but in this case I'm just storing the path to a temporary file. The file isn't even sensitive data; it's just a user-uploaded media file for which the user has not yet selected a license, although we anticipate they will in a few minutes. If it's user-uploaded, take care of garbage collection; actually, how does PHP handle it if you upload a file and then don't touch it during the script's runtime? Will it automatically be deleted after the script is finished or after a specific time? Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Toolserver-l] Static dump of German Wikipedia
On Fri, Sep 24, 2010 at 3:44 AM, Marcin Cieslak sa...@saper.info wrote: John Vandenberg jay...@gmail.com wrote: http://download.wikimedia.org/dewiki/ Is there any problem with using them? I think they are from June 2008. Are they? http://download.wikimedia.org/dewiki/20100903/ These are the database dumps. In order to get any HTML out of it, you need to set up either MediaWiki and/or a replacement parser; not to mention the delicate things enWP folks did with template magic, which requires setting up ParserFunctions - these might even depend on whatever version is currently running live. That's why static dumps (or ?action=render output) are the thing you need when you want to create offline versions or things like Mobipocket Wikipedia (which is my actual goal with the static dump). Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] ResourceLoader, now in trunk!
Hi, On Tue, Sep 7, 2010 at 9:44 AM, Roan Kattouw roan.katt...@gmail.com wrote: Also, it would be great if these high-level JS-libraries like jQuery actually were ported into DOM API level (native browser's implementation instead of extra JS layer). However, these questions are to FF/IE/Opera developers... I definitely think this is the future, provided it's implemented reliably cross-browser. Also, you'd probably want to have a fallback library for browsers that have no or incomplete (e.g. missing a jQuery feature that's newer than the browser) native support. Please, no. The various browsers all have their problems with standards even now, and I don't expect they'd get a jquery (or whatever JS framework) clientside implemented without having all sorts of problems. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Developing true WISIWYG editor for media wiki
On Tue, Aug 3, 2010 at 10:53 AM, Jacopo Corbetta jacopo.corbe...@gmail.com wrote: However, the editing mode provided by browsers is a nightmare of incompatibilities. Basically, each browser produces a different output given identical commands, so currently MeanEditor is not completely up to the task. An external application might be an interesting solution. I don't have the link ready, but Google solved this in Google Docs by re-implementing this in Javascript... they intercept mouse movements/clicks and keyboard events and then javascript-render the page. Given the complexity of wikitext, I fear rewriting the parser in Javascript is the only way to get a 100% compatible wikitext editor... Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] accessibility problems with vector
On Sun, Jun 20, 2010 at 2:12 PM, Tisza Gergo gti...@gmail.com wrote: Anyway, removing table-related attributes doesn't offer much advantage in itself. There will be a few validator warnings about it, so what? Getting rid of table layouts would be nice, but IE6/7 do not understand display:table either, so until IE 6 and 7 die and 8 stops trying to be backwards-compatible, they are here to stay, I'm afraid. IE6 is (thanks god) dying out, one problem less... but yeah, that IE7 doesn't like display: table sucks. But why is IE8 falling back to compat-mode at Wikimedia sites? Any way to force IE8 to standard-conforming mode? Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [GSoC] Extension management platform
Hey, just one thing which makes Wordpress AutoUpdate suck and would be great if you'd take care of it while designing: check for the permissions of ALL files you try to overwrite / update BEFORE trying the update - maybe include a update.xml for each version delta which details the changed files. Then make a fileperms() on each one and look if the www-data user is allowed to write on this file. Third time in a row Wordpress and the Debian package screwed up with either itself or some PHP module here on writing permissions... and I had to un-mess the update by hand (and I'm not the only one)... don't want to see this again in MediaWiki :( Marco On Thu, Jun 3, 2010 at 1:55 AM, Jeroen De Dauw jeroended...@gmail.com wrote: Hey all, As a lot of you already know, I'm again doing a Google Summer of Code project for the WMF this year. The goal of this *awesome* project is to create a set user friendly administration interfaces via which administrators can manage the configuration of their wiki and extensions as well as manage extension installation and updating. The user experience should be as *awesome* as the Wordpress one, or even better. After doing research in existing code and talking to the relevant people, I created a little roadmap [0] of how I plan to proceed with my project. Any feedback and comments on this would be very much appreciated (esp the critic ones :). It'd be to bad to reinvent things already achieved by people, simply by me not knowing about it! I hope to start with the actual coding by this weekend, and will update the roadmap, and the docs page itself [1], as well as my blog [2], as I make progress. [0] http://www.mediawiki.org/wiki/Extension_Management_Platform/Roadmap [1] http://www.mediawiki.org/wiki/Extension_Management_Platform [2] http://blog.bn2vs.com/tag/extension-management/ Cheers -- Jeroen De Dauw * http://blog.bn2vs.com * http://wiki.bn2vs.com Don't panic. Don't be evil. 50 72 6F 67 72 61 6D 6D 69 6E 67 20 34 20 6C 69 66 65! -- ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] a wysiwyg editor for wikipedia?
On Wed, Jun 2, 2010 at 1:42 AM, K. Peachey p858sn...@yahoo.com.au wrote: On Tue, Jun 1, 2010 at 11:22 PM, Marco Schuster ma...@harddisk.is-a-geek.org wrote: On Tue, Jun 1, 2010 at 4:09 AM, Jacopo Corbetta jacopo.corbe...@gmail.com wrote: In our experience, the biggest obstacle is to get the different browsers to reliably make the same changes to HTML. The editor interface is non-standard, and browsers sometimes disagree on encoding rules, escaping, choice of tags, etc. We could do the really hard way, like Google did with Google Docs (http://googledocs.blogspot.com/2010/05/whats-different-about-new-google-docs.html): make *everything* via JS by capturing keystrokes and mouse movements. This way a consistent and reproducible user experience on all platforms can be achieved. And by doing it all in JS, the editor could also generate a wikitext-delta right away and doesn't need to transfer the whole page's wikitext. Marco Google Doc's interface acts like sh*t unless your on a super dooper decent computer and gives negative views on the usability of a service. -Peachey I run a 3 year old laptop (Intel C2D though, but near-to-no cpu load in Firefox, even less in Chrome) and no problems with it, except that printing always does totally not look like what I see in the browser... hopefully Google will make proper PDF export for printing. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Reasonably efficient interwiki transclusion
On Tue, May 25, 2010 at 8:58 PM, Roan Kattouw roan.katt...@gmail.comwrote: To the point of whether parsing on the on the distant wiki makes more sense: I guess there are points to be made both ways. I originally subscribed to the idea of parsing on the home wiki so expanding the same template with the same arguments would always result in the same (preprocessed) wikitext, but I do see how parsing on the local wiki would help for stuff like {{SITENAME}} and {{CONTENTLANG}}. Why not mix it? Take other templates etc. from the source wiki and set magic stuff like time / contentlang to target wiki values. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Bugzilla Weekly Report
On Mon, May 17, 2010 at 10:49 AM, Roan Kattouw roan.katt...@gmail.comwrote: 2010/5/17 repor...@isidore.wikimedia.org: Bugs marked FIXED : 1622 [snip] Top 5 Bug Resolvers roan.kattouw [AT] gmail.com 19 jeluf [AT] gmx.de 15 innocentkiller [AT] gmail.com 11 sam [AT] reedyboy.net 6 tparscal [AT] wikimedia.org 5 That can't be right. It can, because of the cleanup of Chad. Depending on how the reporter software calculates, direct DB changes might confuse the counting algorithms. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Technical means for tagging content (was: [Foundation-l] Statement on appropriate educational content)
On Sun, May 9, 2010 at 3:09 PM, K. Peachey p858sn...@yahoo.com.au wrote: On Sun, May 9, 2010 at 11:08 PM, K. Peachey p858sn...@yahoo.com.au wrote: On Sun, May 9, 2010 at 10:50 PM, Ævar Arnfjörð Bjarmason ava...@gmail.com wrote: It's pretty easy to do arbitrary content tagging (and filtering now). You just add a template or external link to the page. E.g. {{PG-13}}. Then all some third party has to do is to download templatelinks.sql.gz (or externallinks.sql.gz) in addition to the image dump. You just have to start getting people to tag things consistently. The good thing is that you can start now without any additional software support. Also, Do we even have image dumps anymore? AFAIR backups exist, but I don't know if there are any public dumps. I'd imagine they're simply too big to maintain and archive... Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Broken videos
On Wed, Mar 17, 2010 at 5:37 PM, Aryeh Gregor simetrical+wikil...@gmail.com wrote: On Wed, Mar 17, 2010 at 10:39 AM, Platonides platoni...@gmail.com wrote: Would it help adding a link rel=prefetch to the first video in the page? In Firefox, yes. Does anyone else implement that? If it's only Firefox, we could just as well replace Cortado with video autobuffer to begin with as soon as the page loads. In Chrome and Safari, we don't even need the autobuffer, since they don't implement it. (Actually, video autobuffer preload=auto would be the current way to do it, given recent spec changes.) I hope no one is ever insane enough to use this. Imagine those people with cellphones and no data transfer flat (~70% of mobile internet users) - their bills will skyrocket when even a single video is set to auto-preload. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Broken videos
On Wed, Mar 17, 2010 at 9:11 PM, Trevor Parscal tpars...@wikimedia.org wrote: On 3/17/10 1:02 PM, Marco Schuster wrote: On Wed, Mar 17, 2010 at 5:37 PM, Aryeh Gregor simetrical+wikil...@gmail.com wrote: On Wed, Mar 17, 2010 at 10:39 AM, Platonidesplatoni...@gmail.com wrote: Would it help adding alink rel=prefetch to the first video in the page? In Firefox, yes. Does anyone else implement that? If it's only Firefox, we could just as well replace Cortado withvideo autobuffer to begin with as soon as the page loads. In Chrome and Safari, we don't even need the autobuffer, since they don't implement it. (Actually,video autobuffer preload=auto would be the current way to do it, given recent spec changes.) I hope no one is ever insane enough to use this. Imagine those people with cellphones and no data transfer flat (~70% of mobile internet users) - their bills will skyrocket when even a single video is set to auto-preload. Marco Hopefully browsers on phones and low-bandwidth devices could just ignore this attribute. What about people who use tethering? A browser can't determine if he connects via a flatrate connection (DSL, company network) or via tethering and a mobile connection. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] modernizing mediawiki
On Wed, Mar 3, 2010 at 4:30 PM, David Gerard dger...@gmail.com wrote: On 3 March 2010 15:06, Paul Houle p...@ontology2.com wrote: For a large-scale site, there's going to be a lot of administration work to be done, so it doesn't matter if the system is difficult to set up and configure. As it turns out, MediaWiki isn't really hard at all :-) Wordpress, on the other hand, set out with the mission of being the 'cheap and cheerful' program that would dominate the market for blogging software. Everything about Wordpress is designed to make it easy to set up a Wordpress site quickly and configure it easily. Wordpress does scale OK to fairly large blogs and high traffic if you SuperCache it. Multi-user WordPress is a bit arsier. Comparable faff to MediaWiki setup. apt-get install wordpress, and let dpkg handle the rest. it's really easy. marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] modernizing mediawiki
On Wed, Mar 3, 2010 at 5:57 AM, Ryan Lane rlan...@gmail.com wrote: I don't really find updates to be terribly difficult. You mostly just check out (or download) the newest version, and run update.php. This is probably more difficult without shell access. With Wordpress upgrades it's even easier: two clicks and you're done (okay, except if you run multi-user WP setups). Same for extension updates. It even *notifies* you for updates, especially for security-critical - if you don't follow the -announce lists and subsequently never update, your wiki can and will be open to any security issue coming up. -I don't want to go to my ftp to download my local settings file, add a few lines then reupload it. This is caveman-like behavior for the modern internet. Get a host that supports SSH. Use VI, Emacs, nano, pico, etc. HAHAHA, sorry but this way of thinking is stone-age. Who are we to require our users to get more expensive hosting AND knowledge of VI/Emacs (a newbie most likely won't have HEARD of ssh, vi and emacs!) just for being able to modify the core settings of a wiki without having the FTP extra work? Come on, it's so easy to make a web-based settings editor. Mighta even be lots easier to just move all settings stuff except MySQL data into the DB. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] hiphop progress
The point of $IP is that you can use multisite environments by just having index.php and Localsettings.php (and skin crap) in the per-vhost directory, and have extensions and other stuff centralized so you can update the extension once and all the wikis automatically have it. However, the Installer could be patched, to resolve $IP automatically if the user wishes to run a HipHop environment. Marco On Mon, Mar 1, 2010 at 2:59 PM, Ævar Arnfjörð Bjarmason ava...@gmail.com wrote: On Mon, Mar 1, 2010 at 13:35, Domas Mituzas midom.li...@gmail.com wrote: Still, the decision to merge certain changes into MediaWiki codebase (e.g. relative includes, rather than $IP-based absolute ones) would be quite invasive. Also, we'd have to enforce stricter policy on how some of the dynamic PHP features are used. I might be revealing my lack of knowledge about PHP here but why is that invasive and why do we use $IP in includes in the first place? I did some tests here: http://gist.github.com/310380 Which show that as long as you set_include_path() with $IP/includes/ at the front PHP will make exactly the same stat(), read() etc. calls with relative paths that it does with absolute paths. Maybe that's only on recent versions, I tested on php 5.2. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] hiphop progress
On Mon, Mar 1, 2010 at 3:26 PM, Daniel Kinzler dan...@brightbyte.de wrote: Marco Schuster schrieb: The point of $IP is that you can use multisite environments by just having index.php and Localsettings.php (and skin crap) in the per-vhost directory, and have extensions and other stuff centralized so you can update the extension once and all the wikis automatically have it. That'S a silly multi-host setup. Much easier to have a single copy of everything, and just use conditionals in localsettings, based on hostname or path. Downside of this: as a provider, *you* must make the change, not the customer, as it is one central file. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] User-Agent:
On Tue, Feb 16, 2010 at 4:44 PM, Anthony wikim...@inbox.org wrote: On Tue, Feb 16, 2010 at 10:39 AM, Domas Mituzas midom.li...@gmail.comwrote: Been like that for ages, haven't it? No idea. For ages you've been able to just go onto the Wikimedia servers and change whatever you feel like, and answer to nobody? You must be misunderstanding my question or something. Correct me if I'm wrong, but AFAIR Domas was the MySQL admin guy since pretty much the beginning, and I think that the fact he's sysop won't change anyway, no matter what happens. For the notes, the step of banning everything without UA sucks totally. Sure, the API and other abuse-prone stuff can be blocked, but ordinary article watching should ALWAYS be possible, no matter what fucked-up UA you use. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] User-Agent:
Hi, On Tue, Feb 16, 2010 at 8:31 PM, Domas Mituzas midom.li...@gmail.com wrote: You can sure assume, that we need to come up with something to defend a new policy. Yeah, ban no/broken-UA clients for these things that do cause CPU load, but leave article reading unharmed. Normal readers with Privoxy or other privacy filters (you know, people DO still use them, even if their percentage is small!) can at least READ, then. Presumably some percentage of that 20-50% will come back as the spammers realize they have to supply the string. Presumably we then start playing whack-a-mole. Yes, we will ban all IPs participating in this. Good luck fighting a dynamic bot herder (though I do ask me, with the spam blacklist and the captchas for URLs, what the hell can a botnet master achieve by hitting Wikipedia?!). Presumably there's a plan for what to do when the spammers begin supplying a new, random string every time. Random strings are easy to identify, fixed strings are easy to verify. The point is, what should bot writers do: 1) no UA at all, that's the typical newbie mistake who just supplies GET /w/index.php?action=edit, which works with his localhost wiki and every other wegs. 2) default UA of the programming language (PHPs thingy, cURL, Python, some bots may even use wget and bash scripting, it's not THAT difficult to write a Wikibot in bashscript!) 3) own UA (stuff like HDBot v1.1 (http://xyz.tld), which I couldn't use some longer time ago) 4) spoof a browser UA (bad, as the site cant differ between bot and browser) To avoid the ban, only 3 and 4 are possible, as the default UAs are blocked for most cases. But as 3 not really works, or at least is hard to troubleshoot, it leaves only 4, which you do not want. Please write some doc that answers this once and for all. Marco PS: Oh, and please, please make the 403 msg something that people can figure out what's wrong, it takes AGES if you are a newbie to scripting. -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] importing enwiki into local database
On Sun, Feb 14, 2010 at 1:00 PM, Robert Ullmann rlullm...@gmail.com wrote: Are you using $wgUseTidy? It is an HTML cleanup process that is always enabled on WMF projects. Since it it is there, template creators often miss closing spans and other things, or leave in extra close tags, and never notice because the tidy fixes them. Would not surprise me a bit if dozens of en.wp templates have such errors; I find them occasionally on en.wikt. What about turning wgUseTidy off for some time? Maybe some night hours... so that our template magicians are forced to clean up the templates and the other crap which resides deep buried into the wikitext. It is not a 3rd user's responsibility to install extra software just to be able to render our content... or at least, it shouldn't be. We should not artificially make it difficult to get our content forked or duplicated or elsewhile re-used. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] importing enwiki into local database
On Mon, Feb 15, 2010 at 2:30 AM, Aryeh Gregor simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com wrote: On Sun, Feb 14, 2010 at 7:34 PM, Marco Schuster ma...@harddisk.is-a-geek.org wrote: What about turning wgUseTidy off for some time? Maybe some night hours... so that our template magicians are forced to clean up the templates and the other crap which resides deep buried into the wikitext. They would just complain until we turned it back on. Unless we left it off for good, which might be reasonable, but then we'd have to improve Sanitizer, I guess. (Maybe once html5lib is more usable.) Why? Why must software take care of the crap that users do? Either we force them to write proper code, or they never will. It's like with PHP, with the difference that we still have the option to make our templatewriters and coders learn, instead of having to support stone-age stuff with crappy workarounds. Every single extra step that has to be taken to get a WP fork up and running is one step too much. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] confirm dfe03c58a0a....
Hi all, I just got this email from bugzilla. Apparently Google Apps has screwed up something again so the message itself doesnt annoy me, but why are the users' passwords still sent in CLEARTEXT in these days?? Can someone (tm) of the mailman admins or tech guys please fix this up? Thanks, Marco On Wed, Feb 10, 2010 at 2:47 PM, wikitech-l-requ...@lists.wikimedia.orgwrote: Your membership in the mailing list Wikitech-l has been disabled due to excessive bounces The last bounce received from you was dated 10-Feb-2010. You will not get any more messages from this list until you re-enable your membership. You will receive 3 more reminders like this before your membership in the list is deleted. To re-enable your membership, you can simply respond to this message (leaving the Subject: line intact), or visit the confirmation page at https://lists.wikimedia.org/mailman/confirm/wikitech-l/dfe03c58a0a1fe2.https://lists.wikimedia.org/mailman/confirm/wikitech-l/dfe03c58a0a1fe2700f74689430fa846e36fbcdb You can also visit your membership page at https://lists.wikimedia.org/mailman/options/wikitech-l/marco%40harddisk.is-a-geek.org On your membership page, you can change various delivery options such as your email address and whether you get digests or not. As a reminder, your membership password is blanked If you have any questions or problems, you can contact the list owner at wikitech-l-ow...@lists.wikimedia.org ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] (no subject)
On Thu, Jan 28, 2010 at 5:02 PM, Tei oscar.vi...@gmail.com wrote: On 28 January 2010 15:06, 李琴 q...@ica.stc.sh.cn wrote: Hi all, I have built a LocalWiki. Now I want the data of it to keep consistent with the Wikipedia and one work I should do is to get the data of update from Wikipedia. I get the URLs through analyzing the RSS ( http://zh.wikipedia.org/w/index.php?title=Special:%E6%9C%80%E8%BF%91%E6%9B%B4%E6%94%B9feed=rss ) and get all HTML content of the edit box by analyzing these URLs after opening an URL and clicking the ’edit this page’. That’s because I visit it too frequently and my IP address is prohibited or the network is too slow? 李琴 well.. thats webscrapping, that is a poor tecnique, one with lots of errors that generate lots of trafic. One thing a robot must do is read and follow the http://zh.wikipedia.org/robots.txt file ( probably you sould read it too) As a general rule of Internet, a rude robot will be banned by the site admins. It would be a good idea to anounce your bot as a bot in the user_agent string . Good bot beavior is one that read a website like a human. I don't know, like 10 request minute?. I don't know about this Wikipedia site rules about it. What you are suffering could be automatic or manual throttling, since is detected a abusive number of request from your IP. Wikipedia seems to provide fulldumps of his wiki, but are unusable for you, since are giganteous :-/, trying to rebuilt wikipedia on your PC with a snapshot would be like summoning Tchulu in a teapot. But.. I don't know, maybe the zh version is smaller, or your resources powerfull enough. One feels that what you have built has a severe overload (wastage of resources) and there must be better ways to do it... Indeed there are. What you need: 1) the Wikimedia IRC live feed - last time I've looked at it, it was at irc://irc.wikimedia.org/ and then each project had its own channel. 2) A PHP IRC bot framework - Net_SmartIRC is well-written and easy to get started with 3) the page source you can EASILY get either in rendered form http://zh.wikipedia.org/w/index.php?title=TITLEaction=render or in raw form http://zh.wikipedia.org/w/index.php?title=TITLEaction=raw (this is page source). Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] enwiki database dump schedule
On Fri, Jan 29, 2010 at 2:33 AM, Anthony wikim...@inbox.org wrote: On Mon, Jan 25, 2010 at 6:23 PM, Tomasz Finc tf...@wikimedia.org wrote: New snapshot ready. http://download.wikipedia.org/enwiki/20100116 And the history dump, which had run for a month and a half and looked like it was going to actually complete for the first time in years, is now broken. Thanks a lot. How are the old revisions backed up, by the way? Just replication to a remote datacenter? Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Failed to Download any pages-articles.xml.bz2 file
This is likely your company proxy not supporting huge downloads. Might be its cache partition got filled up. Marco 2009/12/15 Rob Giberson ajax...@gmail.com I failed several times and ended up downloading exactly the same number of bytes: 1,465,454KB (1.46GB), even for different versions of that file. Is it possible that the files themselves are corrupted? Anyone successfully downloaded one of these big files recently? Rob On Mon, Dec 14, 2009 at 8:11 PM, Rob Giberson ajax...@gmail.com wrote: You got that right. I am behind a company proxy. Any idea how to get this around? On Mon, Dec 14, 2009 at 8:09 PM, K. Peachey p858sn...@yahoo.com.au wrote: On Tue, Dec 15, 2009 at 12:25 PM, Rob Giberson ajax...@gmail.com wrote: Guess this problem might be asked several (many) times before...but I tried to download the pages-articles.xml.bz2 file which is approximately 5.xGB. However, all versions I tried failed at about 1.5GB. I noticed someone posted a Python code online as a workaround, but it did not work for me. My machine is Windows Server 2008 64 bit. Any idea how to get this huge file? Appreciated. Rob You wouldn't happen to have proxies between you and your internet connection would you? -Peachey ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [WikiEN-l] Extracting main titles from enwiki-latest-all-titles-in-ns0.gz
Hi, On Sat, Dec 12, 2009 at 4:35 PM, David Gerard dger...@gmail.com wrote: 2009/12/11 Behrang Saeedzadeh behran...@gmail.com: Hi, I have downloaded enwiki-latest-all-titles-in-ns0.gz and I want to extract main titles and store them in another file. For example, some titles have meta information (e.g. disambiguation etc.) and I want these to be removed. Can I remove all the text between parentheses from the titles to achieve this? You have to parse it by hand. Also some titles start with the ! character. and some are enclosed between two or three of them such as !Adiso_Amigos!. What is the purpose of ! in such cases? It's part of the topic's name (in case of http://en.wikipedia.org/wiki/%C2%A1Adios_Amigos!, the band's name). The reverse exclamation mark is part of the Spanish language. Also why some titles are enclosed between two double quotes such as 400_Years_of_Telescope? Same case: The are part of the topic's name (e.g. http://en.wikipedia.org/wiki/%22Weird_Al%22_Yankovic). Marco PS: Next time, please do correct copypaste so people have a chance to see what you want. Both your supplied examples had to be corrected, the second one was missing a the: http://en.wikipedia.org/wiki/ 400_Years_of_the_Telescope -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Wikimedia IRC Group Contacts looking for a developer
Didn't seanw write some system for exactly that, or is it broken for ages? Marco On Fri, Dec 4, 2009 at 9:25 PM, Rjd0060 rjd0060.w...@gmail.com wrote: Hi all - The IRC Group Contacts are in search of a developer to create and maintain a new cloak request system. Some basic details are posted at http://meta.wikimedia.org/wiki/IRC/Cloaks/System . If you are interested (or know anybody who might be) or have any questions, please get in touch with us. You can email irc-contacts-ow...@lists.wikimedia.org or poke one of us individually ( http://meta.wikimedia.org/wiki/IRC/Group_Contacts#Names ). Thanks in advance. -- Ryan User:Rjd0060 ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Unicode equivalence
2009/12/1 Praveen Prakash me.prav...@gmail.com Popular transliteration tool for Malayalam typing (*Varamozhi*) and popular font (*Anjali OldLipi*) are currently supporting Unicode 5.1 in windows. Recently (two or three days before) Microsoft announced their own tool for Malayalam typing which also supporting 5.1. Microsoft's default Karthika font for Malayalam also now supporting 5.1. But IE6 is not supporting unicode 5.1 even with supporting fonts. Is dynamic reverse conversion at clientside using javascript possible? This way we could output UC5.1 to everything supporting it, and older / crappy browsers / OSes can display still correctly. Sure, it adds a JS dependency, but I do think we can require JS for that. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [MediaWiki] Enhancement: LaTeX images quality (eliminate white background)
On Sun, Nov 29, 2009 at 5:46 PM, Aryeh Gregor simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com wrote: On Sun, Nov 29, 2009 at 11:45 AM, Aryeh Gregor simetrical+wikil...@gmail.com simetrical%2bwikil...@gmail.com wrote: But not one to make the images transparent by default, unless it somehow plays nicely with IE6. Who cares about that browser?? It's been history for years! I really doubt that ANYONE still using that bitrotten thing of a browser cares about alpha-transparency or Wikipedia in general, and I don't think more time, money or energy should be wasted on making anything look ok on it - as long as a IE6 user is still able to read the text, it's ok IMO. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [MediaWiki] Enhancement: LaTeX images quality (eliminate white background)
On Sun, Nov 29, 2009 at 6:28 PM, Aryeh Gregor simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com wrote: On Sun, Nov 29, 2009 at 12:19 PM, Marco Schuster ma...@harddisk.is-a-geek.org wrote: Who cares about that browser?? ~15% of our users use it. If our goal is to make a broadly usable website, we care about it. So what? They'll see blocky images, but can make out what the content is. as long as a IE6 user is still able to read the text, it's ok IMO. We need to weigh the interests of our users without regard to politics. Sometimes this is necessary though. Many people today still don't know that IE6 is dangerous. Wikipedia should warn those users and tell them how to upgrade. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [MediaWiki] Enhancement: LaTeX images quality (eliminate white background)
On Tue, Nov 17, 2009 at 4:11 PM, Aryeh Gregor simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com wrote: On Tue, Nov 17, 2009 at 3:53 AM, Alexander Shulgin alex.shul...@gmail.com wrote: Today, while reading my morning load of news I've come across a Wikipedia article[1] with some embedded LaTeX formulae, used within a table. The table header on that page has background of a distinct color which makes formula images look ugly. See bug: https://bugzilla.wikimedia.org/show_bug.cgi?id=8 As far as I know, the only thing actually blocking us from doing this was something like IE5 on Mac printing transparent images with black backgrounds. That's probably not relevant anymore. We're still stuck with the fact that IE6 doesn't support alpha channels, though -- we could make the fully-transparent parts of the background transparent, but I don't see how we could avoid aliasing effects on sane browsers without making things look extremely ugly on IE6. Aren't there various workarounds using JS and filters for IE6? Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] __TOC__ handling
On Sun, Sep 6, 2009 at 12:05 PM, Aryeh Gregor simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com wrote: This is correct. Although it's includes/parser/Parser.php (not all of us use Windows or Mac! :P). AFAIR Mac OS X's Partition Manager allows you to set up a HFS partition as case sensitive :p Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Wikipedia iPhone app official page?
On Fri, Sep 4, 2009 at 9:21 PM, Chad innocentkil...@gmail.com wrote: Wheee! TortoiseSVN indeed spoils us Windows users, as it's made version control so easy that...well...a Windows user can do it ;-) If Windows had a decent command line / shell (has its suckyness improved for Win7?), I bet that TortoiseSVN had far less downloads... it simply is the only way to make SVN usable on Windows. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Version control systems (was: Wikipedia iPhone app official page?)
On Sat, Sep 5, 2009 at 1:34 AM, David Gerard dger...@gmail.com wrote: [subject changed] 2009/9/5 Marco Schuster ma...@harddisk.is-a-geek.org: On Fri, Sep 4, 2009 at 9:21 PM, Chad innocentkil...@gmail.com wrote: Wheee! TortoiseSVN indeed spoils us Windows users, as it's made version control so easy that...well...a Windows user can do it ;-) If Windows had a decent command line / shell (has its suckyness improved for Win7?), I bet that TortoiseSVN had far less downloads... it simply is the only way to make SVN usable on Windows. That or Cygwin. (git works well in Cygwin too. At my last workplace we made damn sure to put Cygwin on our few Windows servers with sshd running.) Cygwin made even command-line CVS usable. Yup, cygwin is really cool... but you still need a proper GUI frontend. Windows's cmd simply sucks compared to e.g. xterm or konsole, which, of course, both can be run in cygwin. But getting X output via cygwin is a damn nightmare of a task. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Wiktionary API acceptable use policy
On Thu, Sep 3, 2009 at 10:26 PM, Aryeh Gregor simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com wrote: I don't know how formal or authoritative that is. You might want to ask someone like Brion. I think the answer in practice is that nobody's going to waste time blocking you if you don't cause noticeable load, but I don't know if there's an official statement anywhere. I vaguely recall that some sites might pay Wikimedia a fee to do commercial live mirroring, but I'm not sure on that. AFAIK one of these is spiegel.de which gets some kind of live feed, they arranged it with WM DE. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Wikipedia iPhone app official page?
On Sat, Aug 29, 2009 at 11:52 PM, Gregory Maxwell gmaxw...@gmail.comwrote: I laughed at this... GIT has a number of negatives, but poor speed is not one of them especially if you're used to working with SVN and a remote server. Maybe this is just a windows issue? GIT leaves a lot of work to the filesystem. And so to the disk. If the disk or the controller sucks or is simply old (not everyone has shiny new hardware), you're also damn slow. What should also not be underestimated is the diskspace demand of a GIT repo - not everyone has the money to buy new, high-sized disks (or has not the possibility to upgrade, especially laptop users). Of course, the diskspace issue can be solved partially by splitting that what is now one big SVN repo to multiple smaller ones - a standard GIT pull should e.g. not include the custom MWF stuff or the sources of the helper tools/scripts and whatever is hidden in the deep world of svnroot. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Wiki not responding to patches
On Thu, Aug 13, 2009 at 9:10 PM, Taja Anand taja.w...@gmail.com wrote: 3) removed all the content of index.php !! [it still runs] This sounds bad, shouldnt be possible. Are you running on Windows or Linux? What http server (apache, lighttpd)? Any PHP cache installed / active? Any Squid proxy accidentally set? Cleared the browser cache? (Firefox has had a similar problem for me ages ago) Is localhost pointing to 127.0.0.1? (Verify this in the hosts file of your OS) Oh, and did you verify you use the correct path in your browser? Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] w...@home Extension
On Sun, Aug 2, 2009 at 2:32 AM, Platonides platoni...@gmail.com wrote: I'd actually be interested how YouTube and the other video hosters protect themselves against hacker threats - did they code totally new de/en-coders? That would be even more risky than using existing, tested (de|en)coders. Really? If they simply don't publish the source (and the binaries), then the only possible way for an attacker is fuzzing... and that can take long time. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] w...@home Extension
On Sat, Aug 1, 2009 at 9:35 PM, Brian brian.min...@colorado.edu wrote: Never trust the client. Ever, ever, ever. If you have a working model that relies on a trusted client you're fucked already. Basically, if you want to distribute binaries to reduce hackability ... it won't work and you might as well be distributing source. Security by obscurity just isn't. - d. Ok, nice rant. But nobody cares if you scramble their scientific data before sending it back to the server. They will notice the statistical blip and ban you. What about video files exploiting some new 0day exploit in a video input format? The Wikimedia transcoding servers *must* be totally separated from the other WM servers to prevent 0wnage or a site-wide hack. About users who run encoding chunks - they have to get a full installation of decoders and stuff, which also has to be kept up to date (and if the clients run in different countries - there are patents and other legal stuff to take care of!); also, the clients must be protected from getting infected chunks so they do not get 0wned by content wikimedia gave to them (imagine the press headlines)... I'd actually be interested how YouTube and the other video hosters protect themselves against hacker threats - did they code totally new de/en-coders? Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Watchlistr.com, an outside site that asks for Wikimedia passwords
On Thu, Jul 23, 2009 at 8:50 PM, Happy-melon happy-me...@live.com wrote: Aryeh Gregor simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com wrote in message news:7c2a12e20907231051s638dd2f9v399ac2a79e185...@mail.gmail.com... On Thu, Jul 23, 2009 at 1:37 PM, Tim Starlingtstarl...@wikimedia.org wrote: To help in the proving trustworthy, or else process, I have released the source code of Watchlistr - please take a look at it. You will see that I take the utmost care in securing user information. The wiki logins are encrypted with AES in our database. The key used to encrypt each user's login list is their site username, which is stored as a SHA1 hash in our database. If a cracker were to, somehow, gain access to the database, they would be left with a pile of garbage. They would only have to get the site usernames to decrypt the login info. They could get those the next time each user logs in, if they're not detected immediately. There's no way around this; if your program can log in as the users, so can an attacker who's able to subvert your program. Or, since the set of registered Wikimedia users is both vastly smaller than the superset of all possible usernames (remember it's restricted to users with a global login AFAICT), and readily accessible through a high-throughput API, a brute-force attack would be, if not trivial, certainly extremely feasible. As for the other solutions that were presented - I was really trying to create a cross-platform, cross-browser solution that would not hinge on one particular technology. Javascript would be great, but what if someone doesn't have JS enabled? OAuth and a read-only API would be close-to-ideal, but they currently don't work with/don't exist on the Wikimedia servers. I am, however, open to other workable solutions that are presented - let me know. I would suggest you apply for a toolserver account: https://wiki.toolserver.org/view/Account_approval_process Once you have a toolserver account, I'd be willing to work with you to arrange for some form of direct access to all wikis' watchlist tables (I'm a toolserver root). You then wouldn't need to possess any login info. This looks like a *much* more acceptable system. Although how would you authenticate without collecting proscribed data...? Let the user prove account ownership by a talk page edit. This was the way Interiot used in his old edit counter... (is this one still active?) Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] For the Germans: PHP Release Party in Munich July 17th
Forwarding this to vereinde-l and wikide-l. Marco On Thu, Jul 9, 2009 at 3:52 PM, Brion Vibber brion.vib...@gmail.com wrote: Thought this might be of interest to some of our folks in and around Germany: http://phpugmunich.org/dokuwiki/php_release_party Wouldn't hurt to have a MediaWikian or two there to represent. :) It's at a biergarten so you know I'd be there if I were local! ;) -- brion vibber (brion @ wikimedia.org) ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Proposal: switch to HTML 5
On Wed, Jul 8, 2009 at 3:46 AM, Gregory Maxwell gmaxw...@gmail.com wrote: There is only a short period of time remaining where a singular browser recommendation can be done fairly and neutrally. Chrome and Opera will ship production versions and then there will be options. Choices are bad for usability. We should not recommend Chrome - as good as it is, but it has serious privacy problems. Opera is not Open Source, so I think we'd best stay with Firefox, even if Chrome/Opera begin to support video tag. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] secure slower and slower
On Wed, Jul 8, 2009 at 10:04 PM, Aryeh Gregor simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com wrote: On Wed, Jul 8, 2009 at 10:45 AM, Gregory Maxwellgmaxw...@gmail.com wrote: Provided your changes didn't break the site, I'd take a bet that you could have a malware installer running for days before it was discovered. What, on enwiki? I'd bet ten minutes before it's noticed someone using NoScript configured to prompt about cross-site loads or something. And if you're not a real technical expert who sees aha, the site where the js comes from is surely NOT wikipedia, it doesn't help anyone. People click away these warnings very often and don't bother to actually read them... this is something most people forget in security themes. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] secure slower and slower
On Tue, Jul 7, 2009 at 4:03 AM, Aryeh Gregor simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com wrote: But really -- have there been *any* confirmed incidents of MITMing an Internet connection in, say, the past decade? Real malicious attacks in the wild, not proof-of-concepts or white-hat experimentation? I'd imagine so, but for all people emphasize SSL, I can't think of any specific case I've heard of, ever. It's not something normal people need to worry much about, least of all for Wikipedia. Public congresses, schools without protection for ARP spoofing (I got 0wned this way myself), maybe corporate networks w/o proper network setup... they all allow sniffing or in-line traffic manipulation. Not that uncommon attacks, and when you know the colleague you do not like is WP admin, you simply have to wait for him to visit WP logged in, and you have either his pass or the cookies. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] On templates and programming languages
On Fri, Jul 3, 2009 at 4:22 AM, Aryeh Gregor simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com wrote: On Thu, Jul 2, 2009 at 10:18 PM, Steve Bennettstevag...@gmail.com wrote: So: 1) The chosen language will support iteration over finite sets 2) Could it support general iteration, recursion etc? 3) If so, are there any good mechanisms for limiting the destrutiveness of an infinite loop? You don't really need an infinite loop. DoS would work fine if you can have any loop. Even with just foreach: foreach(array(1,2)as $x1)foreach(array(1,2)as $x2) A few dozen of those in a row will give you a nice short bit of code that may as well run forever. You can make some kind of counter, which gets incremented each foreach/while/for loop. If it reaches 200 (or whatever), execution is stopped. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] On templates and programming languages
On Tue, Jun 30, 2009 at 10:25 PM, Brion Vibber br...@wikimedia.org wrote: Aryeh Gregor wrote: Our current tarballs are 10 MB; we could easily just chuck in Lua binaries for Linux x86-32 and Windows without even noticing the size increase, and allow users to enable it with one line in LocalSettings.php. Hmm... it might be interesting to experiment with something like this, if it can _really_ be compiled standalone. (Linux binary distribution is a hellhole of incompatible linked library versions!) Static compiling the stuff? How would this affect the binary size? (And: is static linking working across different libc versions?) BTW, what about Mac OS / FreeBSD hosts? Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] On templates and programming languages
On Tue, Jun 30, 2009 at 10:45 PM, Aryeh Gregor simetrical+wikil...@gmail.com simetrical%2bwikil...@gmail.com wrote: Alternatively, is the libc ABI stable enough that we could dynamically link libc, and statically link everything else? The other libraries required are very small. I wouldn't count on this... at least we should provide a dyn-linked version for those wanting less storage/memory/whatever consumption. How do statically compiled programs for x86 platforms behave on x64, btw? And what about more exotic platforms like ARM (which can also be multi-endian, IXP4xx is an example) / SPARC (Toolserver!!!) or PowerPC? Are they actually supported by Lua? BTW, what about Mac OS / FreeBSD hosts? Are there any shared webhosts you know of that run Mac or BSD? At worst, they can fall into the same group as the no-exec() camp, able to use Wikipedia content but not 100%. The webhoster hosting our school's homepage does, for example... They host all schools in Munich, and I think they're a bit security-paranoid. We don't have any issues hosting a MediaWiki there, actually. (OK, we never imported WP content.) Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] wikimedia, wikipedia and ipv6
On Mon, Jun 15, 2009 at 11:46 AM, Peter Gervai grin...@gmail.com wrote: On Fri, Jun 12, 2009 at 23:55, Platonidesplatoni...@gmail.com wrote: List archives are not searchable by google. Is it on purpose? Why? Yep, so that accidentally published private data doesn't get indexed by google. Same for deliberately published data or personal insults with clearnames of users, wikide-l had this some times. Dunno about such experiences on other lists. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] more bugzilla components
On Tue, May 26, 2009 at 7:55 PM, Michael Dale md...@wikimedia.org wrote: *I also want to report some strangeness with bugzilla. I sometimes get the below error when trying to log in (without restrict to ip checked ) and I occasionally get time-outs when submitting bugs: Undef to trick_taint at Bugzilla/Util.pm line 67 Bugzilla::Util::trick_taint('undef') called at Bugzilla/Auth/Persist/Cookie.pm line 61 Bugzilla::Auth::Persist::Cookie::persist_login('Bugzilla::Auth::Persist::Cookie=ARRAY(0xXX)', 'Bugzilla::User=HASH(0xXX)') called at Bugzilla/Auth.pm line 147 Do you have IPv6 enabled? If yes, switch it off. Shoulda help you. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Wiki editor wysiwyg
On Mon, May 11, 2009 at 5:05 PM, Daniel Kinzler dan...@brightbyte.dewrote: Basically, this is the reason why there is no really good WYSIWYG editor for MediaWiki. I don't want to discurage you, I just want to point out where the problems are. Some examples: the closing }} of a template may actually be contained in the definition of another template, same with thables. I have seen |} being replaced by something like {{TableEnd}}. Lots of fun there. Also there are people who use date/time to switch between template contents... it's really funny how people can use something like the MW markup - they end literally there where no man eh coder has been before. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Wiki editor wysiwyg
On Mon, May 11, 2009 at 8:50 PM, Daniel Schwen li...@schwen.de wrote: The simple (albeit ugly) solution would to add a parser version field to the revision table, drag the old parser along as 'legacy', make the new parser the default (and only) option for all new edits, and spit out a warning when you are editing a legacy revision for the first time. The warning you be made dependent on the cases that break with the new parser. Cases that break could be detected by comparing tidied HTML output from both parser versions. Sounds cool, but it'd require a formalization of MW markup first (something that should have been done long ago). What about correcting stuff from old behavior to new parser via bots/update scripts, even for old revisions? Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] OpenID MediaWiki Extension v.0.8.4.1 - Identity Providers UI
On Sun, Apr 19, 2009 at 5:55 AM, Sergey Chernyshev sergey.chernys...@gmail.com wrote: And, I can't choose the case spelling of my nick (it's harddisk on OID), normally it shoulda be HardDisk, but I think this is an OpenID-related problem - anyway, it'd be cool if you could make an additional field for the user to input desired username. It's very possible that your provider returns lowercase nickname and MediaWiki user is automatically created. Indeed, this is the error... but (also in order to avoid name collision) it 'd be nice for people to choose their own username. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] OpenID MediaWiki Extension v.0.8.4.1 - Identity Providers UI
On Sat, Apr 18, 2009 at 9:00 AM, Sergey Chernyshev sergey.chernys...@gmail.com wrote: Hope you like it, but I'm still open to suggestions about improving the interface so you all finally install it on your wikis ;) There's a double escape on the confirmation page which redirects to the OID provider (\continue\)... unfortunately it redirected to myopenid too fast to CnP the page. And, I can't choose the case spelling of my nick (it's harddisk on OID), normally it shoulda be HardDisk, but I think this is an OpenID-related problem - anyway, it'd be cool if you could make an additional field for the user to input desired username. Besides that, it's ENORMOUSLY cool. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Dealing with Large Files when attempting a wikipedia database download - Focus upon Bittorrent List
On Fri, Apr 17, 2009 at 11:55 PM, Jameson Scanlon jameson.scan...@googlemail.com wrote: Is it possible for anyone to indicate more comprehensive lists of torrents/trackers than these? Are there any plans for all the database download files to be available in this way (I imagine that there would also be some PDF manual which would go along with these to indicate offline viewing, and potentially more info than this). In theory, one can create a torrent with the Wikipedia servers as webseeds easily. Question is, how many torrent clients except Azureus support these? Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Skin JS cleanup and jQuery
On Fri, Apr 17, 2009 at 1:38 PM, Aryeh Gregor simetrical+wikil...@gmail.comsimetrical%2bwikil...@gmail.com wrote: On Thu, Apr 16, 2009 at 6:35 PM, Marco Schuster ma...@harddisk.is-a-geek.org wrote: Are there any plans to use Google Gears for storage on clients? Okay, people have to enable it by hand, but it shoulda speed up page loads for people very much (at least for those who use it). What, specifically, would be stored in Google Gears? Would HTML5's localStorage also be suitable? Isn't GG supposed to be an implementation of localStorage for browsers who don't support it yet (does any browser support localStorage *now*, btw?)? What could be stored is JS bits likely not to change THAT often, i.e. if Wikipedia is ever going to make a WYSIWYG editor available (Wikia has it!!!) its JS files could be cached, same for those tiny little flag icons , the wikipedia ball, the background of the page... maybe even some parts of the sitewide CSS. Actually, it could be expanded to store whole articles (then simply copy over or enhance http://code.google.com/intl/de-DE/apis/gears/articles/gearsmonkey.html - I'm gonna modify it for german Wikipedia when i've got some time). Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Skin JS cleanup and jQuery
On Fri, Apr 17, 2009 at 11:42 PM, Brion Vibber br...@wikimedia.org wrote: * Background JavaScript worker threads Not super high-priority for our largely client-server site. Can be useful if you're doing some heavy work in JS, though, since you can have it run in background without freezing the user interface. You mean...stuff like bots written in Javascript, using the XML API? I could imagine also sending mails via Special:Emailuser in the background to reach multiple recipients - that's a PITA if you want send mails to multiple users. * Geolocation services Also available in a standardized form in upcoming Firefox 3.5. Could be useful for geographic-based search ('show me interesting articles on places near me') and 'social'-type things like letting people know about local meetups (like the experimental 'geonotice' that's been running sometimes on the watchlist page). That sounds kinda interesting, even if the accuracy on non-GPS-enabled devices isn't that high... can this in any way be joined with the OSM integration? Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Skin JS cleanup and jQuery
On Wed, Apr 15, 2009 at 11:05 PM, Brion Vibber br...@wikimedia.org wrote: Just a heads-up -- Michael Dale is working on some cleanup of how the various JavaScript bits are loaded by the skins to centralize some of the currently horridly spread-out code and make it easier to integrate in a centralized loader so we can serve more JS together in a single compressed request. Are there any plans to use Google Gears for storage on clients? Okay, people have to enable it by hand, but it shoulda speed up page loads for people very much (at least for those who use it). Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Mailing lists problems
On Mon, Mar 30, 2009 at 7:32 PM, Anthony wikim...@inbox.org wrote: On Mon, Mar 30, 2009 at 12:57 PM, Brion Vibber br...@wikimedia.org wrote: If we could have it only send sorry mails on non-spam mails, that probably would be nice. Hopefully some day we can get there. :) Sending it only to SPF-verified addresses wouldn't be hard, would it? (I must admit I have no idea how widespread SPF use is.) Google verifies SPF (and, for non-Google-Apps-users also has it so it can be verified), for example. And it's not that difficult to set up an SPF record if you run your own mail server. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Dump processes seem to be dead
2009/2/25 John Doe phoenixoverr...@gmail.com: Id recommend either 10m or 10% of the database which ever is larger for new dumps to screen out a majority of the deletions. what are your thoughts on this process brion (and the rest of the tech team)? Another idea: If $revision is deleted/oversighted/whateverhowmadeinvisible, then find out the block ID for the dump so that only this specific block needs to be re-created in next dump run. Or, better: do not recreate the dump block, but only remove the offending revision(s) from it. Shoulda save a lot of dump preparation time, IMO. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschätsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Dump processes seem to be dead
2009/2/22 Robert Ullmann rlullm...@gmail.com: Want everyone to just dynamically crawl the live DB, with whatever screwy lousy inefficiency? FIne, just continue as you are, where that is all that can be relied upon! Even if you had the dumps, you have another problem: They're incredibly big and so a bit difficult to parse. So, a small suggestion if the dumps will ever be workin' again: Split the history and current db stuff by alphabet, please. Marco PS: Are there any measurements what traffic is generated by ppl who download the dumps? Have there been any attempts to distribute them via BitTorrent? -- VMSoft GbR Nabburger Str. 15 81737 München Geschätsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Wikipedia giving 403 Forbidden
On Sat, Feb 21, 2009 at 9:00 PM, Leon Weber l...@leonweber.de wrote: On 22.02.2009 03:57:15, jida...@jidanni.org wrote: OK, can you please stop giving 403 Forbidden for HEAD on both pages that do and don't exist. It makes testing difficult. % HEAD -PS -H 'User-agent: leon' http://en.wikipedia.org/ HEAD http://en.wikipedia.org/ -- 301 Moved Permanently Where does that make testing too hard? You first have to find some dude who tells you oh, 403 means probably wrong user agent. IMO it should be stated in the HTML content of the 403 page WHY the request failed. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschätsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] empty parts in print preview and print
On Mon, Feb 16, 2009 at 3:49 PM, Uwe Baumbach u.baumb...@web.de wrote: Empty part (white area) seems to start right after a toc and then ends at not remarkable places within text to show/print the rest of an article. I 've seen similar issues on Starwars Wikia sometimes...maybe it's the same bug. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschätsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Help by Extensions
On Tue, Feb 10, 2009 at 8:50 PM, Jan Luca j...@jans-seite.de wrote: Can't help me someone? The code exactly outputs what it should (http://toolserver.org/~jan/poll/dev/main.php?page=wiki_outputid=2 returns 2!). And as it's my code you took without source notice (from http://code.harddisk.is-a-geek.org/filedetails.php?repname=hd_botpath=%2Fhttp_wrapper.inc.phprev=38sc=1 - don't think I don't notice this. Please add a source note to your code and I'll be fine with it)... if(isset($get_server)) { is totally wrong. Use if($get_server!=) { And where do you use main.php? It doesn't get included anywhere. Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschätsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] war on Cite/{{cite}}
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Sat, Jan 31, 2009 at 2:03 PM, Domas Mituzas wrote: Hello, I understand the need for cite, thats why it is still there :) But... (...) What about converting these to ref tags? Unfortunately, {{cite}} is the only template I can profile/account for now, we don't have proper per-template profiling, but I wish to get one some day. Then we'd have more war on ... topics ;-D Stub templates, for example :D Generally, templates are major part of our parsing, and thats over 50% of our current cluster CPU load. Wow. Can you compare the load to the systems with the load caused by solely using tags? Marco -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (MingW32) Comment: Use GnuPG with Firefox : http://getfiregpg.org (Version: 0.7.2) iD8DBQFJhG4xW6S2GapJUuQRAsQdAJ0WHP1DfI0+5BF5s0PYlHe6Ax5rPwCfRXax f/yjmuQRbPinnl4mzvRWCtw= =F6F1 -END PGP SIGNATURE- ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Crawling deWP
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi all, I want to crawl around 800.000 flagged revisions from the German Wikipedia, in order to make a dump containing only flagged revisions. For this, I obviously need to spider Wikipedia. What are the limits (rate!) here, what UA should I use and what caveats do I have to take care of? Thanks, Marco PS: I already have a revisions list, created with the Toolserver. I used the following query: select fp_stable,fp_page_id from flaggedpages where fp_reviewed=1;. Is it correct this one gives me a list of all articles with flagged revs, fp_stable being the revid of the most current flagged rev for this article? -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (MingW32) Comment: Use GnuPG with Firefox : http://getfiregpg.org (Version: 0.7.2) iD8DBQFJf5wcW6S2GapJUuQRAl8NAJ0Xs+ImyTqmoX2Vtj6k6PK9ntlS5wCeJjsl M5kMETB3URYni5TilIOt8Fs= =j7Og -END PGP SIGNATURE- ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Crawling deWP
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Wed, Jan 28, 2009 at 12:49 AM, Rolf Lampa wrote: Marco Schuster skrev: I want to crawl around 800.000 flagged revisions from the German Wikipedia, in order to make a dump containing only flagged revisions. [...] flaggedpages where fp_reviewed=1;. Is it correct this one gives me a list of all articles with flagged revs, Doesn't the xml dumps contain the flag for flagged revs? The xml dumps are nothing for me, way too much overhead (especially, they are old, and I want to use single files, it's easier to process these than one hge xml file). And they don't contain flagged revisions flags :( Marco -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (MingW32) Comment: Use GnuPG with Firefox : http://getfiregpg.org (Version: 0.7.2) iD8DBQFJf5/cW6S2GapJUuQRAj1KAJ9feF3ElQTQbuENa2xfDoXJE5pq5QCfYtRd x8lfmVHMzmVOqtO39MCfieQ= =8YJP -END PGP SIGNATURE- ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Toolserver-l] Crawling deWP
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Wed, Jan 28, 2009 at 1:13 AM, Daniel Kinzler wrote: Marco Schuster schrieb: Fetch them from the toolserver (there's a tool by duesentrieb for that). It will catch almost all of them from the toolserver cluster, and make a request to wikipedia only if needed. I highly doubt this is legal use for the toolserver, and I pretty much guess that 800k revisions to fetch would be a huge resource load. Thanks, Marco PS: CC-ing toolserver list. It's a legal use, the only problem is that the tool i wrote for is is quite slow. You shouldn't hit it at full speed. So it might actually be better to query the main server cluster, they can distribute the load more nicely. What is the best speed, actually? 2 requests per second? Or can I go up to 4? One day i'll rewrite WikiProxy and everything will be better :) :) But by then, i do hope we have revision flags in the dumps. because that would be The Right Thing to use. Still, using the dumps would require me to get the full history dump because I only want flagged revisions and not current revisions without the flag. Marco -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (MingW32) Comment: Use GnuPG with Firefox : http://getfiregpg.org (Version: 0.7.2) iD8DBQFJgAIpW6S2GapJUuQRAuY/AJ47eppKPbBqjz0l4HllCPolMWz9KACfRurR Lod/wkd4ZM0ee+cPTfaO7yg= =zB26 -END PGP SIGNATURE- ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] 403 with content to Python?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Sun, Jan 25, 2009 at 2:50 PM, Platonides wrote: Marco Schuster wrote: I used HDBot API x.y (PHP $phpversion) as UA. No idea what triggered the filters. Perhaps the mention to php, although I'm not being blocked when using that UA, so can't test. Yeah, I'm also not blocked anymore...nice to hear that. But again, it'd be nice to see in an error message what part of the UA triggered the filter and why this part is blocked. Brion, do you have a list of blocked UA (parts)? Marco -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (MingW32) Comment: http://getfiregpg.org iD4DBQFJfJOQW6S2GapJUuQRAiwgAJdXucmjZ4d9BToMAnK3uKuzq3ooAJ4mFGFZ AeFuiPnC+cSzTuseHDtAUg== =OwNP -END PGP SIGNATURE- ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] 403 with content to Python?
On Fri, Jan 23, 2009 at 7:03 PM, Brion Vibber br...@wikimedia.org wrote: On 1/23/09 2:36 AM, Andre Engels wrote: Two questions: 1. Why is this User Agent getting this response? If I remember correctly, this was installed in the early days of the pywikipediabot, when Brion wanted to block it because it had a programming error causing it to fetch each page twice (sometimes even more?). If that is the actual reason, I see no reason why it should still be active years afterward... This has nothing to do with pywikipediabot. We too frequently encountered poorly-written bots and site-scrapers which slammed the servers too hard and caused problems. Blocking default UAs of common libraries cut these incidents down dramatically, and helps encourage thoughtful bot writers to put specific information into their user-agent string, making it possible to track them down more easily if they are problematic. Is there any list of those UAs or UA parts available? I had this problem some time ago with my bot which used a custom UA string and got access denied, so I changed its UA to Firefox as I had no nerves to track down WHICH part of the UA triggered the filter. Marco ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] 403 with content to Python?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Sun, Jan 25, 2009 at 1:11 AM, Aryeh Gregor wrote: On Sat, Jan 24, 2009 at 4:05 AM, Marco Schuster wrote: Is there any list of those UAs or UA parts available? I had this problem some time ago with my bot which used a custom UA string and got access denied, so I changed its UA to Firefox as I had no nerves to track down WHICH part of the UA triggered the filter. Just change it to something like YourBotName, run by Marco Schuster . That will certainly avoid any filters, and provide the desired info. I used HDBot API x.y (PHP $phpversion) as UA. No idea what triggered the filters. I don't know why the error page doesn't give this info already. The current message only confuses people and -- if they can figure out it's UA-based -- tempts them to mimic browser UA strings. Anyone skilled enough to write a bot is skilled enough to find that out, IMO. Anyway, it should also be in the error message what part of the UA is forbidden. Marco -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (MingW32) Comment: http://getfiregpg.org iD8DBQFJe7C4W6S2GapJUuQRAvcgAJ9YY1N0ckE9DzqG21K45teAiG1QVQCfcGBJ hFtOQisDPnYlLyXjTwKaTTI= =iuTY -END PGP SIGNATURE- ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Drafts extension in testing
On Tue, Jan 20, 2009 at 11:35 PM, Aryeh Gregor simetrical+wikil...@gmail.com wrote: On Tue, Jan 20, 2009 at 4:40 PM, Platonides platoni...@gmail.com wrote: IMHO we still need some kind of saving into firefox storage, for cases like a read-only db. Instead of 'You can't save, the site is read-only'-'Save-draft'-'No, you can't, the db is read-only', 'You can't save, the site is read-only'-'Save-draft'-'The site is read-only, the draft has been saved into your browser'. This can be done in cutting-edge browsers using HTML5's localStorage and sessionStorage. What about Google Gears? Yeah, it's Google, but GG supports a variety of browsers and we wouldn't have to wait for M$ to support it properly in IE 20. Marco ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Non-latin characters broken in donation comments
On Mon, Dec 1, 2008 at 7:02 PM, Brion Vibber [EMAIL PROTECTED] wrote: Tei wrote: Is me or maybe form need charset=*UTF-8* added to it? Considering that the part that's broken isn't even *on* our form, I'm pretty sure it's not something on our form. :) The name gets put in at PayPal's forms, and is passed on to us with the payment completion data. Can you reverse the buggy encoding of Paypal (iconv)? Marco ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [Commons-l] Support for Chemical Markup Language
On Sun, Nov 30, 2008 at 1:11 AM, Brian Salter-Duke [EMAIL PROTECTED] wrote: On Sun, 30 Nov 2008 00:50:08 +0100, Platonides [EMAIL PROTECTED] wrote: See https://bugzilla.wikimedia.org/show_bug.cgi?id=16491 That users can embed javascript is not acceptable to run it on Wikipedia. Other parameters, like urlContents or signed wouldn't be used but at least they can be disabled. I am afraid this is all beyond my expertise. Are you saying that there is no way Jmol can ever be used on WMF projects? There is, as soon as the Javascript embedding possibility gets disabled and the extension gets a proper review (TM). Marco ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Upload filesize limit bumped
On Sat, Nov 22, 2008 at 1:43 PM, Daniel Kinzler [EMAIL PROTECTED]wrote: Anyway, HTTP doesn't support feedback during upload (or any feedback, really), and HTML does not offer a way for multi-file uploads (which would also be quite handy). Any solutions I have so far seen for that are based either on a Java Applet or on Falsh. RS.com's upload indicator apparently works via an iframe: form name=ul method=post action=http://rs426l3.rapidshare.com/cgi-bin/upload.cgi?rsuploadid=149063132559697458; enctype=multipart/form-data onsubmit=return zeigeProcess(); div id=progbar style=display:none; iframe src=http://rs426l3.rapidshare.com/progress.html?uploadid=149063132559697458; name=pframe width=100% height=120 frameborder=0 marginwidth=0 marginheight=0 scrolling=no/iframe /div div id=dateiwahl input type=file size=65 id=dateiname name=filecontent onchange=zeigeUploadBtn(); / input type=image id=btnupload name=u src=/img2/upload_file.jpg style=visibility:hidden; / /div /form The usage of an unique upload ID ensures at their end that the progressbar iframe always gets the right data. It refreshes using AJAX technology: http://pastebin.com/f56b8c8f4 I'll take a look if this might be applicable to put into MW. Marco ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Language committee and language setup
On Sun, Nov 16, 2008 at 1:58 AM, Brion Vibber [EMAIL PROTECTED] wrote: Gerard Meijssen wrote: Hoi, While you are at it, please have a look at bug 15013... It has been waiting today for 121 days.. 121 days after registering in Bugzilla. If there are any issues please let them be known. https://bugzilla.wikimedia.org/show_bug.cgi?id=15013 1) The bug was improperly labeled and could not be found when searching specifically for the request. This likely didn't help it to receive any attention! :) Bit of offtopic, but are there actually any howtos how to correctly label your bugs e.g. for config changes, wiki creation etc. so that these do not get overlooked? Marco ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l