[Wikitech-l] Unbreaking statistics
Hello, I see I've created quite a stir around, but so far nothing really useful popped up. :-( But I see that one from Neil: > Yes, modifying the http://stats.grok.se/ systems looks like the way to go. For me it doesn't really seem to be, since it seems to be using an extremely dumbed down version of input, which only contains page views and [unreliable] byte counters. Most probably it would require large rewrites, and a magical new data source. > What do people actually want to see from the traffic data? Do they want > referrers, anonymized user trails, or what? Are you old enough to remember stats.wikipedia.org? As far as I remember originally it ran webalizer, then something else, then nothing. If you check a webalizer stat you'll see what's in it. We are using, or we used until our nice fellow editors broke it, awstats, which basically provides the same with more caching. Most used and useful stats are page views (daily and hourly stats are pretty useful too), referrers, visitor domain and provider stats, os and browser stats, screen resolution stats, bot activity stats, visitor duration and depth, among probably others. At a brief glance I could replicate the grok.se stats easily since it seems to work out of http://dammit.lt/wikistats/, but it's completely useless for anything beyond page hit count. Is there a possibility to write a code which process raw squid data? Who do I have to bribe? :-/ -- byte-byte, grin ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] wikimedia, wikipedia and ipv6
Ehlo, I see that this topic has been popped up from time to time since 2004, and that most of the misc servers have been already IPv6 enabled. I have checked around whether google have any info on that, and found a few (really few) mail on that, from the original 2004 test to a comment from 2008 that squid and mediawiki is the problem, apart from some smaller issues. (As a sidenote, google don't seem to find anything on site:lists.wikimedia.org about "ipv6", interesting.) Now, squid fully supports IPv6 as of now (since 3.1), so I guess that's check. (I didn't try it, though, but others seem to have.) MediaWiki, well, http://www.mediawiki.org/wiki/IPv6_support didn't mention any outstanding problem and the linked bug is closed, so as far as I'm observing (without actually testing it) it looks okay. The database structure may require some tuning as far as I see. Right? Apache handle it since eternity, php does I guess. Are there any further, non v6 compatible components in running a wikipedia? If not, is there any outstanding proble which would make it impossible to fire up a test interface on ipv6? I'd say to use a separate host, like en.ipv6.wikipedia.org, and not to worry about the cache efficiency because I doubt that the ipv6 level traffic would really measure up to the ipv4 one. At least it could be properly measured, and decision should base on facts how to go on. Maybe there's a test host already on, but I wasn't able to find it, so I guess nobody else can. ;-) Is there any further problem in this topic require solutions, or it just didn't occur to anyone lately? -- byte-byte, grin ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] wikimedia, wikipedia and ipv6
On Fri, Jun 12, 2009 at 13:55, Aryeh Gregor wrote: > This might be useful, although most of the info is probably outdated: > http://wikitech.wikimedia.org/view/Special:Search?search=ipv6&go=Go Yep, including the dead labs link. But it mentioned LVS [ipvs], dunno whether we use it or not, but it supports ipv6 either. ;-) g ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] wikimedia, wikipedia and ipv6
On Fri, Jun 12, 2009 at 23:55, Platonides wrote: > List archives are not searchable by google. Is it on purpose? Why? > I don't think so. You quoted back 3 questions and answered one of them, dunno which. :-) > MediaWiki code may need some assumptions about it, though. > I think this comment in the config summarises it: > "no IPv6 support - 20051207" I don't think so. If you look more carefully you see references about code cleanups regarding IPv6, actually lots of them, and generally it seems that people silently work on that, so maybe it's just checking it again, since 2005 wasn't yesterday. Any core devels with opinions? Tim? Brion? -- byte-byte, grin ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [WikiEN-l] MediaWiki is getting a new programming language
On Wed, Jul 8, 2009 at 10:16, Gerard Meijssen wrote: > The argument that a language should be readable and easy to learn is REALLY > relevant and powerful. A language that is only good for geeks is detrimental > to the use of MediaWiki. Our current templates and template syntax are > horrible. Wikipedia is as a consequence hardly editable by everyone. Mortals _use_ the templates, not _create_ them. Geeks create templates for mortals. Current syntax is indeed horrible, but complete readibility is not the main issue I'd say. Security, speed and flexibility should be, along the ease of implementation. Peter ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [WikiEN-l] MediaWiki is getting a new programming language
On Wed, Jul 8, 2009 at 12:23, Neil Harris wrote: > {{#switch: > {{#iferror: {{#expr: {{{1}}} + {{{2}}} }} | error | correct }} > | error = that's an error > | correct = {{{1}}} + {{{2}}} = {{#expr: {{{1}}} + {{{2}}} {{#perl if( error( eval( "$arg1+$arg2" ) ) ) { return "that's an error"; } else { return "1+2=", $arg1+$arg2; } }} :-) But really, any above cited language should work, the more easy to parse (and to create an interpreter) the better. I guess a lua-like language should be easy and readable. -- byte-byte, grin ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [WikiEN-l] MediaWiki is getting a new programming language
On Fri, Jul 17, 2009 at 12:06, Gerard Meijssen wrote: > Well this strength is not that great when people like myself who has commit > right on SVN does not want to touch templates with a barge pole if I can > help it. Wikipedia is supposed to be this thing everybody can edit. I think you misprioritise the whole thing. Consider it a feature, not a base functionality. Most installations do not use 10% of the possible features available, due to lack of knowledge, time, bravery or else. Writing templates with code is a rare art by my observation, most of the larger MW installations never have used it in the first hand. You would like to remove TeX (math) input because the language is complex? And since it'd be an extension I guess, it would imply that you want to forbid(?) creating complex extensions? That's unrealistic. Geeks need this functionality, so ungeeks may or may not care about that, it doesn't really matter. If it's easier to understand than not: that's a plus. By the way I wouldn't touch PHP with a teen feet pole and rubber gloves, but I'm fine with current template syntax. We're individuals with our own preferences. ;-) grin ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Wikitext vs. WYSIWYG (was: Proposal for editing template calls within pages)
On Thu, Sep 24, 2009 at 14:36, David Gerard wrote: > However, impenetrable wikitext is one of *the* greatest barriers to > new users on Wikimedia projects. And this impenetrability is not, in > any way whatsoever or by any twists of logic, a feature. Adding a gui layer to wikitext is always okay, as long as it's possible to get rid of, since majority of edits not coming from "new users", and losing flexibility for power users to get more newbies doesn't sound like a good deal to me. At least all of the GUIs I've seen were slow and hard to use, and resulted unwanted (side) effects if something even barely complex were entered. And this isn't the problem of Wikipedia: google docs, which is one of the most advanced web-based gui systems I guess have plenty of usability problems, which only can be fixed by messing with the Source. And many core people want to mess with the source. So, adding a newbie layer is okay as long as you don't mess up the work of the non-newbies. g ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Bugzilla Vs other trackers.
On Thu, Jan 7, 2010 at 08:17, Robert Rohde wrote: > On Wed, Jan 6, 2010 at 9:14 PM, Steve Bennett wrote: >> Anyone tried FogBugz, Joel Spolsky's baby? I'm so curious... although it's >> commercial software, who knows, you might get a discount or even a freebie. > > The historical position has been that absolutely nothing goes into the > WMF software pool unless it is open source. I deeply agree that. > However, my recollection is based on discussions years ago. On > searching, I couldn't find any policy forbidding closed source > software (is there one?). So, it is possible that closed source might > be looked on as a more acceptable possibility for some functions now > (though I wouldn't bet on it). Wouldn't be nice. First, it's an attitude thing: we want (and have to) promote open stuff. Second, it isn't nice to show something to the users they cannot use themselves. It's kind of against or basic principle of "you can do what we do, you're free to do it, we just do it better" :-) -- byte-byte, grin ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] "Google phases out support for IE6"
What about creating an "monobo'oldies" theme for them? I mean, move current stuff to oldies, and drop elders support from monobook. -- byte-byte, grin ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] "Google phases out support for IE6"
On Tue, Feb 2, 2010 at 00:44, Gregory Maxwell wrote: > People are really bad at complaining, especially web users. We've had > prolonged obvious glitches which must have effected hundreds of > thousands of people and maybe we get a couple of reports. For Average Joe and Jane it usually isn't obvious what to do when something's broken. I've observed people using really broken websites (fallen apart layout, broken menus) and never "report" but complain to their colleagues. I second that people are bad at reporting problems, and I must add that computer people are usually bad to get the complaints and fix them. ;-) I guess if you have a problem and you know someone who can do something about it, then it'll get fixed, otherwise it _may_ get fixed, one day or other. [I've experienced this latter problem regarding email config bugs and [not] having them fixed.] Nevertheless I wouldn't miss any IE features, but then again I'm an anti m$ fascist by genetics. ;-) g ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Version control
On Sun, Feb 7, 2010 at 00:38, Ævar Arnfjörð Bjarmason wrote: > It's interesting that the #1 con against Git in that document is "Lots > of annoying Git/Linux fanboys". No, it's the "screaming 'hell yeah!' but have no idea what they're talking about" part. :-) g ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] What is wrong with Wikia's WYSIWYG?
On Mon, May 2, 2011 at 10:02, Tim Starling wrote: > don't think using wikitext is the best way to make things easier for > new users. It's always been a dilemma for me to see how much amount of computer illiterate users should be wished for (or actually possible to tolerate). I don't feel that web number dot number (whatever version we call it) is fast, reliable, useful enough to be used as the main way to input encyclopedia text. These usually very slow, and quite unreliable (including google docs stuff which is I believe the most advanced tech out here in this topic). And... People habitually completely get lost in DTP software (be that [open/whatever]office or else), they can't comprehend formatting, fonts, text annotation and other advanced features. I do not see that WYSIWYG would've made them more able to use the techniques. There are some guys who actually learned enough markup to completely screw up wikibooks (putting flashing 80pt large fonts in scrolling frames with all kinds of - otherwise not horrible on purpose - features of CSS), and I just fear what they can do with wysiwig. Such texts are sometimes just easier to completely reformat (reset ALL formatting to default and start over). Foundation's purpose is to make it easier for everyone and to invite and involve everyone, I know. I just have my doubts and worries, which I have just shared. Peter ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] XKCD: Extended Mind
On Wed, May 25, 2011 at 09:24, Tim Starling wrote: > On 25/05/11 17:05, Domas Mituzas wrote: >> On May 25, 2011, at 9:35 AM, K. Peachey wrote: >> >>> http://xkcd.org/903/ -Peachey >> >> that error is fake! 10.0.0.242 is internal services DNS server and >> is not used to serve en.wikipedia.org - dberror log does not have a >> single instance of it! 10.0.6.42 on the other hand > > I would have thought the fact that it was hand drawn would have given > it away. But in this particular case hand drawn doesn't mean facts can slip. At least these drawings are usually extremely precise. (You can see which pulldowns he usually keep open. :-)) I second Domas to check because there may be a super secret conspiracy and the drawing may be correct. ;-) -- byte-byte, grin ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] XKCD: Extended Mind
On Wed, May 25, 2011 at 17:16, Domas Mituzas Thanks for clearing that up. Nice work. g ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] XKCD: Extended Mind
On Thu, May 26, 2011 at 17:38, Leo Koppelkamm wrote: > http://ryanelmquist.com/cgi-bin/xkcdwiki Nice way to see that first sentences eventually lead to a general quantity or property which links to [[property (phylosophy)]] which links to Philosophy itself. So far I didn't see a way which wasn't following 'property'. g ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Errors in Wikimedia Commons old files
On Thu, Mar 1, 2012 at 00:56, emijrp wrote: > I'm trying to download Wikimedia Commons, but I have found some errors. For There are still occasional errors around, would be nice to run a script against the files database... but it can be usually fixed by downloading (sometimes from history) and upoading again. g ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] forking media files
Let me retitle one of the topics nobody seems to touch. On Fri, Aug 12, 2011 at 13:44, Brion Vibber wrote: > * media files -- these are freely copiable but I'm not sure the state of > easily obtaing them in bulk. As the data set moved into TB it became > impractical to just build .tar dumps. There are batch downloader tools > available, and the metadata's all in dumps and api. Right now it is basically locked: there is no way to bulk copy the media files, including doing simply a backup of one wikipedia, or commons. I've tried, I've asked, and the answer was basically to contact a dev and arrange it, which obviously could be done (I know many of the folks) but that isn't the point. Some explanations were mentioned, mostly mentioning that media and its metadata is quite detached, and thus it's hard to enforce licensing quirks like attribution, special licenses and such. I can guess this is a relevant comment since the text corpus is uniformly licensed under CC/GFDL while the media files are at best non-homogeneous (like commons, where everything's free in a way) and completely chaos at its worst (individual wikipedias, where there may be anything from leftover fair use to copyrighted by various entities to images to be deleted "soon"). Still, I do not believe it's a good method to make it close to impossible to bulk copy the data. I am not sure which technical means is best, as there are many competing ones. We could, for example, open up an API which would serve media file with its metadata together, possibly supporting mass operations. Still, it's pretty ineffective. Or we could support zsync, rsync and such (and I again recommend examining zsync's several interesting abilities to offload the work to the client), but there ought to be some pointers to image metadata, at least an oneliner file with every image linking to the license page. Or we could connect the bulk way to established editor accounts, so we could have at least a bit of an assurance that s/he knows what s/he's doing. -- byte-byte, grin ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] forking media files
On Mon, Aug 15, 2011 at 18:40, Russell N. Nelson - rnnelson wrote: > The problem is that 1) the files are bulky, That's expected. :-) > 2) there are many of them, 3) they are in constant flux, That is not really a problem: since there are many of them statistically they are not in flux. > and 4) it's likely that your connection would close for whatever reason > part-way through the download.. I seem not to forgot to mention zsync/rsync. ;-) > Even taking a snapshot of the filenames is dicey. By the time you finish, > it's likely that there will be new ones, and possible that some will be > deleted. Probably the best way to make this work is to 1) make a snapshot of > files periodically, Since I've been told they're backed up it naturally should exist. > 2) create an API which returns a tarball using the snapshot of files that > also implements Range requests. I would very much prefer ready-to-use format instead of a tarball, not to mention it's pretty resource consuming to create a tarball just for that. > Of course, this would result in a 12-terabyte file on the recipient's host. > That wouldn't work very well. I'm pretty sure that the recipient would need > an http client which would 1) keep track of the place in the bytestream and > 2) split out files and write them to disk as separate files. It's possible > that a program like getbot already implements this. I'd make a snapshot without tar especially because partial transfers aren't possible that way. -- byte-byte, grin ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l