[Wikitech-l] How Does Wikimedia Succeed As A Non-Profit?

2011-11-10 Thread Paul Houle
Here's a crazy question. Non-profit organizations are famous for having terrible web sites. Generally they get a fixed budget and after they spend it, they have a party and announced that they succeeded. Nobody ever tells the users, or rather, the people who might have been

Re: [Wikitech-l] should we join the Unicode Constortium?

2011-07-19 Thread Paul Houle
On 7/19/2011 2:41 PM, Ryan Kaldari wrote: The Wikimedia Foundation is now an official liaison member of the Unicode Consortium: http://www.unicode.org/consortium/memblist.html#liais Rick McGowan is the Unicode representative to Wikimedia, and I'll be serving as Wikimedia's respresentative

Re: [Wikitech-l] Wikimedia engineering report for June 2011

2011-07-11 Thread Paul Houle
On 7/1/2011 11:19 AM, Guillaume Paumier wrote: *Academic publications authentication proxy* --- Chad Horohoe http://www.mediawiki.org/wiki/User:%5Edemon started a project whose goal is to allow selected Wikimedians to access third-party academic publishing sites to help with content

Re: [Wikitech-l] schema.org - anything here for us?

2011-06-03 Thread Paul Houle
On 6/3/2011 4:02 PM, David Gerard wrote: http://schema.org/ An initiative by Google, Yahoo and Bing to make a tag language to make things more findable in search engines. Is there anything in this for us? schema.org tags in templates? Presumably this would require software work too, and

Re: [Wikitech-l] wiki/ prefix in URL

2011-05-18 Thread Paul Houle
On 5/17/2011 8:00 PM, John Vandenberg wrote: Is there a good reason for us to always include the '/wiki/' in the URL? Removing the prefix would save five characters, and I'm guessing that it would also save a measurable amount of traffic serving 5KB 404 pages. Is there something else on

Re: [Wikitech-l] search=steven+tyler gets Steven_tyler

2011-05-13 Thread Paul Houle
On 5/13/2011 3:31 AM, M. Williamson wrote: I still don't think page titles should be case sensitive. Last time I asked how useful this really was, back in 2005 or so, I got a tersely-worded response that we need it to disambiguate certain pages. OK, but how many cases does that actually

Re: [Wikitech-l] Mediawiki Development IDE

2011-05-10 Thread Paul Houle
On 5/10/2011 2:15 AM, Ashar Voultoiz wrote: On 09/05/11 18:47, Brion Vibber wrote: On 09/05/11 15:08, Tod wrote: Is there an IDE that the MW developer community has settled on and can recommend? My take is that there are three cultures. (1) People who use IDEs that are actually useful

Re: [Wikitech-l] Categories 2.0

2011-05-10 Thread Paul Houle
On 5/10/2011 5:48 PM, Lars Aronsson wrote: One thing that I would like to do in Wikipedia is: A category that spans a scale, e.g. people by date of birth. Today we use [[Category:1823 births]] to group people born in that year, where the category page has links to the previous and next

Re: [Wikitech-l] auditcode.py - discern class structure

2011-05-03 Thread Paul Houle
On 5/2/2011 7:35 PM, Ryan Lane wrote: I totally3 that you wrote it in python. On Mon, May 2, 2011 at 4:21 PM, Russell N. Nelson - rnnelson rnnel...@clarkson.edu wrote: Maybe there's a better tool to tell you what function is defined in what class in PHP, but I couldn't find one in the

Re: [Wikitech-l] Static HTML Dumps

2011-04-05 Thread Paul Houle
On 4/5/2011 4:00 PM, Platonides wrote I think he is better parsing the articles, though. For a linguistic research you don't need things such as the contents of templates, so a simple wikitext stripping would do. And it will be much, much, much, much faster than parsing the whole wiki.

Re: [Wikitech-l] Fwd: Reg. Research using Wikipedia

2011-03-10 Thread Paul Houle
On 3/10/2011 3:46 AM, David Gerard wrote: feel the program takes 71 days to finish all the 3.1 million article titles. Is there anyway, our university IP address will be given permission or sending a official email from our department head to Wikipedia Server administrator to consider that

Re: [Wikitech-l] Topic and cathegory analyser

2011-03-04 Thread Paul Houle
On 3/3/2011 7:12 PM, Dávid Tóth wrote: Would it be useful to make a program that would create topic relations for each wikipedia article based on the links and the distribution of semantic structures? This would be very useful for me. I'm thinking about attack this problem by

[Wikitech-l] Is there an easy way to extract the first N wikipedia topics in the order they were created?

2011-02-22 Thread Paul Houle
Hi, I've been thinking about the early history of Wikipedia and about what which sort of topics got written early on. I'm wondering if there is an easy way to find the first N wikipedia topics (where N is say 100,000) in the order they were created.

Re: [Wikitech-l] How would you disrupt Wikipedia?

2010-12-30 Thread Paul Houle
On 12/29/2010 2:31 AM, Neil Kandalgaonkar wrote: Let's imagine you wanted to start a rival to Wikipedia. Assume that you are motivated by money, and that venture capitalists promise you can be paid gazillions of dollars if you can do one, or many, of the following: Ok, first of all you

Re: [Wikitech-l] Making usability part of the development process

2010-12-07 Thread Paul Houle
On 12/7/2010 2:23 AM, Daniel Friesen wrote: One thing our skin system does have is an extensive linker and system for building tooltips and accesskeys for things using our i18n system. And calls to the message system from skins are all over the place: tagline, jumpto, and basically every

Re: [Wikitech-l] Parallel computing project

2010-10-25 Thread Paul Houle
On 10/24/2010 8:42 PM, Aryeh Gregor wrote: My first thought was to write a GPU program to crack MediaWiki password hashes as quickly as possible, then use what we've studied in class about GPU architecture to design a hash function that would be as slow as possible to crack on a GPU relative

[Wikitech-l] API vs data dumps

2010-10-13 Thread Paul Houle
I know there's some discussion about what's appropriate for the Wikipedia API, and I'd just like to share my recent experience. I was trying to download the Wikipedia entries for people, of which I found about 800,000. I had a scanner already written that could do the download,

[Wikitech-l] Will the real URI stand up? [dbpedia vs wikipedia vs the world]

2010-10-13 Thread Paul Houle
I notice lines in the dbpedia dumps that look like http://dbpedia.org/resource/Boston%2C_MA http://dbpedia.org/property/redirect http://dbpedia.org/resource/Boston . Note the URL encoded %2C=,. Anyhow, if I go to http://dbpedia.org/page/Boston%2C_MA I see two redirects

Re: [Wikitech-l] Bugmeister opening at Wikimedia Foundation

2010-10-11 Thread Paul Houle
On 10/7/2010 11:30 PM, MZMcBride wrote: Given that Wikimedia already employs a number of contractors who don't work full-time, I think the fact that this position would be full-time is really odd. Damn, I thought the too many Indians and not enough Chiefs problem was just something we

Re: [Wikitech-l] Parser implementaton for MediaWiki syntax

2010-09-28 Thread Paul Houle
On 9/28/2010 3:53 AM, Andreas Jonsson wrote: For my own IX work I've written a wikimedia markup parser in C# based on the Irony framework. It fails to parse about 0.5% of pages in wikipedia What do you mean with fail. It assigns slightly incorrect semantic to a construction? It

[Wikitech-l] Accuracy of coordinates in dbpedia/wikipedia freebase

2010-09-27 Thread Paul Houle
I've recently put up a site that uses coordinate information from Freebase and Dbpedia, and I'm starting to think about how to clean up certain data quality problems I'm encountering, for instance, see: http://ookaboo.com/o/pictures/topic/209440/Oakville_Assembly In this particular case,

Re: [Wikitech-l] Parser implementaton for MediaWiki syntax

2010-09-27 Thread Paul Houle
On 9/27/2010 2:58 PM, Chad wrote: This. Tim sums up the consensus very well with that commit summary. He also made some comments on the history of wikitext and alternative parsers on foundation-l back in Jan '09[0]. Worth a read (starting mainly at Parser is a convenient and short name for

Re: [Wikitech-l] Acceptable use of API

2010-09-24 Thread Paul Houle
On 9/24/2010 8:49 AM, Robin Ryder wrote: Hi, Thanks for the quick answers, and for the useful link. My previous e-mail was not detailed enough; sorry about that. Let me clarify: - I don't need to crawl the entire Wikipedia, only (for example) articles in a category. ~1,000 articles would

[Wikitech-l] Slow file downloads from commons...

2010-09-22 Thread Paul Houle
Right now I'm seeing that image downloads on wikimedia commons have slowed down by a factor of ten or so since the last weekend. I used to download about 700 images an hour, so it's possible that my bandwidth is being throttled, but it might be something affecting other people too.

[Wikitech-l] wikimedia commons slow lately?

2010-09-21 Thread Paul Houle
In the last few days I've noticed sporadically that API calls time out on Wikimedia Commons and also that sometimes file downloads from Wikimedia Commons are really slow. Are there any performance problems going on? ___ Wikitech-l mailing

Re: [Wikitech-l] modernizing mediawiki

2010-03-08 Thread Paul Houle
Dmitriy Sintsov wrote: When one looks for educational / academic content, rich and colorful interface only distracts the reader. The following site is not mediawiki / monobook based, yet the visual design is simple: http://plato.stanford.edu/contents.html There is nothing wrong with it.

Re: [Wikitech-l] modernizing mediawiki

2010-03-03 Thread Paul Houle
Chris Lewis wrote: I hope I am emailing this to the right group. My concern was about mediawiki and it's limitations, as well as it's outdated methods. As someone wo runs a wiki, I've gone through a lot of frustrations. For one thing, I'd say that mediawiki aims for a particular

Re: [Wikitech-l] modernizing mediawiki

2010-03-03 Thread Paul Houle
Marco Schuster wrote: On Wed, Mar 3, 2010 at 4:30 PM, David Gerard dger...@gmail.com wrote: On 3 March 2010 15:06, Paul Houle p...@ontology2.com wrote: For a large-scale site, there's going to be a lot of administration work to be done, so it doesn't matter if the system

Re: [Wikitech-l] A suitable error message for iPhones

2009-10-07 Thread Paul Houle
David Gerard wrote: Are you *sure* we can't put a narky message when iPhone users click a video? Adobe do! http://twitpic.com/kf361 (assuming it's real - can anyone with an iPhone please check?) Adobe is the most feared company on the web right now. Even though Microsoft has

[Wikitech-l] URLs that aren't cool...

2009-07-28 Thread Paul Houle
I've been looking at the id structure of dbpedia and wikipedia and finally found an example where case sensitivity issues really bite. Cases like this with a redirect are a little obnoxious, http://en.wikipedia.org/wiki/New_York_City http://en.wikipedia.org/wiki/New_york_city largely because

Re: [Wikitech-l] [Dbpedia-discussion] URLs that aren t cool...

2009-07-28 Thread Paul Houle
Georgi Kobilarov wrote: In this particular one, it's two articles about the same topic, but there could be some cases where the two articles are about something different. Yes, such as http://en.wikipedia.org/wiki/FROG and http://en.wikipedia.org/wiki/Frog I agree that this can be

Re: [Wikitech-l] Licensing update: Final steps

2009-06-10 Thread Paul Houle
Erik Moeller wrote: For multimedia, the licensing committee and the Wikimedia Commons community are still discussing the best update strategy, but it will probably involve a bot updating the existing templates. We're also hoping to run a CentralNotice to explain the process to the communities