Re: [Wikitech-l] We're not quite at Google's level
On Fri, May 22, 2009 at 12:13 PM, Thomas Dalton wrote: > The thing that prompted me to start this thread was Google, a > commercial organisation (although not one people pay for at the point > of use), issuing just such a press release. Err, yes. But people had already noticed, and been blogging rampantly about it. So it's not like they were promoting their failure so much as avoiding being silent on the issue. Whereas we would be actively promoting it. Steve ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] We're not quite at Google's level
2009/5/22 Steve Bennett : > On Sat, May 16, 2009 at 2:58 AM, The Cunctator wrote: >> We should definitely highlight real downtime as a reason for funding, >> especially in a way that discusses practical steps that would be taken to >> reduce the problem and how much those steps would cost. > > Interesting point. Commercial organisations would never issue a press > release highlighting poor performance, because they want people to > think they're getting good value for money. A charity on the other > hand...what does wikipedia have to lose from people thinking its > servers are unreliable due to lack of funding? The thing that prompted me to start this thread was Google, a commercial organisation (although not one people pay for at the point of use), issuing just such a press release. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] We're not quite at Google's level
On Sat, May 16, 2009 at 2:58 AM, The Cunctator wrote: > We should definitely highlight real downtime as a reason for funding, > especially in a way that discusses practical steps that would be taken to > reduce the problem and how much those steps would cost. Interesting point. Commercial organisations would never issue a press release highlighting poor performance, because they want people to think they're getting good value for money. A charity on the other hand...what does wikipedia have to lose from people thinking its servers are unreliable due to lack of funding? ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Correct way to Import SQL Dumps of Wikipedia into MediaWiki in Binary
O. O. wrote: > Hi, > This may be a bit obvious – but I don’t have quite as much experience > in this area. The SQL Dumps provided at http://download.wikimedia.org do > not contain specifications for the “DEFAULT CHARSET” of the respective > Table. When installing MediaWiki – it seems to be recommended to use the > binary Charset. I would like to know how to import one of these dumps > into a Table with the binary Charset. > > Right now I import on the cmdline: E.g. > > mysql wikidb < enwiki-20090306-pagelinks.sql > > This results in the corresponding Table being dropped and then recreated > again. The problem with this is that the newly created Table does not > have the “DEFAULT CHARSET” set to Binary, because the SQL Dumps do not > have these specified. > > I first attempted to modify my my.cnf file to set the “DEFAULT CHARSET” > to binary for new Tables. I attempted to make the following changes to > my.cnf: > > [client] > default-character-set=binary > [mysqld] > default-character-set=binary > default-collation=binary > character-set-server=binary > collation-server=binary > init-connect='SET NAMES binary' > > > I restarted the Server – but I found that the new Table that gets > created, gets created in UTF-8, not binary. > > I then attempted to edit the SQL File i.e. replace the line > > ) TYPE=InnoDB; > > With > > ) TYPE=InnoDB DEFAULT CHARSET=binary; > > This works, in the sense that now the new Table gets created in Binary. > However I think I am making mistakes in editing the file. These files > are rather large, so I wrote code in Perl, and again in Java to do the > editing. They can manage to do the above substitution, but I am not > entirely confident about their UTF-8 handling. You can also use sed to edit it: $ sed -i "n;n;n;n;n;n;n;n;n;n;n;n;n;n;n;n;n;s/InnoDB/InnoDB DEFAULT CHARSET=binary/" enwiki-20090306-pagelinks.sql will modify just that line (the 18th, adjust the number of 'n;'s should the schema change) to be ) TYPE=InnoDB DEFAULT CHARSET=binary; > The problem appears when > I am trying to import these modified files, where I get an error > “Duplicate entry” e.g. for the enwiki-20090306-pagelinks.sql file, I > get the error: > > ERROR 1062 (23000) at line 1359: Duplicate entry > '1198132-2-Gangleri/tests/links/�' for key 1 > > I would like to add that importing this file as UTF-8 results in this > “Duplicate entry” error coming much earlier in the input file. I have looked at the entries from 1198132 (http://en.wikipedia.org/wiki/User:%D7%9C%D7%A2%D7%A8%D7%99_%D7%A8%D7%99%D7%99%D7%A0%D7%94%D7%90%D7%A8%D7%98/tests/links/char_x00_-_xFF) on that file and they aren't duplicated (the file is ok), but they stress the charset a bit (uses the full 0-255 range) so if mysql is not completely interpreting it as binary, it'll have problems. It explains that in utf8 you get the problem earlier, but I don't know why exactly it's failing in your setup. You can also try luck with yesterday's pagelinks.sql > So, what’s the correct way of importing these SQL Dumps, such that they > are imported into a Table in Binary? If my above description is not > clear please let me know and I would try to explain again. > > Thanks a lot, > O. O. > > P. S. I am running MediaWiki/MySQL under Ubuntu. I hope UTF-8 is handled > correctly on the Commandline Bash – but I don’t know how to check that. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] flagged revisions
Bart wrote: > I think a better solution would be to upgrade Huggle/Twinkle. As it now > stands, the main antivandal programs are... somewhat stodgy. When you have > a bunch of people all using them at once, you start to run into edit > conflicts. Different people will be trying to revert the same page at the > same time. But you know how you can set a page as "patrolled" in the "new > revisions" section? Perhaps there should be a way to set that on > Huggle/Twinkle, but for multiple users. There could be a flag that flips > when there are more than X number of users actively running through > Huggle/Twinkle. If the number of users is greater than X, then revisions > are actually sent out to multiple people at once. This seems somewhat > contradictory at first, the idea that you'll save time/resources and cover > more pages if you have more people working on the same page, but it wouldn't > revert as soon as you hit revert. It would just set the flag on that page > and serve up the next page -- if a majority of reviewers reverts it, the > vandalism is reverted and the vandal is warned. > > If the number of users is lower than X, then of course each person would > instantly revert a page when they revert. But I spend a lot of time waiting > for Huggle to revert a page and warn a user. This may not be the case for > everyone, but I read very quickly. I read the last Harry Potter book in > like a couple of hours, no joking. When I use Huggle, I spend the majority > of my time waiting for Huggle to revert a page and warn a user (well, other > than using Google to find other sites to check on factual accuracy, but > that's another story). > > I just feel that the amount of edit conflicts while using Huggle and the > amount that the same set of pages is looked over by the same set of people, > all of whom are trying to individually revert, is just too much. There's > far too much wasted time, in my opinion, because Huggle and Twinkle, > although great, are just slightly inadequate to keep up with how big > Wikepdia has become. It's so huge that it's impossible for one person to > read it all, since it'd take a few years of continuous reading and it's > growing faster than the fastest reader could read. In summary, your complain is that Huggle and Twinkle are slow. Complain to its authors, not to mediawiki developers. It's up to you to use them or not, or even create a "better" tool. If two people save the same version, mediawiki already chooses the first one. There could be an addition of "add this section if it doesn't exist" command, but that's all. Antivandal tools are free to synchronize and load their load between them in any way they wish. This is the wrong list to rant about them. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] Bugzilla components for Extensions
Can somebody create a component in Bugzilla for the Widgets extension please? You can set my email (sergey.chernys...@gmail.com) as a default assignee for it. BTW, maybe it makes sense to have a "- other -" component in there so people who monitor the bugs can create component by just seeing that it doesn't exist? Maybe I'm wrong. Thank you, Sergey -- Sergey Chernyshev http://www.sergeychernyshev.com/ ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l