Re: [Wikitech-l] [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D

2010-03-19 Thread zh509
On Mar 19 2010, Platonides wrote:

Zeyi wrote:
 Hi,

 Firstly, congratulations for this! as i Know it has taken for a long 
 time!

 and May I ask a small question: what difference between current dump and
 history dump. I know current one only includes current edits, and history
 one has all edits as introduction said.

You have explained the difference perfectly :)

 More specifically, how different
 shows on one article? Can anyone explain it in detail, please?

It doesn't show the article. It's just a really really large bunch of 
wikitext separated by xml tags.
It is shown by a tool. If you just wwant to read the articles, you don't 
need histories.

What I mean is that if the current dump show there are 30 edits under the 
particular article name, and history dump show there are 100 edits under 
the same article. what's different between these 30 and 100?

If i say that the current dump can explain how the current articles 
established from different edits, is that correct?

 Additionally, why all the statistics of Wikipedia only use history dump 
 for analysis?

Because they study things like changes made to articles, number of edits 
per time...

 Thanks very much!

You're welcome.



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D

2010-03-19 Thread Conrad Irwin

On 03/19/2010 11:02 AM, zh...@york.ac.uk wrote:

 What I mean is that if the current dump show there are 30 edits under the 
 particular article name, and history dump show there are 100 edits under 
 the same article. what's different between these 30 and 100?

The current dump shows 1 edit for each article, only the most recent at
the time that article was processed. The history dump shows all edits
for all articles.

Conrad

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D

2010-03-19 Thread zh509
On Mar 19 2010, Conrad Irwin wrote:


On 03/19/2010 11:02 AM, zh...@york.ac.uk wrote:

 What I mean is that if the current dump show there are 30 edits under 
 the particular article name, and history dump show there are 100 edits 
 under the same article. what's different between these 30 and 100?

The current dump shows 1 edit for each article, only the most recent at
the time that article was processed. The history dump shows all edits
for all articles.

Wow, can you confirm that only the lastest edit can be collected by the 
current dump? So, the current dump isn't meaningful in the term of 
statistics?


Conrad
thanks,
Zeyi
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Wikimedia Google Summer of Code Accepted!

2010-03-19 Thread Meadowlark Bradsher
W00t! So exciting!

On Thu, Mar 18, 2010 at 1:33 PM, Rob Lanphier ro...@robla.net wrote:

 Hi folks,

 We've been accepted again for another Google Summer of Code!  What this
 means:

 *  Mentors:  please go to this page to formally apply to be a mentor:
 http://socghop.appspot.com/gsoc/mentor/request/google/gsoc2010/wikimedia

 Note: you can't officially be a mentor until you do this, and we can't do
 it
 for you (part of it involves agreeing to the mentor agreement).

 Question for the group: how many student slots do you think we should
 request? On the advice for mentors page, it says: 

 A good rule of thumb when finding and assigning mentors is to have two
 mentors per student. It is also a good idea to have a spare mentor or two
 who can pay attention to many students and keep track of the big picture.


 Given our current list of mentors (we have 9 listed, plus 1 maybe), that
 would give us 4 as the number of slots.  Does that seem like a number
 that's both low enough that we can be reasonably confident we'll do a good
 job mentoring, but high enough that we're not selling ourselves short?

 *  Students: it's still not yet formally time to apply, but now is a really
 good time to start brainstorming ideas, and getting clarifications on
 what's
 already been suggested:
 http://www.mediawiki.org/wiki/Summer_of_Code_2010

 While you may be tempted (from a competitive perspective) not to reveal
 what
 your ideas are early, it is almost certainly going to be to your benefit to
 engage now.  By engage, I mean demonstrate that you're really thinking
 about how to improve MediaWiki and other Wikimedia project technologies,
 and
 have the wherewithal to do it, not merely impress us with what skills you
 have.  The more specific and thoughtful your ideas, questions, and
 suggestions are, the more comfortable we'll all feel in selecting you.

 You might want to take a peek at the GSoC student agreement now, since
 you'll be required to agree to it as a precondition for being part of this
 year's program:

 http://socghop.appspot.com/document/show/gsoc_program/google/gsoc2010/studentagmt

 Rob
 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] Can we have one for wikibooks?

2010-03-19 Thread Magnus Manske
http://www.wired.com/gadgetlab/2010/03/high-speed-camera-scans-books-in-seconds/

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Can we have one for wikibooks?

2010-03-19 Thread Gerard Meijssen
Hoi,
Having one is not so much of an issue.. How are you going to operate it ?
Where do you locate this tool? Where do you get the books from..
Thanks,
  GerardM

On 19 March 2010 22:06, Magnus Manske magnusman...@googlemail.com wrote:


 http://www.wired.com/gadgetlab/2010/03/high-speed-camera-scans-books-in-seconds/

 ___
 Wikitech-l mailing list
 Wikitech-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Can we have one for wikibooks?

2010-03-19 Thread Daniel Schwen
Cooperation with a major university library would be great. (but this
is a bit off-topic for wikitech, sorry)

 Having one is not so much of an issue.. How are you going to operate it ?
 Where do you locate this tool? Where do you get the books from..

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Who can give me a svndump of svn.wikimedia.org?

2010-03-19 Thread Tim Landscheidt
Ævar Arnfjörð Bjarmason ava...@gmail.com wrote:

 Chad and I have been playing around with a SVN-Git conversion of
 MediaWiki. After running into some odd issues with git-svn and since
 it takes around 3 weeks to do a complete git-svn import (with
 branches) of the MediaWiki SVN repository I'd like to get access to
 `svnadmin dump' output as run on mayflower.
 [...]

 No answer to your question, but can you estimate how large
 the git repository will probably be?

 The complete conversion with all branches/tags is which goes up until
 r62638 (Feb 17) is 406MB. You can clone it at
 http://github.com/mediawiki/mediawiki-svn.

I'll wait till you have sorted out the hiccups :-).

 Of course a proper conversion to Git would involve splitting up the
 repository into logically separate chunks. Grabbing just the data you
 need (phase3 + a few extensions) would take less space and be faster
 than with SVN now.

I don't know whether I would want that the former - after a
few days with intermittent problems with my InterNet connec-
tion I really got to love git's stand-alone approach.

Tim


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l