Re: [Wiki-research-l] [WikiEN-l] Old Wikipedia backups discovered

2010-12-16 Thread Joseph Reagle
On Wednesday, December 15, 2010, Tim Starling wrote: There were some changes made to the page text that weren't represented in diff_log, specifically changing certain camel-case links to free links. It appears my problems were related to some CR/LF issues not round-tripping between diff and

Re: [Wiki-research-l] [WikiEN-l] Old Wikipedia backups discovered

2010-12-16 Thread Tim Starling
On 16/12/10 23:10, Joseph Reagle wrote: On Wednesday, December 15, 2010, Tim Starling wrote: There were some changes made to the page text that weren't represented in diff_log, specifically changing certain camel-case links to free links. It appears my problems were related to some CR/LF

Re: [Wiki-research-l] [WikiEN-l] Old Wikipedia backups discovered

2010-12-16 Thread Joseph Reagle
I have the first 10K edits up reconstructed in their various pages at: http://cyber.law.harvard.edu/~reagle/wp-redux/ ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Re: [Wiki-research-l] [WikiEN-l] Old Wikipedia backups discovered

2010-12-16 Thread lior gimel
This is amazing! Thanks for the work and effort, this reconstruction is a priceless resource for researchers. Lior On Thu, Dec 16, 2010 at 8:53 PM, Joseph Reagle joseph.2...@reagle.orgwrote: I have the first 10K edits up reconstructed in their various pages at:

Re: [Wiki-research-l] [WikiEN-l] Old Wikipedia backups discovered

2010-12-16 Thread Joseph Reagle
On Thursday, December 16, 2010, lior gimel wrote: This is amazing! And buggy! :-) Thanks for the work and effort, this reconstruction is a priceless resource for researchers. Thanks to Tim for providing the data, and for working on a much better version that I look forward to!

[Wiki-research-l] Google ngrams

2010-12-16 Thread emijrp
Hi all; I leave this link here... http://ngrams.googlelabs.com/datasets An example http://ngrams.googlelabs.com/graph?content=collaborativeyear_start=1920year_end=corpus=0smoothing=3 Regards, emijrp ___ Wiki-research-l mailing list

Re: [Wiki-research-l] Google ngrams

2010-12-16 Thread Samuel Klein
I was just playing with this... remarkable. Someone should do the same with Wikipedia's text over time, which would provide even crisper comparisons [as within categories]. http://ngrams.googlelabs.com/graph?content=art,technology,wwwyear_start=1950year_end=2008corpus=5smoothing=4 On Thu, Dec

Re: [Wiki-research-l] Google ngrams

2010-12-16 Thread emijrp
Look at this one ; ) http://ngrams.googlelabs.com/graph?content=security%2Cfreedomyear_start=1950year_end=2008corpus=5smoothing=4 2010/12/17 Samuel Klein meta...@gmail.com I was just playing with this... remarkable. Someone should do the same with Wikipedia's text over time, which would