[Wiki-research-l] very good

2011-12-13 Thread mohamad mehdi
bspan style=font-size: 25pt;
a  alt=po59wc42cngm1azbq59
jf47sat1rk0w2hlofg13
qf7z5rovyqtd0gtqxu9l
id=3t5kevvd64ujuztrjlt
7kbvibd4b78aijnwnfnq
href=gsr6os76o3xeyp.lm3.me/sd_wiki-researc...@lists.wikimedia.org/rhf2
z3tng47zlogg4ecjau_ViewMsg 
Click here to see the attached video/a
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] Wikipedia Literature Review - Tools and Data Sets

2011-04-20 Thread mohamad mehdi

Hi everyone,

Thank you all for your replies, we really appreciate your cooperation. Below is 
a summary of the tools and data sets recommended by Torsten, Andrew, paolo, and 
emijrp. We would also like to know if there is any existing Wikipedia page that 
includes such a list so we can add to it. Otherwise, where do you suggest 
adding this list so it is noticeable and useful for the community?

http://code.google.com/p/jwpl/
http://wikipedia-miner.sourceforge.net/
http://code.google.com/p/wikokit//*Wiktionary parser and visual interface */
https://github.com/phauly/wiki-network/  /*Python scripts for parsing 
Wikipedia dumps with different goals*/
http://www.gnuband.org/2011/04/19/wikipedia_datasets_released//*datasets of 
network extracted from User Talk pages*/
http://code.google.com/p/wikiteam/
http://code.google.com/p/wikiteam/downloads/list?can=1 
http://www.research.ibm.com/visual/projects/history_flow/
http://meta.wikimedia.org/wiki/WikiXRay
http://statmediawiki.forja.rediris.es/index_en.html

Best regards,
Mohamad Mehdi
  ___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] Wikipedia Literature Review - Tools and Data Sets

2011-04-18 Thread mohamad mehdi

Hi everyone,
 
This is a follow up on a previous thread (Wikipedia data sets) related to the 
Wikipedia literature review (Chitu Okoli). As I mentioned in my previous email, 
part of our study is to identify the data collection methods and data sets used 
for Wikipedia studies. Therefore, we searched for online tools used to extract 
Wikipedia articles and for pre-compiled Wikipedia articles data sets; we were 
able to identify the following list. Please let us know of any other sources 
you know about. Also, we would like to know if there is any existing Wikipedia 
page that includes such a list so we can add to it. Otherwise, where do you 
suggest adding this list so it is noticeable and useful for the community?
 
http://download.wikimedia.org/   /* official 
Wikipedia database dumps */ 
http://datamob.org/datasets/tag/wikipedia   /* Multiple data sets 
(English Wikipedia articles that have been transformed into XML) */
http://wiki.dbpedia.org/Datasets /* Structured 
information from Wikipedia*/
http://labs.systemone.at/wikipedia3/* Wikipedia³ is a 
conversion of the English Wikipedia into RDF. It's a monthly updated dataset 
containing around 47 million triples.*/
http://www.scribd.com/doc/9582/integrating-wikipediawordnet  /* article talking 
about integrating WorldNet and Wikipedia with YAGO */
http://www.infochimps.com/datasets/taxobox-wikipedia-infoboxes-with-taxonomic-information-on-animal/
 
http://www.infochimps.com/link_frame?dataset=11043   /* Wikipedia Datasets for 
the Hadoop Hack | Cloudera */
http://www.infochimps.com/link_frame?dataset=11166   /* Wikipedia: Lists of 
common misspellings/For machines */
http://www.infochimps.com/link_frame?dataset=11028   /* Building a (fast) 
Wikipedia offline reader */
http://www.infochimps.com/link_frame?dataset=11004   /* Using the Wikipedia 
page-to-page link database */
http://www.infochimps.com/link_frame?dataset=11285   /* List of films */
http://www.infochimps.com/link_frame?dataset=11598   /* MusicBrainz Database */
http://dammit.lt/wikistats/   /* Wikitech-l page counters */
http://snap.stanford.edu/data/wiki-meta.html/* Complete Wikipedia edit 
history (up to January 2008) */
http://aws.amazon.com/datasets/2596?_encoding=UTF8jiveRedirect=1  /* Wikipedia 
Page Traffic Statistics */
http://aws.amazon.com/datasets/2506   /* Wikipedia XML Data */
http://www-958.ibm.com/software/data/cognos/manyeyes/datasets?q=Wikipedia+  
 /* list of Wikipedia data sets */ 
Examples:
  
http://www-958.ibm.com/software/data/cognos/manyeyes/datasets/top-1000-accessed-wikipedia-articl/versions/1
  /* Top 1000 Accessed Wikipedia Articles  */
  
http://www-958.ibm.com/software/data/cognos/manyeyes/datasets/wikipedia-hits/versions/1
  /* Wikipedia Hits */
 
Tools to extract data from Wikipedia:
http://www.evanjones.ca/software/wikipedia2text.html/* 
Extracting Text from Wikipedia */
http://www.infochimps.com/link_frame?dataset=11121/* Wikipedia 
article traffic statistics */
http://blog.afterthedeadline.com/2009/12/04/generating-a-plain-text-corpus-from-wikipedia/
   /* Generating a Plain Text Corpus from Wikipedia */
http://www.infochimps.com/datasets/wikipedia-articles-title-autocomplete 
 
 
Thank you,
Mohamad Mehdi
  ___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] Wikipedia Literature Review - data sets

2011-03-24 Thread mohamad mehdi

Hi everyone,
 
This is related to the Wikipedia literature review that Chitu Okoli described 
earlier. Part of our study is to identify the data collection methods and data 
sets used for Wikipedia studies. We are aware of available tools to download 
wikipedia dumps such as wp-download and other tools from 
https://wiki.toolserver.org. Nevertheless, we are wondering if there exists a 
list of pre-compiled data sets of wikipedia articles that you know about. If no 
such list exists, we would also appreciate it if you can send us names or 
references to such data sets. 

Thank you,
Mohamad
  ___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l