[Wikitech-l] Fwd: Reg. Research using Wikipedia

2011-03-10 Thread David Gerard
-- Forwarded message -- From: ramesh kumar ramesh_ch...@hotmail.com Date: 9 March 2011 13:27 Subject: RE: Reg. Research using Wikipedia To: dger...@gmail.com Dear Mr.Gerard, Thanks for your instant response. But is there a time-gap to check between one request into another

Re: [Wikitech-l] Fwd: Reg. Research using Wikipedia

2011-03-10 Thread Paul Houle
On 3/10/2011 3:46 AM, David Gerard wrote: feel the program takes 71 days to finish all the 3.1 million article titles. Is there anyway, our university IP address will be given permission or sending a official email from our department head to Wikipedia Server administrator to consider that

Re: [Wikitech-l] Fwd: Reg. Research using Wikipedia

2011-03-10 Thread Arthur Richards
On 3/10/11 6:29 AM, Paul Houle wrote: I can say, positively, that you'll get the job done faster by downloading the dump file and cracking into it directly. I've got scripts that can download and extract stuff from the XML dump in an hour or so. I still have some processes that use

Re: [Wikitech-l] Fwd: Reg. Research using Wikipedia

2011-03-10 Thread Ángel González
On 10/03/11 7:13 PM, Arthur Richards wrote: You're likely correct - also I've recently been exposed to the 'wikipedia offline patch' extension (http://code.google.com/p/wikipedia-offline-patch/) which I believe allows you to use a compressed dump as your db stroage - saving you the

[Wikitech-l] Fwd: Reg. Research using Wikipedia

2011-03-09 Thread David Gerard
Is there a standard answer to this question - how much researchers are allowed to hammer the site? - d. -- Forwarded message -- From: ramesh kumar ramesh_ch...@hotmail.com Date: 9 March 2011 09:47 Subject: Reg. Research using Wikipedia To: wikie...@lists.wikimedia.org,

Re: [Wikitech-l] Fwd: Reg. Research using Wikipedia

2011-03-09 Thread Roan Kattouw
2011/3/9 David Gerard dger...@gmail.com: Is there a standard answer to this question - how much researchers are allowed to hammer the site? If they use the API and wait for one request to finish before they start the next one (i.e. don't make parallel requests), that's pretty much always fine.

Re: [Wikitech-l] Fwd: Reg. Research using Wikipedia

2011-03-09 Thread Platonides
Dear Members, I am Ramesh, pursuing my PhD in Monash University, Malaysia. My Research is on blog classification using Wikipedia Categories. As for my experiment, I use 12 main categories of Wikipedia. I want to identify which particular article belongs to which main 12 categories?. So I

Re: [Wikitech-l] Fwd: Reg. Research using Wikipedia

2011-03-09 Thread James Linden
Why do you need to access the live wikipedia for this? Using categorylinks.sql and page.sql you should be able to fetch the same data. Probably faster. In my research, the answer to this question is two-fold A) Creating a local copy of wikipedia (using mediawiki and various import tools) is

Re: [Wikitech-l] Fwd: Reg. Research using Wikipedia

2011-03-09 Thread Platonides
James Linden wrote: Why do you need to access the live wikipedia for this? Using categorylinks.sql and page.sql you should be able to fetch the same data. Probably faster. In my research, the answer to this question is two-fold A) Creating a local copy of wikipedia (using mediawiki and

Re: [Wikitech-l] Fwd: Reg. Research using Wikipedia

2011-03-09 Thread Thomas Dalton
On 9 March 2011 16:00, Platonides platoni...@gmail.com wrote: Dear Members, I am Ramesh, pursuing my PhD in Monash University, Malaysia. My Research is on blog classification using Wikipedia Categories. As for my experiment, I use 12 main categories of Wikipedia. I want to identify which

Re: [Wikitech-l] Fwd: Reg. Research using Wikipedia

2011-03-09 Thread Alex
On 3/9/2011 11:29 AM, James Linden wrote: Why do you need to access the live wikipedia for this? Using categorylinks.sql and page.sql you should be able to fetch the same data. Probably faster. In my research, the answer to this question is two-fold A) Creating a local copy of wikipedia