Give me a few minutes I can get you a database dump of what you need.

On Sat, May 6, 2017 at 5:25 PM, Abdulfattah Safa <fattah.s...@gmail.com>
wrote:

> 1. I'm usng max as a limit parameter
> 2. I'm not sure if the dumps have the data I need. I need to get the titles
> for all Articles (name space = 0), with no redirects and also need the
> titles of all Categories (namespace = 14) without redirects
>
> On Sat, May 6, 2017 at 11:39 PM Eran Rosenthal <eranro...@gmail.com>
> wrote:
>
> > 1. You can use limit parameter to get more titles in each request
> > 2. For getting many entries it is recommended to extract from dumps or
> from
> > database using quarry
> >
> > On May 6, 2017 22:36, "Abdulfattah Safa" <fattah.s...@gmail.com> wrote:
> >
> > > for the & in $Continue=-||, it's a type. It doesn't exist in the code.
> > >
> > > On Sat, May 6, 2017 at 10:12 PM Abdulfattah Safa <
> fattah.s...@gmail.com>
> > > wrote:
> > >
> > > > I'm trying to get all the page titles in Wikipedia in namespace using
> > the
> > > > API as following:
> > > >
> > > > https://en.wikipedia.org/w/api.php?action=query&format=
> > > xml&list=allpages&apnamespace=0&apfilterredir=nonredirects&
> > > aplimit=max&$continue=-||$apcontinue=BASE_PAGE_TITLE
> > > >
> > > > I keep requesting this url and checking the response if contains
> > continue
> > > > tag. if yes, then I use same request but change the *BASE_PAGE_TITLE
> > *to
> > > > the value in apcontinue attribute in the response.
> > > > My applications had been running since 3 days and number of retrieved
> > > > exceeds 30M, whereas it is about 13M in the dumps.
> > > > any idea?
> > > >
> > > >
> > > >
> > > _______________________________________________
> > > Wikitech-l mailing list
> > > Wikitech-l@lists.wikimedia.org
> > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > _______________________________________________
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to