O. O. wrote:
>       I am looking at the dump of the English Wikipedia at 
> http://download.wikimedia.org/enwiki/20081008/ There is a file called 
> “all-titles-in-ns0.gz” which is supposed to contain the List of Page 
> Titles.  If I do
> 
> cat enwiki-20081008-all-titles-in-ns0 | wc -l
> 
> I get 5716820. On the same page, a little above in 
> “pages-articles.xml.bz2” we have “enwiki 7649051 pages”.
> 
> So why are these two numbers different? Are there pages without a Title?

The description of pages-articles.xml.bz2 says "Articles, templates, 
image descriptions, and primary meta-pages."  Presumably the 1932231 
non-article pages in it are the "templates, image descriptions, and 
primary meta-pages".

-- 
Ilmari Karonen

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to