Hi!

Thanks for noticing and sharing. Another known issue with HTML dumps
is that it seems that categories and templates are not always
extracted: https://phabricator.wikimedia.org/T300124


Mitar

On Tue, Apr 5, 2022 at 12:59 PM Jan Berkel <j...@berkel.fr> wrote:
>
> Hello,
>
> just a heads-up for anyone using HTML dumps, apart from the missing 
> namespaces issue already mentioned on this list, there also seem to be entire 
> pages missing, and some of the included page data is outdated and does not 
> contain the latest changes. I have no idea how many pages are affected.
>
> phabricator ticket with more details: 
> https://phabricator.wikimedia.org/T305407
>
>  – Jan
> _______________________________________________
> Xmldatadumps-l mailing list -- xmldatadumps-l@lists.wikimedia.org
> To unsubscribe send an email to xmldatadumps-l-le...@lists.wikimedia.org



-- 
http://mitar.tnode.com/
https://twitter.com/mitar_m
_______________________________________________
Xmldatadumps-l mailing list -- xmldatadumps-l@lists.wikimedia.org
To unsubscribe send an email to xmldatadumps-l-le...@lists.wikimedia.org

Reply via email to