Re: [Wikitech-l] [Potential Spoof] Question about wikidata dump bz2 file

2017-04-07 Thread Trung Dinh
Awesome, thanks Ariel.

On 4/6/17, 2:33 AM, "Wikitech-l on behalf of Ariel Glenn WMF" 
 wrote:

Hi Trung,

For larger wikis, there will be a collection of partial files such as
these, where the pXXXpXXX indicate the first and last page ids in the
file.  But for pages-articles, there will also be a combined file
generated, so you'll be able to download that directly.  It's listed on the
download page https://dumps.wikimedia.org/wikidatawiki/20170401/ and the
direct link is as you expect:

https://dumps.wikimedia.org/wikidatawiki/20170401/wikidatawiki-20170401-pages-articles.xml.bz2

Please do consider joining the xmldatadumps-l list; changes and updates are
announced there, among other things.

Ariel

On Thu, Apr 6, 2017 at 10:12 AM, Jaime Crespo  wrote:

> Trung,
>
> If you do not get an answer on the developers' forum, there is a
> dumps-focused mailing list at
> https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
>
> Cheers,
>
> On Thu, Apr 6, 2017 at 6:59 AM, Trung Dinh  wrote:
>
> > Sorry, I hit enter early by accident.
> >
> > I realized the dump file for wikidata is no longer in the format
> > wikidatawiki-2017-pages-articles.xml.bz2 anymore.
> > Now, it is split in to different dumps:
> > https://dumps.wikimedia.org/wikidatawiki/latest/
> > wikidatawiki-latest-md5sums.txt
> >
> > I am wondering when did this happen and the rationale behind it. Will it
> > be permanent or we will switch back to the original format soon ?
> >
> > Thank you,
> >
> > Best regards,
> >
> > Trung
> >
> > On 4/5/17, 9:57 PM, "Wikitech-l on behalf of Trung Dinh" <
> > wikitech-l-boun...@lists.wikimedia.org on behalf of t...@fb.com> wrote:
> >
> > Hi everyone,
> >
> > I realized the dump file for wikidata is no longer in the format
> > wikidatawiki-2017-pages-articles.xml.bz2 anymore.
> >
> >
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
>
>
>
> --
> Jaime Crespo
> 
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Potential Spoof] Question about wikidata dump bz2 file

2017-04-06 Thread Ariel Glenn WMF
Hi Trung,

For larger wikis, there will be a collection of partial files such as
these, where the pXXXpXXX indicate the first and last page ids in the
file.  But for pages-articles, there will also be a combined file
generated, so you'll be able to download that directly.  It's listed on the
download page https://dumps.wikimedia.org/wikidatawiki/20170401/ and the
direct link is as you expect:
https://dumps.wikimedia.org/wikidatawiki/20170401/wikidatawiki-20170401-pages-articles.xml.bz2

Please do consider joining the xmldatadumps-l list; changes and updates are
announced there, among other things.

Ariel

On Thu, Apr 6, 2017 at 10:12 AM, Jaime Crespo  wrote:

> Trung,
>
> If you do not get an answer on the developers' forum, there is a
> dumps-focused mailing list at
> https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
>
> Cheers,
>
> On Thu, Apr 6, 2017 at 6:59 AM, Trung Dinh  wrote:
>
> > Sorry, I hit enter early by accident.
> >
> > I realized the dump file for wikidata is no longer in the format
> > wikidatawiki-2017-pages-articles.xml.bz2 anymore.
> > Now, it is split in to different dumps:
> > https://dumps.wikimedia.org/wikidatawiki/latest/
> > wikidatawiki-latest-md5sums.txt
> >
> > I am wondering when did this happen and the rationale behind it. Will it
> > be permanent or we will switch back to the original format soon ?
> >
> > Thank you,
> >
> > Best regards,
> >
> > Trung
> >
> > On 4/5/17, 9:57 PM, "Wikitech-l on behalf of Trung Dinh" <
> > wikitech-l-boun...@lists.wikimedia.org on behalf of t...@fb.com> wrote:
> >
> > Hi everyone,
> >
> > I realized the dump file for wikidata is no longer in the format
> > wikidatawiki-2017-pages-articles.xml.bz2 anymore.
> >
> >
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
>
>
>
> --
> Jaime Crespo
> 
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Potential Spoof] Question about wikidata dump bz2 file

2017-04-06 Thread Jaime Crespo
Trung,

If you do not get an answer on the developers' forum, there is a
dumps-focused mailing list at
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l

Cheers,

On Thu, Apr 6, 2017 at 6:59 AM, Trung Dinh  wrote:

> Sorry, I hit enter early by accident.
>
> I realized the dump file for wikidata is no longer in the format
> wikidatawiki-2017-pages-articles.xml.bz2 anymore.
> Now, it is split in to different dumps:
> https://dumps.wikimedia.org/wikidatawiki/latest/
> wikidatawiki-latest-md5sums.txt
>
> I am wondering when did this happen and the rationale behind it. Will it
> be permanent or we will switch back to the original format soon ?
>
> Thank you,
>
> Best regards,
>
> Trung
>
> On 4/5/17, 9:57 PM, "Wikitech-l on behalf of Trung Dinh" <
> wikitech-l-boun...@lists.wikimedia.org on behalf of t...@fb.com> wrote:
>
> Hi everyone,
>
> I realized the dump file for wikidata is no longer in the format
> wikidatawiki-2017-pages-articles.xml.bz2 anymore.
>
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>



-- 
Jaime Crespo

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] [Potential Spoof] Question about wikidata dump bz2 file

2017-04-05 Thread Trung Dinh
Sorry, I hit enter early by accident.

I realized the dump file for wikidata is no longer in the format 
wikidatawiki-2017-pages-articles.xml.bz2 anymore.
Now, it is split in to different dumps:
https://dumps.wikimedia.org/wikidatawiki/latest/wikidatawiki-latest-md5sums.txt

I am wondering when did this happen and the rationale behind it. Will it be 
permanent or we will switch back to the original format soon ?

Thank you,

Best regards,

Trung

On 4/5/17, 9:57 PM, "Wikitech-l on behalf of Trung Dinh" 
 wrote:

Hi everyone,

I realized the dump file for wikidata is no longer in the format 
wikidatawiki-2017-pages-articles.xml.bz2 anymore.


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l