[Xmldatadumps-l] Wikidata project and interwiki links removed in wiki text

2013-03-04 Thread François Bonzon
Hi,

I understand from http://www.wikidata.org/wiki/Wikidata:News that
- enwiki since February 13, 2013
- hewiki and itwiki since January 30, 2013
- huwiki January 14, 2013
have migrated to the Wikidata project. And more wikis will follow shortly.

One consequence is that wiki markup for interwiki links (cross-language
links) are being gradually removed from articles, because the MediaWiki
software can now read them from the centralized Wikidata repository.

I verified in the latest huwiki dump that some articles indeed no more have
interwiki links. Do you confirm my above statements?

How can I now extract interwiki links from dumps? Is there a separate
Wikidata dump I should download? What attributes for look for to join
Wikidata and separate language wiki dumps? Thanks for your help.

-François
___
Xmldatadumps-l mailing list
Xmldatadumps-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l


Re: [Xmldatadumps-l] Wikidata project and interwiki links removed in wiki text

2013-03-04 Thread Federico Leva (Nemo)

François Bonzon, 04/03/2013 16:35:

How can I now extract interwiki links from dumps? Is there a separate
Wikidata dump I should download? What attributes for look for to join
Wikidata and separate language wiki dumps? Thanks for your help.


http://dumps.wikimedia.org/huwiki/20130224/huwiki-20130224-langlinks.sql.gz
https://www.mediawiki.org/wiki/Manual:Langlinks_table

Nemo

___
Xmldatadumps-l mailing list
Xmldatadumps-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l


[Xmldatadumps-l] About 20120228's wikidata.org dump failure

2013-03-04 Thread Jianyong Zhang
We try to build a snapshot of wikidata.org. First we want to get a
snapshot of wikidata.org to play with.
Thus we're depending on its dump now.

I have checked the sitehttp://dumps.wikimedia.org/wikidatawiki/.
http://dumps.wikimedia.org/wikidatawiki/ The latest wikidata dump is
for 20130228.
But it failed.

I'm wondering if it is possible to fix this failure, or we have to
wait for the next dump.

*From http://meta.wikimedia.org/wiki/Data_dumps, it mentions:*
Failures in the dump process are generally dealt with by rerunning the
portion of the dump that failed.

Isn't it the case for that wikidata failure? Will someone take care of it?

Thank all people to spend time to make dump as a great
data source :-)
___
Xmldatadumps-l mailing list
Xmldatadumps-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l


Re: [Xmldatadumps-l] Wikidata project and interwiki links removed in wiki text

2013-03-04 Thread François Bonzon
Thanks Nemo.

I confirm I now see interwiki language links originating from Wikidata in
languagewiki-date-langlinks.sql.gz dumps, with the format described in
the 2nd link you sent. However, this is a MySQL dump, not a XML dump.

Language links are then no more available in XML data dumps?


On Mon, Mar 4, 2013 at 4:45 PM, Federico Leva (Nemo) nemow...@gmail.comwrote:

 François Bonzon, 04/03/2013 16:35:

  How can I now extract interwiki links from dumps? Is there a separate
 Wikidata dump I should download? What attributes for look for to join
 Wikidata and separate language wiki dumps? Thanks for your help.


 http://dumps.wikimedia.org/**huwiki/20130224/huwiki-**
 20130224-langlinks.sql.gzhttp://dumps.wikimedia.org/huwiki/20130224/huwiki-20130224-langlinks.sql.gz
 https://www.mediawiki.org/**wiki/Manual:Langlinks_tablehttps://www.mediawiki.org/wiki/Manual:Langlinks_table

 Nemo
___
Xmldatadumps-l mailing list
Xmldatadumps-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l


Re: [Xmldatadumps-l] Wikidata project and interwiki links removed in wiki text

2013-03-04 Thread Nicolas Torzec
Indeed, that will be an issue for everyone that consumes Wikipedia data 
automatically, especially as more structured data (e.g. infobox) will 
eventually move from MediaWiki to Wikidata. DBpedia will have the same issue at 
one point.

Nicolas.

--
Nicolas Torzec
Yahoo! Labs.







From: François Bonzon 
francois.bon...@gmail.commailto:francois.bon...@gmail.com
Date: Monday, March 4, 2013 7:35 AM
To: 
xmldatadumps-l@lists.wikimedia.orgmailto:xmldatadumps-l@lists.wikimedia.org 
xmldatadumps-l@lists.wikimedia.orgmailto:xmldatadumps-l@lists.wikimedia.org
Subject: [Xmldatadumps-l] Wikidata project and interwiki links removed in wiki 
text

Hi,

I understand from http://www.wikidata.org/wiki/Wikidata:News that
- enwiki since February 13, 2013
- hewiki and itwiki since January 30, 2013
- huwiki January 14, 2013
have migrated to the Wikidata project. And more wikis will follow shortly.

One consequence is that wiki markup for interwiki links (cross-language links) 
are being gradually removed from articles, because the MediaWiki software can 
now read them from the centralized Wikidata repository.

I verified in the latest huwiki dump that some articles indeed no more have 
interwiki links. Do you confirm my above statements?

How can I now extract interwiki links from dumps? Is there a separate Wikidata 
dump I should download? What attributes for look for to join Wikidata and 
separate language wiki dumps? Thanks for your help.

-François
___
Xmldatadumps-l mailing list
Xmldatadumps-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l


Re: [Xmldatadumps-l] Wikidata project and interwiki links removed in wiki text

2013-03-04 Thread Federico Leva (Nemo)

François Bonzon, 04/03/2013 18:22:

I confirm I now see interwiki language links originating from Wikidata
in languagewiki-date-langlinks.sql.gz dumps, with the format
described in the 2nd link you sent. However, this is a MySQL dump, not a
XML dump.

Language links are then no more available in XML data dumps?


I guess not, except – probably – in the XML data dumps for Wikidata 
itself, in whatever weird format ContentHandler makes them into.


Nemo

___
Xmldatadumps-l mailing list
Xmldatadumps-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l