On 8/12/13 12:56 PM, Nicolas Torzec wrote:
With respect to the RDF export I'd advocate for:
1) an RDF format with one fact per line.
2) the use of a mature/proven RDF generation framework.
Yes, keep it simple, use Turtle.
The additional benefit of Turtle is that is addresses a wide data
consu
On 12/08/13 17:56, Nicolas Torzec wrote:
With respect to the RDF export I'd advocate for:
1) an RDF format with one fact per line.
2) the use of a mature/proven RDF generation framework.
Optimizing too early based on a limited and/or biased view of the
potential use cases may not be a good idea
With respect to the RDF export I'd advocate for:
1) an RDF format with one fact per line.
2) the use of a mature/proven RDF generation framework.
Optimizing too early based on a limited and/or biased view of the
potential use cases may not be a good idea in the long run.
I'd rather keep it simple
On 11/08/13 22:29, Tom Morris wrote:
On Sat, Aug 10, 2013 at 2:30 PM, Markus Krötzsch
mailto:mar...@semantic-mediawiki.org>>
wrote:
Anyway, if you restrict yourself to tools that are installed by
default on your system, then it will be difficult to do many
interesting things with a 4
On Sat, Aug 10, 2013 at 2:30 PM, Markus Krötzsch <
mar...@semantic-mediawiki.org> wrote:
> Anyway, if you restrict yourself to tools that are installed by default on
> your system, then it will be difficult to do many interesting things with a
> 4.5G RDF file ;-) Seriously, the RDF dump is really
Hi Tom,
On 10/08/13 15:55, Tom Morris wrote:
Given your "educating" people about software engineering principles,
this may fall on deaf ears, but I too have a strong preference for the
format with an independent line per triple.
No worries. The eventual RDF export of Wikidata will most certain
Given your "educating" people about software engineering principles, this
may fall on deaf ears, but I too have a strong preference for the format
with an independent line per triple.
On Sat, Aug 10, 2013 at 8:35 AM, Markus Krötzsch <
markus.kroetz...@cs.ox.ac.uk> wrote:
>
> On 10/08/13 12:18, Seb
Dear Sebastian,
On 10/08/13 12:18, Sebastian Hellmann wrote:
Hi Markus!
Thank you very much.
Regarding your last email:
Of course, I am aware of your arguments in your last email, that the
dump is not "official". Nevertheless, I am expecting you and others to
code (or supervise) similar RDF dum
Hi Markus!
Thank you very much.
Regarding your last email:
Of course, I am aware of your arguments in your last email, that the
dump is not "official". Nevertheless, I am expecting you and others to
code (or supervise) similar RDF dumping projects in the future.
Here are two really important
Good morning. I just found a bug that was caused by a bug in the
Wikidata dumps (a value that should be a URI was not). This led to a few
dozen lines with illegal qnames of the form "w: ". The updated script
fixes this.
Cheers,
Markus
On 09/08/13 18:15, Markus Krötzsch wrote:
Hi Sebastian,
Hi Sebastian,
On 09/08/13 15:44, Sebastian Hellmann wrote:
Hi Markus,
we just had a look at your python code and created a dump. We are still
getting a syntax error for the turtle dump.
You mean "just" as in "at around 15:30 today" ;-)? The code is under
heavy development, so changes are quit
mann
Sent: Friday, August 9, 2013 10:44 AM
To: Discussion list for the Wikidata project.
Cc: Dimitris Kontokostas ; Jona Christopher Sahnwaldt
Subject: Re: [Wikidata-l] Wikidata RDF export available
Hi Markus,
we just had a look at your python code and created a dump. We are still
getting a synta
Hi Markus,
we just had a look at your python code and created a dump. We are still
getting a syntax error for the turtle dump.
I saw, that you did not use a mature framework for serializing the
turtle. Let me explain the problem:
Over the last 4 years, I have seen about two dozen people (und
Markus Krötzsch, 04/08/2013 17:35:
Are you sure? The file you linked has mappings from site ids to language
codes, not from language codes to language codes. Do you mean to say:
"If you take only the entries of the form 'XXXwiki' in the list, and
extract a language code from the XXX, then you get
On 04/08/13 13:17, Federico Leva (Nemo) wrote:
Markus Krötzsch, 04/08/2013 12:32:
* Wikidata uses "be-x-old" as a code, but MediaWiki messages for this
language seem to use "be-tarask" as a language code. So there must be a
mapping somewhere. Where?
Where I linked it.
Are you sure? The file
Markus Krötzsch, 04/08/2013 12:32:
* Wikidata uses "be-x-old" as a code, but MediaWiki messages for this
language seem to use "be-tarask" as a language code. So there must be a
mapping somewhere. Where?
Where I linked it.
* MediaWiki's http://www.mediawiki.org/wiki/Manual:$wgDummyLanguageCode
Let me top-post a question to the Wikidata dev team:
Where can we find documentation on what the Wikidata internal language
codes actually mean? In particular, how do you map the language selector
to the internal codes? I noticed some puzzling details:
* Wikidata uses "be-x-old" as a code, bu
Markus Krötzsch, 03/08/2013 15:48:
(3) Limited language support. The script uses Wikidata's internal
language codes for string literals in RDF. In some cases, this might not
be correct. It would be great if somebody could create a mapping from
Wikidata language codes to BCP47 language codes (let
Update: the first bugs in the export have already been discovered -- and
fixed in the script on github. The files I uploaded will be updated on
Monday when I have a better upload again (the links file should be fine,
the statements file requires a rather tolerant Turtle string literal
parser, a
Hi,
I am happy to report that an initial, yet fully functional RDF export
for Wikidata is now available. The exports can be created using the
wda-export-data.py script of the wda toolkit [1]. This script downloads
recent Wikidata database dumps and processes them to create RDF/Turtle
files. V
20 matches
Mail list logo