Hi Ibu,
Thank you for your feedback.
To simplify the parsing of the files, from all literals I removed the following
characters: "\" { } | , \n". If there are quotes in the URIs, they are escaped
as '""'. Also, there is no URI that starts with"{" and ends with "}", so there
is no need to escape "{ } |" inside the URIs.
I apologize for those two incorrectly parsed files. I fixed them couple of days
ago, so please download them again.
Regards,
Petar
-----Original Message-----
From: ibu ☉ radempa ䷰ [mailto:[email protected]]
Sent: Wednesday, November 27, 2013 10:00 PM
To: [email protected]
Subject: Re: [Dbpedia-discussion] DBpedia as Tables release
On 11/25/2013 02:18 PM, Petar Ristoski wrote:
> We are happy to announce the first version of the DBpedia as Tables
> tool [1].
> Any feedback is welcome!
> [1] http://wiki.dbpedia.org/DBpediaAsTables
Thanks Petar,
your CSV files are really helpful.
For all who want to import data into Postgresql, I've written a python script
which automatically creates the SQL corresponding to the CSV:
https://gitorious.org/dbpedia_csv2sql/dbpedia_csv2sql
The column types (ofter arrays) are inferred from your headers and the data
rows; indexes are also created.
(If people here find this script useful, I could also package it for pypi and
improve documentation a bit.)
I was assuming that your files are encoded in UTF-8, which worked, but I didn't
find either a '""' or a '\"' inside a field value, so I don't know how a '"'
would be encoded, if there were one. Also for a multi-value field (e.g.
'{1|2|3}') I don't know how '{', '|' and '}' are encoded, if they appear within
one of the values. - Maybe you could add some documentation on that.
In your data I found 2 format problems (I don't think my download went wrong,
but anyway, a checksum might be helpful):
* Film.csv seems to have no headers (it has 20004 lines for me).
* Aircraft.csv: the 2nd last row (
"http://dbpedia.org/resource/Marinens_Flyvebaatfabrikk_M.F.10"
) has too many columns.
All other files (except owl#Thing.csv and Agent.csv, which I didn't check due
to size and column number) were ok.
I also noticed another thing, not concerning your tool, where some parser maybe
could be optimized:
http://dbpedia.org/resource/Americas
has language="American (but see [[#English usage"
Regards,
ibu
------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT
organizations don't have a clear picture of how application performance affects
their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, &
PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349351&iu=/4140/ostg.clktrk
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT
organizations don't have a clear picture of how application performance
affects their revenue. With AppDynamics, you get 100% visibility into your
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349351&iu=/4140/ostg.clktrk
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion