Hi Ibu, Thank you for your feedback.
To simplify the parsing of the files, from all literals I removed the following characters: "\" { } | , \n". If there are quotes in the URIs, they are escaped as '""'. Also, there is no URI that starts with"{" and ends with "}", so there is no need to escape "{ } |" inside the URIs. I apologize for those two incorrectly parsed files. I fixed them couple of days ago, so please download them again. Regards, Petar -----Original Message----- From: ibu ☉ radempa ䷰ [mailto:i...@radempa.de] Sent: Wednesday, November 27, 2013 10:00 PM To: dbpedia-discussion@lists.sourceforge.net Subject: Re: [Dbpedia-discussion] DBpedia as Tables release On 11/25/2013 02:18 PM, Petar Ristoski wrote: > We are happy to announce the first version of the DBpedia as Tables > tool [1]. > Any feedback is welcome! > [1] http://wiki.dbpedia.org/DBpediaAsTables Thanks Petar, your CSV files are really helpful. For all who want to import data into Postgresql, I've written a python script which automatically creates the SQL corresponding to the CSV: https://gitorious.org/dbpedia_csv2sql/dbpedia_csv2sql The column types (ofter arrays) are inferred from your headers and the data rows; indexes are also created. (If people here find this script useful, I could also package it for pypi and improve documentation a bit.) I was assuming that your files are encoded in UTF-8, which worked, but I didn't find either a '""' or a '\"' inside a field value, so I don't know how a '"' would be encoded, if there were one. Also for a multi-value field (e.g. '{1|2|3}') I don't know how '{', '|' and '}' are encoded, if they appear within one of the values. - Maybe you could add some documentation on that. In your data I found 2 format problems (I don't think my download went wrong, but anyway, a checksum might be helpful): * Film.csv seems to have no headers (it has 20004 lines for me). * Aircraft.csv: the 2nd last row ( "http://dbpedia.org/resource/Marinens_Flyvebaatfabrikk_M.F.10" ) has too many columns. All other files (except owl#Thing.csv and Agent.csv, which I didn't check due to size and column number) were ok. I also noticed another thing, not concerning your tool, where some parser maybe could be optimized: http://dbpedia.org/resource/Americas has language="American (but see [[#English usage" Regards, ibu ------------------------------------------------------------------------------ Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349351&iu=/4140/ostg.clktrk _______________________________________________ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion ------------------------------------------------------------------------------ Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349351&iu=/4140/ostg.clktrk _______________________________________________ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion