Hi Andrea, there used to be encoding problems, but I think they are all fixed since the 3.8 release. I tried very hard to make TurtleEscaper do the right thing - I checked the relevant standards etc. Could you give an example where Jena complains about a DBpedia 3.8 file?
Cheers, JC On Wed, Mar 20, 2013 at 6:16 PM, Andrea Di Menna <ninn...@gmail.com> wrote: > Hi, > > I have been using Stanbol [1] to process DBpedia data files and build a > dbpedia Solr index. > Stanbol is using Jena TDB in order to load DBpedia files into a triple > store. > Unfortunately, almost all the DBpedia N-Triples files must be pre-processed > before being able to import them using Jena [2]. > > The following sed command is launched: > > sed 's/\\\\/\\u005c\\u005c/g;s/\\\([^u"]\)/\\u005c\1/g' > > Basically the backslash is replaced with the unicode character escape > sequence. > > Do you think this should/could be fixed in > org.dbpedia.extraction.util.TurtleEscaper#escapeTurtle ? > > Cheers > Andrea > > [1] http://stanbol.apache.org/ > [2] > http://svn.apache.org/repos/asf/stanbol/trunk/entityhub/indexing/dbpedia/dbpedia-3.8/fetch_data_en_int.sh > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_d2d_mar > _______________________________________________ > Dbpedia-discussion mailing list > Dbpedia-discussion@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion > ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_mar _______________________________________________ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion