Re: Help creating a custom vocabulary

2012-10-11 Thread Rupert Westenthaler
Hi René, BTW I finished the work on STANBOL-765 today. See first comment for the documentation on how to enable indexing of Bnodes. best Rupert On Thu, Oct 11, 2012 at 10:54 PM, Rene Nederhand wrote: > Hi Rupert, > > Thank you very much for all the work. I'd expected this would take much > long

Re: Help creating a custom vocabulary

2012-10-11 Thread Rene Nederhand
Hi Rupert, Thank you very much for all the work. I'd expected this would take much longer :) Probably this weekend, I will try to get some of the CommonCrawl data imported into Stanbol and see how this works out. In addition, I will try the Apache any23 tool (thx. A. Soroka). Best, René On Wed

Re: Help creating a custom vocabulary

2012-10-10 Thread Rupert Westenthaler
Hi Rene, With STANBOL-764 the indexing tool now supports importing quads. However you will still have problems to work with the CommonCrawl data. 1. Because a lot of the data do use BNodes and those are ignored by the Entityhub. As indexing of Bnodes was already requested several times from I cre

Re: Help creating a custom vocabulary

2012-10-09 Thread aj...@virginia.edu
This may or may not be immediately useful to you, but the Apache Any23 tool: https://any23.apache.org/ will parse N-quads and output N-triples. I haven't used it for that purpose (haven't had to) but I've used it for other purposes and it works well. --- A. Soroka Software & Systems Engineering

Re: Help creating a custom vocabulary

2012-10-09 Thread Rene Nederhand
Hi Rupert, It would be great if we could make it possible to use CommonCrawl data even if we would lose some information. As I remember well, this was one of the requests that came up in the validation reports quite frequently. Freebase is an alternative. So, if this involves importing N-quads th

Re: Help creating a custom vocabulary

2012-10-09 Thread Rupert Westenthaler
Hi Rene, The problem ist that the files of this dataset do use N-Quads and not NTriples (basically SPOC (Subject, Predicate, Object, Context) instead of SPO. I can try to add support for importing N-Quads, but because the importing tool does not use named graphs you might even than lose some quad

Help creating a custom vocabulary

2012-10-09 Thread Rene Nederhand
Hi, I am trying to create a custom vocabulary using webdatacommonsRDFa data [1]. To do this I am following this tutorial [2]. I've installed the indexer tool without any problems, editing the config file and