Custom dbpedia indexing without Entity Scores

Alex Lopez Thu, 03 Nov 2011 04:07:02 -0700

Hi stanbolers,

I'm in the middle of the process of creating a custom dbpedia index forStanbol, using some 24 dumps from dbpedia 3.7, english and portuguese,and some custom mappings (in specific some special treating forPortuguese text plus some additional properties I'd like to see indexed).


I'm following this file:

http://svn.apache.org/repos/asf/incubator/stanbol/trunk/entityhub/indexing/dbpedia/README.md

and this for processing the broken images_en file

http://svn.apache.org/repos/asf/incubator/stanbol/trunk/entityhub/indexing/dbpedia/fetch_prepare.sh

The process went well up to the point after all triples (some 80M) whereloaded into tdb.


The problem is that the process stops after that and outputs a

Exception in thread "Thread-3" java.lang.IllegalStateException: The filewith the Entity Scores is missingatorg.apache.stanbol.entityhub.indexing.core.source.LineBasedEntityIterator.initialise(LineBasedEntityIterator.java:424)

...

10:12:22,077 [Thread-2] INFO solryard.SolrYardIndexingDestination -... create SolrYard


And nothing more happens.

Of course, the file is missing because I didn't need it, since I want toindex all entities. I tried to generate it anyway once but after a lotof time of processing it failed with some outOfMem exception (I think inthe process of sorting).

Is there a way to instruct the indexer to ignore the Entity Scores file?Or write some simple one in a way that says "all entities are to beindexed"?


Thanks, I can send the complete log if it is needed.
Best,
Alex

Custom dbpedia indexing without Entity Scores

Reply via email to