Hi Rupert,
Thank for the prompt reply.
When I checked the incoming_links.txt the final lines were as follows
1 m.0___xpc
1 m.0___xk_
1 m.0___ttg
1 m.0___t6s
1 m.0___t6h
1 m.0___t5v
1 m.0___t5c
1 m.0___rw7
1 m.0___qhn
1 m.0___p3v
1 m.0___nm5
1 m.0___n4s
1 m.0___n
1 m.0___jk_
1 m.0___hv4
1 m.0___c6k
1 m.0___b4g
1 m.0___8
1 m.0___7yv
1 m.0___2fw
1 m.0____
So the entities are not preceded with a name space. Therefore when calling
String prefix = NamespaceMappingUtils.getPrefix(entity);
(LineBasedEntityIterator.parseEntityFormLine(String line) - 425), prefix is
assigned with a empty String.
Is it correct to defined an empty name space mapping as follows in the
namespaceprefix.mapping?
fb http://rdf.freebase.com/ns/
ns http://rdf.freebase.com/ns/
key http://rdf.freebase.com/key/
http://rdf.freebase.com/ns/
Thanks
Regards
Amindri
On 9 February 2015 at 17:52, Rupert Westenthaler <
[email protected]> wrote:
> Hi Amindri
>
> Based on the code the NPE could originate from a namespace prefix
> unknown to the namespace prefix service.
>
> Can you please check the data of the "incoming_links.txt" file against
> mappings define in the "indexing/config/namespaceprefix.mappings"
> file. My guess is that the "incoming_links.txt" uses a prefix that is
> not define in the mappings file.
>
> It is recommended to explicitly define namespace prefix mappings for
> all namespaces used by the indexing process (config data and rdf
> data). For missing mappings http://prefix.cc/ is used as a fallback.
>
> best
> Rupert
>
>
> On Mon, Feb 9, 2015 at 7:33 AM, Amindri Udugala
> <[email protected]> wrote:
> > Hi All,
> >
> > I need to create an index from a Freebase data dump. So I followed the
> > instructions in the README file in entityhub\indexing\freebase.
> >
> > First I executed java -jar
> > org.apache.stanbol.entityhub.indexing.freebase-1.0.0-SNAPSHOT.jar init,
> to
> > generate the folder structure. The folder structure was successfully
> > generated except for the following warnings
> >
> > 16:16:20,530 [main] WARN impl.NamespacePrefixProviderImpl - Invalid
> > Namespace Mapping: prefix 'nsogi' valid , namespace '
> http://prefix.cc/nsogi:'
> > invalid -> mapping ignored!
> > 16:16:21,279 [main] WARN impl.NamespacePrefixProviderImpl - Invalid
> > Namespace Mapping: prefix 'category' valid , namespace '
> > http://dbpedia.org/resource/Category:' invalid -> mapping ignored!
> > 16:16:21,435 [main] WARN impl.NamespacePrefixProviderImpl - Invalid
> > Namespace Mapping: prefix 'chebi' valid , namespace '
> > http://bio2rdf.org/chebi:' invalid -> mapping ignored!
> > 16:16:21,435 [main] WARN impl.NamespacePrefixProviderImpl - Invalid
> > Namespace Mapping: prefix 'hgnc' valid , namespace '
> http://bio2rdf.org/hgnc:'
> > invalid -> mapping ignored!
> > 16:16:21,450 [main] WARN impl.NamespacePrefixProviderImpl - Invalid
> > Namespace Mapping: prefix 'dbptmpl' valid , namespace '
> > http://dbpedia.org/resource/Template:' invalid -> mapping ignored!
> > 16:16:21,638 [main] WARN impl.NamespacePrefixProviderImpl - Invalid
> > Namespace Mapping: prefix 'pubmed' valid , namespace '
> > http://bio2rdf.org/pubmed_vocabulary:' invalid -> mapping ignored!
> > 16:16:21,638 [main] WARN impl.NamespacePrefixProviderImpl - Invalid
> > Namespace Mapping: prefix 'dbc' valid , namespace '
> > http://dbpedia.org/resource/Category:' invalid -> mapping ignored!
> > 16:16:21,638 [main] WARN impl.NamespacePrefixProviderImpl - Invalid
> > Namespace Mapping: prefix 'dbt' valid , namespace '
> > http://dbpedia.org/resource/Template:' invalid -> mapping ignored!
> > 16:16:21,638 [main] WARN impl.NamespacePrefixProviderImpl - Invalid
> > Namespace Mapping: prefix 'dbrc' valid , namespace '
> > http://dbpedia.org/resource/Category:' invalid -> mapping ignored!
> > 16:16:21,809 [main] WARN impl.NamespacePrefixProviderImpl - Invalid
> > Namespace Mapping: prefix 'call' valid , namespace '
> > http://webofcode.org/wfn/call:' invalid -> mapping ignored!
> > 16:16:21,809 [main] WARN impl.NamespacePrefixProviderImpl - Invalid
> > Namespace Mapping: prefix 'affymetrix' valid , namespace '
> > http://bio2rdf.org/affymetrix_vocabulary:' invalid -> mapping ignored!
> >
> > Then I copied the Freebase dump (freebase-rdf-latest.gz) to the
> > indexing/resources/rdfdata folder
> > and the incoming_links.txt file, generated by fbrankings-uri.sh to
> > indexing/resources folder and executed the indexing process. (I used all
> > the default config files)
> >
> > While executing the index process I noticed the following log.
> >
> >
> > 16:38:40,806 [main] INFO core.IndexerFactory - - EntityDataIterable:
> null
> > 16:38:40,806 [main] INFO core.IndexerFactory - - EntityIterator:
> >
> org.apache.stanbol.entityhub.indexing.core.source.LineBasedEntityIterator@1880249c
> > 16:38:40,806 [main] INFO core.IndexerFactory - - EntityDataProvider:
> >
> org.apache.stanbol.entityhub.indexing.source.jenatdb.RdfIndexingSource@4e38a55
> > 16:38:40,806 [main] INFO core.IndexerFactory - - EntityScoreProvider:
> null
> >
> > Finally it threw a null pointer exception as follows
> >
> > 16:38:40,837 [Thread-3] INFO source.ResourceLoader - ... 1 files
> imported
> > in 0 seconds
> > 16:38:40,837 [Thread-3] INFO source.ResourceLoader - Loding 0 File ...
> > 16:38:40,837 [Thread-3] INFO source.ResourceLoader - ... 0 files
> imported
> > in 0 seconds
> > 16:38:42,912 [Thread-0] INFO solryard.SolrYardIndexingDestination -
> ...
> > create SolrYard
> > 16:38:42,959 [main] INFO impl.IndexerImpl - ... delete existing
> > IndexedEntityId file
> >
> C:\cygwin64\home\User\code\stanbol_indexing\indexing\destination\indexed-entities-ids.zip
> > 16:38:42,974 [main] INFO impl.IndexerImpl - Initialisation completed
> > 16:38:42,974 [main] INFO impl.IndexerImpl - ... initialisation
> completed
> > 16:38:42,974 [main] INFO impl.IndexerImpl - start indexing ...
> > 16:38:42,974 [main] INFO impl.IndexerImpl - Indexing started ...
> > Exception in thread "Indexing: Entity Source Reader Deamon"
> > java.lang.NullPointerException
> > at java.lang.StringBuilder.<init>(Unknown Source)
> > at
> >
> org.apache.stanbol.entityhub.indexing.core.source.LineBasedEntityIterator.parseEntityFormLine(LineBasedEntityIterator.java:435)
> > at
> >
> org.apache.stanbol.entityhub.indexing.core.source.LineBasedEntityIterator.getNext(LineBasedEntityIterator.java:379)
> > at
> >
> org.apache.stanbol.entityhub.indexing.core.source.LineBasedEntityIterator.hasNext(LineBasedEntityIterator.java:356)
> > at
> >
> org.apache.stanbol.entityhub.indexing.core.impl.EntityIdBasedIndexingDaemon.run(EntityIdBasedIndexingDaemon.java:55)
> > at java.lang.Thread.run(Unknown Source)
> >
> > I'm not sure if this happens because I haven't configured an important
> > property in a configuration file. I'm pretty new to Stanbol and any help
> > would be much appreciated.
> >
> > Thanks in advance.
> > --
> > Regards
> > Amindri Udugala
>
>
>
> --
> | Rupert Westenthaler [email protected]
> | Bodenlehenstraße 11 ++43-699-11108907
> | A-5500 Bischofshofen
> | REDLINK.CO
> ..........................................................................
> | http://redlink.co/
>
--
Regards
Amindri Udugala