Hi Rafa, 

Thanks for the hint, I will have a look at it!

Best,
Kata

-----Original Message-----
From: Rafa Haro [mailto:rh...@apache.org] 
Sent: Tuesday, January 12, 2016 12:21 PM
To: Lejtovicz, Katalin <katalin.lejtov...@oeaw.ac.at>; dev@stanbol.apache.org
Subject: Re: Re: question on working with custom vocabularies

Hi Kata,

Probably there is a problem with your linking configuration. The first
thing I would check would be if you have labels for the language identified
automatically by Stanbol.

If your instance is available in any public URL I could try to take a look
if you want.

Cheers,
Rafa

On Tue, Jan 12, 2016 at 11:36 AM Lejtovicz, Katalin <
katalin.lejtov...@oeaw.ac.at> wrote:

> Hi Rafa,
>
>
>
> Thanks for your reply!
>
> The index file was deleted and the new one copied to the datafiles folder,
> but on another machine I also tried out a new installation of Stanbol, and
> copied the index file over, and it didn’t work there either.
>
>
>
> The EntityHub via the API works, I checked that first, when I noticed the
> problem. I get back my entities.
>
> Do you probably have any clue, what the problem can be? The index contains
> the entities, and I can also query them via EntityHub, but the solr query
> seems to be ‘incorrect’ (the first one, which is logged from Stanbol
> doesn’t return any results, but the second, that I tried out works fine).
>
>
>
> Thanks in advance!
>
>
>
> Best regards,
>
> Kata
>
>
>
> >Hi Kata,
>
> >
>
> >Have you overwritten the old solr index in the datafiles folder or have
> you
>
> >started from the scratch after fixing the encoding of the RDF files?
>
> >
>
> >Just a hint: you can check if your entities have been indexed by querying
>
> >then with the EntityHub API at Stanbol Web interface
>
> >
>
> >Hope that helps,
>
> >Rafa
>
> >
>
> >On Mon, Jan 11, 2016 at 7:19 PM Lejtovicz, Katalin <
>
> >katalin.lejtov...@oeaw.ac.at>> wrote:
>
> >
>
> >> Dear All,
>
> >>
>
> >> I have some problem with using custom vocabularies to enhance my
> content.
>
> >> I created an index with Stanbol from a vocabulary, deployed the .jar
> file
>
> >> and copied the solr index file to the datafiles folder, and created an
>
> >> EntityHub Linking Engine, plus a weighted chain, where the following
>
> >> pipeline was configured: langdetect, opennlp-sentence, opennlp-token,
>
> >> opennlp-pos, opennlp-chunker, and the an EntityHub Linking Engine for my
>
> >> custom vocab.
>
> >>
>
> >> It worked fine, when text was pasted in this enhancement chain in the
> user
>
> >> interface of Stanbol, entities were found. However we had an encoding
>
> >> problem in our RDF resource from which the index was built, so entities
>
> >> with umlaut (eg. ö, ä) were not found. We corrected the encoding of the
> RDF
>
> >> and I ran the indexing process again with the same config files, but
> with
>
> >> the new RDF resource.
>
> >> I again deployed (.jar and solr zip), and created the entityhub Linking
>
> >> Engine, plus the same Weighted Chain as above specified.
>
> >> Now I don't get any results, when I paste text in the text field of this
>
> >> chain in Stanbol.
>
> >>
>
> >> I configured log files, so that I can see what is happening. The
> linkable,
>
> >> matchable tokens, etc. are defined correctly eg. 'Berlin' in the
> sentence
>
> >> 'Berlin is a big city' is defined as linkable token:
>
> >>
>
> >> 11.01.2016 16:14:05.667 *DEBUG* [Thread-9]
>
> >> org.apache.stanbol.enhancer.engines.entitylinking.impl.SectionData     -
>
> >> TokenData: 'Berlin'[linkable=true(linkabkePos=true)|
>
> >> matchable=true(matchablePos=true)| alpha=true| seachLength=true|
>
> >> upperCase=true]
>
> >>
>
> >> Also it is sent to the solr index, but from there, no results come back:
>
> >> 11.01.2016 16:14:05.668 *DEBUG* [Thread-9]
>
> >> org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker ---
>
> >> preocess Token 0: Berlin (lemma: null) linkable=true, matchable=true |
>
> >> chunk: Chunk: [0, 6] Berlin
>
> >> 11.01.2016 16:14:05.668 *DEBUG* [Thread-9]
>
> >> org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker
> -
>
> >> 1:'is' (lemma: null) linkable=false, matchable=false
>
> >> 11.01.2016 16:14:05.668 *DEBUG* [Thread-9]
>
> >> org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker
> -
>
> >> 2:'a' (lemma: null) linkable=false, matchable=false
>
> >> 11.01.2016 16:14:05.668 *DEBUG* [Thread-9]
>
> >> org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker
> >>>>
>
> >> searchStrings [Berlin]
>
> >> 11.01.2016 16:14:05.668 *DEBUG* [Thread-9]
>
> >> org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker
> >>
>
> >> request entities [0-20] entities ...
>
> >> 11.01.2016 16:14:05.669 *DEBUG* [Thread-9]
>
> >>
> org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker       <
>
> >> found 0 entities ...
>
> >>
>
> >> I also looked at the solr.log, the query looks like this:
>
> >> (((@en\/rdfs\:label\/:"Berlin")) OR ((@\/rdfs\:label\/:"Berlin")))
>
> >> hits=0 status=0 QTime=1
>
> >>
>
> >>
>
> >> I installed solr and copied the index file over to execute the above
>
> >> query. It does not result any Solr Documents, but the following one
> does:
>
> >> (((_\!@en\/rdfs\:label\/:" Berlin ")) OR ((_\!@\/rdfs\:label\/:" Berlin
>
> >> ")))
>
> >>
>
> >> Can someone help me, what I am missing?
>
> >> Is it a configuration issue when I am creating the index? (Strange is,
>
> >> that I used the same config files for the incorrectly encoded RDF
> resource
>
> >> file, an that index worked.)
>
> >> Or is it a Stanbol issue?
>
> >>
>
> >> Thanks for any hints/help!
>
> >>
>
> >> Best regards,
>
> >> Kata
>
> >>
>
> >>
>
>
>

Reply via email to