Hi On Thu, Jun 6, 2013 at 5:26 PM, Joseph M'Bimbi-Bene <jbi...@object-ive.com> wrote: > Hello, sorry for the late answer. Thank you for yours > > > 2013/6/3 Rupert Westenthaler <rupert.westentha...@gmail.com> > >> Hi Joseph >> >> On Mon, Jun 3, 2013 at 3:43 PM, Joseph M'Bimbi-Bene >> <jbi...@object-ive.com> wrote: >> [..] >> > >> > Now, the logs of the processing of the token "La" >> > >> > ProcessingState > 0: Token: [1087, 1089] La (pos:[Value [pos: >> > ADJ(olia:Adjective)].prob=0.016871281997002517]) chunk: 'none' >> > >> > ProcessingState - TokenData: 'La'[linkable=true(linkabkePos=null)| >> > matchable=true(matchablePos=null)| alpha=true| seachLength=true| >> > upperCase=true] >> > >> >> The reason why the 'La' of the last sentence of your document is >> marked as 'linkable' is the combination of the following things: >> >> 1. the POS tag has a very low probability (0.017) and is therefore >> ignored as the configured minimum probability is higher as that. >> > > Actually, i set both parameters "prop" and "pprob" to 0.01 , i didn't > commit any mistake, did i ? You mentionned or a previous mail something > about a strange tokenizing behaviour, it might be a source of a new > problem: here is, for example a log excerpt from the stanbol web console > for an integration test. I isolated the pathologic case : >
The reason is that "prop=0.01" should be "prob=0.01". There is a typo in the default configuration, because of that the changed value for "prop" does not have any effect. I created STANBOL-1100 for fixing this. > and when i curl the text to Talismane, i get the following message: > > 16:49:21,166 [main] INFO server.Main - ... starting server > 16:53:55,560 [btpool0-2] ERROR resource.AnalysisResource - Exception while > analysing Blob > java.lang. IllegalArgumentException: Illegal span [2199,2201] for Token > relative to Text: [0, 2200] : Span of the contained Token MUST NOT extend > the others! When implementing the Talismane Stanbol integration I had a lot of problems with the getting the index positions right. Getting a Span of a Token exceeding the size of the document could indicate that there are still some problems with that. If you come across a text that can reproduce this please open an issue on the stanbol-talismane [1] [1] https://github.com/westei/stanbol-talismane best Rupert -- | Rupert Westenthaler rupert.westentha...@gmail.com | Bodenlehenstraße 11 ++43-699-11108907 | A-5500 Bischofshofen