Hi Rupert, Thanks for your prompt response. I too think that this problem is introduced recently because I am getting these errors from last 2 weeks only. Before that I was not getting this problem. In fact I am having a stanbol jar which was compiled around 1 month back and it is working fine on my system.
How to get the previous released complete stanbol source code (and not using the trunk link)? I tried browsing the SVN but not able to found a tag which gets all the code (like trunk) in one go. Regards, Manish On Tue, May 7, 2013 at 9:32 PM, Rupert Westenthaler < [email protected]> wrote: > Hi Manish > > I had the same error once. In that case the error was originating from > some uncommon UTF8 chars, where basically two chars at the same > position where creating a single character in the text. > > On Tue, May 7, 2013 at 1:01 PM, Manish Aggarwal <[email protected]> > wrote: > > For example if I run the stanbol server for the text > > "Because demand for the Figaro exceeded the 20,000 vehicles built, Nissan > > sold the car by lottery: winners could place orders for the car. Despite > > being a JDM-only model, the Figaro is one of the most imported models of > > the K10 derivatives; its popularity among numerous celebrity owners > helped > > it earn cult status. The K10 ceased production on 21 December 1992." > > > > I can not reproduce the reported error when using this text. Could be > because some special chars where removed/converted while sending the > mail. > Can you try if you can reproduce the error on > "http://dev.iks-project.eu:8081/enhancer" or > "http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-noun-linking"? > > > Caused by: java.lang.IllegalArgumentException: The span '1' MUST BE >= > > the number of matched tokens '2': Cult status[m=FULL,s=1,c=1(0.875)/2] > > score=1.53125[l=0.875,t=1.75]! > > at > org.apache.stanbol.enhancer.engines.entitylinking.impl.LabelMatch.<init>(LabelMatch.java:96) > > at > org.apache.stanbol.enhancer.engines.entitylinking.impl.EntityLinker.matchLabel(EntityLinker.java:762) > > This exception is basically a "safeguard" that wars for an unexpected > state in the EntityLinking process. One could also just write a > warning to the log, but then it would be much less likely to discover > such issues. > > The warning suggests that the section "cult status" is the reason for > the exception. As the matching label "Cult status" is a FULL match, > but the matching score is lower than 1 (0.875) my assumption is that > the section "... earn cult status. The ..." does indeed contain some > special UTF8 characters because otherwise I one would expect an exact > match with a score of 1.0. > > Can you please check the original text for such chars? I will have a > look at the exception. Maybe I should catch those exceptions and write > a detailed summary (including the source text, tokens, pos tags ...) > to the logs instead. > > Thanks for the report! > best > Rupert > > -- > | Rupert Westenthaler [email protected] > | Bodenlehenstraße 11 ++43-699-11108907 > | A-5500 Bischofshofen >
