> Hi all, > I'm not sure if this is a bug, a problem with my local > installation or > an issue in the project. Testing our local installation in > Spanish we > are having problems with the list of stopwords. I'm almost > sure that the > list is being used properly during the indexing with Lucene's > SpanishAnalyzer. But then, when we annotate a text in Spanish, > some > stopwords are selected as spotters and finally linked with a > candidate. > That is also happening sometimes with punctuation marks (dots, > quotes....). > Actually, I don't know if the system applies a stopwords > removal process > to the input text, but I was supposing that it should do it to > prevent > this behaviour. Am I right?? > Thanks. Regards
Hello Has there been any progress on this? I'm working on an adaption of spotlight to German, and I'm facing the exact same problem. (I think; especially, since this is not the first time I encounter a problem that Rafa already expressed here.) Actually, in my case, no punctuation is annotated, only stopwords. [LingPipeSpotter,WikiMarkupSpotter], ShortSurfaceFormSelector. I can partially eliminate this by setting a higher confidence parameter, but that seems like a waste of this functionality. (I cut the after the first one, because gmane made me, but I read them too; the stopwords file is specified correctly in both indexing.de.proprieties and server.de.proprieties.) Cheers, Hali ------------------------------------------------------------------------------ Free Next-Gen Firewall Hardware Offer Buy your Sophos next-gen firewall before the end March 2013 and get the hardware for free! Learn more. http://p.sf.net/sfu/sophos-d2d-feb _______________________________________________ Dbp-spotlight-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users
