>           Hi all,
>           I'm not sure if this is a bug, a problem with my local
>           installation or
>           an issue in the project. Testing our local installation in
>           Spanish we
>           are having problems with the list of stopwords. I'm almost
>           sure that the
>           list is being used properly during the indexing with Lucene's
>           SpanishAnalyzer. But then, when we annotate a text in Spanish,
>           some
>           stopwords are selected as spotters and finally linked with a
>           candidate.
>           That is also happening sometimes with punctuation marks (dots,
>           quotes....).
>           Actually, I don't know if the system applies a stopwords
>           removal process
>           to the input text, but I was supposing that it should do it to
>           prevent
>           this behaviour. Am I right??
>           Thanks. Regards


Hello

Has there been any progress on this?
I'm working on an adaption of spotlight to German, and I'm facing the exact 
same 
problem. (I think; especially, since this is not the first time I encounter a 
problem that Rafa already expressed here.)
Actually, in my case, no punctuation is annotated, only stopwords.
[LingPipeSpotter,WikiMarkupSpotter], ShortSurfaceFormSelector.
I can partially eliminate this by setting a higher confidence parameter, but 
that seems like a waste of this functionality.

(I cut the after the first one, because gmane made me, but I read them too; the 
stopwords file is specified correctly in both indexing.de.proprieties and 
server.de.proprieties.)

Cheers,
Hali



------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013 
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
Dbp-spotlight-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users

Reply via email to