Hi Rafa,
The part that is perhaps confusing here is that the stopword list is used
in multiple places. The SpanishAnalyzer removes them from the context index
(used in disambiguation). What you report is that you see stopwords being
spotted, which is a problem with your spotter dictionary (and the class
that created it) or the spotter implementation.
Try this:
1) check if your *indexing.es.properties* configuration is pointing to the
right stopwords file for spanish. If yes, check if that file contains the
undesired words you see spotted. If no, that's your problem.
2) check if surfaceForms.tsv contain these spurious stopwords. If yes, then
you need to double check what's happening in IndexLingPipeSpotter. Create a
small surfaceForms.tsv and stopwords.txt and step through the code
Which spotter are you using? I am assuming it is LingPipeSpotter.
Cheers
pablo
On Dec 15, 2012 12:13 AM, "Rafa Haro" <[email protected]> wrote:
> Hi all,
>
> I'm not sure if this is a bug, a problem with my local installation or
> an issue in the project. Testing our local installation in Spanish we
> are having problems with the list of stopwords. I'm almost sure that the
> list is being used properly during the indexing with Lucene's
> SpanishAnalyzer. But then, when we annotate a text in Spanish, some
> stopwords are selected as spotters and finally linked with a candidate.
> That is also happening sometimes with punctuation marks (dots, quotes....).
>
> Actually, I don't know if the system applies a stopwords removal process
> to the input text, but I was supposing that it should do it to prevent
> this behaviour. Am I right??
>
> Thanks. Regards
> This message should be regarded as confidential. If you have received this
> email in error please notify the sender and destroy it immediately.
> Statements of intent shall only become binding when confirmed in hard copy
> by an authorised signatory.
>
> Zaizi Ltd is registered in England and Wales with the registration number
> 6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam Road,
> London W10 5JJ, UK.
>
>
>
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue delivers
> http://p.sf.net/sfu/logmein_12329d2d
> _______________________________________________
> Dbp-spotlight-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users
>
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Dbp-spotlight-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users