hi pablo,
there is no flag to set whether stopwords should be ignored.
it would be simple to add stop words filtering.
i only hesitate to develop my own code because i will then have to
maintain a different branch.
if i would provide a patch to filter stop words, you would include it?
best regards
reinhard
Am 15.03.2012 11:00, schrieb Pablo Mendes:
>
>
> Yes. However, indexing stopwords will bloat your index and may trick
> the system to believe that disambiguations are made with high
> confidence. Stopwords match anything, and make it look like there is
> enough context in the input text. Our ICF scoring is robust to this
> bias given enough data, but will also take very long to compute.
>
> If you want to choose what strings the system will or not attempt to
> annotate, you can stopword the spotter dictionary.
> See IndexLingPipeSpotter for example. This does not affect the lucene
> index.
>
> Cheers
> Pablo
>
> On Mar 15, 2012 10:32 AM, "reinhard schwab" <[email protected]
> <mailto:[email protected]>> wrote:
>
> hi,
>
> i know it is possible to configure stopwords in server.properties and
> in indexing.properties.
> how are stopwords handled?
> is it possible to filter out stopwords only when annotating,
> that they are still indexed but not annotated?
>
> best regards
> reinhard
>
>
> ------------------------------------------------------------------------------
> This SF email is sponsosred by:
> Try Windows Azure free for 90 days Click Here
> http://p.sf.net/sfu/sfd2d-msazure
> _______________________________________________
> Dbp-spotlight-users mailing list
> [email protected]
> <mailto:[email protected]>
> https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users
>
------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Dbp-spotlight-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users