Hi all:

I just found out that the default search analyzer that comes with
out-of-the-box dspace is English-based (which is obvious).

In the dspace.cfg file there is an example to change it for a chinese
analyzer (org.apache.lucene.analysis.cn.ChineseAnalyzer). Is there any
option for Spanish language? I coudn't find an Spanish analyzer in
lucene page:
http://lucene.apache.org/java/2_2_0/api/org/apache/lucene/analysis/Analyzer.html

We have at least 2 problems:
- Stemming. We find different hits searching "Energía" v/s "Energia",
and we would expect the same result.
- Stop words. We have irrelevant words like "a", "al", "de", "del". And
we would like to exclude them from the searching process, unless the
user wanted exact search. How can I define those stop words?

Regards,

-- 
Álvaro Sandoval
BCN, Biblioteca del Congreso Nacional
Servicios Digitales. Ingeniería y Desarrollo
Fono: (56-32) 226 3981. Fax: (56-32) 226 3973
www.bcn.cl


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to