Hi Jaime,

Please see o.a.l.analysis.custom.CustomAnalyzer.builder() to create custom 
analyzers using a builder-style API.

Ahmet


On Friday, June 24, 2016 10:54 AM, Jaime <j.par...@estructure.es> wrote:
Thank you very much, that seems to solve my issue.

However, I find this a little cumbersome. I need to filter the text 
before any tokenizing takes place, so I have to implement a filtered 
version of every analyzer I'm using (StandardAnalyzer and 
SpanishAnalyzer and a custom analyzer right now).

If I need to support another analyzer in the future (a very plausible 
possibility) I will need to create another version of that analyzer. 
Whenever any of those analyzer is changed, I will need to manually apply 
the changes.

Isn't there a better way to do this?

El 23/06/2016 a las 20:28, Ahmet Arslan escribió:
> Hi,
>
> Zero or more CharFilter(s) is the way to manipulate text before the tokenizer.
> I think init reader is the method you want to plug char filters.
> https://github.com/apache/lucene-solr/blob/master/lucene/analysis/morfologik/src/java/org/apache/lucene/analysis/uk/UkrainianMorfologikAnalyzer.java
>
> Ahmet
>
> On Thursday, June 23, 2016 6:47 PM, Jaime <j.par...@estructure.es> wrote:
> Hello,
>
> I want to change the input text before tokenizing. I think I just need
> to use some characters as word separators, and maybe remove some others
> completely.
>
> I was planning to use MappingCharFilterFactory to replace some chars
> with " " and others with "", but I feel like I'm not in the right track.
>
> First, I've implemented a custom analyzer to use my custom tokenizer. My
> idea was to inherit from StandardTokenizer and, in setReader, calling
> MappingCharFilterFactory.create(reader) from within.
>
> However, setReader is final, so I can't override it.
>
> Is there a better way to do this?
> In any case, how should I use MappingCharFilter in case I really needed it?
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org

-- 
Jaime Pardos
ESTRUCTURE MEDIA SYSTEMS, S.L.
Avda. de Madrid nº 120 nave 10, 28500, Arganda del Rey, MADRID,
j.par...@estructure.es
910088429
  
AVISO LEGAL: Este mensaje y sus archivos adjuntos van dirigidos exclusivamente 
a su destinatario, pudiendo contener información confidencial sometida a 
secreto confidencial. No está permitida su reproducción o distribución sin la 
autorización expresa de ESTRUCTURE MEDIA SYSTEMS, S.L.. Si usted no es el 
destinatario final por favor elimínelo e infórmenos por esta vía. De acuerdo 
con lo establecido en la Ley Orgánica 15/1999, de 13 de diciembre, de 
Protección de Datos de Carácter Personal (LOPD), le informamos que sus datos 
están incorporados en un fichero del que es titular ESTRUCTURE MEDIA SYSTEMS, 
S.L. con la finalidad de realizar la gestión administrativa, contable, y 
fiscal, así como enviarle comunicaciones comerciales sobre nuestros productos 
y/o servicios. Asimismo, le informamos de la posibilidad de ejercer los 
derechos de acceso, rectificación, cancelación y oposición de sus datos en el 
domicilio de ESTRUCTURE MEDIA SYSTEMS, S.L., sito en Avda. de Madrid nº 120 
nave 10, 28500, Arganda del Rey, MADRID, o a la dirección de correo electrónico 
i...@estructure.es.
  
This message and its attachments are intended solely for the addressee and may 
contain confidential information submitted to confidential secret. It is not 
allowed its reproduction or distribution without the express permission of 
ESTRUCTURE MEDIA SYSTEMS, S.L. .. If you are not the intended recipient please 
delete it and inform us in this way. According to the provisions of Law 
15/1999, of December 13, Protection of Personal Data (LOPD), we inform you that 
your data is incorporated into a file which is owned by ESTRUCTURE MEDIA 
SYSTEMS, S.L. in order to perform administrative, accounting and fiscal 
management, as well as send you communications about our products and / or 
services. Also we advised of the possibility of exercising rights of access, 
rectification, cancellation and opposition of their data at the home of 
ESTRUCTURE MEDIA SYSTEMS, SL, located in Avda. De Madrid # 120 ship 10 28500, 
Arganda del Rey, Madrid , or email address i...@estructure.es.




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to