Re: ISOLatin1AccentFilter before or after Snowball?
Now, you got me wondering - wich one should I like better? I didn't even know there is an alternative. :-) Chantal Koji Sekiguchi schrieb: No, ISOLatin1AccentFilterFactory is not deprecated. You can use either MappingCharFilterFactory+mapping-ISOLatin1Accent.txt or ISOLatin1AccentFilterFactory whichever you'd like. Koji Jay Hill wrote: Correct me if I'm wrong, but wasn't the ISOLatin1AccentFilterFactory deprecated in favor of: charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ in 1.4? -Jay http://www.lucidimagination.com
Re: ISOLatin1AccentFilter before or after Snowball?
In this particular case, I don't think one is better than the other... In general, MappingCharFilter is more flexible than specific TokenFilters, such as ISOLatin1AccentFilter. For example, if you want your own character mapping rules, you can add them to mapping.txt. It should be easier than modifing TokenFilters as you don't need programming. Koji Chantal Ackermann wrote: Now, you got me wondering - wich one should I like better? I didn't even know there is an alternative. :-) Chantal Koji Sekiguchi schrieb: No, ISOLatin1AccentFilterFactory is not deprecated. You can use either MappingCharFilterFactory+mapping-ISOLatin1Accent.txt or ISOLatin1AccentFilterFactory whichever you'd like. Koji Jay Hill wrote: Correct me if I'm wrong, but wasn't the ISOLatin1AccentFilterFactory deprecated in favor of: charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ in 1.4? -Jay http://www.lucidimagination.com
Re: ISOLatin1AccentFilter before or after Snowball?
Hello, i'm following the thread but i think it still hasn't been answered if the isolatinfilter goes before or after the stemmer. any direct answer? Koji Sekiguchi wrote: In this particular case, I don't think one is better than the other... In general, MappingCharFilter is more flexible than specific TokenFilters, such as ISOLatin1AccentFilter. For example, if you want your own character mapping rules, you can add them to mapping.txt. It should be easier than modifing TokenFilters as you don't need programming. Koji Chantal Ackermann wrote: Now, you got me wondering - wich one should I like better? I didn't even know there is an alternative. :-) Chantal Koji Sekiguchi schrieb: No, ISOLatin1AccentFilterFactory is not deprecated. You can use either MappingCharFilterFactory+mapping-ISOLatin1Accent.txt or ISOLatin1AccentFilterFactory whichever you'd like. Koji Jay Hill wrote: Correct me if I'm wrong, but wasn't the ISOLatin1AccentFilterFactory deprecated in favor of: charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ in 1.4? -Jay http://www.lucidimagination.com -- Claudio Martella Digital Technologies Unit Research Development - Engineer TIS innovation park Via Siemens 19 | Siemensstr. 19 39100 Bolzano | 39100 Bozen Tel. +39 0471 068 123 Fax +39 0471 068 129 claudio.marte...@tis.bz.it http://www.tis.bz.it Short information regarding use of personal data. According to Section 13 of Italian Legislative Decree no. 196 of 30 June 2003, we inform you that we process your personal data in order to fulfil contractual and fiscal obligations and also to send you information regarding our services and events. Your personal data are processed with and without electronic means and by respecting data subjects' rights, fundamental freedoms and dignity, particularly with regard to confidentiality, personal identity and the right to personal data protection. At any time and without formalities you can write an e-mail to priv...@tis.bz.it in order to object the processing of your personal data for the purpose of sending advertising materials and also to exercise the right to access personal data and other rights referred to in Section 7 of Decree 196/2003. The data controller is TIS Techno Innovation Alto Adige, Siemens Street n. 19, Bolzano. You can find the complete information on the web site www.tis.bz.it.
Re: ISOLatin1AccentFilter before or after Snowball?
On Tue, Oct 6, 2009 at 4:33 PM, Chantal Ackermann chantal.ackerm...@btelligent.de wrote: Hi all, from reading through previous posts on that subject, it seems like the accent filter has to come before the snowball filter. I'd just like to make sure this is so. If it is the case, I'm wondering whether snowball filters for i.e. French process accented language correctly, at all, or whether they remove accents anyway... Or whether accents should be removed whenever making use of snowball filters. I'd think so but I'm not sure. Perhaps someone else can weigh in. And also: it really is meant to take UTF-8 as input, even though it is named ISOLatin1AccentFilter, isn't it? See http://markmail.org/message/hi25u5iqusfu542b -- Regards, Shalin Shekhar Mangar.
Re: ISOLatin1AccentFilter before or after Snowball?
See http://markmail.org/message/hi25u5iqusfu542b Thank you for the link, Shalin! It could be worth copying that to the wiki? Cheers! Chantal I'd just like to make sure this is so. If it is the case, I'm wondering whether snowball filters for i.e. French process accented language correctly, at all, or whether they remove accents anyway... Or whether accents should be removed whenever making use of snowball filters. I'd think so but I'm not sure. Perhaps someone else can weigh in. And also: it really is meant to take UTF-8 as input, even though it is named ISOLatin1AccentFilter, isn't it? See http://markmail.org/message/hi25u5iqusfu542b -- Regards, Shalin Shekhar Mangar.
Re: ISOLatin1AccentFilter before or after Snowball?
Correct me if I'm wrong, but wasn't the ISOLatin1AccentFilterFactory deprecated in favor of: charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ in 1.4? -Jay http://www.lucidimagination.com On Wed, Oct 7, 2009 at 1:44 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Tue, Oct 6, 2009 at 4:33 PM, Chantal Ackermann chantal.ackerm...@btelligent.de wrote: Hi all, from reading through previous posts on that subject, it seems like the accent filter has to come before the snowball filter. I'd just like to make sure this is so. If it is the case, I'm wondering whether snowball filters for i.e. French process accented language correctly, at all, or whether they remove accents anyway... Or whether accents should be removed whenever making use of snowball filters. I'd think so but I'm not sure. Perhaps someone else can weigh in. And also: it really is meant to take UTF-8 as input, even though it is named ISOLatin1AccentFilter, isn't it? See http://markmail.org/message/hi25u5iqusfu542b -- Regards, Shalin Shekhar Mangar.
Re: ISOLatin1AccentFilter before or after Snowball?
No, ISOLatin1AccentFilterFactory is not deprecated. You can use either MappingCharFilterFactory+mapping-ISOLatin1Accent.txt or ISOLatin1AccentFilterFactory whichever you'd like. Koji Jay Hill wrote: Correct me if I'm wrong, but wasn't the ISOLatin1AccentFilterFactory deprecated in favor of: charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ in 1.4? -Jay http://www.lucidimagination.com
ISOLatin1AccentFilter before or after Snowball?
Hi all, from reading through previous posts on that subject, it seems like the accent filter has to come before the snowball filter. I'd just like to make sure this is so. If it is the case, I'm wondering whether snowball filters for i.e. French process accented language correctly, at all, or whether they remove accents anyway... Or whether accents should be removed whenever making use of snowball filters. And also: it really is meant to take UTF-8 as input, even though it is named ISOLatin1AccentFilter, isn't it? Thanks in advance! Chantal -- Chantal Ackermann