> Hmm, for standardization of text fields, collation might be a little
> awkward.

I arrived there after using custom rules for a while (see "RuleBasedCollator" 
on http://wiki.apache.org/solr/UnicodeCollation) and then being told
"For better performance, less memory usage, and support for more locales, you 
can add the analysis-extras contrib and use ICUCollationKeyFilterFactory 
instead." (on the same page under "ICU Collation").

> For your german umlauts, what do you mean by standardize? is this to
> achieve equivalency of e.g. oe to ö in your search terms?

That is the main point, but I might also need the additional normalization of 
combined characters like
o+  ̈ = ö and probably similar constructions for other languages (like 

> In that case, a simpler approach would be to put
> GermanNormalizationFilterFactory in your chain:
> http://lucene.apache.org/core/4_6_1/analyzers-common/org/apache/lucene/analysis/de/GermanNormalizationFilter.html

I'll see how far I get with this, but from the description
        • 'ä', 'ö', 'ü' are replaced by 'a', 'o', 'u', respectively.
        • 'ae' and 'oe' are replaced by 'a', and 'o', respectively.
this seems to be too far-reaching a reduction: while the identification "ä=ae" 
is not very serious and rarely misleading, "ä=a" might pack words together that 
shouldn't be, "Äsen" and "Asen" are quite different concepts,

In general, the deprecation of ICUCollationKeyFilterFactory doesn't seem to be 
really thought through.

Thanks anyway, best

> On Wed, Feb 19, 2014 at 9:16 AM, Thomas Fischer <fischer...@aon.at> wrote:
>> Thanks, that helps!
>> I'm trying to migrate from the now deprecated ICUCollationKeyFilterFactory
>> I used before to the ICUCollationField.
>> Is there any description how to achieve this?
>> First tries now yield
>> ICUCollationField does not support specifying an analyzer.
>> which makes it complicated since I used the ICUCollationKeyFilterFactory
>> to standardize my text fields (in particular because of German Umlauts).
>> But an ICUCollationField without LowerCaseFilter, a WhitespaceTokenizer, a
>> LetterTokenizer, etc. doesn't do me much good, I'm afraid.
>> Or is this somehow wrapped into the ICUCollationField?
>> I didn't find ICUCollationField  in the solr wiki and not much information
>> in the reference.
>> And the hint
>> "solr.ICUCollationField is included in the Solr analysis-extras contrib -
>> see solr/contrib/analysis-extras/README.txt for instructions on which jars
>> you need to add to your SOLR_HOME/lib in order to use it."
>> is misleading insofar as this README.txt doesn't mention the
>> solr-analysis-extras-4.6.1.jar in dist.
>> Best
>> Thomas
>> Am 19.02.2014 um 14:27 schrieb Robert Muir:
>>> you need the solr analysis-extras jar itself, too.
>>> On Wed, Feb 19, 2014 at 8:25 AM, Thomas Fischer <fischer...@aon.at>
>> wrote:
>>>> Hello Robert,
>>>> I already added
>>>> contrib/analysis-extras/lib/
>>>> and
>>>> contrib/analysis-extras/lucene-libs/
>>>> via lib directives in solrconfig, this is why the classes mentioned are
>>>> loaded.
>>>> Do you know which jar is supposed to contain the ICUCollationField?
>>>> Best regards
>>>> Thomas
>>>> Am 19.02.2014 um 13:54 schrieb Robert Muir:
>>>>> you need the solr analysis-extras jar in your classpath, too.
>>>>> On Wed, Feb 19, 2014 at 6:45 AM, Thomas Fischer <fischer...@aon.at>
>>>> wrote:
>>>>>> Hello,
>>>>>> I'm migrating to solr 4.6.1 and have problems with the
>> ICUCollationField
>>>>>> (apache-solr-ref-guide-4.6.pdf, pp. 31 and 100).
>>>>>> I get consistently the error message
>>>>>> Error loading class 'solr.ICUCollationField'.
>>>>>> even after
>>>>>> INFO: Adding
>>>>>> 'file:/srv/solr4.6.1/contrib/analysis-extras/lib/icu4j-49.1.jar' to
>>>>>> classloader
>>>>>> and
>>>>>> INFO: Adding
>> 'file:/srv/solr4.6.1/contrib/analysis-extras/lucene-libs/lucene-analyzers-icu-4.6.1.jar'
>>>>>> to classloader.
>>>>>> Am I missing something?
>>>>>> I solr's subversion I found
>> /SVN/solr/contrib/analysis-extras/src/java/org/apache/solr/schema/ICUCollationField.java
>>>>>> but no corresponding class in solr4.6.1's contrib folder.
>>>>>> Best
>>>>>> Thomas

Reply via email to