On Apr 21, 2011, at 5:02 PM, Clemens Wyss wrote:

> I keep my search terms in a dedicated RAMDirectory (the termIndex). 
> In there I palce all the term of my real index. When putting the terms into 
> the 
> termIndex I can still see [using the debugger] the Umlaute (äöü). 
> Unfortunately when searching the 
> termIndex the documents no more contain these Umlaute.
> 
> Populating the termIndex:
> termIndex = new RAMDirectory();
> IndexWriterConfig config = new IndexWriterConfig( Version.LUCENE_31, new 
> TermAnalyzer( locale ) );
> termIndexWriter = new IndexWriter( termIndex, config );
> TermEnum tEnum = realIndexReader.terms();
> while ( tEnum.next() )
> {
>       Term t = tEnum.term();
>       String termText = t.text();
>       Document termDocument = new Document();
>       Field field = new Field( FIELDNAME_TERM, termText, Field.Store.YES, 
> Field.Index.ANALYZED );
>       termDocument.add( field );
>       // and add term into the index
>       termIndexWriter.addDocument( termDocument );
> }
> termIndexWriter.commit();
> termIndexWriter.optimize();
> termIndexWriter.close();
> 
> termIndexReader = IndexReader.open( termIndex, true );
> ---------- searching terms
> Query q = fuzzy ? new FuzzyQuery( new Term( FIELDNAME_TERM, 
> termFilter.toLowerCase() ) ) :
>                                       new WildcardQuery( new Term( 
> FIELDNAME_TERM, "*" + termFilter.toLowerCase() + "*" ) );
> TopDocs topDocs = new IndexSearcher( getTermIndexReader() ).search( q, 100 ); 
>                         
> for ( ScoreDoc hit : topDocs.scoreDocs )
> {
>       Document doc = getTermIndexReader().document( hit.doc );
>       String indexTerm = doc.get( FIELDNAME_TERM );
>       if ( !returnValue.contains( indexTerm  ) )
>       {
>               returnValue.add( indexTerm );
>       }
> }
> ----------
> The TermAbnalyzer is the same analyzer as the main index analyzer with the 
> exception that a LowerCaseFilter is applied.

What is the Analyzer for the Main Index?  What is the tokenizer and token 
filters used?

Out of curiosity, what is the problem you are trying to solve?

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to