Re: Lucene full text search

Erick Erickson Thu, 28 Jan 2010 05:35:58 -0800

Well, there are a couple of approaches:


1> enable leading wildcards and search for *arabic*. You
     probably don't want to do this, it's really, really expensive.
2> use the ngram (edgengram?) tokenizers. This'll cost
     you some index space, but that may be acceptable.

HTH
Erick

2010/1/28 Lutischán Ferenc <[email protected]>

> Hi,
>
> I have a problem with Lucene:
> I'm indexed an english phrase list with Lucene:
>            doc.add(new Field("r1", r1.toLowerCase(), Field.Store.NO,
> Field.Index.ANALYZED));
>
> I searched for the word 'arabic':
>
>            Analyzer analyzer = new
> StandardAnalyzer(Version.LUCENE_CURRENT);
>            QueryParser parser = new QueryParser(Version.LUCENE_CURRENT,
> this.searchedField, analyzer);
>            Query query = parser.parse(searchedStr);
>            TopScoreDocCollector collector = TopScoreDocCollector.create(10,
> true);
>            this.memDict.isearcher.search(query, collector);
>            foundCnt=collector.getTotalHits();
>            System.out.println(searchedStr + ":" + foundCnt);
>
>            // Iterate through the results:
>            ScoreDoc[] hits = collector.topDocs().scoreDocs;
>            for (int i = 0; i < hits.length; i++) {
>                Document hitDoc = this.memDict.isearcher.doc(hits[i].doc);
>                System.out.println("\"r1\"=" + hitDoc.get("r1"));
>            }
>
> The result list is:
> *arabic
> **arabic* numerals
> gum *arabic
> *
> But is not in the result list:
> moz*arabic*
>
> How to use Lucene to find all the words contains 'arabic'?
>
> Regards,
>    Ferenc
>

Re: Lucene full text search

Reply via email to