Re: lucene farsi problem

Grant Ingersoll Wed, 30 Apr 2008 05:00:17 -0700

What Analyzer are you using? You might try looking in Luke to seewhat is in your index, etc. It also isn't clear to me what yourdocuments look like.

As for a Farsi analyzer, I would Google "Farsi analyzer Lucene" andsee if you can find anything. Otherwise, you will have to write yourown (and donate it????)


-Grant

On Apr 30, 2008, at 3:21 AM, esra wrote:

hi,
i am using lucene's "IndexSearcher" to search the given xml bykeyword which
contains farsi information.
while searching i use ranges like

آ-ث  |  ج-خ  |  د-ژ  |  س-ظ  |  ع-ق  |  ک-ل  |  م-ی
when i do search for "د-ژ" range the results are wrong , they arethe
results of  " س-ظ "range.
for example when i do search for "د-ژ" one of the the results is"ساب ووفر", this result also shown on the " س-ظ " range's result list whichis the
corret range.
As IndexSearcher use "compareTo" method and this method usesunicodes for
comparing, i found the unicodes of the characters.

د=U+62F
ژ = U+698
and the first letter of "ساب ووفر " is  س = U+633
Do you have any idea how to solve this problem, there are analyzersfor
different languages ,
will this be usefull if so do you know where to find a farsi analyzer?

I would bu glad if you help.

thanks ,

Esra

--
View this message in context: 
http://www.nabble.com/lucene-farsi-problem-tp16977096p16977096.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


--------------------------
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ







---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: lucene farsi problem

Reply via email to