Lucene ancient greek normalization

paolo anghileri Fri, 21 Nov 2014 11:17:29 -0800

For development purposes I need the ability in lucene to normalizeancient greek characters for al the cases of grammatical details such asaccents, diacritics and so on.

My need is to retrieve ancient greek words with accents and othergrammatical details by the input of the string without accents.


For example the input of οργανον (organon) should to retrieve also Ὄργανον,

I am not a lucene commiter and I a new to this so my question is aboutthe best practice to implement this in Lucene, and possibile submit acommit proposal to Lucene A project management committee.


I have made some searches and found this file in Lucene-soir:


It contains normalization for some chars.

My thought would be to add extra normalization here, including allunicode ancient greek chars with all grammatical details.I already have all the unicode values for that chars so It should not bedifficult for me to include them

If my understanding is correct, this should add to lucene the featuresdescribed above.



As I am new to this, my needs are:

1.   To be sure that this is the correct place in Lucene for doing
   normalization
2. How to post commit proposal


Any help appreciated

Kind regards

Paolo

Lucene ancient greek normalization

Reply via email to