For development purposes I need the ability in lucene to normalize
ancient greek characters for al the cases of grammatical details such as
accents, diacritics and so on.
My need is to retrieve ancient greek words with accents and other
grammatical details by the input of the string without accents.
For example the input of οργανον (organon) should to retrieve also Ὄργανον,
I am not a lucene commiter and I a new to this so my question is about
the best practice to implement this in Lucene, and possibile submit a
commit proposal to Lucene A project management committee.
I have made some searches and found this file in Lucene-soir:
It contains normalization for some chars.
My thought would be to add extra normalization here, including all
unicode ancient greek chars with all grammatical details.
I already have all the unicode values for that chars so It should not be
difficult for me to include them
If my understanding is correct, this should add to lucene the features
described above.
As I am new to this, my needs are:
1. To be sure that this is the correct place in Lucene for doing
normalization
2. How to post commit proposal
Any help appreciated
Kind regards
Paolo