Ok I opened https://issues.apache.org/jira/browse/LUCENE-1068 and attached the patch files. I don't know if and how you can deprecate a JFlex grammar though.
On Nov 27, 2007 1:43 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > Yes, please open a JIRA issue and submit your patches. > > I wonder if there is anyway to deprecate functionality in a JFlex > grammar? That is, is there anyway we can communicate to people that > both will be supported through 2.9 and then the correct way will be > supported in 3.x? > > -Grant > > On Nov 27, 2007, at 2:18 AM, Shai Erera wrote: > > > I understand it would change the behavior of existing search > > solutions, > > however the current behavior is just wrong. An ACRONYM cannot be > > ABC.DEF. If > > you look up acronym in Wikipedia, you find only examples of I.B.M. / > > U.S.A. > > like, or NATO, IBM, USA, but nothing of the form StandardAnalyzer > > currently > > recognizes. > > > > There are several ways to solve this change: > > 1. Create a new analyzer that fixes the problem - that way, > > applications > > that don't want to use it will not have to, if they feel ok with the > > current > > behavior. However, for those who would like to get a correct behavior, > > they'll be able to. This is not my favorite solution, but I think it > > would > > be preferable than simply not fixing it. > > 2. Fix it in the new version (2.3) and specifically mention that in > > the > > release notes. Aren't there releases where applications need to re- > > build the > > index because of fundamental changes? > > > > Am I the only one who thinks that? > > > > BTW, I changed the definition in the jflex file and recompiled using > > jflex > > and it indeed solved the problem. It now recognizes www.abc.com. and > > www.abc.com as hosts. I can attach the 'patch' files if you'd like to > > compare. > > > > On Nov 27, 2007 9:07 AM, Chris Hostetter <[EMAIL PROTECTED]> > > wrote: > > > >> > >> : If you pass "www.abc.com", the output is (www.abc.com, > >> 0,11,type=<HOST>) > >> : (which is correct in my opinion). > >> : However, if you pass "www.abc.com." (notice the extra '.' at the > >> end), > >> the > >> : output is (wwwabccom,0,12,type=<ACRONYM>). > >> > >> see also... > >> > >> > http://www.nabble.com/Inconsistent-StandardTokenizer-behaviour-tf596059.html#a1593383 > >> > >> > http://www.nabble.com/Standard-Analyzer---Host-and-Acronym-tf3620533.html#a10109926 > >> > >> one hitch which potentially changing this now is that it would break > >> some searches in applications that have existing indexes built using > >> previous versions. > >> > >> > >> > >> -Hoss > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: [EMAIL PROTECTED] > >> For additional commands, e-mail: [EMAIL PROTECTED] > >> > >> > > > > > > -- > > Regards, > > > > Shai Erera > > -------------------------- > Grant Ingersoll > http://lucene.grantingersoll.com > > Lucene Helpful Hints: > http://wiki.apache.org/lucene-java/BasicsOfPerformance > http://wiki.apache.org/lucene-java/LuceneFAQ > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > -- Regards, Shai Erera