Hello, I think some compatibility breaks should really be accepted, otherwise these requirements are going to kill the technological advancement: the effort in backwards compatibility will grow and be more timeconsuming and harder every day.
A mayor release won't happen every day, likely not even every year, so it seems acceptable to have milestones defining compatibility boundaries: you need to be able to "reset" the complexity curve occasionally. Backporting a feature would benefit from being merged in the correct testsuite, and avoid the explosion of this matrix-like backwards compatibility test suite. BTW the current testsuite is likely covering all kinds of combinations which nobody is actually using or caring about. Also if I where to discover a nice improvement in an Analyzer, and you where telling me that to contribute it I would have to face this amount of complexity.. I would think twice before trying; honestly the current requirements are scary. +1 Sanne 2010/4/15 Earwin Burrfoot <ear...@gmail.com>: > I'd like to remind that Mike's proposal has stable branches. > > We can branch off preflex trunk right now and wrap it up as 3.1. > Current trunk is declared as future 4.0 and all backcompat cruft is > removed from it. > If some new features/bugfixes appear in trunk, and they don't break > stuff - we backport them to 3.x branch, eventually releasing 3.2, 3.3, > etc > > Thus, devs are free to work without back-compat burden, bleeding edge > users get their blood, conservative users get their stability + a > subset of new features from stable branches. > > > On Thu, Apr 15, 2010 at 22:02, DM Smith <dmsmith...@gmail.com> wrote: >> On 04/15/2010 01:50 PM, Earwin Burrfoot wrote: >>>> >>>> First, the index format. IMHO, it is a good thing for a major release to >>>> be >>>> able to read the prior major release's index. And the ability to convert >>>> it >>>> to the current format via optimize is also good. Whatever is decided on >>>> this >>>> thread should take this seriously. >>>> >>> >>> Optimize is a bad way to convert to current. >>> 1. conversion is not guaranteed, optimizing already optimized index is a >>> noop >>> 2. it merges all your segments. if you use BalancedSegmentMergePolicy, >>> that destroys your segment size distribution >>> >>> Dedicated upgrade tool (available both from command-line and >>> programmatically) is a good way to convert to current. >>> 1. conversion happens exactly when you need it, conversion happens for >>> sure, no additional checks needed >>> 2. it should leave all your segments as is, only changing their format >>> >>> >>>> >>>> It is my observation, though possibly not correct, that core only has >>>> rudimentary analysis capabilities, handling English very well. To handle >>>> other languages well "contrib/analyzers" is required. Until recently it >>>> did >>>> not get much love. There have been many bw compat breaking changes >>>> (though >>>> w/ version one can probably get the prior behavior). IMHO, most of >>>> contrib/analyzers should be core. My guess is that most non-trivial >>>> applications will use contrib/analyzers. >>>> >>> >>> I counter - most non-trivial applications will use their own analyzers. >>> The more modules - the merrier. You can choose precisely what you need. >>> >> >> By and large an analyzer is a simple wrapper for a tokenizer and some >> filters. Are you suggesting that most non-trivial apps write their own >> tokenizers and filters? >> >> I'd find that hard to believe. For example, I don't know enough Chinese, >> Farsi, Arabic, Polish, ... to come up with anything better than what Lucene >> has to tokenize, stem or filter these. >> >>> >>>> >>>> Our user base are those with ancient, >>>> underpowered laptops in 3-rd world countries. On those machines it might >>>> take 10 minutes to create an index and during that time the machine is >>>> fairly unresponsive. There is no opportunity to "do it in the >>>> background." >>>> >>> >>> Major Lucene releases (feature-wise, not version-wise) happen like >>> once in a year, or year-and-a-half. >>> Is it that hard for your users to wait ten minutes once a year? >>> >> >> I said that was for one index. Multiply that times the number of books >> available (300+) and yes, it is too much to ask. Even if a small subset is >> indexed, say 30, that's around 5 hours of waiting. >> >> Under consideration is the frequency of breakage. Some are suggesting a >> greater frequency than yearly. >> >> DM >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-dev-h...@lucene.apache.org >> >> > > > > -- > Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) > Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 > ICQ: 104465785 > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org