I'd like to remind that Mike's proposal has stable branches.

We can branch off preflex trunk right now and wrap it up as 3.1.
Current trunk is declared as future 4.0 and all backcompat cruft is
removed from it.
If some new features/bugfixes appear in trunk, and they don't break
stuff - we backport them to 3.x branch, eventually releasing 3.2, 3.3,
etc

Thus, devs are free to work without back-compat burden, bleeding edge
users get their blood, conservative users get their stability + a
subset of new features from stable branches.


On Thu, Apr 15, 2010 at 22:02, DM Smith <dmsmith...@gmail.com> wrote:
> On 04/15/2010 01:50 PM, Earwin Burrfoot wrote:
>>>
>>> First, the index format. IMHO, it is a good thing for a major release to
>>> be
>>> able to read the prior major release's index. And the ability to convert
>>> it
>>> to the current format via optimize is also good. Whatever is decided on
>>> this
>>> thread should take this seriously.
>>>
>>
>> Optimize is a bad way to convert to current.
>> 1. conversion is not guaranteed, optimizing already optimized index is a
>> noop
>> 2. it merges all your segments. if you use BalancedSegmentMergePolicy,
>> that destroys your segment size distribution
>>
>> Dedicated upgrade tool (available both from command-line and
>> programmatically) is a good way to convert to current.
>> 1. conversion happens exactly when you need it, conversion happens for
>> sure, no additional checks needed
>> 2. it should leave all your segments as is, only changing their format
>>
>>
>>>
>>> It is my observation, though possibly not correct, that core only has
>>> rudimentary analysis capabilities, handling English very well. To handle
>>> other languages well "contrib/analyzers" is required. Until recently it
>>> did
>>> not get much love. There have been many bw compat breaking changes
>>> (though
>>> w/ version one can probably get the prior behavior). IMHO, most of
>>> contrib/analyzers should be core. My guess is that most non-trivial
>>> applications will use contrib/analyzers.
>>>
>>
>> I counter - most non-trivial applications will use their own analyzers.
>> The more modules - the merrier. You can choose precisely what you need.
>>
>
> By and large an analyzer is a simple wrapper for a tokenizer and some
> filters. Are you suggesting that most non-trivial apps write their own
> tokenizers and filters?
>
> I'd find that hard to believe. For example, I don't know enough Chinese,
> Farsi, Arabic, Polish, ... to come up with anything better than what Lucene
> has to tokenize, stem or filter these.
>
>>
>>>
>>> Our user base are those with ancient,
>>> underpowered laptops in 3-rd world countries. On those machines it might
>>> take 10 minutes to create an index and during that time the machine is
>>> fairly unresponsive. There is no opportunity to "do it in the
>>> background."
>>>
>>
>> Major Lucene releases (feature-wise, not version-wise) happen like
>> once in a year, or year-and-a-half.
>> Is it that hard for your users to wait ten minutes once a year?
>>
>
>  I said that was for one index. Multiply that times the number of books
> available (300+) and yes, it is too much to ask. Even if a small subset is
> indexed, say 30, that's around 5 hours of waiting.
>
> Under consideration is the frequency of breakage. Some are suggesting a
> greater frequency than yearly.
>
> DM
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-dev-h...@lucene.apache.org
>
>



-- 
Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to