Having read the entire thread as it's come in, my head is spinning. It is hard 
to keep up with the ideas and proposals. Through out this thread my mindset has 
changed, more than once. And may change again ;)

To that end I'd like to make some end-user observations and thoughts:
(I had thought that this would be a quick few bullets, but it isn't. Sorry for 
that...)

Most of this thread deals with implementation not requirement. As I see it the 
bw compat policy is the implementation of what I think the real requirement is: 
that user upgrades be straightforward, simple, obvious, ..... but I don't think 
the requirement should be that it be painless.

I think it is important to keep the cost of doing an upgrade fairly low. If not 
some might abandon Lucene. I've seen people abandon Windows Vista for Mac, 
InterViews 2.0 for other GUI toolkits, ..... I just think there are other ways 
to do it w/o maintaining the current bw compat policy.

There is a proposal of an index upgrade tool. I think that satisfies the bw 
compat for the index structure.

The practice of drop-in jar compatibility is nice, but I think it is 
potentially dangerous. It might lull end-users into a false sense of security 
and without a deep understanding of what or how to test it might yield 
surprising results. I think that this is especially true with the 
tokenizers/analyzers/filters.

And with the "drop-in" compat, it has a tendency to move away from the best 
class names. (My druthers is to keep the best class names with the best 
implementation, renaming the existing class to XxxYyyOld (maybe replace Old 
with a version number, e.g. XxxYyy3_0). Currently, we use the @deprecated for 
an upgrade advisor. Most deprecations give the suggested replacement. So in a 
way, it is merely a mapping from old signature/pattern to new 
signature/pattern. I don't think it would be too hard to create an upgrade 
advisor in perl or php that given a source tree, looks through Java code for 
old signatures and identifies potential problems and suggests new signatures. 
(I'm not volunteering;)

Also, having watched Robert's et al changes to the analyzers, I'm now of the 
mindset that one cannot maintain bw compat of analyzers (either core or 
contrib). There are too many conditions for it to be true. Several things 
conspire together to create a "token stream": the version of the OS, the 
version of Java (i.e. the version/implementation of Unicode) and the 
implementation of the tokenizers, stemmers, stop-words and filters and the 
order they are implemented in an analyzer. If any of these change all bets are 
off. It's likely to work for most situations, but not for others. A prime 
example of this is that if the machine's Locale changes, then all places that 
String.toLowerCase/toUpperCase is at risk. Fixing the bug is the right thing to 
do. An end-user upgrading without reindexing may be the wrong thing to do.

I think the suggestion that an end user mix and matches a particular "analyzer" 
from a prior release with the current release, is viable. As long as the API 
doesn't change. (BTW, I like the suggestion that the API be minimized and 
stabilized as an iterator over tokens with attributes. Something that the 
current bw compat policy prevented or made too hard.)

I don't think it matters whether trunk or branch is used for "best" code to an 
end-user. Most will wait until a release. I don't think it matters much what 
pattern other projects use other than as a learning experience. A development 
roadmap should be sufficient to explain to what the choice is. I think anyone 
who will get pre-release Lucene from source code control are smart enough to 
figure it out.

The concern that "stable" and "unstable" will drift is probably over-blown. (I 
don't like these labels, and would rather have "next/best" and "current" as 
descriptions.) I'm really impressed with the caliper of commitment the 
committers have to keeping Lucene the best and most useful search engine. The 
practice of consensus via patches in Jira just reinforce this. It is a simple 
matter for someone to say, please don't commit this patch to "best" until a 
corresponding "current" patch is created. I guess I expect that important 
changes to "best" will be wanted in "current".

I think that it may be best to have a 4.0 release to cause this to happen.

That is to say:
o Keep a clear, well-defined migration path that a competent engineer can 
perform in a short period of time.
o Provide tools, as needed, to assist in the migration (e.g. an index structure 
upgrade tool, an "upgrade advisor")
o Maintain index structure compatibility w/in a major release cycle.

-- DM

On Apr 25, 2010, at 3:01 PM, Mark Miller wrote:

> On 4/25/10 1:43 PM, Michael McCandless wrote:
> 
>> Changes that go into stable need to be merged to unstable, maybe
>> periodically sweeped or maybe merged up by the original committer or
>> likely some combination (like flex).
>> 
>> (And, yes we'll still use other branches for big new features that are
>> in active development).
>> 
>> Mike
>> 
> 
> I'm still +1 on all the proposals you have made. And still -1 to most of the 
> attempted tweaks on them that have been proposed :)
> 
> -- 
> - Mark
> 
> http://www.lucidimagination.com



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to