Having read the entire thread as it's come in, my head is spinning. It is hard to keep up with the ideas and proposals. Through out this thread my mindset has changed, more than once. And may change again ;)
To that end I'd like to make some end-user observations and thoughts: (I had thought that this would be a quick few bullets, but it isn't. Sorry for that...) Most of this thread deals with implementation not requirement. As I see it the bw compat policy is the implementation of what I think the real requirement is: that user upgrades be straightforward, simple, obvious, ..... but I don't think the requirement should be that it be painless. I think it is important to keep the cost of doing an upgrade fairly low. If not some might abandon Lucene. I've seen people abandon Windows Vista for Mac, InterViews 2.0 for other GUI toolkits, ..... I just think there are other ways to do it w/o maintaining the current bw compat policy. There is a proposal of an index upgrade tool. I think that satisfies the bw compat for the index structure. The practice of drop-in jar compatibility is nice, but I think it is potentially dangerous. It might lull end-users into a false sense of security and without a deep understanding of what or how to test it might yield surprising results. I think that this is especially true with the tokenizers/analyzers/filters. And with the "drop-in" compat, it has a tendency to move away from the best class names. (My druthers is to keep the best class names with the best implementation, renaming the existing class to XxxYyyOld (maybe replace Old with a version number, e.g. XxxYyy3_0). Currently, we use the @deprecated for an upgrade advisor. Most deprecations give the suggested replacement. So in a way, it is merely a mapping from old signature/pattern to new signature/pattern. I don't think it would be too hard to create an upgrade advisor in perl or php that given a source tree, looks through Java code for old signatures and identifies potential problems and suggests new signatures. (I'm not volunteering;) Also, having watched Robert's et al changes to the analyzers, I'm now of the mindset that one cannot maintain bw compat of analyzers (either core or contrib). There are too many conditions for it to be true. Several things conspire together to create a "token stream": the version of the OS, the version of Java (i.e. the version/implementation of Unicode) and the implementation of the tokenizers, stemmers, stop-words and filters and the order they are implemented in an analyzer. If any of these change all bets are off. It's likely to work for most situations, but not for others. A prime example of this is that if the machine's Locale changes, then all places that String.toLowerCase/toUpperCase is at risk. Fixing the bug is the right thing to do. An end-user upgrading without reindexing may be the wrong thing to do. I think the suggestion that an end user mix and matches a particular "analyzer" from a prior release with the current release, is viable. As long as the API doesn't change. (BTW, I like the suggestion that the API be minimized and stabilized as an iterator over tokens with attributes. Something that the current bw compat policy prevented or made too hard.) I don't think it matters whether trunk or branch is used for "best" code to an end-user. Most will wait until a release. I don't think it matters much what pattern other projects use other than as a learning experience. A development roadmap should be sufficient to explain to what the choice is. I think anyone who will get pre-release Lucene from source code control are smart enough to figure it out. The concern that "stable" and "unstable" will drift is probably over-blown. (I don't like these labels, and would rather have "next/best" and "current" as descriptions.) I'm really impressed with the caliper of commitment the committers have to keeping Lucene the best and most useful search engine. The practice of consensus via patches in Jira just reinforce this. It is a simple matter for someone to say, please don't commit this patch to "best" until a corresponding "current" patch is created. I guess I expect that important changes to "best" will be wanted in "current". I think that it may be best to have a 4.0 release to cause this to happen. That is to say: o Keep a clear, well-defined migration path that a competent engineer can perform in a short period of time. o Provide tools, as needed, to assist in the migration (e.g. an index structure upgrade tool, an "upgrade advisor") o Maintain index structure compatibility w/in a major release cycle. -- DM On Apr 25, 2010, at 3:01 PM, Mark Miller wrote: > On 4/25/10 1:43 PM, Michael McCandless wrote: > >> Changes that go into stable need to be merged to unstable, maybe >> periodically sweeped or maybe merged up by the original committer or >> likely some combination (like flex). >> >> (And, yes we'll still use other branches for big new features that are >> in active development). >> >> Mike >> > > I'm still +1 on all the proposals you have made. And still -1 to most of the > attempted tweaks on them that have been proposed :) > > -- > - Mark > > http://www.lucidimagination.com --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org