> > Your example confused me.
You're right. I Wrote it with one eye closed already. I meant to say that if I'm a 2.4 user and something gets deprecated in trunk (afterwards), it is carried through 2.4.X and 2.5 and then removed in 2.6. So only 1 full minor release. It's somewhat crazy, but what if we deprecate stuff and rename it? > I absolutely love that idea ! But it means that: 1) We cannot support jar drop-in ability in those cases (which I'm fine with because people can upgrade to 2.4.X to get bug fixes) not just because the API does something different, but because it may not compile. For example, the changes I'm doing in 1614 would have changed next() and skipTo() signature, and so someone who wrote a DISI which has a next() that returns boolean will fail to compile. 2) We give the deprecated API the mediocre names. (A funny thought: we can give those methods/classes really stupid/nasty names, to emphasize the beauty of the existing API, to encourage people to stick with the better API :) ). 3) We document clearly what needs to be done in order to use the deprecated API. One thing we didn't address here fully are methods added to interfaces/abstract classes. When we add a method to an abstract class with a default impl, that's ok. But what if we need to make it abstract (like we had to do in 1575 for the Collector versions)? I guess for interfaces we should first move all of them to abstract classes. I like interfaces. but abstract classes give us slightly more freedom when we face back-compat issues. Maybe to support Earwin's idea, we use the name for a new abstract class, and give the interface a different name? That way to upgrade people just need to change implements to extends (I hope that won't cause any problems if their classes already extend something else). But if we apply this policy to interfaces, I think more users will need to touch their code when upgrading even minor releases. So Mike, about actsAsVersion ... I think I'm starting to get used to it. I do relate to what Marvin writes though, about two different apps running in the same JVM with different settings. We have such a case - two teams develop two search solutions (for two back-ends). They live in the same JVM but have different development plans/schedules. So it's not just a hypothetical problem to me. If we could have the app saying something Version.getInstance(appId).actAsVersion(2.4) that would solve it because each will have its own Id, and the Version class would maintain a map between the Id and an instance. But I've still yet to resolve (in my mind) how the Lucene code will use it, since the same code runs in two apps with different IDs, and so won't know which appId to pass. Oh well .. we're going to change the way those two teams work anyway, so for me at least, this problem will be gone soon :) I also agree that actsAsVersion breaks the localilty principle, in which when you see a bug you should check in the surroundings where the bug happened, and not realize the bug stems from files away. But I don't like passing version information in the constructors also ... What if we continue to process Marvin's proposal on saving that information in the index. I think, Mike, that I asked you a similar question a while ago, about whether Lucene has the ability to store index versions. Index versions are important and can save some of the problems here - not just with storing stopwords list, but also code that manipulates the index, or makes decisions about scoring etc. For the two apps in same JVM it should solve the problem since I think we can safely assume each operates on its own index. Arggh .. but again we face the same problem - how do we pass that information to the different classes? How is a TokenStream expected to read that info? I think we may have to settle on the static Version class, even if it will read the information from the index (by doing some Version.init(File indexDir)). Shai On Fri, May 22, 2009 at 1:53 AM, Marvin Humphrey <mar...@rectangular.com>wrote: > On Thu, May 21, 2009 at 05:19:43PM -0400, Michael McCandless wrote: > > > Marvin, which solution would you prefer? > > Between the two, I'd prefer settings constructor arguments, though I would > be > inclined to have settings classes that are specific to individual classes > rather than Lucene-wide. > > At least that scheme gets locality right. The global actsAsVersion > variable > violates that principle and has the potential to saddle a small number of > users who have done absolutely nothing wrong with bugs that are very, very > hard to hunt down. That's unfair. > > As far as analyzers and token streams, the theoretical answer is making > indexes self-describing via serializable schemas, as discussed on the Lucy > dev > list, and as implemented in KinoSearch svn trunk. With versioning metadata > attached to the index, there is no longer any worry about upgrading > analysis > modules provided that those modules handle their own versioning correctly. > > For instance, in KS the Stopalizer always embeds the complete stoplist in > the > schema file, so even if we update the "English" stoplist, we don't get > invalid > search results for indexes which were created with the old stoplist. > Similarly, it may not be possible to keep around multiple variants of > Snowball, but at least we can fail catastrophically instead of subtly if we > detect that the Snowball version has changed. > > Full-on schema serialization isn't feasible for Lucene, but attaching an > actsAsVersion variable to an index and feeding that to your analyzers would > be > a decent start. > > Lastly, I think a major java Lucene release is justified already. Won't > this > discussion die down somewhat if you can get 3.0 out? If there are issues > that > are half done, how about rolling back whatever's in the way? > > Marvin Humphrey > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > >