On Thu, Nov 11, 2010 at 09:22:00AM -0600, Peter Karman wrote: > > As the index format changes, we accumulate cruft in our codebase to support > > old indexes and old segments. At some point, we need to purge such cruft > > and > > abandon support for old indexes. But if you are a user, it's hard to know > > whether your index has old segments in it, and whether you can upgrade > > safely > > to a given version of the library. > > You're describing the back compat path for KS users switching to Lucy, yes?
The "modernizer" approach addresses a general problem for the Lucy/Lucene segmented index design, and it will be useful at every major index format break going forward. But yes, I'm thinking that the first use case would be dropping support for segments originally written under KinoSearch. I've recently whipped up a patch that allows Lucy to read KinoSearch indexes. All that we need to do is alias a few class names so that deserializing a Schema works properly -- i.e. when the Schema JSON file contains a serialized "KinoSearch::Analysis::CaseFolder", the object that emerges from the deserializer is a Lucy::Analysis::CaseFolder. However, the Lucy codebase currently supports a number of obsolete KinoSearch segment formats, and it would be nice to drop that support and clean out the cruft at some time in the future. For whatever Lucy release we decide to do that on, we would announce that users who had migrated indexes from KinoSearch need to run the modernizer. (Indexes initialized under Lucy would not need modernization, as we could guarantee that they were written in a recent format.) Providing a clean migration path for KinoSearch users allows us to put KinoSearch into maintenance mode and focus exclusively on developing Lucy. At the same time, having the modernizer in reserve holds the promise that we won't be burdened forever by the backwards compatibility requirements of old index formats. > Maybe write the cookbook recipe for upgrading KS to Lucy, and then we can > see if it needs to be formalized into a part of the core? OK, sounds like the cookbook approach is feasible and prudent. We don't technically *need* the modernizer until we decide to drop support for an old format, though. Marvin Humphrey
