I agree with Doug that the burden of proof is on keeping the codebases together instead of the reverse. I liken it to a marriage; it has to work well for both parties. It seems to be mostly beneficial for Solr but much less so for Lucene.
BTW an even better example than the huge FuzzyQuery case was the loss of an entire postings format that Solr was using -- LUCENE-9116 <https://issues.apache.org/jira/browse/LUCENE-9116>. That one was caught thanks to Solr tests and prevented the release. The huge FuzzyQuery, on the other hand, was released. I hope that with a split project, we're able to do Solr side tests quickly enough prior to Lucene doing releases. I wonder if ElasticSearch tries to do this on their side too; does it? An idea just occurred to me that may help make a split nicer for Solr than it is today. Solr could use a branch of the Lucene project that's used for the Solr project. That's just impossible today due to the single codebase. This affords the possibility of changes that are not endorsed on the Lucene side (i.e. that would not make it into a real Lucene release). An example of this are API changes like LUCENE-8159 <https://issues.apache.org/jira/browse/LUCENE-8159> or perhaps making some classes public so that Solr can access them without awkward hacks. Put differently, like some companies maintain forks of Lucene/Solr, in the future, Solr should be able to have its fork of Lucene likewise. Should this approach be adopted, Solr would want to keep this to a minimum to keep upkeep of the branch low, and the branch _would_ need upkeep (e.g. running tests), so it's not a total panacea. On the other hand, if Solr strictly only releases with released Lucene versions, then this is way nicer from a versioning and artifact management (i.e. publishing to Maven) point of view. It's nice to have options. ~ David On Thu, May 7, 2020 at 1:07 PM Bram Van Dam <[email protected]> wrote: > > The big question is this: “Is this the right time to split Solr and > > Lucene into two independent projects?”. > > Sounds like there are quite a few tasks to complete to get this done. > Splitting the build and codebase. Presumably a bunch of administration > within Apache/the PMC. Setting up infrastructure etc. > > These are the costs, to be paid up front in the currency of someone's > time. The benefits are less clear. Faster build times and easier > maintenance sound attractive, but when will those benefits be visible? > Next month? Or in a year? > > Whoever will be doing this work should probably ask themselves the > questions: is this the best use of their time? > > - Bram > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
