Would this not be eased to some extent if the initial committer base of both the projects was the same?
On Wed, May 13, 2020 at 10:44 PM Jason Gerlowski <gerlowsk...@gmail.com> wrote: > > There's nothing wrong with a harsh "sink or swim" approach if the > risks are bearable. If the worst case risk here is that we have a few > rough releases as we smooth out the process, I'm all on board with > "sink or swim". But by the same token - "sink or swim" gets less > appealing as the risks increase. No sane person would toss their PFD > after a shipwreck because they always meant to learn to backstroke. > So maybe we just disagree on what the worst case harm to Solr looks > like. I see the harm being pretty serious: if Solr stagnates its > Lucene version relative to other offerings users could go elsewhere > and the project would lose out on adoption and community. A Very Bad > Thing. But if you don't see this as even a remote possibility, well > then "sink or swim" makes sense. > > > I'd be OK with a stable, robust Solr that got 1-2 major versions behind > > Lucene, but was rock-solid with a lower barrier to entry... > > If that's an option, I might be too. But I'm not sure how a > Lucene-Solr split (or an older Lucene version) does anything to make > Solr more solid, lower its barrier to entry, etc. Anecdotally, Solr > bugs rooted in Lucene seem the minority by far. And Solr committers > can put effort into stability/barrier-to-entry as easily now as they > can in a post-split world. Is there some connection between the split > and the those -ilities that I'm missing? > > > I choose to be more optimistic wrt «Solr committers» ability to integrate > > new and changed Lucene APIs in Solr > I agree that Solr committers _can_ do this work, and that there are > some awesome committers who straddle the fence and know Lucene very > well. I wasn't trying to impugn anyone's efforts, interest or > expertise. My point was just that at the end of the day a split > leaves fewer people around Solr with knowledge of the Lucene APIs and > their perf implications. And a split is going to burden those > remaining people heavily until the roster of Lucene-literate Solr > committers re-populates. > > On Wed, May 13, 2020 at 10:29 AM Jan Høydahl <jan....@cominvent.com> wrote: > > > > I choose to be more optimistic wrt «Solr committers» ability to integrate > > new and changed Lucene APIs in Solr. You do not need to be a Lucene > > committer in order to learn how to USE the Lucene APIs, and I believe there > > are several «Solr committers» who already posess those skills and are > > pretty deep in Lucene already. Hopefully they are interested in doing > > lucene upgrades for Solr, even if that some times includes implementing > > support for a new fieldType (points vs trie), getting rid of > > index-time-boost features etc. I may even attempt some of those tasks > > myself for the areas of Lucene API I am comfortable with. > > > > Jan > > > > 13. mai 2020 kl. 16:24 skrev Doug Turnbull > > <dturnb...@opensourceconnections.com>: > > > > Jason, I hear your arguments and think of them FOR a split > > > > This might sound a bit harsh, but maybe Lucene devs helping with Solr has > > let Solr off the hook a bit too much? I actually like the fact that the > > split causes Solr to figure out it's own situation and focus on its > > problems. > > > > Regardless of the split or not, Solr is going to sink or swim based on the > > efforts of Solr committers, not Lucene committers. I don't think Lucene > > committers are going to be the ones to really address the systemic issues > > with Solr. If anything, I imagine they are "let me fix this so the code > > compiles" level of maintenance. > > > > "Falling behind Lucene" is counterbalanced to me with "Should Solr be on > > cutting-edge Lucene?" > > > > I'd be OK with a stable, robust Solr that got 1-2 major versions behind > > Lucene, but was rock-solid with a lower barrier to entry... > > > > On Wed, May 13, 2020 at 10:07 AM Jason Gerlowski <gerlowsk...@gmail.com> > > wrote: > >> > >> Wanted to add my two cents to the mix, though I'm a little late as the > >> vote has already progressed pretty far. > >> > >> I'm against a split. From the points raised, I agree that Lucene has > >> much to gain. But Solr has a lot to lose. > >> > >> Lucene devs would be freed from keeping Solr usage up to date. That's > >> a great improvement for Lucene itself. But that burden doesn't > >> disappear - it's just being moved to a different (smaller) group of > >> committers - who by definition don't know Lucene as well, and are less > >> suited to the task. (Lucene devs still might help post-split, but > >> given that avoiding this burden is one of the arguments made above for > >> a split, it seems unwise to assume how much this generosity will > >> continue.) > >> > >> One likely result is that Solr will fall behind Lucene. Possibly > >> permanently behind. Lucene folks are doing great work to improve > >> perf, add features etc. so falling behind is a Very Bad Thing. To > >> Solr, Lucene is not the same as Jetty or Jackson which Solr can fall > >> behind on without significant detriment. Lucene and the core search > >> functionality it offers is what brings people to Solr (or Elastic). > >> Putting ourselves in a position to fall behind on Lucene does a huge > >> disservice to our users, and loses Solr one of its greatest > >> advantages. > >> > >> I hope that in the case of a split, the Solr community would rise to > >> the occasion and prevent this. But my personal judgement is that it's > >> unlikely. I hate to be negative, and I hope to be proven wrong, but > >> that's how things look to me. We (Solr folks) have a bad track record > >> of addressing things with less-tangible, less-sellable benefits. Take > >> our ongoing test flakiness woes and SolrCloud instability issues as > >> examples: both are serious threats to the project, both have been > >> around for years, and both are here to stay for the foreseeable > >> future. > >> > >> If conditions were different in a way that made "falling behind" less > >> likely, I'd be all for a split. But given (1) our recent track record > >> of addressing these sort of issues, (2) our test flakiness which will > >> make identifying "Lucene snapshot upgrade" bugs exceedingly difficult, > >> and (3) the current economic conditions which may make it harder for > >> committers to negotiate time from their employers to work on Lucene > >> updates...now seems like a bad time to attempt a split. It will harm > >> Solr more than it helps Lucene. > >> > >> On Tue, May 12, 2020 at 3:37 PM Namgyu Kim <kng0...@gmail.com> wrote: > >> > > >> > It's hard to make a decision because it seems to have pros and cons. > >> > Basically, I agree to separate but there are some questions. > >> > So I don't not vote right now. > >> > > >> > 1) Release version > >> > Currently, versions of Lucene and Solr are aligned, how will they be > >> > managed in the future? > >> > Other people took Elasticsearch as an example... But it was an > >> > independent project from the beginning. > >> > So there is no problem with the Lucene version. (Elasticsearch 7.7 and > >> > Lucene 8.5.1) > >> > I'm sure if we make solr as an independent project, it will make cracks > >> > about the version structure. (like Lucene 8.6.2 and Solr 8.9.1) > >> > But it's also strange to suddenly start a new version of the Solr. (Solr > >> > 1.0) > >> > Of course it's a matter of adaption, but it's likely to cause some > >> > confusion for existing users. > >> > > >> > 2) Complementary relationship > >> > When Lucene and Solr are built together, Solr can always maintain the > >> > latest Lucene. > >> > In my personal opinion, it's a great advantage of Solr. > >> > Because Solr doesn't have to suffer from Lucene API changes and has > >> > latest library. > >> > But it will be difficult if Solr becomes independent. > >> > If Solr tracks the master branch of Lucene on separate > >> > repository(project), can it always check and reflect Lucene's API > >> > changes? > >> > > >> > On Tue, May 12, 2020 at 10:12 PM Doug Turnbull > >> > <dturnb...@opensourceconnections.com> wrote: > >> >> > >> >> I'll give a perspective that comes more from the user's / "market" > >> >> point of view as at OSC we onboard lots of new organizations into Solr. > >> >> > >> >> - Most new users incorrectly think of Solr as an independent Apache > >> >> project, and many will have little knowledge or awareness of Lucene > >> >> itself until given the full history of Lucene, Solr, Elasticsearch... > >> >> or they have to dive into the code/write a plugin > >> >> > >> >> - Most orgs / managers think in terms of "Solr" (as in "Solr" vs > >> >> "Elasticsearch" vs "Vespa, etc). So the starting point for new devs / > >> >> folks is from the Solr angle > >> >> > >> >> - Lucene, when discussed, is understood more colloquially as a Solr > >> >> dependency > >> >> > >> >> - If someone brings down the code to do some kind of work or > >> >> investigation, there's typically surprise that Lucene and Solr are > >> >> bundled together. > >> >> > >> >> - There's further surprise as the projects are indeed so different: > >> >> Lucene and Solr tests, for example look little alike. They seem to have > >> >> different coding syles / practices. One has more server-like and > >> >> distributed system concerns; the other is clearly a low-level library > >> >> for doing search work... > >> >> > >> >> I personally have a hard time explaining to new users the rationale for > >> >> keeping these together, and it only increases the barrier to entry (to > >> >> both projects) to have this added complexity of two very different code > >> >> bases munged together. > >> >> > >> >> Just my 2 cents... > >> >> -Doug > >> >> > >> >> On Tue, May 12, 2020 at 7:30 AM Alan Woodward <romseyg...@gmail.com> > >> >> wrote: > >> >>> > >> >>> One advantage I find with the way Elasticsearch and Lucene interact is > >> >>> that ES doesn’t depend on the master branch. We upgrade our master > >> >>> branch frequently to keep up to date with the latest release branch, > >> >>> and that lets us find regressions or API problems pretty quickly, but > >> >>> it also insulates us from having to make big changes immediately. I > >> >>> find this really useful for things like deprecations. Let’s say we > >> >>> deprecate a particular API in the release branch, and remove it > >> >>> entirely in master. Currently, that means Solr needs to immediately > >> >>> switch over to the new API in its master branch. But the whole point > >> >>> of doing deprecations first is that it gives users time to find issues > >> >>> with the replacements - if we find that the replacement API doesn’t > >> >>> quite fit in ES, we have time to work out either how to change our > >> >>> code, or to improve the new API, but because the deprecated version is > >> >>> still there we’re not blocked from upgrading and getting other > >> >>> improvements. Solr, meanwhile, may end up with a hacky workaround > >> >>> because that’s what got tests passing for the Lucene developer; or > >> >>> worse, we end up just copying the deprecated API wholesale into Solr > >> >>> and abandoning it there - witness TrieField or UninvertingReader. > >> >>> > >> >>> > On 11 May 2020, at 19:05, Atri Sharma <a...@apache.org> wrote: > >> >>> > > >> >>> > My two cents: > >> >>> > > >> >>> > As a Lucene heavy developer, I have several found maintaining Solr > >> >>> > dependencies while making large changes a bit cumbersome. I believe > >> >>> > Lucene and Solr should exist in a symbiotic relationship but not > >> >>> > tightly coupled with each other. > >> >>> > > >> >>> > > >> >>> > On Mon, May 11, 2020 at 7:22 PM Erik Hatcher > >> >>> > <erik.hatc...@gmail.com> wrote: > >> >>> >> > >> >>> >> Without reading much or replying to any specific points made on > >> >>> >> this thread, here's my raw thoughts on this age-old topic.... > >> >>> >> (finally coming out of my cocoon after taking things in for a bit) > >> >>> >> > >> >>> >> Solr is a search -server- with distributed capabilities, that > >> >>> >> leverages the magic of Lucene underneath. Solr depends on Lucene, > >> >>> >> is a consumer of it. Lucene is a tight search -library- with > >> >>> >> little to no external dependencies. Their purposes and end-users > >> >>> >> are different. > >> >>> >> > >> >>> >> I was never really for the grand unification of Lucene and Solr > >> >>> >> back in the day because: > >> >>> >> > >> >>> >> - Solr's developer experience would be greatly streamlined, faster, > >> >>> >> cleaner, leaner, and focused > >> >>> >> - Having Lucene change when Solr doesni't (yet) adapt to those > >> >>> >> changes leads to confusion and inconsistency, loose wires hanging > >> >>> >> out of the wall unconnected or duct taped together > >> >>> >> - It simply makes sense to keep Lucene versioned and tightly > >> >>> >> controlled for upgrades, various testing configurations varying > >> >>> >> Lucene versions, within Solr > >> >>> >> - Solr could have a very concerted upgrade effort for Lucene > >> >>> >> capability jumps, with a focused upgrade effort at the > >> >>> >> changed/improved/added touch points just like other dependencies > >> >>> >> within Solr (like Tika and Jetty) > >> >>> >> > >> >>> >> Those points all kinda say the same thing.... Solr depends on > >> >>> >> "lucene.jar" and I'm in the camp that thinks Solr and Lucene > >> >>> >> development, communities, and end-users/consumers would all greatly > >> >>> >> benefit from a fancy new TLP and focused community for > >> >>> >> solr.apache.org and a tight(er) relationship with the Lucene > >> >>> >> community as an involved and vested consumer. > >> >>> >> > >> >>> >> Erik > >> >>> >> > >> >>> > > >> >>> > > >> >>> > -- > >> >>> > Regards, > >> >>> > > >> >>> > Atri > >> >>> > Apache Concerted > >> >>> > > >> >>> > --------------------------------------------------------------------- > >> >>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > >> >>> > For additional commands, e-mail: dev-h...@lucene.apache.org > >> >>> > > >> >>> > >> >>> > >> >>> --------------------------------------------------------------------- > >> >>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > >> >>> For additional commands, e-mail: dev-h...@lucene.apache.org > >> >>> > >> >> > >> >> > >> >> -- > >> >> Doug Turnbull | CTO | OpenSource Connections, LLC | 240.476.9983 > >> >> Author: Relevant Search; Contributor: AI Powered Search > >> >> This e-mail and all contents, including attachments, is considered to > >> >> be Company Confidential unless explicitly stated otherwise, regardless > >> >> of whether attachments are marked as such. > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > >> For additional commands, e-mail: dev-h...@lucene.apache.org > >> > > > > > > -- > > Doug Turnbull | CTO | OpenSource Connections, LLC | 240.476.9983 > > Author: Relevant Search; Contributor: AI Powered Search > > This e-mail and all contents, including attachments, is considered to be > > Company Confidential unless explicitly stated otherwise, regardless of > > whether attachments are marked as such. > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > -- Regards, Atri Apache Concerted --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org