+1 to the developer-doc, very neat. Thanks for adding it. On Sun, Jan 16, 2022 at 10:52 PM Eric Pugh <[email protected]> wrote:
> I found this a great summary! > > On Jan 16, 2022, at 9:48 AM, Jan Høydahl <[email protected]> wrote: > > I added a developer-doc draft for modules and packages in > https://github.com/apache/solr/pull/531 (HTML preview > <https://github.com/apache/solr/blob/7eeaba318a79ed62678ab3ac5f1d403733d88e5f/dev-docs/plugins-modules-packages.adoc>). > Let me know if it is useful. > > Jan > > 14. jan. 2022 kl. 18:13 skrev David Smiley <[email protected]>: > > Fair points. I might take a stab at this on the weekend to see. > > I propose no change to the SOLR_HOME detection logic, which will naturally > end up being SOLR_INSTALL/server/solr (where solr.xml is). Docker stuff > won't need to set it / play games as it does now. > > ~ David Smiley > Apache Lucene/Solr Search Developer > http://www.linkedin.com/in/davidwsmiley > > > On Fri, Jan 14, 2022 at 9:08 AM Jan Høydahl <[email protected]> wrote: > >> Hmm, yea it's always been a bit odd how SOLR_HOME does not point to where >> you untared solr, i.e. /opt/solr, like for every other software out there. >> So I support such a change. >> Will SOLR_VAR be exactly what the old SOLR_HOME was, i.e. /var/solr/data, >> or will it point to /var/solr? It's also a bit odd how we don't (I think) >> have a var pointing to /var/solr as laid out by the install script and in >> Dockerfile. >> >> Such a change will have to happen either in 9.0 or 10.0. Sounds a tad too >> large for 9.0, since it's not even started. But a JIRA is a good start. >> Perhaps it is easier than we imagine, and suddenly someone have put up a >> PR? :) >> >> I did not quite get where you wanted the "new" SOLR_HOME to point to. I >> think if we should change anything, it should point to the root of the Solr >> installation? >> >> Jan >> >> 14. jan. 2022 kl. 14:47 skrev David Smiley <[email protected]>: >> >> I believe the root cause here is fixed by my "Immutable Infrastructure" >> adherence proposal relating to a new SOLR_VAR: >> https://lists.apache.org/thread/3vvld3xnndtthtl7sfgdbsgkbtpm55b0 >> Thus SOLR_HOME stays with the solr installation; mutable data like the >> indexes go in a new SOLR_VAR -- ultimately the same path to the data that >> exists today. But since SOLR_HOME stays with Solr, so does the lib and >> thus it's easy to mount in some other path or whatever. >> >> I didn't create a JIRA issue... I've been extremely busy. But before I >> do, WDYT about this? >> >> ~ David Smiley >> Apache Lucene/Solr Search Developer >> http://www.linkedin.com/in/davidwsmiley >> >> >> On Fri, Jan 14, 2022 at 4:20 AM Jan Høydahl <[email protected]> >> wrote: >> >>> Yep, have also been using SOLR_HOME/lib for years. But for a recent >>> client, they needed to package up 2-3 plugin jars into the docker image, so >>> then we tried $SOLR_HOME/lib, but since /var/solr/data is defined as a >>> Docker volume in our Dockerfile, it won't help copying libs in that >>> location in custom Dockerfile, since at runtime the volume location will be >>> used instead, where some old jars would be used instead. So we added the >>> libs to some /opt/foo/lib folder, and made an init-script in >>> "/docker-entrypoint-initdb.d/" that on container startup would do a "rm >>> /var/solr/data/lib/*.jar && cp /opt/foo/lib/*.jar /var/solr/data/lib/", >>> i.e. clean up existing jars from the docker-host's existing volume and copy >>> in the fresh plugin jars from the newest image. Phew. And the same with >>> solr.xml initialization... >>> >>> Of course we could have used export SOLR_OPTS=$SOLR_OPTS >>> -Dsolr.sharedLib=/opt/foo/lib or something, but it is still not super easy. >>> So that's what the new standard location tries to solve - you load code >>> from a stable path, not together with your data. >>> >>> Jan >>> >>> 13. jan. 2022 kl. 19:04 skrev David Smiley <[email protected]>: >>> >>> +1 to your phasing. >>> >>> >>>> Another minor improvement for users is if we pre-add $SOLR_TIP/lib to >>>> the classloader >>> >>> I'll create a JIRA :) >>> >>> >>> SOLR-HOME/lib is already supported -- >>> https://nightlies.apache.org/solr/draft-guides/solr-reference-guide-main/libs.html >>> This is what I recommend people use in general. >>> >>> ~ David Smiley >>> Apache Lucene/Solr Search Developer >>> http://www.linkedin.com/in/davidwsmiley >>> >>> >>> On Thu, Jan 13, 2022 at 10:59 AM Houston Putman <[email protected]> >>> wrote: >>> >>>> It could very well be worth shipping two docker images in the meantime. >>>>> Or maybe a zip of each module could be a separate artifact that is >>>>> published? I'm not sure what freedoms we have to do this in the ASF. >>>>> >>>> >>>> I think for 9.0 we could realistically shoot for 2 binary releases and >>>> 2 docker images, slim (without the modules) and full-featured (with the >>>> modules), having the full-featured be the default. >>>> >>>> Starting in the 9.x line, we could start packaging the modules as >>>> separate binary artifacts for the solr release. Then in 10.x we can make >>>> the slim release be the default (still having the fat tgz available as well >>>> with as solr-extended-10.0.0.tgz or something like that). >>>> >>>> >>>>> Phase 1. (9.0): Modularize Solr by extracting obvious low hanging >>>>> fruits plugins into contribs/modules. Make it super easy to launch solr >>>>> wil >>>>> any of these on class-path (SOLR-15914 >>>>> <https://issues.apache.org/jira/browse/SOLR-15914>). >>>>> Phase 2 (9.x): Evolve package manager and make it possible to >>>>> optionally install the modules as 1st party packages instead (still fat >>>>> distro) >>>>> Pase 3: (10.0?): Extract even more features as modules, and publish >>>>> all modules as separate delivery artifacts on DLCDN >>>>> >>>> >>>> I really like this plan. I agree for 9.x we really don't have an >>>> option, but to keep publishing the fat tgz as the default. Even in 10.x I >>>> think we want to offer both a full-featured download and a slim download, >>>> but with first-part-packages we can make slim the "default". >>>> >>>> Another minor improvement for users is if we pre-add $SOLR_TIP/lib to >>>>> the classloader >>>> >>>> I'll create a JIRA :) >>>> >>>> >>>> Yes please. That would be a lovely improvement! People >>>> bend-over-backward currently to add custom libs. >>>> >>>> - Houston >>>> >>>> On Thu, Jan 13, 2022 at 8:09 AM Jan Høydahl <[email protected]> >>>> wrote: >>>> >>>>> Another minor improvement for users is if we pre-add $SOLR_TIP/lib to >>>>> the classloader, similar to what we have with $SOLR_HOME/lib today. The >>>>> disadvantage of $SOLR_HOME/lib is that it can be anywhere, perhaps on a >>>>> Docker volume or a different disk, so you cannot e.g make a Dockerfile >>>>> like >>>>> >>>>> FROM solr:9.0 >>>>> ADD foo.jar /var/solr/data/lib/foo.jar >>>>> >>>>> ...since /var/solr/data is a volume and will resolve to the volume >>>>> partition of the user, not the content from the image. So if we instead >>>>> allow users to do >>>>> >>>>> FROM solr:9.0 >>>>> ADD foo.jar /opt/solr/lib/ >>>>> >>>>> That is both logical and beautiful, and would always work. >>>>> >>>>> I'll create a JIRA :) >>>>> >>>>> Jan >>>>> >>>>> 13. jan. 2022 kl. 13:57 skrev Jan Høydahl <[email protected]>: >>>>> >>>>> There is not a lack of vision for future local and remote package >>>>> repositories, but the story is that package mgmt development has stalled, >>>>> and is out of reach for 1st party pkgs in the 9.0.0 timeframe. >>>>> So we have to think progress over perfection - once again >>>>> >>>>> Phase 1. (9.0): Modularize Solr by extracting obvious low hanging >>>>> fruits plugins into contribs/modules. Make it super easy to launch solr >>>>> wil >>>>> any of these on class-path (SOLR-15914 >>>>> <https://issues.apache.org/jira/browse/SOLR-15914>). >>>>> Phase 2 (9.x): Evolve package manager and make it possible to >>>>> optionally install the modules as 1st party packages instead (still fat >>>>> distro) >>>>> Pase 3: (10.0?): Extract even more features as modules, and publish >>>>> all modules as separate delivery artifacts on DLCDN >>>>> >>>>> Regarding phase 2 in 9.x. We cannot really extract a feature into a >>>>> module in e.g. 9.1 so users upgrading from 9.0 will get >>>>> NoClassFoundException. That breaks back-compat. But perhaps we could >>>>> continue modularization efforts in 9.x if we make sure that all new >>>>> modules >>>>> extracted in a minor release are automatically added to the classloader? >>>>> Then the classes will disappear from solr-core.jar so would possibly break >>>>> someone's custom embedded usecase, but 99% of users would be unaffected. >>>>> Wdyt? >>>>> >>>>> In any case, I think for 9.x the realistic route is to keep our fat >>>>> tgz, but make it slimmer by removing redundancy and prune down on the >>>>> number of overlapping dependencies. That can get us a long way. >>>>> >>>>> Jan >>>>> >>>>> 13. jan. 2022 kl. 03:15 skrev David Smiley <[email protected]>: >>>>> >>>>> Shawn: >>>>> * RE redundancies of stuff in /dist/, see >>>>> https://issues.apache.org/jira/browse/SOLR-15916 >>>>> * RE "contrib" vs "module" vs "package", see: >>>>> https://issues.apache.org/jira/browse/SOLR-15917 >>>>> * RE not shipping these extras with the Solr distribution, see: "slim >>>>> distro" mention in the document "Solr first party packages" >>>>> https://docs.google.com/document/d/1n7gB2JAdZhlJKFrCd4Txcw4HDkdk7hlULyAZBS-wXrE/edit?usp=sharing >>>>> >>>>> It could very well be worth shipping two docker images in the meantime. >>>>> Or maybe a zip of each module could be a separate artifact that is >>>>> published? I'm not sure what freedoms we have to do this in the ASF. >>>>> >>>>> ~ David Smiley >>>>> Apache Lucene/Solr Search Developer >>>>> http://www.linkedin.com/in/davidwsmiley >>>>> >>>>> >>>>> On Wed, Jan 12, 2022 at 8:21 PM Shawn Heisey <[email protected]> >>>>> wrote: >>>>> >>>>>> On 1/12/2022 8:31 AM, Jan Høydahl wrote: >>>>>> > I think there are lots of pieces of code in solr-core that can >>>>>> easily be extracted the same way. >>>>>> > Some perhaps even for 9.0.0, as it slims down the core and reduces >>>>>> attack surface for most users as well. >>>>>> >>>>>> I think it would be really awesome if we had a core download that >>>>>> only >>>>>> included basic functionality, and all the other fancy things that >>>>>> Solr >>>>>> does now out of the box (as well as those that are contrib) could be >>>>>> added after download via package scripting or just additional >>>>>> downloads. >>>>>> >>>>>> The size of solr-8.11.1.tgz is 207MiB, or 218076598 bytes. The .zip >>>>>> version is slightly larger. 8.0.0 was 163MiB, 7.0.0 was 142MiBm, >>>>>> 6.0.0 >>>>>> was 131MiB, and 1.4.1 was 53.7MiB. I think it's insane that the >>>>>> download is so big ... and a lot of what makes it big are things that >>>>>> the vast majority of our users will never use. >>>>>> >>>>>> Large reductions in the overall size of the main download would be >>>>>> possible by putting hadoop, calcite, some of the really large lucene >>>>>> analysis components, and the contrib stuff into packages. The >>>>>> extraction contrib alone is 43.5MiB compressed in zip format. >>>>>> >>>>>> I would suggest moving zookeeper and its dependencies as well, but I >>>>>> think we probably want SolrCloud to be part of base functionality. >>>>>> >>>>>> Some of the large jars are included for what are probably >>>>>> insignificant >>>>>> usages, and I wonder if that functionality could be replaced by newer >>>>>> native functions available in Java 8 and later. I am eyeballing >>>>>> things >>>>>> like guava and the commons-* jars here, but I am sure there are other >>>>>> things in this category. I'd like to eliminate as many dependencies >>>>>> as >>>>>> we can. >>>>>> >>>>>> Extracting some things from the solr-core jar into other jars sounds >>>>>> like a really awesome idea. >>>>>> >>>>>> I don't think the solr-core jar should be in the dist directory. >>>>>> It's >>>>>> useless by itself, because it will still have a LOT of dependencies >>>>>> even >>>>>> if we shrink it. And there are likely other things in the dist >>>>>> directory that fall into that category. The test framework and its >>>>>> dependencies are a good candidate for removal. >>>>>> >>>>>> By removing some of the low-hanging fruit that I am SURE isn't needed >>>>>> for base binary functionality on the 8.11.1 download, I was able to >>>>>> end >>>>>> up with a .zip file sized in at 60.4MiB, and I am sure at least a >>>>>> little >>>>>> bit of further reduction is possible if we can fully map out >>>>>> dependencies. I think we can leverage gradle to provide some >>>>>> dependency >>>>>> info. >>>>>> >>>>>> Exactly how to organize the code repo to create divided artifacts is >>>>>> something that we would need to think about. My initial idea is >>>>>> changing "contrib" to "package" and then making some new directories >>>>>> under package. >>>>>> >>>>>> Thanks, >>>>>> Shawn >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: [email protected] >>>>>> For additional commands, e-mail: [email protected] >>>>>> >>>>>> >>>>> >>>>> >>> >> > > _______________________ > *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467 > | http://www.opensourceconnections.com | My Free/Busy > <http://tinyurl.com/eric-cal> > Co-Author: Apache Solr Enterprise Search Server, 3rd Ed > <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw> > This e-mail and all contents, including attachments, is considered to be > Company Confidential unless explicitly stated otherwise, regardless > of whether attachments are marked as such. > >
