Fair points.  I might take a stab at this on the weekend to see.

I propose no change to the SOLR_HOME detection logic, which will naturally
end up being SOLR_INSTALL/server/solr (where solr.xml is).  Docker stuff
won't need to set it / play games as it does now.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Jan 14, 2022 at 9:08 AM Jan Høydahl <[email protected]> wrote:

> Hmm, yea it's always been a bit odd how SOLR_HOME does not point to where
> you untared solr, i.e. /opt/solr, like for every other software out there.
> So I support such a change.
> Will SOLR_VAR be exactly what the old SOLR_HOME was, i.e. /var/solr/data,
> or will it point to /var/solr? It's also a bit odd how we don't (I think)
> have a var pointing to /var/solr as laid out by the install script and in
> Dockerfile.
>
> Such a change will have to happen either in 9.0 or 10.0. Sounds a tad too
> large for 9.0, since it's not even started. But a JIRA is a good start.
> Perhaps it is easier than we imagine, and suddenly someone have put up a
> PR? :)
>
> I did not quite get where you wanted the "new" SOLR_HOME to point to. I
> think if we should change anything, it should point to the root of the Solr
> installation?
>
> Jan
>
> 14. jan. 2022 kl. 14:47 skrev David Smiley <[email protected]>:
>
> I believe the root cause here is fixed by my "Immutable Infrastructure"
> adherence proposal relating to a new SOLR_VAR:
> https://lists.apache.org/thread/3vvld3xnndtthtl7sfgdbsgkbtpm55b0
> Thus SOLR_HOME stays with the solr installation; mutable data like the
> indexes go in a new SOLR_VAR -- ultimately the same path to the data that
> exists today.  But since SOLR_HOME stays with Solr, so does the lib and
> thus it's easy to mount in some other path or whatever.
>
> I didn't create a JIRA issue... I've been extremely busy.  But before I
> do, WDYT about this?
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Fri, Jan 14, 2022 at 4:20 AM Jan Høydahl <[email protected]> wrote:
>
>> Yep, have also been using SOLR_HOME/lib for years. But for a recent
>> client, they needed to package up 2-3 plugin jars into the docker image, so
>> then we tried $SOLR_HOME/lib, but since /var/solr/data is defined as a
>> Docker volume in our Dockerfile, it won't help copying libs in that
>> location in custom Dockerfile, since at runtime the volume location will be
>> used instead, where some old jars would be used instead. So we added the
>> libs to some /opt/foo/lib folder, and made an init-script in
>> "/docker-entrypoint-initdb.d/" that on container startup would do a "rm
>> /var/solr/data/lib/*.jar && cp /opt/foo/lib/*.jar /var/solr/data/lib/",
>> i.e. clean up existing jars from the docker-host's existing volume and copy
>> in the fresh plugin jars from the newest image. Phew. And the same with
>> solr.xml initialization...
>>
>> Of course we could have used export SOLR_OPTS=$SOLR_OPTS
>> -Dsolr.sharedLib=/opt/foo/lib or something, but it is still not super easy.
>> So that's what the new standard location tries to solve - you load code
>> from a stable path, not together with your data.
>>
>> Jan
>>
>> 13. jan. 2022 kl. 19:04 skrev David Smiley <[email protected]>:
>>
>> +1 to your phasing.
>>
>>
>>> Another minor improvement for users is if we pre-add $SOLR_TIP/lib to
>>> the classloader
>>
>> I'll create a JIRA :)
>>
>>
>> SOLR-HOME/lib is already supported --
>> https://nightlies.apache.org/solr/draft-guides/solr-reference-guide-main/libs.html
>> This is what I recommend people use in general.
>>
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley
>>
>>
>> On Thu, Jan 13, 2022 at 10:59 AM Houston Putman <[email protected]>
>> wrote:
>>
>>> It could very well be worth shipping two docker images in the meantime.
>>>> Or maybe a zip of each module could be a separate artifact that is
>>>> published?  I'm not sure what freedoms we have to do this in the ASF.
>>>>
>>>
>>> I think for 9.0 we could realistically shoot for 2 binary releases and 2
>>> docker images, slim (without the modules) and full-featured (with the
>>> modules), having the full-featured be the default.
>>>
>>> Starting in the 9.x line, we could start packaging the modules as
>>> separate binary artifacts for the solr release. Then in 10.x we can make
>>> the slim release be the default (still having the fat tgz available as well
>>> with as solr-extended-10.0.0.tgz or something like that).
>>>
>>>
>>>> Phase 1. (9.0): Modularize Solr by extracting obvious low hanging
>>>> fruits plugins into contribs/modules. Make it super easy to launch solr wil
>>>> any of these on class-path (SOLR-15914
>>>> <https://issues.apache.org/jira/browse/SOLR-15914>).
>>>> Phase 2 (9.x): Evolve package manager and make it possible to
>>>> optionally install the modules as 1st party packages instead (still fat
>>>> distro)
>>>> Pase 3: (10.0?): Extract even more features as modules, and publish all
>>>> modules as separate delivery artifacts on DLCDN
>>>>
>>>
>>> I really like this plan. I agree for 9.x we really don't have an option,
>>> but to keep publishing the fat tgz as the default. Even in 10.x I think we
>>> want to offer both a full-featured download and a slim download, but with
>>> first-part-packages we can make slim the "default".
>>>
>>> Another minor improvement for users is if we pre-add $SOLR_TIP/lib to
>>>> the classloader
>>>
>>> I'll create a JIRA :)
>>>
>>>
>>> Yes please. That would be a lovely improvement! People
>>> bend-over-backward currently to add custom libs.
>>>
>>> - Houston
>>>
>>> On Thu, Jan 13, 2022 at 8:09 AM Jan Høydahl <[email protected]>
>>> wrote:
>>>
>>>> Another minor improvement for users is if we pre-add $SOLR_TIP/lib to
>>>> the classloader, similar to what we have with $SOLR_HOME/lib today. The
>>>> disadvantage of $SOLR_HOME/lib is that it can be anywhere, perhaps on a
>>>> Docker volume or a different disk, so you cannot e.g make a Dockerfile like
>>>>
>>>> FROM solr:9.0
>>>> ADD foo.jar /var/solr/data/lib/foo.jar
>>>>
>>>> ...since /var/solr/data is a volume and will resolve to the volume
>>>> partition of the user, not the content from the image. So if we instead
>>>> allow users to do
>>>>
>>>> FROM solr:9.0
>>>> ADD foo.jar /opt/solr/lib/
>>>>
>>>> That is both logical and beautiful, and would always work.
>>>>
>>>> I'll create a JIRA :)
>>>>
>>>> Jan
>>>>
>>>> 13. jan. 2022 kl. 13:57 skrev Jan Høydahl <[email protected]>:
>>>>
>>>> There is not a lack of vision for future local and remote package
>>>> repositories, but the story is that package mgmt development has stalled,
>>>> and is out of reach for 1st party pkgs in the 9.0.0 timeframe.
>>>> So we have to think progress over perfection - once again
>>>>
>>>> Phase 1. (9.0): Modularize Solr by extracting obvious low hanging
>>>> fruits plugins into contribs/modules. Make it super easy to launch solr wil
>>>> any of these on class-path (SOLR-15914
>>>> <https://issues.apache.org/jira/browse/SOLR-15914>).
>>>> Phase 2 (9.x): Evolve package manager and make it possible to
>>>> optionally install the modules as 1st party packages instead (still fat
>>>> distro)
>>>> Pase 3: (10.0?): Extract even more features as modules, and publish all
>>>> modules as separate delivery artifacts on DLCDN
>>>>
>>>> Regarding phase 2 in 9.x. We cannot really extract a feature into a
>>>> module in e.g. 9.1 so users upgrading from 9.0 will get
>>>> NoClassFoundException. That breaks back-compat. But perhaps we could
>>>> continue modularization efforts in 9.x if we make sure that all new modules
>>>> extracted in a minor release are automatically added to the classloader?
>>>> Then the classes will disappear from solr-core.jar so would possibly break
>>>> someone's custom embedded usecase, but 99% of users would be unaffected.
>>>> Wdyt?
>>>>
>>>> In any case, I think for 9.x the realistic route is to keep our fat
>>>> tgz, but make it slimmer by removing redundancy and prune down on the
>>>> number of overlapping dependencies. That can get us a long way.
>>>>
>>>> Jan
>>>>
>>>> 13. jan. 2022 kl. 03:15 skrev David Smiley <[email protected]>:
>>>>
>>>> Shawn:
>>>> * RE redundancies of stuff in /dist/, see
>>>> https://issues.apache.org/jira/browse/SOLR-15916
>>>> * RE "contrib" vs "module" vs "package", see:
>>>> https://issues.apache.org/jira/browse/SOLR-15917
>>>> * RE not shipping these extras with the Solr distribution, see: "slim
>>>> distro" mention in the document "Solr first party packages"
>>>> https://docs.google.com/document/d/1n7gB2JAdZhlJKFrCd4Txcw4HDkdk7hlULyAZBS-wXrE/edit?usp=sharing
>>>>
>>>> It could very well be worth shipping two docker images in the meantime.
>>>> Or maybe a zip of each module could be a separate artifact that is
>>>> published?  I'm not sure what freedoms we have to do this in the ASF.
>>>>
>>>> ~ David Smiley
>>>> Apache Lucene/Solr Search Developer
>>>> http://www.linkedin.com/in/davidwsmiley
>>>>
>>>>
>>>> On Wed, Jan 12, 2022 at 8:21 PM Shawn Heisey <[email protected]>
>>>> wrote:
>>>>
>>>>> On 1/12/2022 8:31 AM, Jan Høydahl wrote:
>>>>> > I think there are lots of pieces of code in solr-core that can
>>>>> easily be extracted the same way.
>>>>> > Some perhaps even for 9.0.0, as it slims down the core and reduces
>>>>> attack surface for most users as well.
>>>>>
>>>>> I think it would be really awesome if we had a core download that only
>>>>> included basic functionality, and all the other fancy things that Solr
>>>>> does now out of the box (as well as those that are contrib) could be
>>>>> added after download via package scripting or just additional
>>>>> downloads.
>>>>>
>>>>> The size of solr-8.11.1.tgz is 207MiB, or 218076598 bytes.  The .zip
>>>>> version is slightly larger.  8.0.0 was 163MiB, 7.0.0 was 142MiBm,
>>>>> 6.0.0
>>>>> was 131MiB, and 1.4.1 was 53.7MiB.  I think it's insane that the
>>>>> download is so big ... and a lot of what makes it big are things that
>>>>> the vast majority of our users will never use.
>>>>>
>>>>> Large reductions in the overall size of the main download would be
>>>>> possible by putting hadoop, calcite, some of the really large lucene
>>>>> analysis components, and the contrib stuff into packages.  The
>>>>> extraction contrib alone is 43.5MiB compressed in zip format.
>>>>>
>>>>> I would suggest moving zookeeper and its dependencies as well, but I
>>>>> think we probably want SolrCloud to be part of base functionality.
>>>>>
>>>>> Some of the large jars are included for what are probably
>>>>> insignificant
>>>>> usages, and I wonder if that functionality could be replaced by newer
>>>>> native functions available in Java 8 and later.  I am eyeballing
>>>>> things
>>>>> like guava and the commons-* jars here, but I am sure there are other
>>>>> things in this category.  I'd like to eliminate as many dependencies
>>>>> as
>>>>> we can.
>>>>>
>>>>> Extracting some things from the solr-core jar into other jars sounds
>>>>> like a really awesome idea.
>>>>>
>>>>> I don't think the solr-core jar should be in the dist directory.  It's
>>>>> useless by itself, because it will still have a LOT of dependencies
>>>>> even
>>>>> if we shrink it.  And there are likely other things in the dist
>>>>> directory that fall into that category.  The test framework and its
>>>>> dependencies are a good candidate for removal.
>>>>>
>>>>> By removing some of the low-hanging fruit that I am SURE isn't needed
>>>>> for base binary functionality on the 8.11.1 download, I was able to
>>>>> end
>>>>> up with a .zip file sized in at 60.4MiB, and I am sure at least a
>>>>> little
>>>>> bit of further reduction is possible if we can fully map out
>>>>> dependencies.  I think we can leverage gradle to provide some
>>>>> dependency
>>>>> info.
>>>>>
>>>>> Exactly how to organize the code repo to create divided artifacts is
>>>>> something that we would need to think about.  My initial idea is
>>>>> changing "contrib" to "package" and then making some new directories
>>>>> under package.
>>>>>
>>>>> Thanks,
>>>>> Shawn
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: [email protected]
>>>>> For additional commands, e-mail: [email protected]
>>>>>
>>>>>
>>>>
>>>>
>>
>

Reply via email to