Shawn:
* RE redundancies of stuff in /dist/, see
https://issues.apache.org/jira/browse/SOLR-15916
* RE "contrib" vs "module" vs "package", see:
https://issues.apache.org/jira/browse/SOLR-15917
* RE not shipping these extras with the Solr distribution, see: "slim
distro" mention in the document "Solr first party packages"
https://docs.google.com/document/d/1n7gB2JAdZhlJKFrCd4Txcw4HDkdk7hlULyAZBS-wXrE/edit?usp=sharing

It could very well be worth shipping two docker images in the meantime.
Or maybe a zip of each module could be a separate artifact that is
published?  I'm not sure what freedoms we have to do this in the ASF.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Wed, Jan 12, 2022 at 8:21 PM Shawn Heisey <[email protected]> wrote:

> On 1/12/2022 8:31 AM, Jan Høydahl wrote:
> > I think there are lots of pieces of code in solr-core that can easily be
> extracted the same way.
> > Some perhaps even for 9.0.0, as it slims down the core and reduces
> attack surface for most users as well.
>
> I think it would be really awesome if we had a core download that only
> included basic functionality, and all the other fancy things that Solr
> does now out of the box (as well as those that are contrib) could be
> added after download via package scripting or just additional downloads.
>
> The size of solr-8.11.1.tgz is 207MiB, or 218076598 bytes.  The .zip
> version is slightly larger.  8.0.0 was 163MiB, 7.0.0 was 142MiBm, 6.0.0
> was 131MiB, and 1.4.1 was 53.7MiB.  I think it's insane that the
> download is so big ... and a lot of what makes it big are things that
> the vast majority of our users will never use.
>
> Large reductions in the overall size of the main download would be
> possible by putting hadoop, calcite, some of the really large lucene
> analysis components, and the contrib stuff into packages.  The
> extraction contrib alone is 43.5MiB compressed in zip format.
>
> I would suggest moving zookeeper and its dependencies as well, but I
> think we probably want SolrCloud to be part of base functionality.
>
> Some of the large jars are included for what are probably insignificant
> usages, and I wonder if that functionality could be replaced by newer
> native functions available in Java 8 and later.  I am eyeballing things
> like guava and the commons-* jars here, but I am sure there are other
> things in this category.  I'd like to eliminate as many dependencies as
> we can.
>
> Extracting some things from the solr-core jar into other jars sounds
> like a really awesome idea.
>
> I don't think the solr-core jar should be in the dist directory.  It's
> useless by itself, because it will still have a LOT of dependencies even
> if we shrink it.  And there are likely other things in the dist
> directory that fall into that category.  The test framework and its
> dependencies are a good candidate for removal.
>
> By removing some of the low-hanging fruit that I am SURE isn't needed
> for base binary functionality on the 8.11.1 download, I was able to end
> up with a .zip file sized in at 60.4MiB, and I am sure at least a little
> bit of further reduction is possible if we can fully map out
> dependencies.  I think we can leverage gradle to provide some dependency
> info.
>
> Exactly how to organize the code repo to create divided artifacts is
> something that we would need to think about.  My initial idea is
> changing "contrib" to "package" and then making some new directories
> under package.
>
> Thanks,
> Shawn
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Reply via email to