I would also add a tangential question (rather than answers at this point): What makes a module(contrib) a module(contrib)? *From now on I'll use 'module' where I intend a package under contrib.*
I am referring to first-party modules such as ltr or langid. My initial understanding was that a module in contrib, is an integration with some external dependency (like langid with OpenNLP, Tika or langdetect). But then, why is *ltr* a module? It doesn't really integrate with any external dependency. It's additional query parsers and components for a key Solr functionality. Is it just a legacy consequence of the fact that initially, Bloomberg contributed the module? Maybe this applies to other modules as well (analytics?). Then, should this be fixed and brought inside the Solr core? And what about first party/third party modules? I don't think there's any visible difference right now, but in case we want to make a difference, should we create a sort of official "Solr Plugin Marketplace" ? (I proposed the idea to Lucidworks many years ago when I was working for a partner, and for a certain amount of time, I think there was a Solr Plugin Marketplace, but it was proprietary). I am curious to understand what you think about this and then reason about the naming convention. Cheers -------------------------- Alessandro Benedetti Apache Lucene/Solr PMC member and Committer Director, R&D Software Engineer, Search Consultant www.sease.io On Fri, 21 Jan 2022 at 10:47, Jan Høydahl <jan....@cominvent.com> wrote: > Hi, > > In > https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming > I suggested standardizing contrib/module names. We did not discuss it in > yesterday's committer meeting, and it may be a bit too much for 9.0. But > I'd like to discussed, since we are anyway renaming everything in > SOLR-15917 "contrib->module". > With as few contribs as we had so far it has not really been an issue. But > the reason I suggested it is because I anticipate a huge growth in number > of modules/packages during 9.x, and it can get messy. Another reason for > having a convention is that it forces the module/package creator to think > through whether the proposed module has the right granularity. Take for > instance the new "HDFS" or "Hadoop" module. It won't fit into either of my > proposed types, as it contains both a directoryFactory, one or two > authentication plugins and one backup repository. That of course suggests > that the module is too big and should be divided. Another reason is that > when we have 50 modules / packages it would be far better for users to be > able to find all backup repositories by looking for backup-* rather than > guess from naming what it is. Perhaps a bad example since both repo > contribs have a suffix "-repository" today. But then "-repository" is not > as user friendly as "backup-". > > So I guess I'd like your opinion on > > 1) Do we even want a convention (at least for our own code?) > 2) If yes, should we rename the contribs/modules for 9.0 when we throw > them around anyway? > 3) When we start adding package manifests to the modules, should there be > a 1:1 between module name and package name? > > Refarding the last point, we could apply such standardized naming > convention for the packages only and leave module names as-is, i.e. you'd > do "solr package install update-extraction" even if the module name is " > extraction". > > Jan >