Makes sense Bram. I note that it's been over a month with no response. Just a suggestion -- try commenting on the pertinent JIRA because it will get the attention of the last committer (and interested parties). BTW we could cap the initial ArrayList size to, say, Math.min(1024,n)
~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Thu, Jun 9, 2022 at 11:34 AM Bram Van Dam <bram.van...@intix.eu> wrote: > Howdy, > > We've noticed that enabling larger transaction logs causes the memory > requirements for Solr to increase: Solr consumes large amounts of memory > at startup. > > After procuring a heap dump, this seems to be because Solr initializes > ArrayLists in UpdateLog::getVersions with size > maxNbTransactionLogEntries. While this may be more efficient if your > actual log files are close to maximum size, this wastes memory when the > actual logs are small. This is something that occurs frequently when you > have a small number of shards which receive a lot of writes, and a lot > of shards which receive few (or no) writes. > > We've seen cases where Solr needs an additional 10GiB of memory during > startup. It gets freed afterwards, but it does make startup painful. > > The fix for SOLR-15676 further increased the memory footprint by > allocating a LongSet of the same size. > > public List<Long> getVersions(int n, long maxVersion) { > List<Long> ret = new ArrayList<>(n); > LongSet set = new LongSet(n); > > The naïve fix would be to simply replace this init of new ArrayList<>(n) > with new ArrayList<>(). ArrayList grows its capacity by 50% every time > it's full, resulting in some extra garbage overhead and extra calls to > array copy. > > A quick bit of napkin math shows that for 10M entries, the array will > have to be reallocated 35 times. In our case, this is worth the extra > overhead. In the general case, it might not be? > > Does anyone have any further insights? > > Thanks, > > - Bram > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org > For additional commands, e-mail: dev-h...@solr.apache.org > >