Re: UpdateLog memory usage on startup

David Smiley Tue, 19 Jul 2022 12:38:42 -0700

The migration you speak of is in Lucene, not Solr.  It would be noticed by
"Watchers".


~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Tue, Jul 19, 2022 at 12:29 PM Ishan Chattopadhyaya <
ichattopadhy...@gmail.com> wrote:

> Great find! Lets have it committed.
>
> On Tue, Jul 19, 2022 at 9:49 PM Bram Van Dam <bram.van...@intix.eu> wrote:
>
> > Thanks for your reply, David. Given the apparent migration from from
> > Jira->Github, I didn't think that would get more response than the
> > mailing list 😅
> >
> > We've been running a patched version of 7.7 with a smaller Versions
> > arraylist for a while now, without any ill effects.
> >
> >   - Bram
> >
> > On 13/07/2022 23.54, David Smiley wrote:
> > > Makes sense Bram.
> > >
> > > I note that it's been over a month with no response.  Just a suggestion
> > --
> > > try commenting on the pertinent JIRA because it will get the attention
> of
> > > the last committer (and interested parties).
> > > BTW we could cap the initial ArrayList size to, say, Math.min(1024,n)
> > >
> > > ~ David Smiley
> > > Apache Lucene/Solr Search Developer
> > > http://www.linkedin.com/in/davidwsmiley
> > >
> > >
> > > On Thu, Jun 9, 2022 at 11:34 AM Bram Van Dam <bram.van...@intix.eu>
> > wrote:
> > >
> > >> Howdy,
> > >>
> > >> We've noticed that enabling larger transaction logs causes the memory
> > >> requirements for Solr to increase: Solr consumes large amounts of
> memory
> > >> at startup.
> > >>
> > >> After procuring a heap dump, this seems to be because Solr initializes
> > >> ArrayLists in UpdateLog::getVersions with size
> > >> maxNbTransactionLogEntries. While this may be more efficient if your
> > >> actual log files are close to maximum size, this wastes memory when
> the
> > >> actual logs are small. This is something that occurs frequently when
> you
> > >> have a small number of shards which receive a lot of writes, and a lot
> > >> of shards which receive few (or no) writes.
> > >>
> > >> We've seen cases where Solr needs an additional 10GiB of memory during
> > >> startup. It gets freed afterwards, but it does make startup painful.
> > >>
> > >> The fix for SOLR-15676 further increased the memory footprint by
> > >> allocating a LongSet of the same size.
> > >>
> > >> public List<Long> getVersions(int n, long maxVersion) {
> > >>         List<Long> ret = new ArrayList<>(n);
> > >>         LongSet set = new LongSet(n);
> > >>
> > >> The naïve fix would be to simply replace this init of new
> ArrayList<>(n)
> > >> with new ArrayList<>(). ArrayList grows its capacity by 50% every time
> > >> it's full, resulting in some extra garbage overhead and extra calls to
> > >> array copy.
> > >>
> > >> A quick bit of napkin math shows that for 10M entries, the array will
> > >> have to be reallocated 35 times. In our case, this is worth the extra
> > >> overhead. In the general case, it might not be?
> > >>
> > >> Does anyone have any further insights?
> > >>
> > >> Thanks,
> > >>
> > >>    - Bram
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > >> For additional commands, e-mail: dev-h...@solr.apache.org
> > >>
> > >>
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > For additional commands, e-mail: dev-h...@solr.apache.org
> >
> >
>

Re: UpdateLog memory usage on startup

Reply via email to