Re: --batch-size and --jobs combination

Eric Wong Thu, 05 Aug 2021 04:05:55 -0700

Konstantin Ryabitsev <konstan...@linuxfoundation.org> wrote:
> On Thu, Jul 29, 2021 at 10:06:29PM +0000, Eric Wong wrote:
> > My gut says 1g batch-size seems too high (Xapian has extra
> > overhead) and could still eat too much into the kernel cache
> > (and slow down reads). 100m might be a more reasonable limit
> > for jobs=4 and 128G RAM.
> 
> Okay, I have things up and running on one of the 4 edge nodes. You can access
> it and kick the tires at https://x-lore.kernel.org/. Initial observations:
> 
> - I can't give any kind of reliable numbers for initial importing/indexing, as
>   I was doing it piecemeal for a while to make sure that the indexer hooks
>   were doing the right thing. Besides, this is a live system serving a lot of
>   (static) content from the same partition where the indexing was done, so I/O
>   was routinely under high and unpredictable load. Final import/index took 40+
>   hours, but I'll have more reliable numbers once I do it on 3 other systems.


40 hours seems about reasonable.

> - Performance in /all/ seems laggy at times, probably depending on whether
>   lvmcache has Xapian DBs in SSD cache or not. After a period of laggy
>   performance, speed seems to dramatically improve, which is probably when
>   most of the backend is in cache.

Yeah, SSDs still make a huge difference.  I moved most of my
worktrees and personal mail back to HDDs and it's been an
eye-opening and painful experience on a cold cache.

Try as I might, physics can't be beat :<  (And most of the
stuff that makes us faster on HDDs makes us faster on SSD, too)

As you get more inboxes, git 2.33.0-rc0+ dramatically reduces
memory use with many alternates (and also makes startup time
tolerable).

Reducing loose objects with more frequent packing will probably
be good, too.

> - I will bring up the rest of the nodes throughout the week, so
>   x-lore.kernel.org will become more geoip-balanced. I will share any other
>   observations once I have more data. Once all 4 nodes are up, I will share
>   this more widely with kernel devs so they can kick some tires and report
>   whether they are seeing decreased performance compared to current
>   lore.kernel.org. It's entirely possible that my plan to use
>   mirrors.edge.kernel.org nodes for this isn't one of my brightest ideas, in
>   which case I may bring up several dedicated instances in multiple clouds
>   instead.

Increasing the size of the SSD caches would net the most
dramatic improvement (or going SSD-only).  Even consumer grade
SSDs (MLC/TLC) leave enterprise HDDs in the dust.

Once the initial index is done, the workload is not especially
write-intensive, either, if SSD wear is a concern (I always
mount everything with noatime).

> Thanks for all your work, Eric.

You're welcome and thanks for the support.  It's been a very
rough time, especially with the pandemic still dragging on :<
--
unsubscribe: one-click, see List-Unsubscribe header
archive: https://public-inbox.org/meta/

Re: --batch-size and --jobs combination

Reply via email to