Konstantin Ryabitsev <konstan...@linuxfoundation.org> wrote:
> Hello:
> 
> Is there any specific logic for mixing --batch-size and --jobs? On a system
> with plenty of CPUs and lots of RAM, does it make sense to have more --jobs,
> larger --batch-size, or some balance of both?

jobs will be bound by I/O capability for your case.  SATA-2 vs
SATA-3 vs NVME will have a notable difference, as does the
quality of the device (MLC, TLC, QLC; cache/controller).

Xapian seems to do better with bigger batch-sizes up to a point.
I'm not sure I have enough RAM to accurately test >8m batch
sizes (since we also need to account for kernel caching).

batch-size * (jobs - 1) = rough total batch size

If it's the initial index creation, I would definitely use
--no-fsync, too.  Perhaps that should be the default for new
indices.

Also note: the recent RFC for --sequential-commit doesn't seem to
be working out performance-wise on my SATA-2 system; but I'm also
not sure about SSD life/degradation.
--
unsubscribe: one-click, see List-Unsubscribe header
archive: https://public-inbox.org/meta/

Reply via email to