On Thu, Nov 21, 2013 at 08:15:08AM +0100, Ulf Samuelsson wrote: > 2013-11-21 01:19, Martin Jansa skrev: > > On Wed, Nov 20, 2013 at 11:43:13PM +0100, Ulf Samuelsson wrote: > >> 2013-11-20 22:29, Richard Purdie skrev: > >> Another idea: > >> > >> I suspect that there is a lot of unpacking and patching of recipes > >> for the target when the native stuff is built. > >> Does it make sense to have multiple threads reading the disk, for > >> the target recipes during the native build or will we just lose out > >> due to seek time? > >> > >> Having multiple threads accessing the disk, might force the disk to spend > >> most of its time seeking. > >> Found an application which measures seek time performance, > >> and my WD Black will do 83 seeks per second, and my SAS disk will do > >> twice that. > >> The RAID of two SAS disks will provide close to SSD throughput (380 MB/s) > >> but seek time is no better than a single SAS disk. > >> > >> Since there is "empty time" at the end of the native build, does it make > >> sense > >> to minimize unpack/patch of target stuff when we reach that point, and > >> then we let loose? > > In my benchmarks increasing PARALLEL_MAKE till number of cores was > > significantly improving build time, but BB_NUMBER_THREADS had minimal > > influence somewhere above 6 or 8 (tested on various systems, even only 4 was > > optimum on my older RAID-0 and 2 on single disk). > > Of course it was quite different for clean build without sstate > > prepopulated and build where most of the stuff was reused from sstate. > > > > see http://wiki.webos-ports.org/wiki/OE_benchmark > > How many cores do you have in your build machine?
The one used in OE_benchmark has 8, my local builder also 8, I got the same results on machines with 32 and 48 cores. My experience (which can be different than what you see), is that PARALLEL_MAKE scales well with number of cores, but BB_NUMBER_THREADS is more or less limited by I/O performance, so even when the machine has 48 cores, it doesn't say anything about running 48 do_populate or do_package tasks at the same time causing avalanche of seeks. The other extreme is when all 48 BB threads are in do_compile and you can get 48x48 gcc processes which again doesn't work well on machine with 48 cores. with PARALLEL_MAKE = "-j32" BB_NUMBER_THREADS = "6" and very big image build, I see all cores well used most of the time. > I started a build, and after 20 minutes it had completed 1500 tasks using: > > PARALLEL_MAKE = "-j24" > BB_NUMBER_THREADS = "6" > > The I decided to kill it. > > When I did > PARALLEL_MAKE = "-j12" > BB_NUMBER_THREADS = "24" > > It completed 2000 tasks in less than half the time. You should have finish whole image, you can get 2000 tasks sooner (tasks like fetch/unpack/patch) but then you're still waiting for the rest, with smaller BB_NUMBER_THREADS it seems to spread tasks more evenly (doing more fetch/unpack/patch tasks later when CPUs are busy compiling something, which is good for I/O). > This does not use tmpfs though. > Do you have any comparision between tmpfs builds and RAID builds? I've sent it to ML few months ago, cannot find it now. > I currently do not use INHERIT += "rm_work" > since I want to be able to do changes on some packages. > Is there a way to defined rm_work on a package basis? > Then the majority of the packages can be removed. > > I use 75 GB without "rm_work" Understood, in my scenario I want to build world as soon as possible, keep sstate, record issues and forget about BUILDDIR. -- Martin 'JaMa' Jansa jabber: martin.ja...@gmail.com
signature.asc
Description: Digital signature
_______________________________________________ Openembedded-core mailing list Openembedded-core@lists.openembedded.org http://lists.openembedded.org/mailman/listinfo/openembedded-core