Re: [FOSSology] How can I get the scheduler to use more CPU time?

Gobeille, Robert Mon, 29 Mar 2010 07:22:08 -0700

Mike,
Something is pretty wrong.  
Check your /var/log/fossology/fossology.log.  I wonder if something is failing 
and you are spending all your time writing to the log.
Also check your Postgresql log file.


bsam-engine should be sucking up nearly 100% of each core per thread.
Restarting the scheduler may make bsam restart the license analysis from 
scratch.  When it does so the items counter does not get reset.  It should 
restart from 0 but does not.
In your top display you are using 26.1% of your cpus.  I'm guessing from your 
starting 4 license threads that you have 4 cores.  If so, what process(es) are 
responsible for that 26%?  Is it postgres?
Do you have the postgresql conf set to do auto vacuum.  If not, you might want 
to do a vacuum analyze.

I don't see where to download your source without getting it piece by piece 
from your Mecurial repository index.  Is there a single source iso or tar or 
???  I could run this on our public  fossology 1.1 server and on 1.2.  To give 
you an example of how long things take on 1.2, I just ran a very small Red Hat 
tar file with 17,209 files.  Unpacking took 5:58 (min:sec), and the new license 
scanner took 20 minutes.

Bob Gobeille
b...@fossology.org


On Mar 29, 2010, at 7:17 AM, Mike Kinghan wrote:

> Hi fossologists,
> 
> Well, the foss v1.1 licence analysis job that I wrote about at the start of 
> this thread to analyse the Symbian kernel source, 7497 files, is still 
> running after ~40 days elapsed with the machine up about 2/3 of the total 
> time. 
> 
> In that period, it has clocked up only 04:06 hrs scheduled time and 03:54 hrs 
> running time. There is no evidence of the scheduler "getting stuck". The job 
> is always clocking up more "items" processed  - just extremely slowly - and 
> `top -u fossy` always shows the scheduler, the fo-watchdog and the bsam 
> engine accumulating CPU time - just extremely slowly. At no time is the 
> system overloaded. Both CPUs are normally idling and < 30% of the 3GB memory 
> is in normally in use.
> 
> The bsam engine gets less than 1/6 of the CPU time that the scheduler itself 
> gets. E.g. right now, top -u fossy shows:
> 
> Tasks: 197 total,   1 running, 196 sleeping,   0 stopped,   0 zombie
> Cpu(s): 26.1%us,  6.5%sy,  0.0%ni, 66.7%id,  0.0%wa,  0.5%hi,  0.2%si,  0.0%st
> Mem:   3092484k total,  1504736k used,  1587748k free,    88888k buffers
> Swap:  1767108k total,        0k used,  1767108k free,   548452k cached
> 
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND          
>   
>  1740 fossy      20    0  74780  68m  1756 S    0       2.3         0:06.40  
> fossology-sched    
>  1742 fossy      20    0  7660   2812  1672 S    0       0.1         0:00.02  
> fo_watchdog        
>  2258 fossy      20    0  9280   3256  2124 S    0       0.1         0:00.52  
> bsam-engine
> 
> I have experimented with the scheduler.conf file to try to coax the job to 
> consume more cycles (restarting the scheduler each time).  I tried changing 
> the default: "%Host localhost 1 1" to "%Host localhost 4 1" and in 
> conjunction with the latter, replicated the line the "agent=licence..." line 
> 4 times, so that 4 bsam engine processes where allowed to run concurrently. I 
> also tried removing the -E parameter from the bsam engine commandline to 
> disable exhaustive matching. I also tried changing the "nice" priority of all 
> the fossy and postgresql processes from 0 to -5.
> 
> None of these changes made any perceptible difference.
> 
> I have allowed the job to carry on this long because the "items" counter of 
> the license stage seemed to give me a way of perceiving progress toward the 
> end. I assume that the "items" recorded by the license job-stage are 
> comparisons of license+file pairs. Then since there are 359 licenses in the 
> database and 7497 files in the tarball, there can only be 2,691,423 
> comparisons.
> 
> But the items counter surpassed that total sometime over the weekend and now 
> stands at 2,844,760, so I don't know what the items are or how many more 
> there might be.
> 
> I know that v1.2 offers very much faster licence analysis than 1.1, but the 
> interminable runtime of this job is apparently not governed by the speed of 
> license analysis. It is due to the fact that I cannot get fossology to employ 
> more than a minute fraction of the available processor bandwidth. Is there 
> reason to suppose v1.2 would do better in this respect?
> 
> Rather than start that experiment myself, I'd be very grateful if someone on 
> the project would download my kernel tarball - and one other that we'd like 
> to benchmark even more - run the license analysis on them on a single 2-4 
> core Debian or Ubuntu system, and show it can be done in say < 48 hrs per 
> package; then tell me exactly how!  Symbian is prepared to invest in heftier 
> kit to run fossology, but only if the software makes decently efficient use 
> of it.
> 
> Are there any takers for this? The entire ~35M LOC of the Symbian 
> distribution is open source and you are heartily welcome to use as it as 
> experimental fodder for fossology. 

_______________________________________________
fossology mailing list
fossology@fossology.org
http://fossology.org/mailman/listinfo/fossology

Re: [FOSSology] How can I get the scheduler to use more CPU time?

Reply via email to