Mike, Something is pretty wrong. Check your /var/log/fossology/fossology.log. I wonder if something is failing and you are spending all your time writing to the log. Also check your Postgresql log file.
bsam-engine should be sucking up nearly 100% of each core per thread. Restarting the scheduler may make bsam restart the license analysis from scratch. When it does so the items counter does not get reset. It should restart from 0 but does not. In your top display you are using 26.1% of your cpus. I'm guessing from your starting 4 license threads that you have 4 cores. If so, what process(es) are responsible for that 26%? Is it postgres? Do you have the postgresql conf set to do auto vacuum. If not, you might want to do a vacuum analyze. I don't see where to download your source without getting it piece by piece from your Mecurial repository index. Is there a single source iso or tar or ??? I could run this on our public fossology 1.1 server and on 1.2. To give you an example of how long things take on 1.2, I just ran a very small Red Hat tar file with 17,209 files. Unpacking took 5:58 (min:sec), and the new license scanner took 20 minutes. Bob Gobeille b...@fossology.org On Mar 29, 2010, at 7:17 AM, Mike Kinghan wrote: > Hi fossologists, > > Well, the foss v1.1 licence analysis job that I wrote about at the start of > this thread to analyse the Symbian kernel source, 7497 files, is still > running after ~40 days elapsed with the machine up about 2/3 of the total > time. > > In that period, it has clocked up only 04:06 hrs scheduled time and 03:54 hrs > running time. There is no evidence of the scheduler "getting stuck". The job > is always clocking up more "items" processed - just extremely slowly - and > `top -u fossy` always shows the scheduler, the fo-watchdog and the bsam > engine accumulating CPU time - just extremely slowly. At no time is the > system overloaded. Both CPUs are normally idling and < 30% of the 3GB memory > is in normally in use. > > The bsam engine gets less than 1/6 of the CPU time that the scheduler itself > gets. E.g. right now, top -u fossy shows: > > Tasks: 197 total, 1 running, 196 sleeping, 0 stopped, 0 zombie > Cpu(s): 26.1%us, 6.5%sy, 0.0%ni, 66.7%id, 0.0%wa, 0.5%hi, 0.2%si, 0.0%st > Mem: 3092484k total, 1504736k used, 1587748k free, 88888k buffers > Swap: 1767108k total, 0k used, 1767108k free, 548452k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > > 1740 fossy 20 0 74780 68m 1756 S 0 2.3 0:06.40 > fossology-sched > 1742 fossy 20 0 7660 2812 1672 S 0 0.1 0:00.02 > fo_watchdog > 2258 fossy 20 0 9280 3256 2124 S 0 0.1 0:00.52 > bsam-engine > > I have experimented with the scheduler.conf file to try to coax the job to > consume more cycles (restarting the scheduler each time). I tried changing > the default: "%Host localhost 1 1" to "%Host localhost 4 1" and in > conjunction with the latter, replicated the line the "agent=licence..." line > 4 times, so that 4 bsam engine processes where allowed to run concurrently. I > also tried removing the -E parameter from the bsam engine commandline to > disable exhaustive matching. I also tried changing the "nice" priority of all > the fossy and postgresql processes from 0 to -5. > > None of these changes made any perceptible difference. > > I have allowed the job to carry on this long because the "items" counter of > the license stage seemed to give me a way of perceiving progress toward the > end. I assume that the "items" recorded by the license job-stage are > comparisons of license+file pairs. Then since there are 359 licenses in the > database and 7497 files in the tarball, there can only be 2,691,423 > comparisons. > > But the items counter surpassed that total sometime over the weekend and now > stands at 2,844,760, so I don't know what the items are or how many more > there might be. > > I know that v1.2 offers very much faster licence analysis than 1.1, but the > interminable runtime of this job is apparently not governed by the speed of > license analysis. It is due to the fact that I cannot get fossology to employ > more than a minute fraction of the available processor bandwidth. Is there > reason to suppose v1.2 would do better in this respect? > > Rather than start that experiment myself, I'd be very grateful if someone on > the project would download my kernel tarball - and one other that we'd like > to benchmark even more - run the license analysis on them on a single 2-4 > core Debian or Ubuntu system, and show it can be done in say < 48 hrs per > package; then tell me exactly how! Symbian is prepared to invest in heftier > kit to run fossology, but only if the software makes decently efficient use > of it. > > Are there any takers for this? The entire ~35M LOC of the Symbian > distribution is open source and you are heartily welcome to use as it as > experimental fodder for fossology. _______________________________________________ fossology mailing list fossology@fossology.org http://fossology.org/mailman/listinfo/fossology