On Fri, Nov 12, 2010 at 10:05:28AM +1100, Balint Seeber wrote: > Dear Eric, > > I realised I was actually getting ahead of myself regarding scenario (1), > because - of course - the sample rate means nothing in terms of timing if it > is not a synchronous graph, and as I stated I didn't use Throttle. So the > behaviour in (1) is expected. would you agree?
Yes. > Still not sure about (3) though. Did the graph make it through okay? > > Thanks very much once again, > > Balint > Using the single graph (the one you sent me): Running case (1): htop shows it burning 95% of one core and 25% of another. Seems reasonable to me. (On my 8-core Xeon) I started oprofile, ran the flow graph for a while (> 10s), then looked at the output of opreport: $ opreport --long-filenames --symbols -t 0.5 >/tmp/report It gives the report below, which isn't surprising. That is, 57% of the samples are in ccomplex_dotprod_sse (the innerloop of the gr_fir_ccc_simd_filter, used by the resampler), and 16% are in gr_sig_source_c::work (generating the complex sinusoid). The cycles chargable to the resampler include ccomplex_dotprod_sse, gr_fir_ccc_simd_filter, and gr_rational_resampler_base_ccc, which comes out to ~69%. (It's normalized to total samples counted) CPU: Core 2, speed 3000.07 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000 samples % app name symbol name 17535154 57.4244 /usr/local/lib64/libgnuradio-core-3.3.1git.so.0.0.0 ccomplex_dotprod_sse 4966060 16.2629 /usr/local/lib64/libgnuradio-core-3.3.1git.so.0.0.0 gr_sig_source_c::work(int, std::vector<void const*, std::allocator<void const*> >&, std::vector<void*, std::allocator<void*> >&) 2909663 9.5286 /no-vmlinux /no-vmlinux 2490431 8.1557 /usr/local/lib64/libgnuradio-core-3.3.1git.so.0.0.0 gr_fir_ccc_simd::filter(std::complex<float> const*) 1094391 3.5839 /usr/local/lib64/libgnuradio-core-3.3.1git.so.0.0.0 gr_rational_resampler_base_ccc::general_work(int, std::vector<int, std::allocator<int> >&, std::vector<void const*, std::allocator<void const*> >&, std::vector<void*, std::allocator<void*> >&) 235207 0.7703 /lib64/libpthread-2.12.1.so pthread_mutex_lock Running case (3): htop shows it burning 95% of TWO cores and 25% of another. Also seems reasonable to me. One core for each of the two rational resamplers, and 25% for the rest. $ opreport --long-filenames --symbols -t 0.5 >/tmp/report3 CPU: Core 2, speed 3000.07 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000 samples % app name symbol name 3931690 63.0917 /usr/local/lib64/libgnuradio-core-3.3.1git.so.0.0.0 ccomplex_dotprod_sse 611059 9.8056 /no-vmlinux /no-vmlinux 557861 8.9520 /usr/local/lib64/libgnuradio-core-3.3.1git.so.0.0.0 gr_fir_ccc_simd::filter(std::complex<float> const*) 550223 8.8294 /usr/local/lib64/libgnuradio-core-3.3.1git.so.0.0.0 gr_sig_source_c::work(int, std::vector<void const*, std::allocator<void const*> >&, std::vector<void*, std::allocator<void*> >&) 248420 3.9864 /usr/local/lib64/libgnuradio-core-3.3.1git.so.0.0.0 gr_rational_resampler_base_ccc::general_work(int, std::vector<int, std::allocator<int> >&, std::vector<void const*, std::allocator<void const*> >&, std::vector<void*, std::allocator<void*> >&) 55851 0.8962 /lib64/libpthread-2.12.1.so pthread_mutex_lock 31423 0.5042 /usr/local/lib64/libgnuradio-core-3.3.1git.so.0.0.0 gr_tpb_detail::notify_upstream(gr_block_detail*) In this case, it's about 76% from the two rational resamplers, 9% for the sig gen, and 1.5% scheduler overhead (pthread_mutex_lock and notify_upstream). In reality, the ticks in the kernel should be charged towards overhead too. Is there any chance that you had some kind of power control or frequency scaling going on? If it's a laptop, be sure that it's in "performance mode" and not "I want the battery to last a long time mode" Remember that Amdahl's Law gives the maximum speedup within a given graph. https://secure.wikimedia.org/wikipedia/en/wiki/Amdahl%27s_law In any case, I think that you'll find a combination of htop and oprofile should help shed some light on where the cycles are being burned. Eric _______________________________________________ Discuss-gnuradio mailing list Discuss-gnuradio@gnu.org http://lists.gnu.org/mailman/listinfo/discuss-gnuradio