> I should add that running DSP tasks will move the CPU frequency to 330MHz, > so this is probably not the answer to everyone's prayers with regard to > freeing the CPU to do Xvid decoding or the like. There is a kernel patch to > not force the CPU to 330MHz (the DSP runs slower) and I'll do some testing > to see if the DSP task can run in real-time at the lower DSP clock speed. > Then it will be significantly more useful.
Right, I've tested the SBC encoder task with the ARM running at 400MHz (and therefore the DSP running at 133MHz (rather than its top speed of 220MHz with the ARM running at 330MHz). Thanks qwerty for the link to the patch. Anyway the task runs and plays music, but there are far too many drop-outs and the sound gets progressively deeper on the run up to each dropout (due to the encoder being too slow). So it certainly needs more optimisation before it could be considered for this role. > The change which has allowed it to encode an entire song rather than just a > few seconds was to move the input and output buffers from SDRAM (OMAP main > memory) to SRAM (DSP fast single access memory). There are probably other > things which would benefit from being moved, the sbc->priv data (or parts > thereof) for one. This structure is pretty big so I allocated it in SDRAM, > but at least parts of it might be better off in faster local memory. This is > something to look at. I looked at this yesterday evening (thanks to derf, crashanddie, and others for answering my C questions), trying to move some parts of the priv structure to SARAM (sorry for the SRAM typo above). Unfortunately just moving the bare minimum (the X array) won't happen as there's not enough SARAM (so dsp_dld tells me). I don't know where it's all gone, anyone have any ideas? I currently have a fast_in[] array in SARAM to which I copy part of the data from the slow (SDRAM) X[] array in the sbc_analyze_eight/four() fns before it's used in the _sbc_analyze_eight/four() fns. These two fns are inlined, so this memcpy is performed in every loop through the code (called something like 150,000 times in total for my test file iirc). I'm not sure if the faster manipulation of the data makes up for the copy overhead (it is a faster 32bit copy at least). No clocks available, so I'll try removing this "optimisation" and testing what it sounds like. More importantly, if the whole X array could be placed in SARAM, there'd be no need for my memcpy anyway and I'd have the benefits of faster access. I'm not too sure how to analyse the code to work out how much data is allocated in SARAM (to work out if I'm close to fitting it or have no chance). Talking about SARAM, the input and output buffers (which the dsp task uses for bulk transfers) are in SARAM, this is what I changed to make the task play in real-time so this obviously makes a difference. It would be good if I could avoid having to copy from the input buffer into one of the priv structure arrays (which holds the PCM data). This is probably not really a big saving compared to optimising the main loop as the read fn is not called all that often (~5000 times for my test file), but every little helps and obviously did before. The input array is currently read into a 2D array, I need to check and see the array dimensions and whether I could write the data into it directly (and place it in SARAM rather than the input array). The output array has data packed into it, so I'm not sure I'll get any savings from fiddling with this. There may yet be other little bits of code which would benefit from being moved to faster memory (or intrinsic-ised), it's just a bit hard to quantify the memcpy slowdown vs. any possible memory access speedup gains without any way of timing individual parts of the code :( I'm currently revisiting my attempt to re-write the inner loop to use lots of DSP intrinsics and the like in the hope that this will provide some sort of speed up. Again to be tested with the mk1 ear ;) Anyway, that's about where I am. If anyone wants to take a look at the code and suggest possible locations for optimisations I'm all ears :) Thanks for reading, Cheers, Simon _______________________________________________ maemo-developers mailing list maemo-developers@maemo.org https://lists.maemo.org/mailman/listinfo/maemo-developers