RE: [maemo-developers] Xvideo support for Nokia 770?
> >2. Reliable information that is detailed enough for performing graphics > >and audio output from DSP, see > >http://maemo.org/pipermail/maemo-developers/2007-February/007949.html > > Again throwing some options in the air > 1. Dump DSP. If I remember the driver for the audio chip is > in the omap tree. So (probably never tried) is to remove all > dsp related components from the image, and then have > everything purely on ARM [maybe worth a shot] 2. use ALSA DSP It seems a shame to drop use of the DSP. It may be generally hard to program well, but even off-loading some parts (of otherwise ARM code) to the DSP will free up the ARM CPU to do more. Add to this the interest involved in hacking/writing code for the DSP and this is something I certainly want to pursue. As a final point, if the hardware is there, don't you find it frustrating not being able to use it? DSP, IVA, 2D/3D acceleration - all just sat there waiting to be exploited (if we can find out how to do so)! It would just be nice if the DSP learning curve could be made a little less steep (and less opaque). With that said I'll keep fiddling with the DSP tools in the hope I find the right combination. Cheers, Simon ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers
RE: [maemo-developers] Xvideo support for Nokia 770?
>-Original Message- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED] On Behalf Of ext >Siarhei Siamashka >Sent: 08 February, 2007 09:16 >To: maemo-developers@maemo.org >Subject: Re: [maemo-developers] Xvideo support for Nokia 770? > >Hello, > >It would be probably a good idea to discuss different >possibilities for improving multimedia support on 770/N800. > >Now we have a fast JIT scaler that runs on ARM core, it solves >all the video resolution related performance problems. I'm >going to work on improving quality, performance and its >inclusion into upstream ffmpeg library, this task is in my >nearest plans: >http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/2007-January/0 >51209.html > >As for the ways of improving multimedia support on Nokia 770, >it may be done in the following ways (in no particular order): >1. Continue ffmpeg optimizations (motion compensation >functions, finetune idct, have a look at the possibilities to >optimize codecs other than mpeg4 and its variants) 2. >Implement Xvideo extension support for Nokia 770 (using >scaling done on ARM core) 3. Implement XvMC in some way (using >C55x DSP for it as it is supposedly good for IDCT and motion >compensation stuff) 4. Improve GStreamer plugins (replacements >for dspfbsink and dspmpeg4sink running on ARM core, it could >probably improve mpeg4 playback performance a lot and allow >using higher video bitrates and resolutions that are currently >available in MPlayer) 5. Try to relay color format conversion >and scaling to DSP. If it works as expected, video scaling can >be done with almost zero overhead for ARM core. Theoretically >the same trick could probably also work for GStreamer if video >output sink can provide its own buffer (::buffer_alloc). The >first step would be to try just doing nonscaled color format >conversion. If it is successful, some more advanced stuff can >be tried such as JIT dynamic code generation on C55x. >6. Try porting vorbis decoder (tremor) to DSP 7. Try porting >libmpeg2 to DSP. With audio decoding and scaling done on ARM >core, it might improve overall mpeg2 playback performance, I >wonder if nonconverted DVD video playback is even >theoretically possible on Nokia 770. > >That's quite a big list and it contains some things that might >be generally nice to have, but have relatively low practical >value and are actually not worth efforts implementing :) > >There are two issues that need to be solved for this all to >become reality: >1. We need some way of applying community developed upgrades >for core system components such as xserver and xlib (if we go >after Xvideo support on Nokia 770). They must be easy to >install by end users, otherwise this all development does not >make much sense. It would be also nice to integrate these >improvements into official firmware later, but I wonder if >Nokia has spare resources for doing this integration and its >quality assurance. >2. Reliable information that is detailed enough for performing >graphics and audio output from DSP, see >http://maemo.org/pipermail/maemo-developers/2007-February/007949.html Again throwing some options in the air 1. Dump DSP. If I remember the driver for the audio chip is in the omap tree. So (probably never tried) is to remove all dsp related components from the image, and then have everything purely on ARM [maybe worth a shot] 2. use ALSA DSP Devesh >___ >maemo-developers mailing list >maemo-developers@maemo.org >https://maemo.org/mailman/listinfo/maemo-developers > ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers
Re: [maemo-developers] Xvideo support for Nokia 770?
Siarhei Siamashka wrote: 6. Try porting vorbis decoder (tremor) to DSP Thanks to Johannes Sandvall and Erik Montnémery this was already done http://fanoush.wz.cz/maemo/sandvall-thesis.pdf http://fanoush.wz.cz/maemo/sandvall-tremor.patch we just need a way to output audio fom dsp task and compile this thing. Frantisek ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers
Re: [maemo-developers] Xvideo support for Nokia 770?
Hello, It would be probably a good idea to discuss different possibilities for improving multimedia support on 770/N800. Now we have a fast JIT scaler that runs on ARM core, it solves all the video resolution related performance problems. I'm going to work on improving quality, performance and its inclusion into upstream ffmpeg library, this task is in my nearest plans: http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/2007-January/051209.html As for the ways of improving multimedia support on Nokia 770, it may be done in the following ways (in no particular order): 1. Continue ffmpeg optimizations (motion compensation functions, finetune idct, have a look at the possibilities to optimize codecs other than mpeg4 and its variants) 2. Implement Xvideo extension support for Nokia 770 (using scaling done on ARM core) 3. Implement XvMC in some way (using C55x DSP for it as it is supposedly good for IDCT and motion compensation stuff) 4. Improve GStreamer plugins (replacements for dspfbsink and dspmpeg4sink running on ARM core, it could probably improve mpeg4 playback performance a lot and allow using higher video bitrates and resolutions that are currently available in MPlayer) 5. Try to relay color format conversion and scaling to DSP. If it works as expected, video scaling can be done with almost zero overhead for ARM core. Theoretically the same trick could probably also work for GStreamer if video output sink can provide its own buffer (::buffer_alloc). The first step would be to try just doing nonscaled color format conversion. If it is successful, some more advanced stuff can be tried such as JIT dynamic code generation on C55x. 6. Try porting vorbis decoder (tremor) to DSP 7. Try porting libmpeg2 to DSP. With audio decoding and scaling done on ARM core, it might improve overall mpeg2 playback performance, I wonder if nonconverted DVD video playback is even theoretically possible on Nokia 770. That's quite a big list and it contains some things that might be generally nice to have, but have relatively low practical value and are actually not worth efforts implementing :) There are two issues that need to be solved for this all to become reality: 1. We need some way of applying community developed upgrades for core system components such as xserver and xlib (if we go after Xvideo support on Nokia 770). They must be easy to install by end users, otherwise this all development does not make much sense. It would be also nice to integrate these improvements into official firmware later, but I wonder if Nokia has spare resources for doing this integration and its quality assurance. 2. Reliable information that is detailed enough for performing graphics and audio output from DSP, see http://maemo.org/pipermail/maemo-developers/2007-February/007949.html ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers
Re: [maemo-developers] Xvideo support for Nokia 770?
On Wednesday 10 January 2007 01:51, Charles 'Buck' Krasic wrote: > Siarhei Siamashka wrote: > > Actually I have been thinking about trying to implement Xvideo > > support on 770 for some time already. Now as N800 has Xvideo > > support, it would be nice to have it on 770 as well for better > > consistency and software compatibility. > > As you may recall, I was considering this back in August/September. > I tried a few things, and reported some of my findings to this list. > The code for all that is still available here: > http://qstream.org/~krasic/770/dsp/ Yes, sure I remember. Thanks for doing these experiments and making the results available. It really helps to have more information around. > > I see the following possible options: > > > > 1. Implement it just using ARM core and optimize it as much as > > possible (using dynamically generated code for scaling to get the > > best performance). Is quite a straightforward solution and only > > needs time to implement it. > > It is my impression that this might be the most attractive option. > I noticed that TCPMP which seems to be the most performant player for > the ARM uses this approach, and it is available under GPL, so it may > be possible to adapt some of its code. > > In the long run, I would hope that integrating TCPMP scaling code into > libswscale of the ffmpeg project might be the most elegant approach, > since that seems to be the most performant/featureful/widel adopted > open-source scaling code (but not yet on ARM). For mplayer, it works > out of the box, since libswcale actually originated from mplayer, and > only recently migrated to ffmpeg. I see, thanks for the information (I checked TCPMP sources some time ago, but was interested in runtime cpu capabilities detection code and did not look at the scaler that time). Using TCPMP code may be an interesting option. But I also still may try to make my own scaler implementation for two reasons: 1. TCPMP is covered by GPL license, and most parts of ffmpeg are LGPL, so probably it makes sense making a clean room implementation of JIT powered scaler for ARM under LGPL license 2. I'm worried about the performance. Knowing how the cache and write buffer work on arm926 core, it is possible to tune generated code for it and get the best performance possible. So the results can be better than for TCPMP. I have just committed some initial assembly optimizations for unscaled yuv420p -> yuyv422 color format convertor to maemo mplayer SVN. It already provides some performance improvement, for example on my test video file (640x480 resolution, 24 fps) I get the following results now: BENCHMARKs: VC: 114.526s VO: 21.055s A: 0.000s Sys: 1.582s = 137.163s BENCHMARK%: VC: 83.4962% VO: 15.3503% A: 0.% Sys: 1.1535% = 100.% We can compare it with the older results (decoding time was also improved a bit since that time because of recent assembly optimizations for dequantizer): http://maemo.org/pipermail/maemo-developers/2006-December/006646.html BENCHMARKs: VC: 121.282s VO: 31.538s A: 0.000s Sys: 1.577s = 154.397s BENCHMARK%: VC: 78.5517% VO: 20.4267% A: 0.% Sys: 1.0216% = 100.% Most of the speed improvement in color conversion and video output (VO: part) is gained just from loop unrolling and avoiding using some extra instructions as gcc does when compiling C code, but using STMD instruction to store 16 bytes at once at aligned location [1] provides at least 10% performance here. If we estimate memory copy speed here with additional colorspace conversion applied, it is about 70MB/s now for 640x480 24 fps video (though we need to read a bit less data than write here, so it is a bit different from memcpy). And I have observed peak memcpy performance about 110MB/s on Nokia 770. So this color convertor is quite close to memory bandwidth limit now. This code can be optimized more by processing two image lines at once, so we can get rid of some data read instructions and improve performance. Also experimenting with prefetch reads may provide some improvement. JIT generated code should have a bit worse performance, but not much. It we decide to make 'nearest neghbour' scaling, the result should be probably as fast as this nonscaled conversion. But I want to try some simplified variation of bilinear scaling: each pixel in the destination buffer is either a copy of some pixel in the source buffer or an average value of two pixels. This way it should only introduce two extra instructions for each byte in output at maximum: addition of two pixel color components and right shift. > > 2. Try using dsp tasks that already exist on the device and are > > used for dspfbsink. But the sources of gst plugins contain code > > that limits video resolution for dspfbsink. I wonder if this check > > was introduced artificially or it is the limitation of DSP scaler > > and it can't handle anything larger than that. Also I wonder if > > existing video scaler DSP task can support direct re
RE: [maemo-developers] Xvideo support for Nokia 770?
> As you may recall, I was considering this back in August/September. > I tried a few things, and reported some of my findings to this list. > The code for all that is still available here: > http://qstream.org/~krasic/770/dsp/ > > 2. Try using dsp tasks that already exist on the device and > are used > > for dspfbsink. But the sources of gst plugins contain code > that limits > > video resolution for dspfbsink. I wonder if this check was > introduced > > artificially or it is the limitation of DSP scaler and it > can't handle > > anything larger than that. Also I wonder if existing video > scaler DSP > > task can support direct rendering [2]. > > I tried direct rendering in the above mentioned > experimentation. I never got it to work exactly correctly, > i.e. I could get images fragments on the screen, but they > were not the whole image, and never > in exactly the correct screen position. I suspected this was tied to > the baroque memory addressing constraints of the DSP (e.g. 16bit data > item limitations). I tried very hard to work around them but was not > successful. Was this the demo_fb task, or something different? I see that demo_console has been compiled (in dspgw-3.3-dsp/apps/demo_mod), but I can't see demo_fb having been compiled in situ (dspgw-3.3-dsp/apps/demo). If it was something different, could you point me to the code please? I ask as I'm trying to get the demo_fb code to work. Demo_console works fine and outputs the message to the screen, but demo_fb complains with the following message: # ./demo_fb fbadr=30 open: Device or resource busy Anyone have any ideas why this might be? I assume this is caused by the open() call in the arm-side demo_fb app (see dspgw-3.3-arm/apps/demo): fd = open("/dev/dsptask/demo_fb", O_RDWR); I'm just not sure what would cause the busy message when the demo_console runs fine before and after I try demo_fb. I altered the demo_fb.c code slightly to add an if defined() statement for the Nokia 770, which I hope should set the screen dimensions correctly. I must add that I've not tried it without this modification, but will do so this evening to check. I also pulled the framebuffer address out of /lib/dsp/avs_kernelcfg.cmd on the 770. Is this the address I should use? > > 3. Try implementing a new DSP based scaler from scratch. The most > > important thing to know is how to access framebuffer > directly from DSP > > and move data to it from mapped buffer without any overhead. > > The first test implementation can just perform nonscaled planar > > YV12 -> packed YUV422 conversion, if it proves to be fast > and useful, > > it could be extended to also support scaling. > > > This is what I did in August. I did YUV -> YUV scaling plus RGB > conversion on the DSP. I think I did YUV->YUV scaling later. The > results (performance) were abysmal. Maybe I committed some mortal > DSP programming sins that dragged the performance down, but it was soo > slow I gave up even hoping. I think my DSP code was maxed out on the > DSP at like 20 fps, where the ARM was able to do 24fps with > about 10-20% cpu. > > Anyway, my code is still there which may be a start if you want to > attempt it. However, I think your first option is probably the most > fruitful option.My little project made me very cynical of the > value of the DSP. ;-) Again, could you give me a pointer to the directory under which to find this code? Thanks, Simon ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers
Re: [maemo-developers] Xvideo support for Nokia 770?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Siarhei Siamashka wrote: > On Tuesday 09 January 2007 20:59, Charles 'Buck' Krasic wrote: > >> Any chance the Xvideo support in the Bora 3.0 will turn up in a >> 770 OS? > > > I asked the same question on #maemo irc channel and daniels > explained that video scaling is done by gpu on N800, so probably > the same code can't be reused on 770: > https://mg.pov.lt/maemo-irclog/%23maemo.2007-01-08.log.html > > Actually I have been thinking about trying to implement Xvideo > support on 770 for some time already. Now as N800 has Xvideo > support, it would be nice to have it on 770 as well for better > consistency and software compatibility. As you may recall, I was considering this back in August/September. I tried a few things, and reported some of my findings to this list. The code for all that is still available here: http://qstream.org/~krasic/770/dsp/ > > I see the following possible options: > > 1. Implement it just using ARM core and optimize it as much as > possible (using dynamically generated code for scaling to get the > best performance). Is quite a straightforward solution and only > needs time to implement it. It is my impression that this might be the most attractive option. I noticed that TCPMP which seems to be the most performant player for the ARM uses this approach, and it is available under GPL, so it may be possible to adapt some of its code. In the long run, I would hope that integrating TCPMP scaling code into libswscale of the ffmpeg project might be the most elegant approach, since that seems to be the most performant/featureful/widel adopted open-source scaling code (but not yet on ARM). For mplayer, it works out of the box, since libswcale actually originated from mplayer, and only recently migrated to ffmpeg. > > 2. Try using dsp tasks that already exist on the device and are > used for dspfbsink. But the sources of gst plugins contain code > that limits video resolution for dspfbsink. I wonder if this check > was introduced artificially or it is the limitation of DSP scaler > and it can't handle anything larger than that. Also I wonder if > existing video scaler DSP task can support direct rendering [2]. I tried direct rendering in the above mentioned experimentation. I never got it to work exactly correctly, i.e. I could get images fragments on the screen, but they were not the whole image, and never in exactly the correct screen position. I suspected this was tied to the baroque memory addressing constraints of the DSP (e.g. 16bit data item limitations). I tried very hard to work around them but was not successful. I think the benefits of direct rendering may be a false temptation on the DSP anyway.My impression was that the DSP access to framebuffer memory slowed down the scaling algorithm tremendously, so it was actually faster to scale into DSP local memory, and then do a fast bulk copy to the FB, or to SDRAM on the ARM side.Plus you have all the AV synchronization headaches. I think these gains pale compared to the gain from just using the fb in YUV mode, and doing all the video stuff on the ARM side. Hence, option 1 seems to sound very attractive. > It would need to support arbitrary number of memory mapped buffers > for video output in order to avoid unnecessary memcpy, otherwise > performance will suffer. > > Maybe we can ask Nokia developers to provide some information about > the internals of these plugins. The most important questions are: * > What are the real capabilities of DSP based scaler, can it be used > for resolutions let's say up to 800x480? I doubt 800x480. The added quality benefit over 400x240 with pixel doubling in the fb is probably way to marginal to justify the effort. The DSP hardware doesn't seem to have any meaningful support for general scaling (beyond doubling). > * Where is the screen update performed after dsp has finished > scaling/converting video from mapped buffer to framebuffer? Is it > done on ARM side, or probably screen update can be also triggered > from DSP directly? I seem to have the rough impression from inspecting X code that ARM side does the final update (copy) to fb memory. I'm not 100% sure on that right now though. > * Is it possible to get direct rendering [2] support with existing > dsp tasks on 770? If not, would it be too hard to implement this > feature? * How are timestamps handled in dsp? Is it possible to > just send a one shot signal to dsp task for rendering video frame > from a mapped buffer as fast as possible? > > A brief dsp interface description would be welcome. Maybe some > questions may be trivial, but unfortunately I did not have much > time for a detailed walk through the sources in order to figure out > how this all works. If any Nokia developer finds time for some > short answers, it would really help a lot. Agreed. > > 3. Try implementing a new DSP based scaler from scratch. The most > important thing to know is how
Re: [maemo-developers] Xvideo support for Nokia 770?
On Tuesday 09 January 2007 20:59, Charles 'Buck' Krasic wrote: > Any chance the Xvideo support in the Bora 3.0 will turn up in a 770 OS? I asked the same question on #maemo irc channel and daniels explained that video scaling is done by gpu on N800, so probably the same code can't be reused on 770: https://mg.pov.lt/maemo-irclog/%23maemo.2007-01-08.log.html Actually I have been thinking about trying to implement Xvideo support on 770 for some time already. Now as N800 has Xvideo support, it would be nice to have it on 770 as well for better consistency and software compatibility. I see the following possible options: 1. Implement it just using ARM core and optimize it as much as possible (using dynamically generated code for scaling to get the best performance). Is quite a straightforward solution and only needs time to implement it. 2. Try using dsp tasks that already exist on the device and are used for dspfbsink. But the sources of gst plugins contain code that limits video resolution for dspfbsink. I wonder if this check was introduced artificially or it is the limitation of DSP scaler and it can't handle anything larger than that. Also I wonder if existing video scaler DSP task can support direct rendering [2]. It would need to support arbitrary number of memory mapped buffers for video output in order to avoid unnecessary memcpy, otherwise performance will suffer. Maybe we can ask Nokia developers to provide some information about the internals of these plugins. The most important questions are: * What are the real capabilities of DSP based scaler, can it be used for resolutions let's say up to 800x480? * Where is the screen update performed after dsp has finished scaling/converting video from mapped buffer to framebuffer? Is it done on ARM side, or probably screen update can be also triggered from DSP directly? * Is it possible to get direct rendering [2] support with existing dsp tasks on 770? If not, would it be too hard to implement this feature? * How are timestamps handled in dsp? Is it possible to just send a one shot signal to dsp task for rendering video frame from a mapped buffer as fast as possible? A brief dsp interface description would be welcome. Maybe some questions may be trivial, but unfortunately I did not have much time for a detailed walk through the sources in order to figure out how this all works. If any Nokia developer finds time for some short answers, it would really help a lot. 3. Try implementing a new DSP based scaler from scratch. The most important thing to know is how to access framebuffer directly from DSP and move data to it from mapped buffer without any overhead. The first test implementation can just perform nonscaled planar YV12 -> packed YUV422 conversion, if it proves to be fast and useful, it could be extended to also support scaling. PS. This is unrelated to Xvideo support development, but also it would be nice to have more or less detailed description of dsp based gstreamer elements and their properties. While the sources of these plugins are available (with a hidden dsp part), some docs are needed to know how they are supposed to work in order to use them efficiently and probably improve. [1] http://repository.maemo.org/pool/scirocco/free/source/g/gst-plugins-dsp0.10/gst-plugins-dsp0.10_0.32.1-1.tar.gz [2] http://www.mplayerhq.hu/DOCS/tech/dr-methods.txt ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers