Re: [hugin-ptx] Re: hugin Fast Panorama preview code for a lot faster "stitching"?
2010/12/18 Lukáš Jirkovský : > On 18 December 2010 16:22, Jan Martin wrote: >> Bruno, >> >> care to estimate what kind of speed improvement optimizing the code would be >> good for? >> What would it need to get it done? >> Who is qualified to do it? >> >> Jan >> >> >> > > I think it would be quite difficult. I remeber I looked at the nona > code quite some time ago (bit it didn't really change since that > time). Most of the code is generated at compile time from templates so > sometimes it's a bit difficult to really understand what it does and > especially how to optimize it. But I guess that a big slowdown is > caused by vigra. Yeah, it allows to write completelly generic code > like the nona code is but for a price. > > I know it from my experiments with hugin deghosting. Vigra's types are > due to extensive polymorphism usage slow, even the simple ones like > RGBsomethingsomething (I had too much vine to remeber that) or vector > types like tinyvector. Also the reason of the slowdown is operator > overloading. > > As I said, it allows you to write generic code but at a price. For > example you can do things like a + b and it translates to a + b for > one component data (eg. grayscale image) or (a[0] + b[0], a[1] + b[1], > a[2] + b[2]) for RGB or Lab or whatever type which is a 3-component > vector. The problem is code generated by the compiler when you add > more together. For example a * b + c is, for three component data, > implemented as somethinglike > > for (i=0; i <3; i++) > tmpres[i] = a[i] + b[i] > for (i = 0; i < 3; i++) > res[i] = tmpres[i] + c[i] > > You can see that there these loops can be shrinked to one, but > compiler won't do this. But there are some possibilities how to make > compiler to make it in one loop but it's a black magic. If you > implement all this manually (ie. you implement different code for > vector types and scalar types) you can gain quite some speed but I > wouldn't like to mutilate nona - this code is just too sexy. > > ciao, > Lukas > After looking at the vigra code I found out I was probably completely wrong. It is possible to evaluate operator chain in one loop by creating a parser tree. I thought it's not used in vigra, but I just found UnrollLoop in tinyvector.hxx to do something which at a first glance look like a tree creation. Lukas -- You received this message because you are subscribed to the Google Groups "Hugin and other free panoramic software" group. A list of frequently asked questions is available at: http://wiki.panotools.org/Hugin_FAQ To post to this group, send email to hugin-ptx@googlegroups.com To unsubscribe from this group, send email to hugin-ptx+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/hugin-ptx
Re: [hugin-ptx] Re: hugin Fast Panorama preview code for a lot faster "stitching"?
On 18 December 2010 16:22, Jan Martin wrote: > Bruno, > > care to estimate what kind of speed improvement optimizing the code would be > good for? > What would it need to get it done? > Who is qualified to do it? > > Jan > > > I think it would be quite difficult. I remeber I looked at the nona code quite some time ago (bit it didn't really change since that time). Most of the code is generated at compile time from templates so sometimes it's a bit difficult to really understand what it does and especially how to optimize it. But I guess that a big slowdown is caused by vigra. Yeah, it allows to write completelly generic code like the nona code is but for a price. I know it from my experiments with hugin deghosting. Vigra's types are due to extensive polymorphism usage slow, even the simple ones like RGBsomethingsomething (I had too much vine to remeber that) or vector types like tinyvector. Also the reason of the slowdown is operator overloading. As I said, it allows you to write generic code but at a price. For example you can do things like a + b and it translates to a + b for one component data (eg. grayscale image) or (a[0] + b[0], a[1] + b[1], a[2] + b[2]) for RGB or Lab or whatever type which is a 3-component vector. The problem is code generated by the compiler when you add more together. For example a * b + c is, for three component data, implemented as somethinglike for (i=0; i <3; i++) tmpres[i] = a[i] + b[i] for (i = 0; i < 3; i++) res[i] = tmpres[i] + c[i] You can see that there these loops can be shrinked to one, but compiler won't do this. But there are some possibilities how to make compiler to make it in one loop but it's a black magic. If you implement all this manually (ie. you implement different code for vector types and scalar types) you can gain quite some speed but I wouldn't like to mutilate nona - this code is just too sexy. ciao, Lukas -- You received this message because you are subscribed to the Google Groups "Hugin and other free panoramic software" group. A list of frequently asked questions is available at: http://wiki.panotools.org/Hugin_FAQ To post to this group, send email to hugin-ptx@googlegroups.com To unsubscribe from this group, send email to hugin-ptx+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/hugin-ptx
Re: [hugin-ptx] Re: hugin Fast Panorama preview code for a lot faster "stitching"?
I've no idea how easy it is to speed up nona. Code optimisation should always follow profiling, but this is straightforward with nona as it only does one thing. -- Bruno -- You received this message because you are subscribed to the Google Groups "Hugin and other free panoramic software" group. A list of frequently asked questions is available at: http://wiki.panotools.org/Hugin_FAQ To post to this group, send email to hugin-ptx@googlegroups.com To unsubscribe from this group, send email to hugin-ptx+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/hugin-ptx
Re: [hugin-ptx] Re: hugin Fast Panorama preview code for a lot faster "stitching"?
Bruno, care to estimate what kind of speed improvement optimizing the code would be good for? What would it need to get it done? Who is qualified to do it? Jan On Sat, Dec 18, 2010 at 11:59 AM, Bruno Postle wrote: > Regarding speeding up stitching with nona, there has been very little > optimisation of the code over the years. For instance nona still does a full > panorama calculation for every pixel in the output, where it could easily > interpolate most of these values. > > -- > Bruno > > > > > -- > You received this message because you are subscribed to the Google Groups > "Hugin and other free panoramic software" group. > A list of frequently asked questions is available at: > http://wiki.panotools.org/Hugin_FAQ > To post to this group, send email to hugin-ptx@googlegroups.com > To unsubscribe from this group, send email to > hugin-ptx+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/hugin-ptx > -- http://www.DIY-streetview.org -- You received this message because you are subscribed to the Google Groups "Hugin and other free panoramic software" group. A list of frequently asked questions is available at: http://wiki.panotools.org/Hugin_FAQ To post to this group, send email to hugin-ptx@googlegroups.com To unsubscribe from this group, send email to hugin-ptx+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/hugin-ptx
Re: [hugin-ptx] Re: hugin Fast Panorama preview code for a lot faster "stitching"?
Regarding speeding up stitching with nona, there has been very little optimisation of the code over the years. For instance nona still does a full panorama calculation for every pixel in the output, where it could easily interpolate most of these values. -- Bruno -- You received this message because you are subscribed to the Google Groups "Hugin and other free panoramic software" group. A list of frequently asked questions is available at: http://wiki.panotools.org/Hugin_FAQ To post to this group, send email to hugin-ptx@googlegroups.com To unsubscribe from this group, send email to hugin-ptx+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/hugin-ptx
Re: [hugin-ptx] Re: hugin Fast Panorama preview code for a lot faster "stitching"?
On Sat, 2010-12-11 at 23:39 -0500, Yuval Levy wrote: > Hi James, > > On December 11, 2010 11:48:22 am James Legg wrote: > > This isn't working on my system. It would be a more user friendly not > > working with the attached patch. > > you have access to the Enblend repo now. Thanks. I've committed the patch to the default branch. -James -- You received this message because you are subscribed to the Google Groups "Hugin and other free panoramic software" group. A list of frequently asked questions is available at: http://wiki.panotools.org/Hugin_FAQ To post to this group, send email to hugin-ptx@googlegroups.com To unsubscribe from this group, send email to hugin-ptx+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/hugin-ptx
Re: [hugin-ptx] Re: hugin Fast Panorama preview code for a lot faster "stitching"?
Hi James, On December 11, 2010 11:48:22 am James Legg wrote: > This isn't working on my system. It would be a more user friendly not > working with the attached patch. you have access to the Enblend repo now. Yuv signature.asc Description: This is a digitally signed message part.
Re: [hugin-ptx] Re: hugin Fast Panorama preview code for a lot faster "stitching"?
On Fri, 2010-12-10 at 10:12 -0500, Yuval Levy wrote: > Enblend (properly compiled) has had GPU blending for longer than Hugin has > had > GPU remapping. > > enblend --gpu This isn't working on my system. It would be a more user friendly not working with the attached patch. If both nona and enblend have GPU support, would be possible to get nona to render to a texture that enblend picks up? This would prevent copying the image from video RAM to main RAM, encoding an image file, decoding it again, then copying it back to VRAM. -James -- You received this message because you are subscribed to the Google Groups "Hugin and other free panoramic software" group. A list of frequently asked questions is available at: http://wiki.panotools.org/Hugin_FAQ To post to this group, send email to hugin-ptx@googlegroups.com To unsubscribe from this group, send email to hugin-ptx+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/hugin-ptx diff -r 77fbc2c95fb5 src/gpu.cc --- a/src/gpu.cc Sun Sep 05 17:04:20 2010 +0200 +++ b/src/gpu.cc Sat Dec 11 16:43:48 2010 + @@ -218,16 +218,19 @@ const GLboolean has_arb_vertex_shader = glewGetExtension("GL_ARB_vertex_shader"); const GLboolean has_arb_shader_objects = glewGetExtension("GL_ARB_shader_objects"); const GLboolean has_arb_shading_language = glewGetExtension("GL_ARB_shading_language_100"); +const GLboolean has_arb_texture_float = glewGetExtension("GL_ARB_texture_float"); if (!(has_arb_fragment_shader && has_arb_vertex_shader && has_arb_shader_objects && - has_arb_shading_language)) { + has_arb_shading_language && + has_arb_texture_float)) { const char* msg[] = {"false", "true"}; cerr << command << ": extension GL_ARB_fragment_shader = " << msg[has_arb_fragment_shader] << "\n" << command << ": extension GL_ARB_vertex_shader = " << msg[has_arb_vertex_shader] << "\n" << command << ": extension GL_ARB_shader_objects = " << msg[has_arb_shader_objects] << "\n" << command << ": extension GL_ARB_shading_language_100 = " << msg[has_arb_shading_language] << "\n" + << command << ": extension GL_ARB_texture_float = " << msg[has_arb_texture_float] << "\n" << command << ": graphics card lacks the necessary extensions for \"--gpu\";" << "\n" << command << ": \"--gpu\" flag is not going to work on this machine" << endl; #ifdef HAVE_APPLE_OPENGL_FRAMEWORK
Re: [hugin-ptx] Re: hugin Fast Panorama preview code for a lot faster "stitching"?
On December 10, 2010 10:06:43 am Aron H wrote: > It seems to me that Enblend is the missing piece. Enblend (properly compiled) has had GPU blending for longer than Hugin has had GPU remapping. enblend --gpu > Another possibility is the OpenMP build of enblend, which uses modern > processor optimizations. Again: it will depend what CPU power vs. what GPU power are pitched. Also: OpenMP does not scale as well as GPU. An eight-core CPU won't be 8 faster than a single-core CPU. I think somebody posted once test results and most of what there is to be gained with OpenMP is within the 3-4 cores range. Yuv signature.asc Description: This is a digitally signed message part.