-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Perhaps we're getting a bit closer now.
c) compiles fine (I actually tried that in between as well). d) fails again. The following, however, works: __kernel void blendop_mask_RAW (__read_only image2d_t in_a, __read_only image2d_t in_b, __write_only image2d_t mask, const int width, const int height, const float gopacity, const int blendif, global const float *blendif_parameters) { const int x = get_global_id(0); const int y = get_global_id(1); if(x >= width || y >= height) return; float4 a = read_imagef(in_a, sampleri, (int2)(x, y)); float4 b = read_imagef(in_b, sampleri, (int2)(x, y)); float bif = blendif_factor_Lab(a, b, blendif, blendif_parameters); float opacity = gopacity * bif; opacity /= bif; write_imagef(mask, (int2)(x, y), opacity); } ...and should at least give the right result as well. Do you know how smart the compiler is? Is this optimised away in the binary? Jens On 10/30/2012 08:11 PM, Ulrich Pegelow wrote: > Let's see what the following does: > > c) 1:1 copy of blendop_mask_Lab > > __kernel void blendop_mask_RAW (__read_only image2d_t in_a, > __read_only image2d_t in_b, __write_only image2d_t mask, const int > width, const int height, const float gopacity, const int blendif, > global const float *blendif_parameters) { const int x = > get_global_id(0); const int y = get_global_id(1); > > if(x >= width || y >= height) return; > > float4 a = read_imagef(in_a, sampleri, (int2)(x, y)); float4 b = > read_imagef(in_b, sampleri, (int2)(x, y)); > > float opacity = gopacity * blendif_factor_Lab(a, b, blendif, > blendif_parameters); > > write_imagef(mask, (int2)(x, y), opacity); } > > > > d) copy of blendop_mask_Lab but adapted so gives the right result > for RAW > > __kernel void blendop_mask_RAW (__read_only image2d_t in_a, > __read_only image2d_t in_b, __write_only image2d_t mask, const int > width, const int height, const float gopacity, const int blendif, > global const float *blendif_parameters) { const int x = > get_global_id(0); const int y = get_global_id(1); > > if(x >= width || y >= height) return; > > float4 a = read_imagef(in_a, sampleri, (int2)(x, y)); float4 b = > read_imagef(in_b, sampleri, (int2)(x, y)); > > float opacity = gopacity; > > write_imagef(mask, (int2)(x, y), opacity); } > > Am 30.10.2012 18:31, schrieb Jens Fendler: >> Hi Ulrich, >> >> thanks, but both options fail with code -30 just as before. >> >> Jens >> >> On 10/30/2012 07:20 PM, Ulrich Pegelow wrote: >>> Hi Jens, >> >>> strange, that kernel is probably the least thrilling in there. >>> The only thing I could imagine is some problem of your >>> compiler. Here are a few ad-hoc ideas: >> >>> a) >> >>> __kernel void blendop_mask_RAW (__read_only image2d_t in_a, >>> __read_only image2d_t in_b, __write_only image2d_t mask, const >>> int width, const int height, const float gopacity, const int >>> blendif, global const float *blendif_parameters) { const int x >>> = get_global_id(0); const int y = get_global_id(1); >> >>> if(x >= width || y >= height) return; >> >>> float opacity = gopacity; >> >>> write_imagef(mask, (int2)(x, y), opacity); } >> >> >>> b) >> >>> __kernel void blendop_mask_RAW (__read_only image2d_t in_a, >>> __read_only image2d_t in_b, __write_only image2d_t mask, const >>> int width, const int height, const float gopacity, const int >>> blendif, global const float *blendif_parameters) { const int x >>> = get_global_id(0); const int y = get_global_id(1); >> >>> if(x >= width || y >= height) return; >> >>> write_imagef(mask, (int2)(x, y), (float4)gopacity); } >> >> >>> Ulrich >> >>> Am 30.10.2012 16:37, schrieb Jens Fendler: >>>> Hi all, >>>> >>>> thanks a lot for your support sp far, but so far it seems >>>> nothing really worked. I always get stuck at the same point, >>>> which is: other GL apps run fine through optirun, but >>>> darktable's openCL support fails when it comes to compiling >>>> blendop.cl. >>>> >>>> However, I could make some progress (still testing) by >>>> removing individual kernels from blendop.cl. The problem >>>> seems to be with one kernel only: __kernel void >>>> blendop_mask_RAW( ... ). In particular, the problem is with >>>> the last statement: write_imagef(mask, (int2)(x, y), >>>> gopacity); >>>> >>>> If that last line is commented out, dt compiles all openCL >>>> code flawlessly and seems to work just fine. >>>> >>>> The only significant difference I see between those >>>> blendop_mask_* kernels is that the RAW version doesn't call >>>> read_imagef() first.. well, I actually have no clue when it >>>> comes to OpenCL.. it's just what I noticed for now. >>>> >>>> Can someone please suggest a way forward from here? >>>> >>>> Thanks a lot, Jens >>>> >>>> PS: Johannes and Kevin: you both mentioned running dt on >>>> optimus cards as well. Does one of you happen to also have >>>> the same card (i.e. GT640M)? >>>> >> >>> ------------------------------------------------------------------------------ >> >> >> >>> Everyone hates slow websites. So do we. >>> Make your web apps faster with AppDynamics Download >>> AppDynamics Lite for free today: >>> http://p.sf.net/sfu/appdyn_sfd2d_oct >>> _______________________________________________ >>> darktable-devel mailing list >>> darktable-devel@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/darktable-devel >> >> >> >> ------------------------------------------------------------------------------ >> >> Everyone hates slow websites. So do we. >> Make your web apps faster with AppDynamics Download AppDynamics >> Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct >> _______________________________________________ darktable-devel >> mailing list darktable-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/darktable-devel >> > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://www.enigmail.net/ iEYEARECAAYFAlCQHGIACgkQbyTZFOIr/d1gBQCg4B7bdXZifm4LRNatX++9Amsu lVwAoL4nZT7kJPyUgm77YXh87gtviayQ =hHWV -----END PGP SIGNATURE----- ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct _______________________________________________ darktable-devel mailing list darktable-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/darktable-devel