-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Perhaps we're getting a bit closer now.

c) compiles fine (I actually tried that in between as well).
d) fails again.

The following, however, works:

__kernel void
blendop_mask_RAW (__read_only image2d_t in_a, __read_only image2d_t
in_b, __write_only image2d_t mask, const int width, const int height,
             const float gopacity, const int blendif, global const float
*blendif_parameters)
{
  const int x = get_global_id(0);
  const int y = get_global_id(1);

  if(x >= width || y >= height) return;

  float4 a = read_imagef(in_a, sampleri, (int2)(x, y));
  float4 b = read_imagef(in_b, sampleri, (int2)(x, y));

  float bif = blendif_factor_Lab(a, b, blendif, blendif_parameters);
  float opacity = gopacity * bif;
  opacity /= bif;

  write_imagef(mask, (int2)(x, y), opacity);
}

...and should at least give the right result as well. Do you know how
smart the compiler is? Is this optimised away in the binary?

Jens

On 10/30/2012 08:11 PM, Ulrich Pegelow wrote:
> Let's see what the following does:
> 
> c) 1:1 copy of blendop_mask_Lab
> 
> __kernel void blendop_mask_RAW (__read_only image2d_t in_a,
> __read_only image2d_t in_b, __write_only image2d_t mask, const int
> width, const int height, const float gopacity, const int blendif,
> global const float *blendif_parameters) { const int x =
> get_global_id(0); const int y = get_global_id(1);
> 
> if(x >= width || y >= height) return;
> 
> float4 a = read_imagef(in_a, sampleri, (int2)(x, y)); float4 b =
> read_imagef(in_b, sampleri, (int2)(x, y));
> 
> float opacity = gopacity * blendif_factor_Lab(a, b, blendif, 
> blendif_parameters);
> 
> write_imagef(mask, (int2)(x, y), opacity); }
> 
> 
> 
> d) copy of blendop_mask_Lab but adapted so gives the right result
> for RAW
> 
> __kernel void blendop_mask_RAW (__read_only image2d_t in_a,
> __read_only image2d_t in_b, __write_only image2d_t mask, const int
> width, const int height, const float gopacity, const int blendif,
> global const float *blendif_parameters) { const int x =
> get_global_id(0); const int y = get_global_id(1);
> 
> if(x >= width || y >= height) return;
> 
> float4 a = read_imagef(in_a, sampleri, (int2)(x, y)); float4 b =
> read_imagef(in_b, sampleri, (int2)(x, y));
> 
> float opacity = gopacity;
> 
> write_imagef(mask, (int2)(x, y), opacity); }
> 
> Am 30.10.2012 18:31, schrieb Jens Fendler:
>> Hi Ulrich,
>> 
>> thanks, but both options fail with code -30 just as before.
>> 
>> Jens
>> 
>> On 10/30/2012 07:20 PM, Ulrich Pegelow wrote:
>>> Hi Jens,
>> 
>>> strange, that kernel is probably the least thrilling in there.
>>> The only thing I could imagine is some problem of your
>>> compiler. Here are a few ad-hoc ideas:
>> 
>>> a)
>> 
>>> __kernel void blendop_mask_RAW (__read_only image2d_t in_a, 
>>> __read_only image2d_t in_b, __write_only image2d_t mask, const
>>> int width, const int height, const float gopacity, const int
>>> blendif, global const float *blendif_parameters) { const int x
>>> = get_global_id(0); const int y = get_global_id(1);
>> 
>>> if(x >= width || y >= height) return;
>> 
>>> float opacity = gopacity;
>> 
>>> write_imagef(mask, (int2)(x, y), opacity); }
>> 
>> 
>>> b)
>> 
>>> __kernel void blendop_mask_RAW (__read_only image2d_t in_a, 
>>> __read_only image2d_t in_b, __write_only image2d_t mask, const
>>> int width, const int height, const float gopacity, const int
>>> blendif, global const float *blendif_parameters) { const int x
>>> = get_global_id(0); const int y = get_global_id(1);
>> 
>>> if(x >= width || y >= height) return;
>> 
>>> write_imagef(mask, (int2)(x, y), (float4)gopacity); }
>> 
>> 
>>> Ulrich
>> 
>>> Am 30.10.2012 16:37, schrieb Jens Fendler:
>>>> Hi all,
>>>> 
>>>> thanks a lot for your support sp far, but so far it seems 
>>>> nothing really worked. I always get stuck at the same point, 
>>>> which is: other GL apps run fine through optirun, but
>>>> darktable's openCL support fails when it comes to compiling
>>>> blendop.cl.
>>>> 
>>>> However, I could make some progress (still testing) by
>>>> removing individual kernels from blendop.cl. The problem
>>>> seems to be with one kernel only: __kernel void
>>>> blendop_mask_RAW( ... ). In particular, the problem is with
>>>> the last statement: write_imagef(mask, (int2)(x, y),
>>>> gopacity);
>>>> 
>>>> If that last line is commented out, dt compiles all openCL
>>>> code flawlessly and seems to work just fine.
>>>> 
>>>> The only significant difference I see between those 
>>>> blendop_mask_* kernels is that the RAW version doesn't call 
>>>> read_imagef() first.. well, I actually have no clue when it
>>>> comes to OpenCL.. it's just what I noticed for now.
>>>> 
>>>> Can someone please suggest a way forward from here?
>>>> 
>>>> Thanks a lot, Jens
>>>> 
>>>> PS: Johannes and Kevin: you both mentioned running dt on
>>>> optimus cards as well. Does one of you happen to also have
>>>> the same card (i.e. GT640M)?
>>>> 
>> 
>>> ------------------------------------------------------------------------------
>>
>>
>>
>>> 
Everyone hates slow websites. So do we.
>>> Make your web apps faster with AppDynamics Download
>>> AppDynamics Lite for free today:
>>> http://p.sf.net/sfu/appdyn_sfd2d_oct 
>>> _______________________________________________
>>> darktable-devel mailing list
>>> darktable-devel@lists.sourceforge.net 
>>> https://lists.sourceforge.net/lists/listinfo/darktable-devel
>> 
>> 
>> 
>> ------------------------------------------------------------------------------
>>
>> 
Everyone hates slow websites. So do we.
>> Make your web apps faster with AppDynamics Download AppDynamics
>> Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct 
>> _______________________________________________ darktable-devel
>> mailing list darktable-devel@lists.sourceforge.net 
>> https://lists.sourceforge.net/lists/listinfo/darktable-devel
>> 
> 

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://www.enigmail.net/

iEYEARECAAYFAlCQHGIACgkQbyTZFOIr/d1gBQCg4B7bdXZifm4LRNatX++9Amsu
lVwAoL4nZT7kJPyUgm77YXh87gtviayQ
=hHWV
-----END PGP SIGNATURE-----

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
_______________________________________________
darktable-devel mailing list
darktable-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/darktable-devel

Reply via email to