Ralf
I will compare performance. I'd be happy to use more efficient algorithms
if they are available!
I also have to think about a licence for vecmathlib. I'd prefer something
that would finally be applicable to all of pocl, llvm, gcc, ...
-erik
On Tue, Feb 5, 2013 at 9:34 AM, Ralf Karrenberg <[email protected]> wrote:
> Hi Erik,
>
> have you done any measurements, e.g. how does your implementation compare
> against the code of Julien Pommier (google "SSE math fun")?
> This is what I am currently using, but unfortunately the list of
> implemented functions is a lot shorter even than what Pekka posted...
>
> Best,
> Ralf
>
>
> On 2/5/13 2:55 PM, Erik Schnetter wrote:
>
>> Ralf
>>
>> Much of vecmathlib comes from another project where I needed this
>> functionality. In particular, I am using finite differences on
>> multi-dimensional arrays that can benefit greatly from vectorisation.
>>
>> I now extracted from there and added to vecmathlib intrinsics to load
>> and store numbers from/to memory, i.e. arrays. These functions are
>> mostly equivalent to vload* and vstore* in OpenCL. This provides two
>> important capabilities:
>>
>> (1) The load/store functions accept a mask parameter, allowing
>> vectorising loops that are not an even multiple of the vector length.
>> (2) The load/store functions distinguish between aligned and unaligned
>> memory accesses, where aligned accesses are faster. This may require
>> adjusting the lower loop bound to start on an aligned memory location.
>>
>> The number of loop iterations is in general not a multiple of the vector
>> size. Also, using scalar loop iterations for the left-over iterations
>> does not work since this is much slower and increases the code size.
>>
>> -erik
>>
>>
>>
>> On Tue, Feb 5, 2013 at 6:55 AM, Ralf Karrenberg <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>> Hi,
>>
>> I haven't had a look at the code, but from what you are writing,
>> this sounds like exactly what I would need to integrate into libWFV.
>> The vectorizer has an API to specify mappings of functions to SIMD
>> equivalents, which is all that you need if all the implementations
>> are there already.
>> So, WFV should be able to work with your library within a few hours
>> of integration work. I'll look into that later.
>>
>> By the way, I recall a discussion on integrating such a library
>> (possibly as a .bc file) into LLVM. You may want to have a look at
>> the thread and respond:
>> http://llvm.1065342.n5.nabble.**__com/SIMD-trigonometry-__**
>> logarithms-tt54215.html#none
>>
>> <http://llvm.1065342.n5.**nabble.com/SIMD-trigonometry-**
>> logarithms-tt54215.html#none<http://llvm.1065342.n5.nabble.com/SIMD-trigonometry-logarithms-tt54215.html#none>
>> >
>>
>> Cheers,
>> Ralf
>>
>>
>> On 2/3/13 7:02 PM, Erik Schnetter wrote:
>>
>> On Sun, Feb 3, 2013 at 12:25 PM, Pekka Jääskeläinen
>> <[email protected]
>> <mailto:pekka.jaaskelainen@**tut.fi<[email protected]>
>> >
>> <mailto:pekka.jaaskelainen@__t**ut.fi <http://tut.fi>
>>
>> <mailto:pekka.jaaskelainen@**tut.fi <[email protected]>>>>
>> wrote:
>>
>> On 02/03/2013 03:56 PM, Erik Schnetter wrote:
>> > In my mind, the vectorizer would never look into sqrt()
>> or any
>> other functions
>> > defined in the language standard, but would simply expect
>> efficient vector
>> > implementations of these. Instead of looking into the
>> language
>> standard we could
>> > also add a respective attribute to the function
>> definitions. This
>> attribute
>> > would then confirm that e.g. double2 sqrt(double2) is
>> equivalent
>> to double
>> > sqrt(double). __attribute__((__vector___**equivalence__))
>>
>> could be a
>> name.
>>
>> OK. The "known" functions should not be inlined but the
>> vectorizer
>> should
>> recognize them (if we do not go towards the intrinsics
>> approach). In
>> the end,
>> the autovectorized work group function and an explicitly
>> vectorized
>> kernel will
>> call the same vector-optimized function in this scheme.
>>
>> For starters we might just use a "white list" for the known
>> vectorizable
>> functions, and assume a trivial scalar to vector mapping
>> for the
>> arguments
>> and the return value. Or use intrinsics for the known ones.
>>
>> Looking at the code of LLVM's LoopVectorize, it seems to be
>> able to
>> vectorize some intrinsics already:
>>
>> case Intrinsic::sqrt:
>> case Intrinsic::sin:
>> case Intrinsic::cos:
>> case Intrinsic::exp:
>> case Intrinsic::exp2:
>> case Intrinsic::log:
>> case Intrinsic::log10:
>> case Intrinsic::log2:
>> case Intrinsic::fabs:
>> case Intrinsic::floor:
>> case Intrinsic::ceil:
>> case Intrinsic::trunc:
>> case Intrinsic::rint:
>> case Intrinsic::nearbyint:
>> case Intrinsic::pow:
>> case Intrinsic::fma:
>> case Intrinsic::fmuladd:
>>
>> Is there some important ones missing? If not, then we could
>> think of
>> going
>> the intrinsics route for these calls. I.e., call the
>> intrinsics from
>> the kernel lib and expand them to calls to your
>> functions+inline after
>> autovectorization.
>>
>>
>> "Important" probably depends on how frequently they are used in
>> real-world code, or in benchmarks. The actual list of intrinsics
>> (as
>> listed e.g. in the OpenCL or C standard) is probably three of
>> four times
>> as long. I would also add the various convert* and as* (i.e. cast)
>> functions to the list.
>>
>> I could create a longer list if that would be helpful.
>>
>> These functions should still be inlined, but only after
>> vectorization.
>>
>> -erik
>>
>> --
>> Erik Schnetter
>> <eschnetter@__perimeterinstitu**te.ca<http://perimeterinstitute.ca>
>>
>> <mailto:eschnetter@**perimeterinstitute.ca<[email protected]>
>> >
>>
>> <mailto:eschnetter@__perimeter**institute.ca<http://perimeterinstitute.ca>
>>
>> <mailto:eschnetter@**perimeterinstitute.ca<[email protected]>
>> >>>
>>
>> http://www.perimeterinstitute.**__ca/personal/eschnetter/
>>
>>
>> <http://www.**perimeterinstitute.ca/**personal/eschnetter/<http://www.perimeterinstitute.ca/personal/eschnetter/>
>> >
>> AIM: eschnett247, Skype: eschnett, Google Talk:
>> [email protected] <mailto:[email protected]>
>> <mailto:[email protected] <mailto:[email protected]>>
>>
>>
>>
>> ------------------------------**__----------------------------**
>> --__------------------
>>
>> Everyone hates slow websites. So do we.
>> Make your web apps faster with AppDynamics
>> Download AppDynamics Lite for free today:
>>
>> http://p.sf.net/sfu/appdyn___**d2d_jan<http://p.sf.net/sfu/appdyn___d2d_jan>
>>
>> <http://p.sf.net/sfu/appdyn_**d2d_jan<http://p.sf.net/sfu/appdyn_d2d_jan>
>> >
>>
>>
>>
>> ______________________________**___________________
>> pocl-devel mailing list
>> [email protected]._**_net
>>
>> <mailto:pocl-devel@lists.**sourceforge.net<[email protected]>
>> >
>>
>> https://lists.sourceforge.net/**__lists/listinfo/pocl-devel<https://lists.sourceforge.net/__lists/listinfo/pocl-devel>
>>
>>
>> <https://lists.sourceforge.**net/lists/listinfo/pocl-devel<https://lists.sourceforge.net/lists/listinfo/pocl-devel>
>> >
>>
>>
>>
>>
>> --
>> Erik Schnetter
>> <eschnetter@**perimeterinstitute.ca<[email protected]>
>> <mailto:eschnetter@**perimeterinstitute.ca<[email protected]>
>> >>
>> http://www.perimeterinstitute.**ca/personal/eschnetter/<http://www.perimeterinstitute.ca/personal/eschnetter/>
>> AIM: eschnett247, Skype: eschnett, Google Talk: [email protected]
>> <mailto:[email protected]>
>>
>>
>> ------------------------------**------------------------------**
>> ------------------
>> Free Next-Gen Firewall Hardware Offer
>> Buy your Sophos next-gen firewall before the end March 2013
>> and get the hardware for free! Learn more.
>> http://p.sf.net/sfu/sophos-**d2d-feb <http://p.sf.net/sfu/sophos-d2d-feb>
>>
>>
>>
>>
>> ______________________________**_________________
>> pocl-devel mailing list
>> [email protected].**net <[email protected]>
>> https://lists.sourceforge.net/**lists/listinfo/pocl-devel<https://lists.sourceforge.net/lists/listinfo/pocl-devel>
>>
>>
--
Erik Schnetter <[email protected]>
http://www.perimeterinstitute.ca/personal/eschnetter/
AIM: eschnett247, Skype: eschnett, Google Talk: [email protected]
------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
pocl-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pocl-devel