Re: (SIMD) Optimized multi-byte chunk scanning

Cecil Ward via Digitalmars-d-learn Sat, 26 Aug 2017 09:41:56 -0700

On Friday, 25 August 2017 at 18:52:57 UTC, Nordlöw wrote:

On Friday, 25 August 2017 at 09:40:28 UTC, Igor wrote:
As for a nice reference of intel intrinsics:https://software.intel.com/sites/landingpage/IntrinsicsGuide/
Wow, what a fabulous UX!

The pcmpestri instruction is probably what you are looking for?There is a useful resource in the Intel optimisation guide. Thereis also an Intel article about speeding up XML parsing with thisinstruction, but if memory serves it's really messy - a rightpalaver. A lot depends on how much control, if any you have overthe input data, typically none I quite understand.


Based on this article,
    https://graphics.stanford.edu/~seander/bithacks.html

I wrote a short d routine to help me learn the language as I wasthinking about faster strlen using larger-sized gulps. The abovearticle has a good test for whether or not a wide word contains aparticular byte value somewhere in it. I wrote a

   bool hasZeroByte( in uint64_t x )
function based on that method.

I'm intending to write a friendlier d convenience routine to giveaccess to inline pcmpestri code generation in GDC when I getround to it (one instruction all fully inlined and flexiblyoptimised at compile-time, with no subroutine call to aninstruction).

Agner Fog's libraries and articles are superb, take a look. Hemust have published code to deal with these C standard librarybyte string processing functions efficiently with wide alignedmachine words, unless I'm getting very forgetful.


A bit of googling?

Re: (SIMD) Optimized multi-byte chunk scanning

Reply via email to