On Wed, Jan 31, 2018 at 10:22 AM, Glen <g...@ugcs.caltech.edu> wrote:

>
>> I think I at least somewhat understand what he means.
> It occurred to me some time ago, that there is no C library
> routine to do what MVCIN does.  Not that it is hard to do,
> but there just isn't one.  I doubt many compilers compile
> the appropriate for loop into MVCIN.
>
> There are some interesting "hardware built-in functions" in z/OS XLC/C++
that cover many interesting instructions, but MVCIN didn't make the cut.
(BTW: MVCIN is an instruction that I will *never* forget, and not because
I've actually used it before. :-)

Its possible to roll your own using the "__asm" inlining support, which now
works in C, Metal C, and C++.
The syntax for using this is as wacky as HLASM macros, but I think that
this is right for MVCIN:

static void
HW_MVCIN(unsigned char* dest, unsigned char* src, unsigned char len) {
__asm  (
"  BRU   *+10\n"
"  MVCIN 0(*-*,%0),0(%1)\n"
"  BCTR %2,0\n"
"  EXRL %2,*-8\n"
: :
"a"(dest),
"a"(src),
"r"(len));
}

The compiler will inline this in the calling function if it can and decides
that it should.


Kirk Wolf
Dovetailed Technologies
http://dovetail.com

PS> Of course, you would start with the following and forget it until you
found that you actually need to tune it in the context of what you are
doing:

void memcpy_reversed(char *dest, char *src, size_t len)
{
   src = src + len - 1; /* last character */
   while (len--)
       *dest++ = *src--;
}

If you compile this with full optimization and look at the generated
assembly, you will find a fancy unrolled loop.
Who knows if MVCIN is any faster?   It certainly won't handle len>256 cases.

Reply via email to