On Wed, Jan 31, 2018 at 10:22 AM, Glen <g...@ugcs.caltech.edu> wrote:
> >> I think I at least somewhat understand what he means. > It occurred to me some time ago, that there is no C library > routine to do what MVCIN does. Not that it is hard to do, > but there just isn't one. I doubt many compilers compile > the appropriate for loop into MVCIN. > > There are some interesting "hardware built-in functions" in z/OS XLC/C++ that cover many interesting instructions, but MVCIN didn't make the cut. (BTW: MVCIN is an instruction that I will *never* forget, and not because I've actually used it before. :-) Its possible to roll your own using the "__asm" inlining support, which now works in C, Metal C, and C++. The syntax for using this is as wacky as HLASM macros, but I think that this is right for MVCIN: static void HW_MVCIN(unsigned char* dest, unsigned char* src, unsigned char len) { __asm ( " BRU *+10\n" " MVCIN 0(*-*,%0),0(%1)\n" " BCTR %2,0\n" " EXRL %2,*-8\n" : : "a"(dest), "a"(src), "r"(len)); } The compiler will inline this in the calling function if it can and decides that it should. Kirk Wolf Dovetailed Technologies http://dovetail.com PS> Of course, you would start with the following and forget it until you found that you actually need to tune it in the context of what you are doing: void memcpy_reversed(char *dest, char *src, size_t len) { src = src + len - 1; /* last character */ while (len--) *dest++ = *src--; } If you compile this with full optimization and look at the generated assembly, you will find a fancy unrolled loop. Who knows if MVCIN is any faster? It certainly won't handle len>256 cases.