A question has been asked about why I deleted the "#pragma push_macro"
and "#pragma pop_macro" that was there around the old lrotl.
The problem push/pop was trying to solve is quite simple. In
x86intrin.h, there is a line like this:
#define _lrotl(a,b) __rolq((a), (b))
Now, with that in mind, what happens in intrin.h if you have this
*after* that #define:
unsigned long __cdecl _lrotl(unsigned long _Val,int _Shift);
It ends up mangling it. So, what the existing code does is:
#pragma push_macro ("_lrotl") // Save the definition
#undef _lrotl // Temporarily delete it
unsigned long __cdecl _lrotl(unsigned long _Val,int _Shift);
#pragma pop_macro ("_lrotl") // Restore it
The effect of this is that intrin.h was able to do a prototype for
_lrotl even if x86intrin.h has done a #define, without permanently
deleting the #define.
Now, why don't we need it anymore?
There are 2 parts to this answer:
1) If you are including intrin.h, there is an inline (non-asm) version
of lrotl defined in intrin-impl.h. We don't need to push/pop a
definition since we're *removing* x86intrin's mappings to rolq and
making our own. Among other things, this means the definition is there
for non-x86 platforms. And it avoids gcc's bug (#61662). Also, our
code will work on that day in the future when longs become 128 bit...
2) If you are including stdlib.h, we only do the #undef's if we see that
the buggy version of x86intrin.h was included. Otherwise, we leave
x86intrin's definitions there in case intrin.h is never included.
Hope this helps explain my thinking here. If I have missed something,
please let me know.
dw
On 2/21/2015 12:03 PM, dw wrote:
On 2/19/2015 10:47 AM, Martell Malone wrote:
This has ended up in my spam folder as it did not pass some yahoo checks.
I assume others have the same issue which is why there was no reply.
Could you give me a commit msg to go with the signed off so we can
look at getting it merged
How about something along the lines of: _lrotl gave wrong answer on LLP64.
However, before you proceed, you should be aware that kai had some
comments about the patch. I haven't replied before now because while
I don't agree with his comments, providing convincing reasons why is hard.
He asked:
1) "why disallowing to save user's macro-definitions anymore?"
Presumably this refers to the #undefs I have in the patch. They are
there to remove the buggy definitions of lrotl and lrotr that are in
x86intrin.h (actually i86intrin.h) in gcc builds prior to 4.9.2. They
also resolve conflicts between the implementation defined in
intrin-impl.h and i86intrin.h.
As for "disallowing" saving user's macro-definitions, I'm not sure I
follow. If someone wants to override the definitions for lrotl (an
odd thing to do, but possible), they would do the same thing for lrotl
that they would do for any of the other functions implemented in
intrin-impl.h: Include the header, *then* define the replacement
macro. For example, trying to do #define __stosb __mystosb *before*
including intrin.h wouldn't have the desired effect (since it would
also rename the implementation). And if you do #define _lrotl _myrotl
*after* including intrin.h, everything works as expected. No sure
what I have "disallowed."
2) "I would like to see instead of use of 'long' the macro '__LONG32'"
This requires a bit more thought. For 32bit and 64bit native Windows
(LLP64
<http://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_models>),
his suggestion and my current implementation return exactly the same
results. The only case where they return different results is
cigwin64 (LP64).
The key question here is: If longs are 64bits, what would you expect
mingw-w64's _lrotr(1,1) to return? 0x80000000 (kai's suggestion)? Or
0x8000000000000000 (my patch)? An argument could be made for either.
Does using one approach or the other make things easier if users
mistakenly send a 64bit long to LONG32 or vice versa? No, it doesn't,
so that doesn't add support to either his approach or mine.
Probably the most compelling reason to decide one way or the other is
that doing what kai suggests produces results that are inconsistent
with what gcc itself does in x86intrin.h. Following his approach (on
LP64 systems), if you #include <x86intrin.h>, _lrotl(1,1) would return
0x8000000000000000, while #include <intrin.h> would return 0x80000000.
Returning different results depending on which headers are included
seems bad. I believe being consistent with gcc (my current patch) so
that _lrotl always returns the same result is the better plan.
There may be circumstances I haven't considered that would change my
mind, but currently I see no reason to change this patch.
dw
On Wed, Feb 11, 2015 at 9:56 AM, dw <limegreenso...@yahoo.com
<mailto:limegreenso...@yahoo.com>> wrote:
I wasn't completely satisfied with my previous patch for lrotl
(which is why I didn't push to get it committed). The changes
for intrin.h and intrin-impl.h were fine, but I wasn't satisfied
with the changes to stdlib.h. Now that I've had a chance to
think about it again, I finally settled on this (attached).
It performs pretty much the same as what's in the current file
(to avoid potential breakage) while making at least some attempt
to fix the problem that can occur with old (broken) versions of
x86intrin.h.
My previous post attempted to try to do mappings, but that just
leads to conflicts with other headers and undefined symbols.
Comments? Suggestions? Or can we finally call this done?
dw
------------------------------------------------------------------------------
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot
Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and
more. Take a
look and join the conversation now.
http://goparallel.sourceforge.net/
_______________________________________________
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
<mailto:Mingw-w64-public@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public