It's a really delayed reply, dw asked me to join the conversation.

Hey Jacek, thanks for you thoughts on this. However, it doesn't seem to have brought us to a conclusion. I've been trying to avoid "nagging" (something I am prone to do), especially since I've seen how busy Kai has been with his own work. But it has now been nearly a month, v3 is upon us, and we've had no movement here.

When I first sent this patch, I was in a hurry since I thought v3 was just a day or two away. As a result, I included several things in this single patch that should probably have been done as separate patches. And some of those things are bug fixes where the functions will actually return wrong answers. Looking back at the original mail, here's what all this patch includes:

=============================================================
1) __movsb, __movsd, __movsq, __movsw: Moved to intrin-impl.
2) __rdtsc: Change to use builtin, moved to intrin-impl, resolved conflict with ia32intrin.
3) _umul128 & _mul128: Moved to intrin-impl.
4) __shiftright128 & __shiftleft128: Re-written as asm, moved to intrin-impl.h. 5) _lrotr, _lrotl: Fix bug caused by ia32intrin.h when longs are 4 bytes long. 6) RtlSecureZeroMemory - According to msdn, this is not an intrinsic and should only be defined in winnt.h. *File deleted from intrincs.* 7) UnsignedMultiplyExtract128 & MultiplyExtract128: According to msdn, these are not intrinsics. Also, MultiplyExtract128 doesn't work right. *Files deleted from intrincs* and code fixed in winnt.h. 8) _InterlockedAdd & _InterlockedAdd64: According to msdn, these intrinsics are only available for itanium. I'm not sure the inline asm we have will run properly there, and there are no #if's around it to limit it to that platform. Note that winnt.h has inlines for x86/x64 for these. *Files deleted from intrincs*.
=============================================================

1-4 These 9 functions were ONLY available in the .a file. This patch includes them as inline intrinsics, a performance win. 5 & 7 fix actual bugs that result in the functions returning incorrect answers under certain conditions.

I don't believe there is any controversy regarding these points. Even if we can't come to an agreement on the rest, I believe these should be included in v3. I'm prepared to produce a patch for just these upon request and we can continue to debate the rest.

As we have discussed before, the problem here seems to be the deletions from 6, 7 & 8. I've tried to understand the requirement here, and I'm sure Kai is as frustrated with trying to explain it to me as I am about trying to get it explained.

For example, kai says:

The need to add it to libmingwex is that this function isn't present
on all supported Windoof OSes, so we need to handle that.

This is confusing since these functions aren't support by the Windows OS. They aren't exported from any DLL that I'm aware of. The only place they exist in MS world is as inlines in winnt.h, which is the same thing I'm trying to do.

Other than this, Kai seems to be saying that there is a requirement that we be able to support the ability to have all these functions not be inline. However, I'm not clear on why these specific functions have this requirement. Not only is this inconsistent with MS's definition, there are a number of other functions in the platform headers that are marked as FORCEINLINE, so why are these functions different? Deleting these files and using FORCEINLINE still seems to me to be the most logical course here.

However, if that's not acceptable, perhaps there is an alternative. If the requirement I'm violating here is simply that these specific functions must be able to support not being inlined, then I believe simply changing them from "FORCEINLINE" to "inline" would satisfy this requirement. It seems like having them in the library is still redundant. Would this change make the deletions acceptable?

Thirdly, if the requirement is really that these functions must exist in the .a file, then just let me know. While I don't understand why (and I'd like to), I'm prepared to do it anyway. I would still need to fix the library version of MultiplyExtract128, but if this is what I need to do, just say so.

I'm obviously not going to check anything in that isn't approved. But there are bug fixes and performance improvements here that I think are worth including in v3. Let me know how I should proceed.

dw
------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58041391&iu=/4140/ostg.clktrk
_______________________________________________
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Reply via email to