On Sun, Feb 19, 2012 at 7:02 PM, Aaron Ballman <[email protected]>wrote:
> On Sun, Feb 19, 2012 at 8:48 PM, Aaron Ballman <[email protected]> > wrote: > > On Sun, Feb 19, 2012 at 7:41 PM, Chandler Carruth <[email protected]> > wrote: > >> On Sun, Feb 19, 2012 at 5:35 PM, Aaron Ballman <[email protected]> > >> wrote: > >>> > >>> On Sun, Feb 19, 2012 at 5:18 PM, Chandler Carruth < > [email protected]> > >>> wrote: > >>> > On Sun, Feb 19, 2012 at 3:09 PM, Aaron Ballman < > [email protected]> > >>> > wrote: > >>> >> > >>> >> On Sun, Feb 19, 2012 at 4:45 PM, Eli Friedman < > [email protected]> > >>> >> wrote: > >>> >> > On Sun, Feb 19, 2012 at 10:28 AM, Aaron Ballman > >>> >> > <[email protected]> > >>> >> > wrote: > >>> >> >> On Windows, the default search paths for includes would put the > >>> >> >> CMake-found headers (<build dir>\lib\clang\3.1\include) before > the > >>> >> >> OS > >>> >> >> headers. For the most part, this is fine. However, there are > some > >>> >> >> header files that share names with the system headers, which are > >>> >> >> incorrect. For instance, immintrin.h is completely different > from > >>> >> >> the > >>> >> >> SDK include. This causes miscompilations in surprising places > >>> >> >> because > >>> >> >> a system header will attempt to include one of these duplicates, > and > >>> >> >> find the clang header instead of the other system header. > >>> >> >> > >>> >> >> A simple test case of: #include <map> when building from MSVC > >>> >> >> demonstrates the issue. > >>> >> >> > >>> >> >> A user can work around this behavior by specifying -nobuiltininc > on > >>> >> >> the command line, but that seems rather obtuse for such normal > use > >>> >> >> cases. > >>> >> >> > >>> >> >> I've fixed this by changing the order that the headers are > included > >>> >> >> from the toolchain. The SDK headers get added *before* the clang > >>> >> >> headers, and this resolves the issue. > >>> >> >> > >>> >> >> This patch does cause one regression test to fail > >>> >> >> (Modules\compiler_builtins.m). This is because the platform SDK > >>> >> >> intrinsics are differently structured from the ones that come > with > >>> >> >> clang. The OS APIs expect the SDK intrinsics, but our tests > seem to > >>> >> >> require our own intrinsics. I'm open to suggestions. > >>> >> > > >>> >> > If on Windows, we specifically need the OS version of > immintrin.h, we > >>> >> > should modify our version to forward to it; completely ignoring > the > >>> >> > builtin headers is a bad idea (for example, > >>> >> > http://llvm.org/bugs/show_bug.cgi?id=11918 is a miscompile > caused by > >>> >> > pulling in the wrong version of stdarg.h). > >>> >> > >>> >> The patch doesn't completely ignore the builtin headers though -- it > >>> >> simply prefers the OS headers over the builtin headers. > >>> > > >>> > > >>> > That doesn't work in the general case though -- look at all the > places > >>> > in > >>> > our builtin headers where we specifically forward *to* the system > >>> > headers. > >>> > If the builtin doesn't come first, lots of things go wrong here. > >>> > > >>> > We should make our builtins always come first, and always be correct > >>> > when > >>> > they come first. > >>> > > >>> >> That being > >>> >> said, I am not against forwarding our version. What would be a good > >>> >> approach to it? > >>> > > >>> > > >>> > Look at the other builtin headers that use '__has_include_next' and > >>> > 'include_next' to fold together a Clang builtin header and a system > >>> > builtin > >>> > header. > >>> > > >>> > I suspect you will have to essentially tack on the system > 'immintrin.h' > >>> > header to the end of the builtin, as we do expect #include > <immintrin.h> > >>> > to > >>> > pull in the Clang builtin bits... Essentially, this doesn't seem > like it > >>> > should be either-or, it should be both. > >>> > >>> So how should I handle the duplicate declarations? For instance: > >>> > >>> // mmintrin.h from MSVC > >>> typedef union __declspec(intrin_type) _CRT_ALIGN(8) __m64 > >>> { > >>> unsigned __int64 m64_u64; > >>> float m64_f32[2]; > >>> __int8 m64_i8[8]; > >>> __int16 m64_i16[4]; > >>> __int32 m64_i32[2]; > >>> __int64 m64_i64; > >>> unsigned __int8 m64_u8[8]; > >>> unsigned __int16 m64_u16[4]; > >>> unsigned __int32 m64_u32[2]; > >>> } __m64; > >>> > >>> void _m_empty(void); > >>> #define _mm_empty _m_empty > >>> > >>> // mmintrin.h from clang > >>> typedef long long __m64 __attribute__((__vector_size__(8))); > >>> > >>> static __inline__ void __attribute__((__always_inline__, __nodebug__)) > >>> _mm_empty(void) > >>> { > >>> __builtin_ia32_emms(); > >>> } > >>> #define _m_empty _mm_empty > >>> > >>> It seems to me that trying to keep *both* headers is going to be > >>> rather challenging given that both make the same typedefs in > >>> incompatible ways, the declarations/macros are opposite one another, > >>> etc. > >> > >> > >> Hold on -- the MS one is trying to provide the same intel intrinsics? > Now I > >> think that the original search order and header file is correct -- we > don't > >> want the MS version of this header as it is likely to use MS extensions > >> Clang doesn't support (or for which Clang has better models). Maybe > what we > >> really need is to understand what the clang immintrin.h *lacks* in > order for > >> MS code to use it? > > > > Yes, MS has the same intrinsics, but with differently-structured > > header files. The MS immintrin.h has all of the AVX intrinsics and > > includes only wmmintrin.h, which includes nmmintrin.h for SSE4.2, etc > > The clang immintrin.h checks various defines (__AVX__, __AES__, etc) > > before including all of the files. > > > > One thing which does get us further in parsing is to force define the > > includes from clang's intrinsics: > > > > #define __AVX__ > > #define __AES__ > > #define __SSE3__ > > #define __SSSE3__ > > #define __SSE4_1__ > > #define __SSE4_2__ > > > > There are still compile errors, however, because of the way > > Microsoft's intrin.h header file works. It has some macro hacks to > > basically erase some declarations. For instance: > > > > #define __MACHINE(X) X; > > #define __MACHINEX86X_IA64 __MACHINE > > ... > > #define __MACHINEZ(X) /* NOTHING */ > > ... > > #if !(defined(_M_IX86) || defined(_M_IA64)) > > #undef __MACHINEX86X_IA64 > > #define __MACHINEX86X_IA64 __MACHINEZ > > #endif > > ... > > __MACHINEX86X_IA64(__m64 _m_pshufw(__m64,int)) > > > > // Microsoft headers > > extern __m64 _m_pshufw(__m64, int); > > #define _mm_shuffle_pi16 _m_pshufw > > > > // Clang headers > > #define _mm_shuffle_pi16(a, n) __extension__ ({ \ > > __m64 __a = (a); \ > > (__m64)__builtin_ia32_pshufw((__v4hi)__a, (n)); }) > > #define _m_pshufw _mm_shuffle_pi16 > > > > This causes compile errors with clang like this: > > > > error: expected unqualified-id > > __MACHINEX86X_IA64(__m64 _m_pshufw(__m64,int)) > > > > Because this expands out into: > > > > __m64 __extension__ ({ __m64 __a = (__m64); > > (__m64)__builtin_ia32_pshufw((__v4hi)__a, (int)); }); > > I wonder if it would make sense to find all of the macro declarations > in the intrinsics and turn them into something like this: > > > #if defined(_MSC_VER) > static __inline__ void __attribute__((__always_inline__, __nodebug__)) > _mm_shuffle_pi16(__m64 a, int n) { > return __buildin_ia32_pshufw((__v4hi)a, n); > } > #else > #define _mm_shuffle_pi16(a, n) __extension__ ({ \ > __m64 __a = (a); \ > (__m64)__builtin_ia32_pshufw((__v4hi)__a, (n)); }) > #endif > > There appear to only be a dozen or so cases of this. But I'm not > certain how much customization we'd like in these header files, > either. No, we shouldn't need two versions. See my previous email, I don't think this is the root cause of the problem, I think there is just some other interfaces that MS requires these headers to provide which we don't yet provide.
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
