On Saturday, 23 April 2016 at 10:40:12 UTC, Johan Engelen wrote:
On Monday, 18 April 2016 at 00:27:06 UTC, Joe Duarte wrote:
Someone else said talked about marking "Broadwell" and other
generation names. As others have said, it's better to specify
features. I wanted to chime in with a couple of additional
examples. Intel's transactional memory accelerating
instructions (TSX) are only available on some Broadwell parts
because there was a bug in the original implementation
(Haswell and early Broadwell) and it's disabled on most. But
the new Broadwell server chips have it, and it's a big deal
for some DB workloads. Similarly, only some Skylake chips have
the Secure Guard instructions (SGX), which are very powerful
for creating secure enclaves on an untrusted host.
Thanks, I've seen similar comments in LLVM code.
I have a question perhaps you can comment on?
With LLVM, it is possible to specify something like
"+sse3,-sse2" (I did not test whether this actually results in
SSE3 instructions being used, but no SSE2 instructions). What
should be returned when querying whether "sse3" feature is
enabled?
Should __traits(targetHasFeature, "sse3") == true mean that
implied features (such as sse and sse2) are also available?
If you specify SSE3, you should definitely get SSE2 and plain old
SSE with it. SSE3 is a superset of SSE2 and includes all the SSE2
instructions (more than 100 I think.)
I'm not sure about your syntax – I thought the hyphen meant to
include the option, not remove it, and I haven't seen the
addition sign used for those settings. But I haven't done much
with those optimization flags.
You wouldn't want to exclude SSE2 support because it's becoming
the bare minimum baseline for modern systems, the de facto FP
unit. Windows 10 requires a CPU with SSE2, as do more and more
applications on the archaic Unix-like platforms.