Hi Guys,

I'd like to discuss changing the defaults of CMAKE_C/CXX_FLAGS_RELEASE on gcc, and potentially gcc like compilers such as clang and intel.

Currently the default is "-O3 -DNDEBUG". I would like to discuss changing this to "-O3 -march=native -mtune=native -DNDEBUG". This change will enable numerous optimizations and produce faster code targeted for the cpu in use during the compilation.

The majority of people want cpu specific optimization enabled when compiling code for use on their system. however many folks (specifically users of scientific codes who are often not computer scientists) don't realize requirement of passing march and mtune flags to enable cpu specific instruction sets. Of course there are a small number of developers making binaries for distribution who do not want to use cpu specific instructions, but the number of developers make binaries for distribution is small compared to folks who just need to compile code for their own use, and the folks making binaries are well aware of the issues and can be expected to know to disable such features.

on my system here is the difference

-03:

   $gcc -O3 -Q --help=target | grep enabled
      -m64                                [enabled]
      -m80387                             [enabled]
      -m96bit-long-double                 [enabled]
      -malign-stringops                   [enabled]
      -mfancy-math-387                    [enabled]
      -mfentry                            [enabled]
      -mfp-ret-in-387                     [enabled]
      -mglibc                             [enabled]
      -mhard-float                        [enabled]
      -mieee-fp                           [enabled]
      -mlong-double-80                    [enabled]
      -mno-sse4                           [enabled]
      -mpush-args                         [enabled]
      -mred-zone                          [enabled]
      -mstackrealign                      [enabled]
      -mtls-direct-seg-refs               [enabled]

-O3 -march=native -mtune=native

   $gcc -O3 -march=native -mtune=native -Q --help=target | grep enabled
      -m64                                [enabled]
      -m80387                             [enabled]
      -m96bit-long-double                 [enabled]
      -maes                               [enabled]
      -malign-stringops                   [enabled]
      -mavx                               [enabled]
      -mcx16                              [enabled]
      -mf16c                              [enabled]
      -mfancy-math-387                    [enabled]
      -mfentry                            [enabled]
      -mfp-ret-in-387                     [enabled]
      -mfsgsbase                          [enabled]
      -mfxsr                              [enabled]
      -mglibc                             [enabled]
      -mhard-float                        [enabled]
      -mieee-fp                           [enabled]
      -mlong-double-80                    [enabled]
      -mmmx                               [enabled]
      -mpclmul                            [enabled]
      -mpopcnt                            [enabled]
      -mpush-args                         [enabled]
      -mrdrnd                             [enabled]
      -mred-zone                          [enabled]
      -msahf                              [enabled]
      -msse                               [enabled]
      -msse2                              [enabled]
      -msse3                              [enabled]
      -msse4                              [enabled]
      -msse4.1                            [enabled]
      -msse4.2                            [enabled]
      -mssse3                             [enabled]
      -mstackrealign                      [enabled]
      -mtls-direct-seg-refs               [enabled]
      -mxsave                             [enabled]
      -mxsaveopt                          [enabled]

notice specifically the lack of sse and avx in -O3! these instruction sets play a major role in providing computational performance in modern cpus, and that's why I think they should be enable by default in cmake Release builds. clang and intel also support these flags, although there may be better alternatives.

What do you think?

Burlen
-- 

Powered by www.kitware.com

Please keep messages on-topic and check the CMake FAQ at: 
http://www.cmake.org/Wiki/CMake_FAQ

Kitware offers various services to support the CMake community. For more 
information on each offering, please visit:

CMake Support: http://cmake.org/cmake/help/support.html
CMake Consulting: http://cmake.org/cmake/help/consulting.html
CMake Training Courses: http://cmake.org/cmake/help/training.html

Visit other Kitware open-source projects at 
http://www.kitware.com/opensource/opensource.html

Follow this link to subscribe/unsubscribe:
http://public.kitware.com/mailman/listinfo/cmake

Reply via email to