[PATCH 13/14] doc, x86: Clean up x86 options documentation [PR122243]

Sandra Loosemore Thu, 08 Jan 2026 11:45:37 -0800

Besides the usual fixes in this series to make the options summary
agree with the options listed in the detailed documentation and add
missing @opindex entries, I decided it was not very helpful to users
to have dozens of ISA extension options documented as a group spanning
multiple pages in the manual.  I broke that up so each of those
options is described separately, using the documentation string from
the .opt file.


gcc/ChangeLog
        PR other/122243
        * config/i386/i386.opt (malign-functions): Mark undocumented/unused
        option as Undocumented.
        (malign-jumps): Likewise.
        (malign-loops): Likewise.
        (mbranch-cost, mforce-drap): Mark undocumented options likely
        intended for developer use only as Undocumented.
        (mstv): Correct sense of option in doc string.
        (mavx512cd): Remove extra "and" from doc string.
        (mavx512dq): Likewise.
        (mavx512bw): Likewise.
        (mavx512vl): Likewise.
        (mavx512ifma): Likewise.
        (mavx512bvmi): Likewise.
        * gcc/doc/invoke.texi (Options Summary) <x86 Options>: Add
        missing options.  Correct whitespace and re-wrap long lines.
        Remove -mthreads which is now classed as a MinGW option.
        (Cygwin and MinGW Options): Replace existing documentation of
        -mthreads with the more detailed text moved from x86 Options.
        (x86 Options): Move introductory text about ISA extensions before
        the individual options instead of after.  Document them all
        individually instead of as a group, and move immediately after
        -march/-mtune documentation.  Rewrap long lines.  Document
        interaction between SSE and AVX with -mfpmath=sse.  Move -masm
        documentation farther down instead of grouped with options
        affecting floating-point behavior.  Add missing @opindex
        entries.  Rewrite the -mdaz-ftz documentation.  Document
        -mstack-arg-probe.  Copy-editing.  Document -mstv.  Remove
        obsolete warning about -mskip-rax-setup in very old GCC versions.
        Rewrite the -mapx-inline-asm-use-gpr32 documentation.
        Document -mgather and -mscatter.  Split -miamcu documentation
        from -m32/-m64/etc.  Rewrite -munroll-only-small-loops documentation.
        Document -mdispatch-scheduler.
---
 gcc/config/i386/i386.opt |   34 +-
 gcc/doc/invoke.texi      | 1232 ++++++++++++++++++++++++--------------
 2 files changed, 793 insertions(+), 473 deletions(-)

diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index 99bb674812b..4942f512417 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -225,16 +225,19 @@ malign-double
 Target Mask(ALIGN_DOUBLE) Save
 Align some doubles on dword boundary.
 
+; Does nothing.
 malign-functions=
-Target RejectNegative Joined UInteger
+Target Undocumented RejectNegative Joined UInteger
 Function starts are aligned to this power of 2.
 
+; Does nothing.
 malign-jumps=
-Target RejectNegative Joined UInteger
+Target Undocumented RejectNegative Joined UInteger
 Jump targets are aligned to this power of 2.
 
+; Does nothing.
 malign-loops=
-Target RejectNegative Joined UInteger
+Target Undocumented RejectNegative Joined UInteger
 Loop code aligned to this power of 2.
 
 malign-stringops
@@ -277,7 +280,7 @@ EnumValue
 Enum(asm_dialect) String(att) Value(ASM_ATT)
 
 mbranch-cost=
-Target RejectNegative Joined UInteger Var(ix86_branch_cost) IntegerRange(0, 5)
+Target Undocumented RejectNegative Joined UInteger Var(ix86_branch_cost) 
IntegerRange(0, 5)
 Branches are this expensive (arbitrary units).
 
 mlarge-data-threshold=
@@ -328,8 +331,13 @@ mfancy-math-387
 Target RejectNegative InverseMask(NO_FANCY_MATH_387, USE_FANCY_MATH_387) Save
 Generate sin, cos, sqrt for FPU.
 
+; This option seems deliberately undocumented as other options added in
+; the same commit were properly documented in the manual.
+; DRAP is usually used only in functions that do dynamic stack
+; allocation (e.g. alloca), this makes it happen everywhere.  Maybe it
+; was intended for debugging?
 mforce-drap
-Target Var(ix86_force_drap)
+Target Undocumented Var(ix86_force_drap)
 Always use Dynamic Realigned Argument Pointer (DRAP) to realign stack.
 
 mfp-ret-in-387
@@ -599,8 +607,8 @@ the function.
 
 mstv
 Target Mask(STV) Save
-Disable Scalar to Vector optimization pass transforming 64-bit integer
-computations into a vector ones.
+Enable Scalar to Vector optimization pass transforming 64-bit integer
+computations into vector ones.
 
 -param=x86-stv-max-visits=
 Target Joined UInteger Var(x86_stv_max_visits) Init(10000) IntegerRange(1, 
1000000) Param
@@ -726,27 +734,27 @@ Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, 
AVX2 and AVX512F built
 
 mavx512cd
 Target Mask(ISA_AVX512CD) Var(ix86_isa_flags) Save
-Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2 and AVX512F and 
AVX512CD built-in functions and code generation.
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX512F and 
AVX512CD built-in functions and code generation.
 
 mavx512dq
 Target Mask(ISA_AVX512DQ) Var(ix86_isa_flags) Save
-Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2 and AVX512F and 
AVX512DQ built-in functions and code generation.
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX512F and 
AVX512DQ built-in functions and code generation.
 
 mavx512bw
 Target Mask(ISA_AVX512BW) Var(ix86_isa_flags) Save
-Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2 and AVX512F and 
AVX512BW built-in functions and code generation.
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX512F and 
AVX512BW built-in functions and code generation.
 
 mavx512vl
 Target Mask(ISA_AVX512VL) Var(ix86_isa_flags) Save
-Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2 and AVX512F and 
AVX512VL built-in functions and code generation.
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX512F and 
AVX512VL built-in functions and code generation.
 
 mavx512ifma
 Target Mask(ISA_AVX512IFMA) Var(ix86_isa_flags) Save
-Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2 and AVX512F and 
AVX512IFMA built-in functions and code generation.
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX512F and 
AVX512IFMA built-in functions and code generation.
 
 mavx512vbmi
 Target Mask(ISA_AVX512VBMI) Var(ix86_isa_flags) Save
-Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2 and AVX512F and 
AVX512VBMI built-in functions and code generation.
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX512F and 
AVX512VBMI built-in functions and code generation.
 
 mavx512vpopcntdq
 Target Mask(ISA_AVX512VPOPCNTDQ) Var(ix86_isa_flags) Save
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 8de21d7b193..c48a73650cc 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1503,15 +1503,15 @@ See RS/6000 and PowerPC Options.
 -mtune-ctrl=@var{feature-list}  -mdump-tune-features  -mno-default
 -mfpmath=@var{unit}
 -masm=@var{dialect}  -mno-fancy-math-387
--mno-fp-ret-in-387  -m80387  -mhard-float  -msoft-float
--mno-wide-multiply  -mrtd  -malign-double
+-mno-fp-ret-in-387  -m80387  -mhard-float  -msoft-float  -mieee-fp
+-mrtd  -malign-double
 -mpreferred-stack-boundary=@var{num}
 -mincoming-stack-boundary=@var{num}
--mcld  -mcx16  -msahf  -mmovbe  -mcrc32 -mmwait
+-mcld  -mcx16  -msahf  -mmovbe  -mcrc32  -mmwait
 -mrecip  -mrecip=@var{opt}
--mvzeroupper  -mprefer-avx128  -mprefer-vector-width=@var{opt}
+-mvzeroupper  -mstv  -mprefer-avx128  -mprefer-vector-width=@var{opt}
 -mpartial-vector-fp-math
--mmove-max=@var{bits} -mstore-max=@var{bits}
+-mmove-max=@var{bits}  -mstore-max=@var{bits}
 -mnoreturn-no-callee-saved-registers
 -mmmx  -msse  -msse2  -msse3  -mssse3  -msse4.1  -msse4.2  -msse4  -mavx
 -mavx2  -mavx512f  -mavx512cd  -mavx512vl
@@ -1520,40 +1520,47 @@ See RS/6000 and PowerPC Options.
 -mptwrite  -mclflushopt  -mclwb  -mxsavec  -mxsaves
 -msse4a  -m3dnow  -m3dnowa  -mpopcnt  -mabm  -mbmi  -mtbm  -mfma4  -mxop
 -madx  -mlzcnt  -mbmi2  -mfxsr  -mxsave  -mxsaveopt  -mrtm  -mhle  -mlwp
--mmwaitx  -mclzero  -mpku  -mthreads  -mgfni  -mvaes  -mwaitpkg
--mshstk -mmanual-endbr -mcet-switch -mforce-indirect-call
--mavx512vbmi2 -mavx512bf16 -menqcmd
+-mmwaitx  -mclzero  -mpku  -mgfni  -mvaes  -mwaitpkg
+-mshstk  -mmanual-endbr  -mcet-switch  -mforce-indirect-call
+-mavx512vbmi2  -mavx512bf16  -menqcmd
 -mvpclmulqdq  -mavx512bitalg  -mmovdiri  -mmovdir64b  -mavx512vpopcntdq
 -mavx512vnni  -mprfchw  -mrdpid
--mrdseed  -msgx -mavx512vp2intersect -mserialize -mtsxldtrk
--mamx-tile  -mamx-int8  -mamx-bf16 -muintr -mhreset -mavxvnni -mamx-fp8
--mavx512fp16 -mavxifma -mavxvnniint8 -mavxneconvert -mcmpccxadd -mamx-fp16
--mprefetchi -mraoint -mamx-complex -mavxvnniint16 -msm3 -msha512 -msm4 -mapxf
--musermsr -mavx10.1 -mavx10.2 -mamx-avx512 -mamx-tf32 -mmovrs -mamx-movrs
--mavx512bmm -mcldemote -mms-bitfields -mno-align-stringops 
-minline-all-stringops
+-mrdseed  -msgx  -mavx512vp2intersect  -mserialize  -mtsxldtrk
+-mamx-tile  -mamx-int8  -mamx-bf16  -muintr  -mhreset
+-mavxvnni  -mamx-fp8  -mavx512fp16  -mavxifma  -mavxvnniint8
+-mavxneconvert  -mcmpccxadd  -mamx-fp16  -mprefetchi  -mraoint
+-mamx-complex  -mavxvnniint16  -msm3  -msha512  -msm4  -mapxf
+-musermsr  -mavx10.1  -mavx10.2  -mamx-avx512  -mamx-tf32  -mmovrs
+-mamx-movrs  -mavx512bmm  -mcldemote  -mms-bitfields
+-mno-align-stringops  -minline-all-stringops
 -minline-stringops-dynamically  -mstringop-strategy=@var{alg}
--mkl -mwidekl
+-mkl  -mwidekl
 -mmemcpy-strategy=@var{strategy}  -mmemset-strategy=@var{strategy}
 -mpush-args  -maccumulate-outgoing-args  -m128bit-long-double
 -m96bit-long-double  -mlong-double-64  -mlong-double-80  -mlong-double-128
 -mregparm=@var{num}  -msseregparm
 -mveclibabi=@var{type}  -mvect8-ret-in-mem
--mpc32  -mpc64  -mpc80  -mdaz-ftz -mstackrealign
+-mpc32  -mpc64  -mpc80  -mdaz-ftz  -mstackrealign  -mstack-arg-probe
 -momit-leaf-frame-pointer  -mno-red-zone  -mno-tls-direct-seg-refs
 -mcmodel=@var{code-model}  -mabi=@var{name}  -maddress-mode=@var{mode}
 -m32  -m64  -mx32  -m16  -miamcu  -mlarge-data-threshold=@var{num}
 -msse2avx  -mfentry  -mrecord-mcount  -mnop-mcount  -m8bit-idiv
--minstrument-return=@var{type} -mfentry-name=@var{name} 
-mfentry-section=@var{name}
+-minstrument-return=@var{type}  -mrecord-return
+-mfentry-name=@var{name}  -mfentry-section=@var{name}
+-mskip-rax-setup
 -mavx256-split-unaligned-load  -mavx256-split-unaligned-store
 -malign-data=@var{type}  -mstack-protector-guard=@var{guard}
 -mstack-protector-guard-reg=@var{reg}
 -mstack-protector-guard-offset=@var{offset}
 -mstack-protector-guard-symbol=@var{symbol}
--mgeneral-regs-only  -mcall-ms2sysv-xlogues -mrelax-cmpxchg-loop
+-mgeneral-regs-only  -mcall-ms2sysv-xlogues  -mtls-dialect=@var{type}
+-mrelax-cmpxchg-loop
 -mindirect-branch=@var{choice}  -mfunction-return=@var{choice}
--mindirect-branch-register -mharden-sls=@var{choice}
--mindirect-branch-cs-prefix -mneeded -mno-direct-extern-access
--munroll-only-small-loops -mlam=@var{choice}}
+-mindirect-branch-register  -mharden-sls=@var{choice}
+-mindirect-branch-cs-prefix  -mapx-inline-asm-use-gpr32
+-mgather  -mscatter
+-mneeded  -mno-direct-extern-access
+-munroll-only-small-loops  -mdispatch-scheduler  -mlam=@var{choice}}
 
 @emph{x86 Windows Options}
 See Cygwin and MinGW Options.
@@ -26781,8 +26788,11 @@ specifies that the @code{dllimport} attribute should 
be ignored.
 
 @opindex mthreads
 @item -mthreads
-This option is available for MinGW targets. It specifies
-that MinGW-specific thread support is to be used.
+Support thread-safe exception handling on MinGW.  Programs that rely
+on thread-safe exception handling must compile and link all code with the
+@option{-mthreads} option.  When compiling, @option{-mthreads} defines
+@option{-D_MT}; when linking, it links in a special thread helper library
+@option{-lmingwthrd} which cleans up per-thread exception-handling data.
 
 @opindex municode
 @opindex mno-unicode
@@ -35717,7 +35727,11 @@ is defined for compatibility with Diab.
 @subsection x86 Options
 @cindex x86 Options
 
-These @samp{-m} options are defined for the x86 family of computers.
+This section documents @option{-m} options available for the x86 family
+of computers.
+
+The following group of options allows compilation to target a specific
+processor.
 
 @table @gcctabopt
 
@@ -36338,10 +36352,583 @@ instruction set applicable to all processors.  In 
contrast,
 @option{-mtune} indicates the processor (or, in this case, collection of
 processors) for which the code is optimized.
 @end table
+@end table
 
-@opindex mcpu
-@item -mcpu=@var{cpu-type}
-A deprecated synonym for @option{-mtune}.
+The following options allow more detailed control over which instruction
+set extensions are targeted by GCC.
+Each has a corresponding @option{-mno-} option to disable use of these
+instructions.
+
+These extensions are also available as built-in functions: see
+@ref{x86 Built-in Functions}, for details of the functions enabled and
+disabled by these switches.
+
+These options enable GCC to use these extended instructions in
+generated code.  Applications that
+perform run-time CPU detection must compile separate files for each
+supported architecture, using the appropriate flags.  In particular,
+the file containing the CPU detection code should be compiled without
+these options.
+
+To control whether 387 or SSE/AVX instructions are generated
+automatically for floating-point arithmetic, see @option{-mfpmath=}, below.
+
+@table @gcctabopt
+@opindex mmmx
+@opindex mno-mmx
+@item -mmmx
+Support MMX built-in functions.
+
+@opindex msse
+@opindex mno-sse
+@item -msse
+Support MMX and SSE built-in functions and code generation.
+
+@opindex msse2
+@opindex mno-sse2
+@item -msse2
+Support MMX, SSE and SSE2 built-in functions and code generation.
+
+@opindex msse3
+@opindex mno-sse3
+@item -msse3
+Support MMX, SSE, SSE2 and SSE3 built-in functions and code generation.
+
+@opindex mssse3
+@opindex mno-ssse3
+@item -mssse3
+Support MMX, SSE, SSE2, SSE3 and SSSE3 built-in functions and code generation.
+
+@opindex msse4.1
+@opindex mno-sse4.1
+@item -msse4.1
+Support MMX, SSE, SSE2, SSE3, SSSE3 and SSE4.1 built-in functions and
+code generation.
+
+@opindex msse4.2
+@opindex mno-sse4.2
+@item -msse4.2
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1 and SSE4.2 built-in functions
+and code generation.
+
+@opindex msse4
+@opindex mno-sse4
+@item -msse4
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1 and SSE4.2 built-in functions
+and code generation.
+
+Note that @option{-msse4} enables both SSE4.1 and SSE4.2 support,
+while @option{-mno-sse4} turns off those features; neither form of the
+option affects SSE4A support, controlled separately by
+@option{-msse4a}.
+
+@opindex msse4a
+@opindex mno-sse4a
+@item -msse4a
+Support MMX, SSE, SSE2, SSE3 and SSE4A built-in functions and code generation.
+
+@opindex mavx
+@opindex mno-avx
+@item -mavx
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2 and AVX built-in
+functions and code generation.
+
+@opindex mavx2
+@opindex mno-avx2
+@item -mavx2
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX and AVX2
+built-in functions and code generation.
+
+@opindex mavx512f
+@opindex mno-avx512f
+@item -mavx512f
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2 and
+AVX512F built-in functions and code generation.
+
+@opindex mavx512cd
+@opindex mno-avx512cd
+@item -mavx512cd
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2,
+AVX512F and AVX512CD built-in functions and code generation.
+
+@opindex mavx512vl
+@opindex mno-avx512vl
+@item -mavx512vl
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2,
+AVX512F and AVX512VL built-in functions and code generation.
+
+@opindex mavx512bw
+@opindex mno-avx512bw
+@item -mavx512bw
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2,
+AVX512F and AVX512BW built-in functions and code generation.
+
+@opindex mavx512dq
+@opindex mno-avx512dq
+@item -mavx512dq
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2,
+AVX512F and AVX512DQ built-in functions and code generation.
+
+@opindex mavx512ifma
+@opindex mno-avx512ifma
+@item -mavx512ifma
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2,
+AVX512F and AVX512IFMA built-in functions and code generation.
+
+@opindex mavx512vbmi
+@opindex mno-avx512vbmi
+@item -mavx512vbmi
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2,
+AVX512F and AVX512VBMI built-in functions and code generation.
+
+@opindex mavx512vpopcntdq
+@opindex mno-avx512vpopcntdq
+@item -mavx512vpopcntdq
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2,
+AVX512F and AVX512VPOPCNTDQ built-in functions and code generation.
+
+@opindex mavx512vp2intersect
+@opindex mno-avx512vp2intersect
+@item -mavx512vp2intersect
+Support AVX512VP2INTERSECT built-in functions and code generation.
+
+@opindex mavx512vnni
+@opindex mno-avx512vnni
+@item -mavx512vnni
+Support AVX512VNNI built-in functions and code generation.
+
+@opindex mavx512vbmi2
+@opindex mno-avx512vbmi2
+@item -mavx512vbmi2
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX512F
+and AVX512VBMI2 built-in functions and code generation.
+
+@opindex mavx512bf16
+@opindex mno-avx512bf16
+@item -mavx512bf16
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX512F and
+AVX512BF16 built-in functions and code generation.
+
+@opindex mavx512fp16
+@opindex mno-avx512fp16
+@item -mavx512fp16
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX512F and
+AVX512-FP16 built-in functions and code generation.
+
+@opindex mavx512bitalg
+@opindex mno-avx512bitalg
+@item -mavx512bitalg
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX512F and
+AVX512BITALG built-in functions and code generation.
+
+@opindex mavx512bmm
+@opindex mno-avx512bmm
+@item -mavx512bmm
+Support AVX512BMM built-in functions and code generation.
+
+@opindex mavxvnni
+@opindex mno-avxvnni
+@item -mavxvnni
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, and
+AVXVNNI built-in functions and code generation.
+
+@opindex mavxifma
+@opindex mno-avxifma
+@item -mavxifma
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, and
+AVXIFMA built-in functions and code generation.
+
+@opindex mavxvnniint8
+@opindex mno-avxvnniint8
+@item -mavxvnniint8
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2 and
+AVXVNNIINT8 built-in functions and code generation.
+
+@opindex mavxneconvert
+@opindex mno-avxneconvert
+@item -mavxneconvert
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, and
+AVXNECONVERT build-in functions and code generation.
+
+@opindex mavxvnniint16
+@opindex mno-avxvnniint16
+@item -mavxvnniint16
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2 and
+AVXVNNIINT16 built-in functions and code generation.
+
+@opindex mavx10.1
+@opindex mno-avx10.1
+@item -mavx10.1
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2,
+and AVX10.1 built-in functions and code generation.
+
+@opindex mavx10.2
+@opindex mno-avx10.2
+@item -mavx10.2
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2,
+AVX10.1 and AVX10.2 built-in functions and code generation.
+
+@opindex msha
+@opindex mno-sha
+@item -msha
+Support SHA1 and SHA256 built-in functions and code generation.
+
+@opindex maes
+@opindex mno-aes
+@item -maes
+Support AES built-in functions and code generation.
+
+@opindex mpclmul
+@opindex mno-pclmul
+@item -mpclmul
+Support PCLMUL built-in functions and code generation.
+
+@opindex mclflushopt
+@opindex mno-clflushopt
+@item -mclflushopt
+Support CLFLUSHOPT instructions.
+
+@opindex mclwb
+@opindex mno-clwb
+@item -mclwb
+Support CLWB instruction.
+
+@opindex mfsgsbase
+@opindex mno-fsgsbase
+@item -mfsgsbase
+Support FSGSBASE built-in functions and code generation.
+
+@opindex mptwrite
+@opindex mno-ptwrite
+@item -mptwrite
+Support PTWRITE built-in functions and code generation.
+
+@opindex mrdrnd
+@opindex mno-rdrnd
+@item -mrdrnd
+Support RDRND built-in functions and code generation.
+
+@opindex mf16c
+@opindex mno-f16c
+@item -mf16c
+Support F16C built-in functions and code generation.
+
+@opindex mfma
+@opindex mno-fma
+@item -mfma
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX and FMA
+built-in functions and code generation.
+
+@opindex mfma4
+@opindex mno-fma4
+@item -mfma4
+Support FMA4 built-in functions and code generation.
+
+@opindex mpconfig
+@opindex mno-pconfig
+@item -mpconfig
+Support PCONFIG built-in functions and code generation.
+
+@opindex mwbnoinvd
+@opindex mno-wbnoinvd
+@item -mwbnoinvd
+Support WBNOINVD built-in functions and code generation.
+
+@opindex mprfchw
+@opindex mno-prfchw
+@item -mprfchw
+Support PREFETCHW instruction.
+
+@opindex mrdpid
+@opindex mno-rdpid
+@item -mrdpid
+Support RDPID built-in functions and code generation.
+
+@opindex mrdseed
+@opindex mno-rdseed
+@item -mrdseed
+Support RDSEED instruction.
+
+@opindex msgx
+@opindex mno-sgx
+@item -msgx
+Support SGX built-in functions and code generation.
+
+@opindex mxop
+@opindex mno-xop
+@item -mxop
+Support XOP built-in functions and code generation.
+
+@opindex mlwp
+@opindex mno-lwp
+@item -mlwp
+Support LWP built-in functions and code generation.
+
+@opindex m3dnow
+@opindex mno-3dnow
+@item -m3dnow
+Support 3DNow! built-in functions.
+
+@opindex m3dnowa
+@opindex mno-3dnowa
+@item -m3dnowa
+Support Athlon 3Dnow! built-in functions.
+
+@opindex mpopcnt
+@opindex mno-popcnt
+@item -mpopcnt
+Support code generation of popcnt instruction.
+
+@opindex mabm
+@opindex mno-abm
+@item -mabm
+Support code generation of Advanced Bit Manipulation (ABM) instructions.
+
+@opindex madx
+@opindex mno-adx
+@item -madx
+Support flag-preserving add-carry instructions.
+
+@opindex mbmi
+@opindex mno-bmi
+@item -mbmi
+Support BMI built-in functions and code generation.
+
+@opindex mbmi2
+@opindex mno-bmi2
+@item -mbmi2
+Support BMI2 built-in functions and code generation.
+
+@opindex mlzcnt
+@opindex mno-lzcnt
+@item -mlzcnt
+Support LZCNT built-in function and code generation.
+
+@opindex mfxsr
+@opindex mno-fxsr
+@item -mfxsr
+Support FXSAVE and FXRSTOR instructions.
+
+@opindex mxsave
+@opindex mno-xsave
+@item -mxsave
+Support XSAVE and XRSTOR instructions.
+
+@opindex mxsaveopt
+@opindex mno-xsaveopt
+@item -mxsaveopt
+Support XSAVEOPT instruction.
+
+@opindex mxsavec
+@opindex mno-xsavec
+@item -mxsavec
+Support XSAVEC instructions.
+
+@opindex mxsaves
+@opindex mno-xsaves
+@item -mxsaves
+Support XSAVES and XRSTORS instructions.
+
+@opindex mrtm
+@opindex mno-rtm
+@item -mrtm
+Support RTM built-in functions and code generation.
+
+@opindex mhle
+@opindex mno-hle
+@item -mhle
+Support Hardware Lock Elision prefixes.
+
+@opindex mtbm
+@opindex mno-tbm
+@item -mtbm
+Support TBM built-in functions and code generation.
+
+@opindex mmwaitx
+@opindex mno-mwaitx
+@item -mmwaitx
+Support MWAITX and MONITORX built-in functions and code generation.
+
+@opindex mclzero
+@opindex mno-clzero
+@item -mclzero
+Support CLZERO built-in functions and code generation.
+
+@opindex mpku
+@opindex mno-pku
+@item -mpku
+Support PKU built-in functions and code generation.
+
+@opindex mgfni
+@opindex mno-gfni
+@item -mgfni
+Support GFNI built-in functions and code generation.
+
+@opindex mvaes
+@opindex mno-vaes
+@item -mvaes
+Support VAES built-in functions and code generation.
+
+@opindex mwaitpkg
+@opindex mno-waitpkg
+@item -mwaitpkg
+Support WAITPKG built-in functions and code generation.
+
+@opindex mvpclmulqdq
+@opindex mno-vpclmulqdq
+@item -mvpclmulqdq
+Support VPCLMULQDQ built-in functions and code generation.
+
+@opindex mmovdiri
+@opindex mno-movdiri
+@item -mmovdiri
+Support MOVDIRI built-in functions and code generation.
+
+@opindex mmovdir64b
+@opindex mno-movdir64b
+@item -mmovdir64b
+Support MOVDIR64B built-in functions and code generation.
+
+@opindex menqcmd
+@opindex mno-enqcmd
+@item -menqcmd
+Support ENQCMD built-in functions and code generation.
+
+@opindex muintr
+@opindex mno-uintr
+@item -muintr
+Support UINTR built-in functions and code generation.
+
+@opindex mtsxldtrk
+@opindex mno-tsxldtrk
+@item -mtsxldtrk
+Support TSXLDTRK built-in functions and code generation.
+
+@opindex mcldemote
+@opindex mno-cldemote
+@item -mcldemote
+Support CLDEMOTE built-in functions and code generation.
+
+@opindex mserialize
+@opindex mno-serialize
+@item -mserialize
+Support SERIALIZE built-in functions and code generation.
+
+@opindex mamx-tile
+@opindex mno-amx-tile
+@item -mamx-tile
+Support AMX-TILE built-in functions and code generation.
+
+@opindex mamx-int8
+@opindex mno-amx-int8
+@item -mamx-int8
+Support AMX-INT8 built-in functions and code generation.
+
+@opindex mamx-bf16
+@opindex mno-amx-bf16
+@item -mamx-bf16
+Support AMX-BF16 built-in functions and code generation.
+
+@opindex mhreset
+@opindex mno-hreset
+@item -mhreset
+Support HRESET built-in functions and code generation.
+
+@opindex mkl
+@opindex mno-kl
+@item -mkl
+Support KL built-in functions and code generation.
+
+@opindex mwidekl
+@opindex mno-widekl
+@item -mwidekl
+Support WIDEKL built-in functions and code generation.
+
+@opindex mcmpccxadd
+@opindex mno-cmpccxadd
+@item -mcmpccxadd
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, and
+CMPCCXADD build-in functions and code generation.
+
+@opindex mamx-fp16
+@opindex mno-amx-fp16
+@item -mamx-fp16
+Support AMX-FP16 built-in functions and code generation.
+
+@opindex mprefetchi
+@opindex mno-prefetchi
+@item -mprefetchi
+Support PREFETCHI built-in functions and code generation.
+
+@opindex mraoint
+@opindex mno-raoint
+@item -mraoint
+Support RAOINT built-in functions and code generation.
+
+@opindex mamx-complex
+@opindex mno-amx-complex
+@item -mamx-complex
+Support AMX-COMPLEX built-in functions and code generation.
+
+@opindex msm3
+@opindex mno-sm3
+@item -msm3
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX and
+SM3 built-in functions and code generation.
+
+@opindex msm4
+@opindex mno-sm4
+@item -msm4
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX and
+SM4 built-in functions and code generation.
+
+@opindex msha512
+@opindex mno-sha512
+@item -msha512
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX and
+SHA512 built-in functions and code generation.
+
+@opindex mapxf
+@opindex mno-apxf
+@item -mapxf
+Support code generation for APX features, including EGPR, PUSH2POP2,
+NDD, PPX, NF, CCMP and ZU.
+
+@opindex musermsr
+@opindex mno-usermsr
+@item -musermsr
+Support USER_MSR built-in functions and code generation.
+
+@opindex mamx-avx512
+@opindex mno-amx-avx512
+@item -mamx-avx512
+Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2,
+AVX10.1, AVX10.2 and AMX-AVX512 built-in functions and code generation.
+
+@opindex mamx-tf32
+@opindex mno-amx-tf32
+@item -mamx-tf32
+Support AMX-TF32 built-in functions and code generation.
+
+@opindex mamx-fp8
+@opindex mno-amx-fp8
+@item -mamx-fp8
+Support AMX-FP8 built-in functions and code generation.
+
+@opindex mmovrs
+@opindex mno-movrs
+@item -mmovrs
+Support MOVRS built-in functions and code generation.
+
+@opindex mamx-movrs
+@opindex mno-amx-movrs
+@item -mamx-movrs
+Support AMX-MOVRS built-in functions and code generation.
+@end table
+
+These additional options are available for the x86 processor family.
+
+@table @gcctabopt
 
 @opindex mfpmath
 @item -mfpmath=@var{unit}
@@ -36350,8 +36937,9 @@ for @var{unit} are:
 
 @table @samp
 @item 387
-Use the standard 387 floating-point coprocessor present on the majority of 
chips and
-emulated otherwise.  Code compiled with this option runs almost everywhere.
+Use the standard 387 floating-point coprocessor present on the majority
+of chips and emulated otherwise.
+Code compiled with this option runs almost everywhere.
 The temporary results are computed in 80-bit precision instead of the precision
 specified by the type, resulting in slightly different results compared to most
 of other chips.  See @option{-ffloat-store} for more detailed description.
@@ -36361,18 +36949,20 @@ This is the default choice for non-Darwin x86-32 
targets.
 @item sse
 Use scalar floating-point instructions present in the SSE instruction set.
 This instruction set is supported by Pentium III and newer chips,
-and in the AMD line
-by Athlon-4, Athlon XP and Athlon MP chips.  The earlier version of the SSE
+and in the AMD line by Athlon-4, Athlon XP and Athlon MP chips.
+The earlier version of the SSE
 instruction set supports only single-precision arithmetic, thus the double and
-extended-precision arithmetic are still done using 387.  A later version, 
present
-only in Pentium 4 and AMD x86-64 chips, supports double-precision
-arithmetic too.
+extended-precision arithmetic are still done using 387.
+A later version, present only in Pentium 4 and AMD x86-64 chips,
+supports double-precision arithmetic too.
 
-For the x86-32 compiler, you must use @option{-march=@var{cpu-type}}, 
@option{-msse}
+For the x86-32 compiler, you must use @option{-march=@var{cpu-type}},
+@option{-msse}
 or @option{-msse2} switches to enable SSE extensions and make this option
 effective.  For the x86-64 compiler, these extensions are enabled by default.
 
-The resulting code should be considerably faster in the majority of cases and 
avoid
+The resulting code should be considerably faster in the majority of cases
+and avoid
 the numerical instability problems of 387 code, but may break some existing
 code that expects temporaries to be 80 bits.
 
@@ -36380,6 +36970,11 @@ This is the default choice for the x86-64 compiler, 
Darwin x86-32 targets,
 and the default choice for x86-32 targets with the SSE2 instruction set
 when @option{-ffast-math} is enabled.
 
+GCC depresses SSE instructions when @option{-mavx} (or another option
+enabling AVX extensions) is used.  Instead, it
+generates new AVX instructions or AVX equivalents for all SSE instructions
+when needed.
+
 @item sse,387
 @itemx sse+387
 @itemx both
@@ -36390,14 +36985,6 @@ still experimental, because the GCC register allocator 
does not model separate
 functional units well, resulting in unstable performance.
 @end table
 
-@opindex masm=@var{dialect}
-@item -masm=@var{dialect}
-Output assembly instructions using selected @var{dialect}.  Also affects
-which dialect is used for basic @code{asm} (@pxref{Basic Asm}) and
-extended @code{asm} (@pxref{Extended Asm}). Supported choices (in dialect
-order) are @samp{att} or @samp{intel}. The default is @samp{att}. Darwin does
-not support @samp{intel}.
-
 @opindex mieee-fp
 @opindex mno-ieee-fp
 @item -mieee-fp
@@ -36412,7 +36999,7 @@ comparison is unordered.
 @itemx -mhard-float
 Generate output containing 80387 instructions for floating point.
 
-@opindex no-80387
+@opindex mno-80387
 @opindex msoft-float
 @item -mno-80387
 @itemx -msoft-float
@@ -36533,6 +37120,7 @@ objects larger than @var{threshold} are placed in large 
data sections.  The
 default is 65535.
 
 @opindex mrtd
+@opindex mno-rtd
 @item -mrtd
 Use a different function-calling convention, in which functions that
 take a fixed number of arguments return with the @code{ret @var{num}}
@@ -36583,6 +37171,7 @@ modules with the same value, including any libraries.  
This includes
 the system libraries and startup modules.
 
 @opindex mvect8-ret-in-mem
+@opindex mno-vect8-ret-in-mem
 @item -mvect8-ret-in-mem
 Return 8-byte vectors in memory instead of MMX registers.  This is the
 default on VxWorks to match the ABI of the Sun Studio compilers until
@@ -36615,16 +37204,20 @@ loss of accuracy, typically through so-called 
``catastrophic cancellation'',
 when this option is used to set the precision to less than extended precision.
 
 @opindex mdaz-ftz
+@opindex mno-daz-ftz
 @item -mdaz-ftz
-
-The flush-to-zero (FTZ) and denormals-are-zero (DAZ) flags in the MXCSR 
register
-are used to control floating-point calculations.SSE and AVX instructions
-including scalar and vector instructions could benefit from enabling the FTZ
-and DAZ flags when @option{-mdaz-ftz} is specified. Don't set FTZ/DAZ flags
-when @option{-mno-daz-ftz} or @option{-shared} is specified, @option{-mdaz-ftz}
-will set FTZ/DAZ flags even with @option{-shared}.
+@itemx -mno-daz-ftz
+The flush-to-zero (FTZ) and denormals-are-zero (DAZ) flags in the
+MXCSR register are used to control floating-point calculations.  The
+@option{-Ofast}, @option{-ffast-math}, or
+@option{-funsafe-math-optimizations} options normally link in startup code
+that sets these flags except when building a shared library
+(@option{-shared}).  You can use the @option{-mdaz-ftz} and
+@option{-mno-daz-ftz} options to explicitly enable or disable setting
+these flags, regardless of other options passed to GCC.
 
 @opindex mstackrealign
+@opindex mno-stackrealign
 @item -mstackrealign
 Realign the stack at entry.  On the x86, the @option{-mstackrealign}
 option generates an alternate prologue and epilogue that realigns the
@@ -36633,6 +37226,11 @@ run-time stack if necessary.  This supports mixing 
legacy codes that keep
 SSE compatibility.  See also the attribute @code{force_align_arg_pointer},
 applicable to individual functions.
 
+@opindex mstack-arg-probe
+@opindex mno-stack-arg-probe
+@item -mstack-arg-probe
+Emit stack probing code in the function prologue.
+
 @opindex mpreferred-stack-boundary
 @item -mpreferred-stack-boundary=@var{num}
 Attempt to keep the stack boundary aligned to a 2 raised to @var{num}
@@ -36679,349 +37277,6 @@ increases code size.  Code that is sensitive to stack 
space usage, such
 as embedded systems and operating system kernels, may want to reduce the
 preferred alignment to @option{-mpreferred-stack-boundary=2}.
 
-@need 200
-@opindex mmmx
-@item -mmmx
-@need 200
-@opindex msse
-@itemx -msse
-@need 200
-@opindex msse2
-@itemx -msse2
-@need 200
-@opindex msse3
-@itemx -msse3
-@need 200
-@opindex mssse3
-@itemx -mssse3
-@need 200
-@opindex msse4
-@itemx -msse4
-@need 200
-@opindex msse4a
-@itemx -msse4a
-@need 200
-@opindex msse4.1
-@itemx -msse4.1
-@need 200
-@opindex msse4.2
-@itemx -msse4.2
-@need 200
-@opindex mavx
-@itemx -mavx
-@need 200
-@opindex mavx2
-@itemx -mavx2
-@need 200
-@opindex mavx512f
-@itemx -mavx512f
-@need 200
-@opindex mavx512cd
-@itemx -mavx512cd
-@need 200
-@opindex mavx512vl
-@itemx -mavx512vl
-@need 200
-@opindex mavx512bw
-@itemx -mavx512bw
-@need 200
-@opindex mavx512dq
-@itemx -mavx512dq
-@need 200
-@opindex mavx512ifma
-@itemx -mavx512ifma
-@need 200
-@opindex mavx512vbmi
-@itemx -mavx512vbmi
-@need 200
-@opindex msha
-@itemx -msha
-@need 200
-@opindex maes
-@itemx -maes
-@need 200
-@opindex mpclmul
-@itemx -mpclmul
-@need 200
-@opindex mclflushopt
-@itemx -mclflushopt
-@need 200
-@opindex mclwb
-@itemx -mclwb
-@need 200
-@opindex mfsgsbase
-@itemx -mfsgsbase
-@need 200
-@opindex mptwrite
-@itemx -mptwrite
-@need 200
-@opindex mrdrnd
-@itemx -mrdrnd
-@need 200
-@opindex mf16c
-@itemx -mf16c
-@need 200
-@opindex mfma
-@itemx -mfma
-@need 200
-@opindex mpconfig
-@itemx -mpconfig
-@need 200
-@opindex mwbnoinvd
-@itemx -mwbnoinvd
-@need 200
-@opindex mfma4
-@itemx -mfma4
-@need 200
-@opindex mprfchw
-@itemx -mprfchw
-@need 200
-@opindex mrdpid
-@itemx -mrdpid
-@need 200
-@opindex mrdseed
-@itemx -mrdseed
-@need 200
-@opindex msgx
-@itemx -msgx
-@need 200
-@opindex mxop
-@itemx -mxop
-@need 200
-@opindex mlwp
-@itemx -mlwp
-@need 200
-@opindex m3dnow
-@itemx -m3dnow
-@need 200
-@opindex m3dnowa
-@itemx -m3dnowa
-@need 200
-@opindex mpopcnt
-@itemx -mpopcnt
-@need 200
-@opindex mabm
-@itemx -mabm
-@need 200
-@opindex madx
-@itemx -madx
-@need 200
-@opindex mbmi
-@itemx -mbmi
-@need 200
-@opindex mbmi2
-@itemx -mbmi2
-@need 200
-@opindex mlzcnt
-@itemx -mlzcnt
-@need 200
-@opindex mfxsr
-@itemx -mfxsr
-@need 200
-@opindex mxsave
-@itemx -mxsave
-@need 200
-@opindex mxsaveopt
-@itemx -mxsaveopt
-@need 200
-@opindex mxsavec
-@itemx -mxsavec
-@need 200
-@opindex mxsaves
-@itemx -mxsaves
-@need 200
-@opindex mrtm
-@itemx -mrtm
-@need 200
-@opindex mhle
-@itemx -mhle
-@need 200
-@opindex mtbm
-@itemx -mtbm
-@need 200
-@opindex mmwaitx
-@itemx -mmwaitx
-@need 200
-@opindex mclzero
-@itemx -mclzero
-@need 200
-@opindex mpku
-@itemx -mpku
-@need 200
-@opindex mavx512vbmi2
-@itemx -mavx512vbmi2
-@need 200
-@opindex mavx512bf16
-@itemx -mavx512bf16
-@need 200
-@opindex mavx512fp16
-@itemx -mavx512fp16
-@need 200
-@opindex mgfni
-@itemx -mgfni
-@need 200
-@opindex mvaes
-@itemx -mvaes
-@need 200
-@opindex mwaitpkg
-@itemx -mwaitpkg
-@need 200
-@opindex mvpclmulqdq
-@itemx -mvpclmulqdq
-@need 200
-@opindex mavx512bitalg
-@itemx -mavx512bitalg
-@need 200
-@opindex mmovdiri
-@itemx -mmovdiri
-@need 200
-@opindex mmovdir64b
-@itemx -mmovdir64b
-@need 200
-@opindex menqcmd
-@opindex muintr
-@itemx -menqcmd
-@itemx -muintr
-@need 200
-@opindex mtsxldtrk
-@itemx -mtsxldtrk
-@need 200
-@opindex mavx512vpopcntdq
-@itemx -mavx512vpopcntdq
-@need 200
-@opindex mavx512vp2intersect
-@itemx -mavx512vp2intersect
-@need 200
-@opindex mavx512vnni
-@itemx -mavx512vnni
-@need 200
-@opindex mavxvnni
-@itemx -mavxvnni
-@need 200
-@opindex mcldemote
-@itemx -mcldemote
-@need 200
-@opindex mserialize
-@itemx -mserialize
-@need 200
-@opindex mamx-tile
-@itemx -mamx-tile
-@need 200
-@opindex mamx-int8
-@itemx -mamx-int8
-@need 200
-@opindex mamx-bf16
-@itemx -mamx-bf16
-@need 200
-@opindex mhreset
-@opindex mkl
-@itemx -mhreset
-@itemx -mkl
-@need 200
-@opindex mwidekl
-@itemx -mwidekl
-@need 200
-@opindex mavxifma
-@itemx -mavxifma
-@need 200
-@opindex mavxvnniint8
-@itemx -mavxvnniint8
-@need 200
-@opindex mavxneconvert
-@itemx -mavxneconvert
-@need 200
-@opindex mcmpccxadd
-@itemx -mcmpccxadd
-@need 200
-@opindex mamx-fp16
-@itemx -mamx-fp16
-@need 200
-@opindex mprefetchi
-@itemx -mprefetchi
-@need 200
-@opindex mraoint
-@itemx -mraoint
-@need 200
-@opindex mamx-complex
-@itemx -mamx-complex
-@need 200
-@opindex mavxvnniint16
-@itemx -mavxvnniint16
-@need 200
-@opindex msm3
-@itemx -msm3
-@need 200
-@opindex msha512
-@itemx -msha512
-@need 200
-@opindex msm4
-@itemx -msm4
-@need 200
-@opindex mapxf
-@itemx -mapxf
-@need 200
-@opindex musermsr
-@itemx -musermsr
-@need 200
-@opindex mavx10.1
-@itemx -mavx10.1
-@need 200
-@opindex mavx10.2
-@itemx -mavx10.2
-@need 200
-@opindex mamx-avx512
-@itemx -mamx-avx512
-@need 200
-@opindex mamx-tf32
-@itemx -mamx-tf32
-@need 200
-@itemx -mamx-fp8
-@opindex mamx-fp8
-@need 200
-@opindex mmovrs
-@itemx -mmovrs
-@need 200
-@opindex mamx-movrs
-@itemx -mamx-movrs
-@need 200
-@opindex mavx512bmm
-@itemx -mavx512bmm
-These switches enable the use of instructions in the MMX, SSE,
-AVX512CD, AVX512VL, AVX512BW, AVX512DQ, AVX512IFMA, AVX512VBMI, SHA, AES,
-PCLMUL, CLFLUSHOPT, CLWB, FSGSBASE, PTWRITE, RDRND, F16C, FMA, PCONFIG,
-WBNOINVD, FMA4, PREFETCHW, RDPID, RDSEED, SGX, XOP, LWP, 3DNow!@:,
-enhanced 3DNow!@:, POPCNT, ABM, ADX, BMI, BMI2, LZCNT, FXSR, XSAVE, XSAVEOPT,
-XSAVEC, XSAVES, RTM, HLE, TBM, MWAITX, CLZERO, PKU, AVX512VBMI2, GFNI, VAES,
-WAITPKG, VPCLMULQDQ, AVX512BITALG, MOVDIRI, MOVDIR64B, AVX512BF16, ENQCMD,
-AVX512VPOPCNTDQ, AVX512VNNI, SERIALIZE, UINTR, HRESET, AMXTILE, AMXINT8,
-AMXBF16, KL, WIDEKL, AVXVNNI, AVX512-FP16, AVXIFMA, AVXVNNIINT8, AVXNECONVERT,
-CMPCCXADD, AMX-FP16, PREFETCHI, RAOINT, AMX-COMPLEX, AVXVNNIINT16, SM3, SHA512,
-SM4, APX_F, USER_MSR, AVX10.1, AVX10.2, AMX-AVX512, AMX-TF32, AMX-FP8, MOVRS,
-AMX-MOVRS, AVX512BMM or CLDEMOTE extended instruction sets. Each has a
-corresponding @option{-mno-} option to disable use of these instructions.
-
-These extensions are also available as built-in functions: see
-@ref{x86 Built-in Functions}, for details of the functions enabled and
-disabled by these switches.
-
-Note that @option{-msse4} enables both SSE4.1 and SSE4.2 support,
-while @option{-mno-sse4} turns off those features; neither form of the
-option affects SSE4A support, controlled separately by
-@option{-msse4a}.
-
-To generate SSE/SSE2 instructions automatically from floating-point
-code (as opposed to 387 instructions), see @option{-mfpmath=sse}.
-
-GCC depresses SSEx instructions when @option{-mavx} is used. Instead, it
-generates new AVX instructions or AVX equivalence for all SSEx instructions
-when needed.
-
-These options enable GCC to use these extended instructions in
-generated code, even without @option{-mfpmath=sse}.  Applications that
-perform run-time CPU detection must compile separate files for each
-supported architecture, using the appropriate flags.  In particular,
-the file containing the CPU detection code should be compiled without
-these options.
 
 @opindex mdump-tune-features
 @item -mdump-tune-features
@@ -37031,10 +37286,10 @@ tuning features and default settings. The names can 
be used in
 
 @opindex mtune-ctrl=@var{feature-list}
 @item -mtune-ctrl=@var{feature-list}
-This option is used to do fine grain control of x86 code generation features.
-@var{feature-list} is a comma separated list of @var{feature} names. See also
-@option{-mdump-tune-features}. When specified, the @var{feature} is turned
-on if it is not preceded with @samp{^}, otherwise, it is turned off.
+This option is used to do fine-grain control of x86 code generation features.
+@var{feature-list} is a comma-separated list of @var{feature} names. See also
+@option{-mdump-tune-features}.  When specified, the @var{feature} is turned
+on if it is not preceded with @samp{^}; otherwise, it is turned off.
 @option{-mtune-ctrl=@var{feature-list}} is intended to be used by GCC
 developers. Using it may lead to code paths not covered by testing and can
 potentially result in compiler ICEs or runtime errors.
@@ -37045,6 +37300,7 @@ This option instructs GCC to turn off all tunable 
features. See also
 @option{-mtune-ctrl=@var{feature-list}} and @option{-mdump-tune-features}.
 
 @opindex mcld
+@opindex mno-cld
 @item -mcld
 This option instructs GCC to emit a @code{cld} instruction in the prologue
 of functions that use string instructions.  String instructions depend on
@@ -37059,13 +37315,23 @@ instructions can be suppressed with the 
@option{-mno-cld} compiler option
 in this case.
 
 @opindex mvzeroupper
+@opindex mno-vzeroupper
 @item -mvzeroupper
 This option instructs GCC to emit a @code{vzeroupper} instruction
 before a transfer of control flow out of the function to minimize
 the AVX to SSE transition penalty as well as remove unnecessary 
@code{zeroupper}
 intrinsics.
 
+@opindex mstv
+@opindex mno-stv
+@item -mstv
+@itemx -mno-stv
+Enable/disable Scalar to Vectorization pass transforming 64-bit
+integer computation into vector ones.  This optimization is restricted
+to @option{-O2} and higher.
+
 @opindex mprefer-avx128
+@opindex mno-prefer-avx128
 @item -mprefer-avx128
 This option instructs GCC to use 128-bit AVX instructions instead of
 256-bit AVX instructions in the auto-vectorizer.
@@ -37076,7 +37342,9 @@ This option instructs GCC to use @var{opt}-bit vector 
width in instructions
 instead of default on the selected platform.
 
 @opindex mpartial-vector-fp-math
+@opindex mno-partial-vector-fp-math
 @item -mpartial-vector-fp-math
+@itemx -mno-partial-vector-fp-math
 This option enables GCC to generate floating-point operations that might
 affect the set of floating-point status flags on partial vectors, where
 vector elements reside in the low part of the 128-bit SSE register.  Unless
@@ -37119,6 +37387,7 @@ Prefer 512-bit vector width for instructions.
 @end table
 
 @opindex mnoreturn-no-callee-saved-registers
+@opindex mno-noreturn-no-callee-saved-registers
 @item -mnoreturn-no-callee-saved-registers
 This option optimizes functions with @code{noreturn} attribute or
 @code{_Noreturn} specifier by not saving in the function prologue callee-saved
@@ -37128,6 +37397,7 @@ register).  This option can interfere with debugging of 
the caller of the
 is not enabled by default.
 
 @opindex mcx16
+@opindex mno-cx16
 @item -mcx16
 This option enables GCC to generate @code{CMPXCHG16B} instructions in 64-bit
 code to implement compare-and-exchange operations on 16-byte aligned 128-bit
@@ -37137,6 +37407,7 @@ machine word in size.  The compiler uses this 
instruction to implement
 128-bit integers, a library call is always used.
 
 @opindex msahf
+@opindex mno-sahf
 @item -msahf
 This option enables generation of @code{SAHF} instructions in 64-bit code.
 Early Intel Pentium 4 CPUs with Intel 64 support,
@@ -37149,28 +37420,33 @@ In 64-bit mode, the @code{SAHF} instruction is used 
to optimize @code{fmod},
 see @ref{Other Builtins} for details.
 
 @opindex mmovbe
+@opindex mno-movbe
 @item -mmovbe
 This option enables use of the @code{movbe} instruction to optimize
 byte swapping of four and eight byte entities.
 
 @opindex mshstk
+@opindex mno-shstk
 @item -mshstk
 The @option{-mshstk} option enables shadow stack built-in functions
 from x86 Control-flow Enforcement Technology (CET).
 
 @opindex mcrc32
+@opindex mno-crc32
 @item -mcrc32
 This option enables built-in functions @code{__builtin_ia32_crc32qi},
 @code{__builtin_ia32_crc32hi}, @code{__builtin_ia32_crc32si} and
 @code{__builtin_ia32_crc32di} to generate the @code{crc32} machine instruction.
 
 @opindex mmwait
+@opindex mno-mwait
 @item -mmwait
 This option enables built-in functions @code{__builtin_ia32_monitor},
 and @code{__builtin_ia32_mwait} to generate the @code{monitor} and
 @code{mwait} machine instructions.
 
 @opindex mrecip
+@opindex mno-recip
 @item -mrecip
 This option enables use of @code{RCPSS} and @code{RSQRTSS} instructions
 (and their vectorized variants @code{RCPPS} and @code{RSQRTPS})
@@ -37193,7 +37469,7 @@ for vectorized single-float division and vectorized 
@code{sqrtf(@var{x})}
 already with @option{-ffast-math} (or the above option combination), and
 doesn't need @option{-mrecip}.
 
-@opindex mrecip=opt
+@opindex mrecip=
 @item -mrecip=@var{opt}
 This option controls which reciprocal estimate instructions
 may be used.  @var{opt} is a comma-separated list of options, which may
@@ -37289,13 +37565,23 @@ You can control this behavior for specific functions 
by
 using the function attributes @code{ms_abi} and @code{sysv_abi}.
 @xref{Function Attributes}.
 
+@opindex masm=@var{dialect}
+@item -masm=@var{dialect}
+Output assembly instructions using selected @var{dialect}.  Also affects
+which dialect is used for basic @code{asm} (@pxref{Basic Asm}) and
+extended @code{asm} (@pxref{Extended Asm}). Supported choices (in dialect
+order) are @samp{att} or @samp{intel}. The default is @samp{att}. Darwin does
+not support @samp{intel}.
+
 @opindex mforce-indirect-call
+@opindex mno-force-indirect-call
 @item -mforce-indirect-call
 Force all calls to functions to be indirect. This is useful
 when using Intel Processor Trace where it generates more precise timing
 information for function calls.
 
 @opindex mmanual-endbr
+@opindex mno-manual-endbr
 @item -mmanual-endbr
 Insert ENDBR instruction at function entry only via the @code{cf_check}
 function attribute. This is useful when used with the option
@@ -37303,6 +37589,7 @@ function attribute. This is useful when used with the 
option
 function entry.
 
 @opindex mcet-switch
+@opindex mno-cet-switch
 @item -mcet-switch
 By default, CET instrumentation is turned off on switch statements that
 use a jump table and indirect branch track is disabled.  Since jump
@@ -37341,6 +37628,7 @@ by default.  In some cases disabling it may improve 
performance because of
 improved scheduling and reduced dependencies.
 
 @opindex maccumulate-outgoing-args
+@opindex mno-accumulate-outgoing-args
 @item -maccumulate-outgoing-args
 If enabled, the maximum amount of space required for outgoing arguments is
 computed in the function prologue.  This is faster on most modern CPUs
@@ -37348,14 +37636,6 @@ because of reduced dependencies, improved scheduling 
and reduced stack usage
 when the preferred stack boundary is not equal to 2.  The drawback is a notable
 increase in code size.  This switch implies @option{-mno-push-args}.
 
-@opindex mthreads
-@item -mthreads
-Support thread-safe exception handling on MinGW.  Programs that rely
-on thread-safe exception handling must compile and link all code with the
-@option{-mthreads} option.  When compiling, @option{-mthreads} defines
-@option{-D_MT}; when linking, it links in a special thread helper library
-@option{-lmingwthrd} which cleans up per-thread exception-handling data.
-
 @opindex mms-bitfields
 @opindex mno-ms-bitfields
 @item -mms-bitfields
@@ -37458,11 +37738,11 @@ Taking this into account, it is important to note the 
following:
 @enumerate
 @item If a zero-length bit-field follows a normal bit-field, the type of the
 zero-length bit-field may affect the alignment of the structure as whole. For
-example, @code{t2} has a size of 4 bytes, since the zero-length bit-field 
follows a
-normal bit-field, and is of type short.
+example, @code{t2} has a size of 4 bytes, since the zero-length bit-field
+follows a normal bit-field, and is of type short.
 
-@item Even if a zero-length bit-field is not followed by a normal bit-field, 
it may
-still affect the alignment of the structure:
+@item Even if a zero-length bit-field is not followed by a normal bit-field,
+it may still affect the alignment of the structure:
 
 @smallexample
 struct
@@ -37500,6 +37780,7 @@ code size and improves performance in case the 
destination is already aligned,
 but GCC doesn't know about it.
 
 @opindex minline-all-stringops
+@opindex mno-inline-all-stringops
 @item -minline-all-stringops
 By default GCC inlines string operations only when the destination is
 known to be aligned to least a 4-byte boundary.
@@ -37510,6 +37791,7 @@ The option enables inline expansion of @code{strlen} 
for all
 pointer alignments.
 
 @opindex minline-stringops-dynamically
+@opindex mno-inline-stringops-dynamically
 @item -minline-stringops-dynamically
 For string operations of unknown size, use run-time checks with
 inline code for small blocks and a library call for large blocks.
@@ -37536,31 +37818,34 @@ Always use a library call.
 
 @opindex mmemcpy-strategy=@var{strategy}
 @item -mmemcpy-strategy=@var{strategy}
-Override the internal decision heuristic to decide if @code{__builtin_memcpy}
-should be inlined and what inline algorithm to use when the expected size
-of the copy operation is known. @var{strategy}
-is a comma-separated list of @var{alg}:@var{max_size}:@var{dest_align} 
triplets.
-@var{alg} is specified in @option{-mstringop-strategy}, @var{max_size} 
specifies
-the max byte size with which inline algorithm @var{alg} is allowed.  For the 
last
-triplet, the @var{max_size} must be @code{-1}. The @var{max_size} of the 
triplets
-in the list must be specified in increasing order.  The minimal byte size for
-@var{alg} is @code{0} for the first triplet and @code{@var{max_size} + 1} of 
the
-preceding range.
+Override the internal decision heuristic to decide if
+@code{__builtin_memcpy} should be inlined and what inline algorithm to
+use when the expected size of the copy operation is known.
+@var{strategy} is a comma-separated list of
+@var{alg}:@var{max_size}:@var{dest_align} triplets.  @var{alg} is
+specified in @option{-mstringop-strategy}, @var{max_size} specifies
+the max byte size with which inline algorithm @var{alg} is allowed.
+For the last triplet, the @var{max_size} must be @code{-1}.
+The @var{max_size} of the triplets in the list must be specified in
+increasing order.  The minimal byte size for @var{alg} is @code{0} for
+the first triplet and @code{@var{max_size} + 1} of the preceding range.
 
 @opindex mmemset-strategy=@var{strategy}
 @item -mmemset-strategy=@var{strategy}
-The option is similar to @option{-mmemcpy-strategy=} except that it is to 
control
-@code{__builtin_memset} expansion.
+This option is similar to @option{-mmemcpy-strategy=} except that it
+controls the @code{__builtin_memset} expansion.
 
 @opindex momit-leaf-frame-pointer
+@opindex mno-omit-leaf-frame-pointer
 @item -momit-leaf-frame-pointer
 Don't keep the frame pointer in a register for leaf functions.  This
 avoids the instructions to save, set up, and restore frame pointers and
 makes an extra register available in leaf functions.  The option
-@option{-momit-leaf-frame-pointer} removes the frame pointer for leaf 
functions,
-which might make debugging harder.
+@option{-momit-leaf-frame-pointer} removes the frame pointer for leaf
+functions, which might make debugging harder.
 
 @opindex mtls-direct-seg-refs
+@opindex mno-tls-direct-seg-refs
 @item -mtls-direct-seg-refs
 @itemx -mno-tls-direct-seg-refs
 Controls whether TLS variables may be accessed with offsets from the
@@ -37572,12 +37857,14 @@ segment to cover the entire TLS area.
 For systems that use the GNU C Library, the default is on.
 
 @opindex msse2avx
+@opindex mno-sse2avx
 @item -msse2avx
 @itemx -mno-sse2avx
 Specify that the assembler should encode SSE instructions with VEX
 prefix.  The option @option{-mavx} turns this on by default.
 
 @opindex mfentry
+@opindex mno-fentry
 @item -mfentry
 @itemx -mno-fentry
 If profiling is active (@option{-pg}), put the profiling
@@ -37586,55 +37873,58 @@ Note: On x86 architectures the attribute 
@code{ms_hook_prologue}
 isn't possible at the moment for @option{-mfentry} and @option{-pg}.
 
 @opindex mrecord-mcount
+@opindex mno-record-mcount
 @item -mrecord-mcount
 @itemx -mno-record-mcount
-If profiling is active (@option{-pg}), generate a __mcount_loc section
-that contains pointers to each profiling call. This is useful for
-automatically patching and out calls.
+If profiling is active (@option{-pg}), generate a section that contains
+pointers to each profiling call; this is useful for automatically patching
+the calls.  You can use the @option{mfentry-section=} option to set the
+name of the section; it defaults to @samp{__mcount_loc}.
 
 @opindex mnop-mcount
+@opindex mno-nop-mcount
 @item -mnop-mcount
 @itemx -mno-nop-mcount
 If profiling is active (@option{-pg}), generate the calls to
-the profiling functions as NOPs. This is useful when they
+the profiling functions as NOPs.  This is useful when they
 should be patched in later dynamically. This is likely only
 useful together with @option{-mrecord-mcount}.
 
 @opindex minstrument-return
 @item -minstrument-return=@var{type}
-Instrument function exit in -pg -mfentry instrumented functions with
-call to specified function. This only instruments true returns ending
-with ret, but not sibling calls ending with jump. Valid types
-are @var{none} to not instrument, @var{call} to generate a call to __return__,
-or @var{nop5} to generate a 5 byte nop.
+With @option{-pg -mfentry}, instrument function exits according to
+@var{type}, which may be one of @samp{none} to not instrument,
+@samp{call} to generate a call to @code{__return__}, or @samp{nop5} to
+generate a 5-byte nop sequence.  This option only instruments true
+returns ending with a @code{ret}, not sibling calls via a jump.
 
 @opindex mrecord-return
+@opindex mno-record-return
 @item -mrecord-return
 @itemx -mno-record-return
-Generate a __return_loc section pointing to all return instrumentation code.
+Generate a @code{__return_loc} section pointing to all return instrumentation
+code.
 
 @opindex mfentry-name
 @item -mfentry-name=@var{name}
-Set name of __fentry__ symbol called at function entry for -pg -mfentry 
functions.
+Set name of @code{__fentry__} symbol called at function entry for
+@option{-pg -mfentry} functions.
 
 @opindex mfentry-section
 @item -mfentry-section=@var{name}
-Set name of section to record -mrecord-mcount calls (default __mcount_loc).
+Set name of section to record @option{-mrecord-mcount} calls.  The
+default is @samp{__mcount_loc}.
 
 @opindex mskip-rax-setup
+@opindex mno-skip-rax-setup
 @item -mskip-rax-setup
 @itemx -mno-skip-rax-setup
 When generating code for the x86-64 architecture with SSE extensions
 disabled, @option{-mskip-rax-setup} can be used to skip setting up RAX
 register when there are no variable arguments passed in vector registers.
 
-@strong{Warning:} Since RAX register is used to avoid unnecessarily
-saving vector registers on stack when passing variable arguments, the
-impacts of this option are callees may waste some stack space,
-misbehave or jump to a random location.  GCC 4.4 or newer don't have
-those issues, regardless the RAX register value.
-
 @opindex m8bit-idiv
+@opindex mno-8bit-idiv
 @item -m8bit-idiv
 @itemx -mno-8bit-idiv
 On some processors, like Intel Atom, 8-bit unsigned integer divide is
@@ -37644,7 +37934,9 @@ to 255, 8-bit unsigned integer divide is used instead of
 32-bit/64-bit integer divide.
 
 @opindex mavx256-split-unaligned-load
+@opindex mno-avx256-split-unaligned-load
 @opindex mavx256-split-unaligned-store
+@opindex mno-avx256-split-unaligned-store
 @item -mavx256-split-unaligned-load
 @itemx -mavx256-split-unaligned-store
 Split 32-byte AVX unaligned load and store.
@@ -37679,6 +37971,7 @@ prevents the compiler from using floating-point, 
vector, mask and bound
 registers.
 
 @opindex mrelax-cmpxchg-loop
+@opindex mno-relax-cmpxchg-loop
 @item -mrelax-cmpxchg-loop
 When emitting a compare-and-swap loop for @ref{__sync Builtins}
 and @ref{__atomic Builtins} lacking a native instruction, optimize
@@ -37728,6 +38021,7 @@ not be reachable in the large code model.
 
 
 @opindex mindirect-branch-register
+@opindex mno-indirect-branch-register
 @item -mindirect-branch-register
 Force indirect call and jump via register.
 
@@ -37740,20 +38034,27 @@ hardening.  @samp{return} enables SLS hardening for 
function returns.
 @samp{all} enables all SLS hardening.
 
 @opindex mindirect-branch-cs-prefix
+@opindex mno-indirect-branch-cs-prefix
 @item -mindirect-branch-cs-prefix
 Add CS prefix to call and jmp to indirect thunk with branch target in
-r8-r15 registers so that the call and jmp instruction length is 6 bytes
-to allow them to be replaced with @samp{lfence; call *%r8-r15} or
-@samp{lfence; jmp *%r8-r15} at run-time.
+r8-r15 registers, so that the call and jmp instruction length is 6 bytes.
+This allows them to potentially be replaced with @samp{lfence; call *%r8-r15}
+or @samp{lfence; jmp *%r8-r15} at run time.
 
 @opindex mapx-inline-asm-use-gpr32
+@opindex mno-apx-inline-asm-use-gpr32
 @item -mapx-inline-asm-use-gpr32
-For inline asm support with APX, by default the EGPR feature was
-disabled to prevent potential illegal instruction with EGPR occurs.
-To invoke egpr usage in inline asm, use new compiler option
--mapx-inline-asm-use-gpr32 and user should ensure the instruction
-supports EGPR.
+By default, EGPR usage is disabled in inline asm constraints as GCC cannot
+be aware of whether the asm instructions support GP32 or not.
+If your inline asm can handle EGPR, use @option{-mapx-inline-asm-use-gpr32}.
 
+@opindex mgather
+@opindex mno-gather
+@opindex mscatter
+@opindex mno-scatter
+@item -mgather
+@itemx -mscatter
+Enable vectorization for gather and scatter instructions, respectively.
 @end table
 
 These @samp{-m} switches are supported in addition to the above
@@ -37764,12 +38065,10 @@ on x86-64 processors in 64-bit environments.
 @opindex m64
 @opindex mx32
 @opindex m16
-@opindex miamcu
 @item -m32
 @itemx -m64
 @itemx -mx32
 @itemx -m16
-@itemx -miamcu
 Generate code for a 16-bit, 32-bit or 64-bit environment.
 The @option{-m32} option sets @code{int}, @code{long}, and pointer types
 to 32 bits, and
@@ -37788,7 +38087,10 @@ The @option{-m16} option is the same as @option{-m32}, 
except for that
 it outputs the @code{.code16gcc} assembly directive at the beginning of
 the assembly output so that the binary can run in 16-bit mode.
 
-The @option{-miamcu} option generates code which conforms to Intel MCU
+@opindex miamcu
+@opindex mno-iamcu
+@item -miamcu
+The @option{-miamcu} option generates code that conforms to Intel MCU
 psABI.  It requires the @option{-m32} option to be turned on.
 
 @opindex mno-red-zone
@@ -37840,6 +38142,7 @@ and x32 environments.  It is the default address mode 
for 32-bit and
 x32 environments.
 
 @opindex mneeded
+@opindex mno-needed
 @item -mneeded
 @itemx -mno-needed
 Emit GNU_PROPERTY_X86_ISA_1_NEEDED GNU property for Linux target to
@@ -37848,7 +38151,7 @@ indicate the micro-architecture ISA level required to 
execute the binary.
 @opindex mno-direct-extern-access
 @opindex mdirect-extern-access
 @item -mno-direct-extern-access
-Without @option{-fpic} nor @option{-fPIC}, always use the GOT pointer
+Without @option{-fpic} or @option{-fPIC}, always use the GOT pointer
 to access external symbols.  With @option{-fpic} or @option{-fPIC},
 treat access to protected symbols as local symbols.  The default is
 @option{-mdirect-extern-access}.
@@ -37861,10 +38164,19 @@ protected symbols are used in shared libraries and 
executable.
 @opindex munroll-only-small-loops
 @opindex mno-unroll-only-small-loops
 @item -munroll-only-small-loops
-Controls conservative small loop unrolling. It is default enabled by
-O2, and unrolls loop with less than 4 insns by 1 time. Explicit
--f[no-]unroll-[all-]loops would disable this flag to avoid any
-unintended unrolling behavior that user does not want.
+@itemx -mno-unroll-only-small-loops
+Controls conservative small loop unrolling.  It is enabled by default with
+@option{-O2}, and unrolls loops with less than 4 instructions by 1 time.
+This gives better utilization of the instruction decoding pipeline on modern
+processors.  You can disable this with @option{-mno-unroll-only-small-loops},
+and it is also disabled if the more general options @option{-funroll-loops}
+or @option{-funroll-all-loops} are either enabled or explicitly disabled.
+
+@opindex mdispatch-scheduler
+@item -mdispatch-scheduler
+Enable instruction scheduling.  This is only supported on @samp{bdver1},
+@samp{bdver2}, @samp{bdver3}, @samp{bdver4}, and @samp{znver1} processors and
+additionally requires @option{-fschedule-insns -fsched-pressure}.
 
 @opindex mlam
 @item -mlam=@var{choice}
-- 
2.39.5

[PATCH 13/14] doc, x86: Clean up x86 options documentation [PR122243]

Reply via email to