Ping: [PATCH v3] doc: Correction of Tree SSA Passes info.

2024-05-09 Thread Chenghui Pan

Ping. Is this version of patch ok for trunk? thanks!

On 2024/3/26 09:54, Chenghui Pan wrote:

Current document of Tree SSA passes contains many parts that is not
updated for many years.

This patch removes some info that is outdated and not existed in
current GCC codebase, and fixes some wrong code location descriptions
based on current codebase status and ChangeLogs.

Changes since v1:
* v3: Add reference to related PRs.
* v2: Add correct info for pass_build_alias.

gcc/ChangeLog:

PR rtl-optimization/951
PR tree-optimization/13756
* doc/passes.texi: Correction of Tree SSA Passes info.
---
  gcc/doc/passes.texi | 75 +
  1 file changed, 7 insertions(+), 68 deletions(-)

diff --git a/gcc/doc/passes.texi b/gcc/doc/passes.texi
index b50d3d5635b..b13ad06c5a9 100644
--- a/gcc/doc/passes.texi
+++ b/gcc/doc/passes.texi
@@ -450,17 +450,6 @@ The following briefly describes the Tree optimization 
passes that are
  run after gimplification and what source files they are located in.
  
  @itemize @bullet

-@item Remove useless statements
-
-This pass is an extremely simple sweep across the gimple code in which
-we identify obviously dead code and remove it.  Here we do things like
-simplify @code{if} statements with constant conditions, remove
-exception handling constructs surrounding code that obviously cannot
-throw, remove lexical bindings that contain no variables, and other
-assorted simplistic cleanups.  The idea is to get rid of the obvious
-stuff quickly rather than wait until later when it's more work to get
-rid of it.  This pass is located in @file{tree-cfg.cc} and described by
-@code{pass_remove_useless_stmts}.
  
  @item OpenMP lowering
  
@@ -478,7 +467,7 @@ described by @code{pass_lower_omp}.
  
  If OpenMP generation (@option{-fopenmp}) is enabled, this pass expands

  parallel regions into their own functions to be invoked by the thread
-library.  The pass is located in @file{omp-low.cc} and is described by
+library.  The pass is located in @file{omp-expand.cc} and is described by
  @code{pass_expand_omp}.
  
  @item Lower control flow

@@ -511,15 +500,6 @@ This pass decomposes a function into basic blocks and 
creates all of
  the edges that connect them.  It is located in @file{tree-cfg.cc} and
  is described by @code{pass_build_cfg}.
  
-@item Find all referenced variables

-
-This pass walks the entire function and collects an array of all
-variables referenced in the function, @code{referenced_vars}.  The
-index at which a variable is found in the array is used as a UID
-for the variable within this function.  This data is needed by the
-SSA rewriting routines.  The pass is located in @file{tree-dfa.cc}
-and is described by @code{pass_referenced_vars}.
-
  @item Enter static single assignment form
  
  This pass rewrites the function such that it is in SSA form.  After

@@ -562,15 +542,6 @@ variables that are used once into the expression that uses 
them and
  seeing if the result can be simplified.  It is located in
  @file{tree-ssa-forwprop.cc} and is described by @code{pass_forwprop}.
  
-@item Copy Renaming

-
-This pass attempts to change the name of compiler temporaries involved in
-copy operations such that SSA->normal can coalesce the copy away.  When 
compiler
-temporaries are copies of user variables, it also renames the compiler
-temporary to the user variable resulting in better use of user symbols.  It is
-located in @file{tree-ssa-copyrename.c} and is described by
-@code{pass_copyrename}.
-
  @item PHI node optimizations
  
  This pass recognizes forms of PHI inputs that can be represented as

@@ -581,12 +552,8 @@ It is located in @file{tree-ssa-phiopt.cc} and is 
described by
  @item May-alias optimization
  
  This pass performs a flow sensitive SSA-based points-to analysis.

-The resulting may-alias, must-alias, and escape analysis information
-is used to promote variables from in-memory addressable objects to
-non-aliased variables that can be renamed into SSA form.  We also
-update the @code{VDEF}/@code{VUSE} memory tags for non-renameable
-aggregates so that we get fewer false kills.  The pass is located
-in @file{tree-ssa-alias.cc} and is described by @code{pass_may_alias}.
+It is located in @file{tree-ssa-structalias.cc} and is described
+by @code{pass_build_alias}.
  
  Interprocedural points-to information is located in

  @file{tree-ssa-structalias.cc} and described by @code{pass_ipa_pta}.
@@ -604,7 +571,7 @@ is described by @code{pass_ipa_tree_profile}.
  This pass implements series of heuristics to guess propababilities
  of branches.  The resulting predictions are turned into edge profile
  by propagating branches across the control flow graphs.
-The pass is located in @file{tree-profile.cc} and is described by
+The pass is located in @file{predict.cc} and is described by
  @code{pass_profile}.
  
  @item Lower complex arithmetic

@@ -653,7 +620,7 @@ in @file{tree-ssa-math-opts.cc} and is descri

[PATCH v3] doc: Correction of Tree SSA Passes info.

2024-03-25 Thread Chenghui Pan
Current document of Tree SSA passes contains many parts that is not
updated for many years.

This patch removes some info that is outdated and not existed in
current GCC codebase, and fixes some wrong code location descriptions
based on current codebase status and ChangeLogs.

Changes since v1:
* v3: Add reference to related PRs.
* v2: Add correct info for pass_build_alias.

gcc/ChangeLog:

PR rtl-optimization/951
PR tree-optimization/13756
* doc/passes.texi: Correction of Tree SSA Passes info.
---
 gcc/doc/passes.texi | 75 +
 1 file changed, 7 insertions(+), 68 deletions(-)

diff --git a/gcc/doc/passes.texi b/gcc/doc/passes.texi
index b50d3d5635b..b13ad06c5a9 100644
--- a/gcc/doc/passes.texi
+++ b/gcc/doc/passes.texi
@@ -450,17 +450,6 @@ The following briefly describes the Tree optimization 
passes that are
 run after gimplification and what source files they are located in.
 
 @itemize @bullet
-@item Remove useless statements
-
-This pass is an extremely simple sweep across the gimple code in which
-we identify obviously dead code and remove it.  Here we do things like
-simplify @code{if} statements with constant conditions, remove
-exception handling constructs surrounding code that obviously cannot
-throw, remove lexical bindings that contain no variables, and other
-assorted simplistic cleanups.  The idea is to get rid of the obvious
-stuff quickly rather than wait until later when it's more work to get
-rid of it.  This pass is located in @file{tree-cfg.cc} and described by
-@code{pass_remove_useless_stmts}.
 
 @item OpenMP lowering
 
@@ -478,7 +467,7 @@ described by @code{pass_lower_omp}.
 
 If OpenMP generation (@option{-fopenmp}) is enabled, this pass expands
 parallel regions into their own functions to be invoked by the thread
-library.  The pass is located in @file{omp-low.cc} and is described by
+library.  The pass is located in @file{omp-expand.cc} and is described by
 @code{pass_expand_omp}.
 
 @item Lower control flow
@@ -511,15 +500,6 @@ This pass decomposes a function into basic blocks and 
creates all of
 the edges that connect them.  It is located in @file{tree-cfg.cc} and
 is described by @code{pass_build_cfg}.
 
-@item Find all referenced variables
-
-This pass walks the entire function and collects an array of all
-variables referenced in the function, @code{referenced_vars}.  The
-index at which a variable is found in the array is used as a UID
-for the variable within this function.  This data is needed by the
-SSA rewriting routines.  The pass is located in @file{tree-dfa.cc}
-and is described by @code{pass_referenced_vars}.
-
 @item Enter static single assignment form
 
 This pass rewrites the function such that it is in SSA form.  After
@@ -562,15 +542,6 @@ variables that are used once into the expression that uses 
them and
 seeing if the result can be simplified.  It is located in
 @file{tree-ssa-forwprop.cc} and is described by @code{pass_forwprop}.
 
-@item Copy Renaming
-
-This pass attempts to change the name of compiler temporaries involved in
-copy operations such that SSA->normal can coalesce the copy away.  When 
compiler
-temporaries are copies of user variables, it also renames the compiler
-temporary to the user variable resulting in better use of user symbols.  It is
-located in @file{tree-ssa-copyrename.c} and is described by
-@code{pass_copyrename}.
-
 @item PHI node optimizations
 
 This pass recognizes forms of PHI inputs that can be represented as
@@ -581,12 +552,8 @@ It is located in @file{tree-ssa-phiopt.cc} and is 
described by
 @item May-alias optimization
 
 This pass performs a flow sensitive SSA-based points-to analysis.
-The resulting may-alias, must-alias, and escape analysis information
-is used to promote variables from in-memory addressable objects to
-non-aliased variables that can be renamed into SSA form.  We also
-update the @code{VDEF}/@code{VUSE} memory tags for non-renameable
-aggregates so that we get fewer false kills.  The pass is located
-in @file{tree-ssa-alias.cc} and is described by @code{pass_may_alias}.
+It is located in @file{tree-ssa-structalias.cc} and is described
+by @code{pass_build_alias}.
 
 Interprocedural points-to information is located in
 @file{tree-ssa-structalias.cc} and described by @code{pass_ipa_pta}.
@@ -604,7 +571,7 @@ is described by @code{pass_ipa_tree_profile}.
 This pass implements series of heuristics to guess propababilities
 of branches.  The resulting predictions are turned into edge profile
 by propagating branches across the control flow graphs.
-The pass is located in @file{tree-profile.cc} and is described by
+The pass is located in @file{predict.cc} and is described by
 @code{pass_profile}.
 
 @item Lower complex arithmetic
@@ -653,7 +620,7 @@ in @file{tree-ssa-math-opts.cc} and is described by
 @item Full redundancy elimination
 
 This is a simpler form of PRE that only eliminates redundancies that
-occur on all paths.  It is 

Re: [PATCH v1] doc: Correction of Tree SSA Passes info.

2024-03-25 Thread Chenghui Pan

Seems right. I will reference them.

Also I think maybe some documents for the passes are still missing and 
we can add them


in the future...

On 2024/3/26 08:11, Andrew Pinski wrote:

On Sun, Mar 24, 2024 at 8:46 PM Chenghui Pan  wrote:

Current document of Tree SSA passes contains many parts that is not
updated for many years.

This patch removes some info that is outdated and not existed in
current GCC codebase, and fixes some wrong code location descriptions
based on current codebase status and ChangeLogs.


This improves the situation for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=951 (and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=13756 ). Maybe it should
include a reference to those 2 also.

Thanks,
Andrew


gcc/ChangeLog:

 * doc/passes.texi: Correction of Tree SSA Passes info.
---
  gcc/doc/passes.texi | 70 -
  1 file changed, 6 insertions(+), 64 deletions(-)

diff --git a/gcc/doc/passes.texi b/gcc/doc/passes.texi
index b50d3d5635b..068036acb7d 100644
--- a/gcc/doc/passes.texi
+++ b/gcc/doc/passes.texi
@@ -450,17 +450,6 @@ The following briefly describes the Tree optimization 
passes that are
  run after gimplification and what source files they are located in.

  @itemize @bullet
-@item Remove useless statements
-
-This pass is an extremely simple sweep across the gimple code in which
-we identify obviously dead code and remove it.  Here we do things like
-simplify @code{if} statements with constant conditions, remove
-exception handling constructs surrounding code that obviously cannot
-throw, remove lexical bindings that contain no variables, and other
-assorted simplistic cleanups.  The idea is to get rid of the obvious
-stuff quickly rather than wait until later when it's more work to get
-rid of it.  This pass is located in @file{tree-cfg.cc} and described by
-@code{pass_remove_useless_stmts}.

  @item OpenMP lowering

@@ -478,7 +467,7 @@ described by @code{pass_lower_omp}.

  If OpenMP generation (@option{-fopenmp}) is enabled, this pass expands
  parallel regions into their own functions to be invoked by the thread
-library.  The pass is located in @file{omp-low.cc} and is described by
+library.  The pass is located in @file{omp-expand.cc} and is described by
  @code{pass_expand_omp}.

  @item Lower control flow
@@ -511,15 +500,6 @@ This pass decomposes a function into basic blocks and 
creates all of
  the edges that connect them.  It is located in @file{tree-cfg.cc} and
  is described by @code{pass_build_cfg}.

-@item Find all referenced variables
-
-This pass walks the entire function and collects an array of all
-variables referenced in the function, @code{referenced_vars}.  The
-index at which a variable is found in the array is used as a UID
-for the variable within this function.  This data is needed by the
-SSA rewriting routines.  The pass is located in @file{tree-dfa.cc}
-and is described by @code{pass_referenced_vars}.
-
  @item Enter static single assignment form

  This pass rewrites the function such that it is in SSA form.  After
@@ -562,15 +542,6 @@ variables that are used once into the expression that uses 
them and
  seeing if the result can be simplified.  It is located in
  @file{tree-ssa-forwprop.cc} and is described by @code{pass_forwprop}.

-@item Copy Renaming
-
-This pass attempts to change the name of compiler temporaries involved in
-copy operations such that SSA->normal can coalesce the copy away.  When 
compiler
-temporaries are copies of user variables, it also renames the compiler
-temporary to the user variable resulting in better use of user symbols.  It is
-located in @file{tree-ssa-copyrename.c} and is described by
-@code{pass_copyrename}.
-
  @item PHI node optimizations

  This pass recognizes forms of PHI inputs that can be represented as
@@ -585,8 +556,7 @@ The resulting may-alias, must-alias, and escape analysis 
information
  is used to promote variables from in-memory addressable objects to
  non-aliased variables that can be renamed into SSA form.  We also
  update the @code{VDEF}/@code{VUSE} memory tags for non-renameable
-aggregates so that we get fewer false kills.  The pass is located
-in @file{tree-ssa-alias.cc} and is described by @code{pass_may_alias}.
+aggregates so that we get fewer false kills.

  Interprocedural points-to information is located in
  @file{tree-ssa-structalias.cc} and described by @code{pass_ipa_pta}.
@@ -604,7 +574,7 @@ is described by @code{pass_ipa_tree_profile}.
  This pass implements series of heuristics to guess propababilities
  of branches.  The resulting predictions are turned into edge profile
  by propagating branches across the control flow graphs.
-The pass is located in @file{tree-profile.cc} and is described by
+The pass is located in @file{predict.cc} and is described by
  @code{pass_profile}.

  @item Lower complex arithmetic
@@ -653,7 +623,7 @@ in @file{tree-ssa-math-opts.cc} and is described by
  @item Full redundancy eliminat

[PATCH v2] doc: Correction of Tree SSA Passes info.

2024-03-25 Thread Chenghui Pan
Current document of Tree SSA passes contains many parts that is not
updated for many years.

This patch removes some info that is outdated and not existed in
current GCC codebase, and fixes some wrong code location descriptions
based on current codebase status and ChangeLogs.

Changes since v1: Add correct info for pass_build_alias.

gcc/ChangeLog:

* doc/passes.texi: Correction of Tree SSA Passes info.
---
 gcc/doc/passes.texi | 75 +
 1 file changed, 7 insertions(+), 68 deletions(-)

diff --git a/gcc/doc/passes.texi b/gcc/doc/passes.texi
index b50d3d5635b..b13ad06c5a9 100644
--- a/gcc/doc/passes.texi
+++ b/gcc/doc/passes.texi
@@ -450,17 +450,6 @@ The following briefly describes the Tree optimization 
passes that are
 run after gimplification and what source files they are located in.
 
 @itemize @bullet
-@item Remove useless statements
-
-This pass is an extremely simple sweep across the gimple code in which
-we identify obviously dead code and remove it.  Here we do things like
-simplify @code{if} statements with constant conditions, remove
-exception handling constructs surrounding code that obviously cannot
-throw, remove lexical bindings that contain no variables, and other
-assorted simplistic cleanups.  The idea is to get rid of the obvious
-stuff quickly rather than wait until later when it's more work to get
-rid of it.  This pass is located in @file{tree-cfg.cc} and described by
-@code{pass_remove_useless_stmts}.
 
 @item OpenMP lowering
 
@@ -478,7 +467,7 @@ described by @code{pass_lower_omp}.
 
 If OpenMP generation (@option{-fopenmp}) is enabled, this pass expands
 parallel regions into their own functions to be invoked by the thread
-library.  The pass is located in @file{omp-low.cc} and is described by
+library.  The pass is located in @file{omp-expand.cc} and is described by
 @code{pass_expand_omp}.
 
 @item Lower control flow
@@ -511,15 +500,6 @@ This pass decomposes a function into basic blocks and 
creates all of
 the edges that connect them.  It is located in @file{tree-cfg.cc} and
 is described by @code{pass_build_cfg}.
 
-@item Find all referenced variables
-
-This pass walks the entire function and collects an array of all
-variables referenced in the function, @code{referenced_vars}.  The
-index at which a variable is found in the array is used as a UID
-for the variable within this function.  This data is needed by the
-SSA rewriting routines.  The pass is located in @file{tree-dfa.cc}
-and is described by @code{pass_referenced_vars}.
-
 @item Enter static single assignment form
 
 This pass rewrites the function such that it is in SSA form.  After
@@ -562,15 +542,6 @@ variables that are used once into the expression that uses 
them and
 seeing if the result can be simplified.  It is located in
 @file{tree-ssa-forwprop.cc} and is described by @code{pass_forwprop}.
 
-@item Copy Renaming
-
-This pass attempts to change the name of compiler temporaries involved in
-copy operations such that SSA->normal can coalesce the copy away.  When 
compiler
-temporaries are copies of user variables, it also renames the compiler
-temporary to the user variable resulting in better use of user symbols.  It is
-located in @file{tree-ssa-copyrename.c} and is described by
-@code{pass_copyrename}.
-
 @item PHI node optimizations
 
 This pass recognizes forms of PHI inputs that can be represented as
@@ -581,12 +552,8 @@ It is located in @file{tree-ssa-phiopt.cc} and is 
described by
 @item May-alias optimization
 
 This pass performs a flow sensitive SSA-based points-to analysis.
-The resulting may-alias, must-alias, and escape analysis information
-is used to promote variables from in-memory addressable objects to
-non-aliased variables that can be renamed into SSA form.  We also
-update the @code{VDEF}/@code{VUSE} memory tags for non-renameable
-aggregates so that we get fewer false kills.  The pass is located
-in @file{tree-ssa-alias.cc} and is described by @code{pass_may_alias}.
+It is located in @file{tree-ssa-structalias.cc} and is described
+by @code{pass_build_alias}.
 
 Interprocedural points-to information is located in
 @file{tree-ssa-structalias.cc} and described by @code{pass_ipa_pta}.
@@ -604,7 +571,7 @@ is described by @code{pass_ipa_tree_profile}.
 This pass implements series of heuristics to guess propababilities
 of branches.  The resulting predictions are turned into edge profile
 by propagating branches across the control flow graphs.
-The pass is located in @file{tree-profile.cc} and is described by
+The pass is located in @file{predict.cc} and is described by
 @code{pass_profile}.
 
 @item Lower complex arithmetic
@@ -653,7 +620,7 @@ in @file{tree-ssa-math-opts.cc} and is described by
 @item Full redundancy elimination
 
 This is a simpler form of PRE that only eliminates redundancies that
-occur on all paths.  It is located in @file{tree-ssa-pre.cc} and
+occur on all paths.  It is located in @file{tree-ssa-sccvn.cc} and
 

[PATCH v1] doc: Correction of Tree SSA Passes info.

2024-03-24 Thread Chenghui Pan
Current document of Tree SSA passes contains many parts that is not
updated for many years.

This patch removes some info that is outdated and not existed in
current GCC codebase, and fixes some wrong code location descriptions
based on current codebase status and ChangeLogs.

gcc/ChangeLog:

* doc/passes.texi: Correction of Tree SSA Passes info.
---
 gcc/doc/passes.texi | 70 -
 1 file changed, 6 insertions(+), 64 deletions(-)

diff --git a/gcc/doc/passes.texi b/gcc/doc/passes.texi
index b50d3d5635b..068036acb7d 100644
--- a/gcc/doc/passes.texi
+++ b/gcc/doc/passes.texi
@@ -450,17 +450,6 @@ The following briefly describes the Tree optimization 
passes that are
 run after gimplification and what source files they are located in.
 
 @itemize @bullet
-@item Remove useless statements
-
-This pass is an extremely simple sweep across the gimple code in which
-we identify obviously dead code and remove it.  Here we do things like
-simplify @code{if} statements with constant conditions, remove
-exception handling constructs surrounding code that obviously cannot
-throw, remove lexical bindings that contain no variables, and other
-assorted simplistic cleanups.  The idea is to get rid of the obvious
-stuff quickly rather than wait until later when it's more work to get
-rid of it.  This pass is located in @file{tree-cfg.cc} and described by
-@code{pass_remove_useless_stmts}.
 
 @item OpenMP lowering
 
@@ -478,7 +467,7 @@ described by @code{pass_lower_omp}.
 
 If OpenMP generation (@option{-fopenmp}) is enabled, this pass expands
 parallel regions into their own functions to be invoked by the thread
-library.  The pass is located in @file{omp-low.cc} and is described by
+library.  The pass is located in @file{omp-expand.cc} and is described by
 @code{pass_expand_omp}.
 
 @item Lower control flow
@@ -511,15 +500,6 @@ This pass decomposes a function into basic blocks and 
creates all of
 the edges that connect them.  It is located in @file{tree-cfg.cc} and
 is described by @code{pass_build_cfg}.
 
-@item Find all referenced variables
-
-This pass walks the entire function and collects an array of all
-variables referenced in the function, @code{referenced_vars}.  The
-index at which a variable is found in the array is used as a UID
-for the variable within this function.  This data is needed by the
-SSA rewriting routines.  The pass is located in @file{tree-dfa.cc}
-and is described by @code{pass_referenced_vars}.
-
 @item Enter static single assignment form
 
 This pass rewrites the function such that it is in SSA form.  After
@@ -562,15 +542,6 @@ variables that are used once into the expression that uses 
them and
 seeing if the result can be simplified.  It is located in
 @file{tree-ssa-forwprop.cc} and is described by @code{pass_forwprop}.
 
-@item Copy Renaming
-
-This pass attempts to change the name of compiler temporaries involved in
-copy operations such that SSA->normal can coalesce the copy away.  When 
compiler
-temporaries are copies of user variables, it also renames the compiler
-temporary to the user variable resulting in better use of user symbols.  It is
-located in @file{tree-ssa-copyrename.c} and is described by
-@code{pass_copyrename}.
-
 @item PHI node optimizations
 
 This pass recognizes forms of PHI inputs that can be represented as
@@ -585,8 +556,7 @@ The resulting may-alias, must-alias, and escape analysis 
information
 is used to promote variables from in-memory addressable objects to
 non-aliased variables that can be renamed into SSA form.  We also
 update the @code{VDEF}/@code{VUSE} memory tags for non-renameable
-aggregates so that we get fewer false kills.  The pass is located
-in @file{tree-ssa-alias.cc} and is described by @code{pass_may_alias}.
+aggregates so that we get fewer false kills.
 
 Interprocedural points-to information is located in
 @file{tree-ssa-structalias.cc} and described by @code{pass_ipa_pta}.
@@ -604,7 +574,7 @@ is described by @code{pass_ipa_tree_profile}.
 This pass implements series of heuristics to guess propababilities
 of branches.  The resulting predictions are turned into edge profile
 by propagating branches across the control flow graphs.
-The pass is located in @file{tree-profile.cc} and is described by
+The pass is located in @file{predict.cc} and is described by
 @code{pass_profile}.
 
 @item Lower complex arithmetic
@@ -653,7 +623,7 @@ in @file{tree-ssa-math-opts.cc} and is described by
 @item Full redundancy elimination
 
 This is a simpler form of PRE that only eliminates redundancies that
-occur on all paths.  It is located in @file{tree-ssa-pre.cc} and
+occur on all paths.  It is located in @file{tree-ssa-sccvn.cc} and
 described by @code{pass_fre}.
 
 @item Loop optimization
@@ -708,7 +678,7 @@ to align the number of iterations, and to align the memory 
accesses in the
 loop.
 The pass is implemented in @file{tree-vectorizer.cc} (the main driver),
 @file{tree-vect-loop.cc} and 

[PATCH v2 0/3] LoongArch: Cleanup unused/redundant codes.

2024-03-14 Thread Chenghui Pan
Changes from v1: Some correction about ChangeLog format.

There's some unused/redundant definitions inside LoongArch target support
codes, these patches make a simple cleanup. Regression test passed.

Chenghui Pan (3):
  LoongArch: Remove unused/useless definitions.
  LoongArch: Change loongarch_expand_vec_cmp()'s return type from bool
to void.
  LoongArch: Combine UNITS_PER_FP_REG and UNITS_PER_FPREG macros.

 gcc/config/loongarch/lasx.md|  6 ++--
 gcc/config/loongarch/loongarch-protos.h |  7 +
 gcc/config/loongarch/loongarch.cc   | 39 -
 gcc/config/loongarch/loongarch.h|  7 ++---
 gcc/config/loongarch/lsx.md |  6 ++--
 5 files changed, 13 insertions(+), 52 deletions(-)

-- 
2.39.3



[PATCH v2 3/3] LoongArch: Combine UNITS_PER_FP_REG and UNITS_PER_FPREG macros.

2024-03-14 Thread Chenghui Pan
These macros are completely same in definition, so we can keep the previous one
and eliminate later one.

gcc/ChangeLog:

* config/loongarch/loongarch.cc
(loongarch_hard_regno_mode_ok_uncached): Combine UNITS_PER_FP_REG and
UNITS_PER_FPREG macros.
(loongarch_hard_regno_nregs): Ditto.
(loongarch_class_max_nregs): Ditto.
(loongarch_get_separate_components): Ditto.
(loongarch_process_components): Ditto.
* config/loongarch/loongarch.h (UNITS_PER_FPREG): Ditto.
(UNITS_PER_HWFPVALUE): Ditto.
(UNITS_PER_FPVALUE): Ditto.
---
 gcc/config/loongarch/loongarch.cc | 10 +-
 gcc/config/loongarch/loongarch.h  |  7 ++-
 2 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 7ef04329668..8f657ee1f9c 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -6770,7 +6770,7 @@ loongarch_hard_regno_mode_ok_uncached (unsigned int 
regno, machine_mode mode)
 and TRUNC.  There's no point allowing sizes smaller than a word,
 because the FPU has no appropriate load/store instructions.  */
   if (mclass == MODE_INT)
-   return size >= MIN_UNITS_PER_WORD && size <= UNITS_PER_FPREG;
+   return size >= MIN_UNITS_PER_WORD && size <= UNITS_PER_FP_REG;
 }
 
   return false;
@@ -6813,7 +6813,7 @@ loongarch_hard_regno_nregs (unsigned int regno, 
machine_mode mode)
   if (LASX_SUPPORTED_MODE_P (mode))
return 1;
 
-  return (GET_MODE_SIZE (mode) + UNITS_PER_FPREG - 1) / UNITS_PER_FPREG;
+  return (GET_MODE_SIZE (mode) + UNITS_PER_FP_REG - 1) / UNITS_PER_FP_REG;
 }
 
   /* All other registers are word-sized.  */
@@ -6848,7 +6848,7 @@ loongarch_class_max_nregs (enum reg_class rclass, 
machine_mode mode)
  else if (LSX_SUPPORTED_MODE_P (mode))
size = MIN (size, UNITS_PER_LSX_REG);
  else
-   size = MIN (size, UNITS_PER_FPREG);
+   size = MIN (size, UNITS_PER_FP_REG);
}
   left &= ~reg_class_contents[FP_REGS];
 }
@@ -8222,7 +8222,7 @@ loongarch_get_separate_components (void)
if (IMM12_OPERAND (offset))
  bitmap_set_bit (components, regno);
 
-   offset -= UNITS_PER_FPREG;
+   offset -= UNITS_PER_FP_REG;
   }
 
   /* Don't mess with the hard frame pointer.  */
@@ -8301,7 +8301,7 @@ loongarch_process_components (sbitmap components, 
loongarch_save_restore_fn fn)
if (bitmap_bit_p (components, regno))
  loongarch_save_restore_reg (mode, regno, offset, fn);
 
-   offset -= UNITS_PER_FPREG;
+   offset -= UNITS_PER_FP_REG;
   }
 }
 
diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h
index bf2351f0968..888a633961d 100644
--- a/gcc/config/loongarch/loongarch.h
+++ b/gcc/config/loongarch/loongarch.h
@@ -138,19 +138,16 @@ along with GCC; see the file COPYING3.  If not see
 /* Width of a LASX vector register in bits.  */
 #define BITS_PER_LASX_REG (UNITS_PER_LASX_REG * BITS_PER_UNIT)
 
-/* For LARCH, width of a floating point register.  */
-#define UNITS_PER_FPREG (TARGET_DOUBLE_FLOAT ? 8 : 4)
-
 /* The largest size of value that can be held in floating-point
registers and moved with a single instruction.  */
 #define UNITS_PER_HWFPVALUE \
-  (TARGET_SOFT_FLOAT ? 0 : UNITS_PER_FPREG)
+  (TARGET_SOFT_FLOAT ? 0 : UNITS_PER_FP_REG)
 
 /* The largest size of value that can be held in floating-point
registers.  */
 #define UNITS_PER_FPVALUE \
   (TARGET_SOFT_FLOAT ? 0 \
-   : TARGET_SINGLE_FLOAT ? UNITS_PER_FPREG \
+   : TARGET_SINGLE_FLOAT ? UNITS_PER_FP_REG \
 : LONG_DOUBLE_TYPE_SIZE / BITS_PER_UNIT)
 
 /* The number of bytes in a double.  */
-- 
2.39.3



[PATCH v2 2/3] LoongArch: Change loongarch_expand_vec_cmp()'s return type from bool to void.

2024-03-14 Thread Chenghui Pan
This function is always return true at the end of function implementation,
so the return value is useless.

gcc/ChangeLog:

* config/loongarch/lasx.md (vec_cmp): Remove checking
of loongarch_expand_vec_cmp()'s return value.
(vec_cmpu): Ditto.
* config/loongarch/lsx.md (vec_cmp): Ditto.
(vec_cmpu): Ditto.
* config/loongarch/loongarch-protos.h
(loongarch_expand_vec_cmp): Change loongarch_expand_vec_cmp()'s return
type from bool to void.
* config/loongarch/loongarch.cc (loongarch_expand_vec_cmp): Ditto.
---
 gcc/config/loongarch/lasx.md| 6 ++
 gcc/config/loongarch/loongarch-protos.h | 2 +-
 gcc/config/loongarch/loongarch.cc   | 3 +--
 gcc/config/loongarch/lsx.md | 6 ++
 4 files changed, 6 insertions(+), 11 deletions(-)

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index ac84db7f0ce..8d4c6b4ec35 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -1383,8 +1383,7 @@ (define_expand "vec_cmp"
   (match_operand:LASX 3 "register_operand")]))]
   "ISA_HAS_LASX"
 {
-  bool ok = loongarch_expand_vec_cmp (operands);
-  gcc_assert (ok);
+  loongarch_expand_vec_cmp (operands);
   DONE;
 })
 
@@ -1395,8 +1394,7 @@ (define_expand "vec_cmpu"
   (match_operand:ILASX 3 "register_operand")]))]
   "ISA_HAS_LASX"
 {
-  bool ok = loongarch_expand_vec_cmp (operands);
-  gcc_assert (ok);
+  loongarch_expand_vec_cmp (operands);
   DONE;
 })
 
diff --git a/gcc/config/loongarch/loongarch-protos.h 
b/gcc/config/loongarch/loongarch-protos.h
index 871544f760c..e3ed2b912a5 100644
--- a/gcc/config/loongarch/loongarch-protos.h
+++ b/gcc/config/loongarch/loongarch-protos.h
@@ -95,7 +95,7 @@ extern void loongarch_split_lsx_fill_d (rtx, rtx);
 extern const char *loongarch_output_move (rtx, rtx);
 #ifdef RTX_CODE
 extern void loongarch_expand_scc (rtx *);
-extern bool loongarch_expand_vec_cmp (rtx *);
+extern void loongarch_expand_vec_cmp (rtx *);
 extern void loongarch_expand_conditional_branch (rtx *);
 extern void loongarch_expand_conditional_move (rtx *);
 extern void loongarch_expand_conditional_trap (rtx);
diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index b25624c9406..7ef04329668 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -10801,13 +10801,12 @@ loongarch_expand_vec_cond_mask_expr (machine_mode 
mode, machine_mode vimode,
 }
 
 /* Expand integer vector comparison */
-bool
+void
 loongarch_expand_vec_cmp (rtx operands[])
 {
 
   rtx_code code = GET_CODE (operands[1]);
   loongarch_expand_lsx_cmp (operands[0], code, operands[2], operands[3]);
-  return true;
 }
 
 /* Implement TARGET_PROMOTE_FUNCTION_MODE.  */
diff --git a/gcc/config/loongarch/lsx.md b/gcc/config/loongarch/lsx.md
index b9b94b9079c..87d3e7c5d9f 100644
--- a/gcc/config/loongarch/lsx.md
+++ b/gcc/config/loongarch/lsx.md
@@ -518,8 +518,7 @@ (define_expand "vec_cmp"
   (match_operand:LSX 3 "register_operand")]))]
   "ISA_HAS_LSX"
 {
-  bool ok = loongarch_expand_vec_cmp (operands);
-  gcc_assert (ok);
+  loongarch_expand_vec_cmp (operands);
   DONE;
 })
 
@@ -530,8 +529,7 @@ (define_expand "vec_cmpu"
   (match_operand:ILSX 3 "register_operand")]))]
   "ISA_HAS_LSX"
 {
-  bool ok = loongarch_expand_vec_cmp (operands);
-  gcc_assert (ok);
+  loongarch_expand_vec_cmp (operands);
   DONE;
 })
 
-- 
2.39.3



[PATCH v2 1/3] LoongArch: Remove unused/useless definitions.

2024-03-14 Thread Chenghui Pan
This patch removes some unnecessary definitions of target hook functions
according to the documentation of GCC.

gcc/ChangeLog:

* config/loongarch/loongarch-protos.h
(loongarch_cfun_has_cprestore_slot_p): Delete.
(loongarch_adjust_insn_length): Delete.
(current_section_name): Delete.
(loongarch_split_symbol_type): Delete.
* config/loongarch/loongarch.cc
(loongarch_case_values_threshold): Delete.
(loongarch_spill_class): Delete.
(TARGET_OPTAB_SUPPORTED_P): Delete.
(TARGET_CASE_VALUES_THRESHOLD): Delete.
(TARGET_SPILL_CLASS): Delete.
---
 gcc/config/loongarch/loongarch-protos.h |  5 -
 gcc/config/loongarch/loongarch.cc   | 26 -
 2 files changed, 31 deletions(-)

diff --git a/gcc/config/loongarch/loongarch-protos.h 
b/gcc/config/loongarch/loongarch-protos.h
index 1fdfda9af01..871544f760c 100644
--- a/gcc/config/loongarch/loongarch-protos.h
+++ b/gcc/config/loongarch/loongarch-protos.h
@@ -93,7 +93,6 @@ extern void loongarch_split_lsx_copy_d (rtx, rtx, rtx, rtx 
(*)(rtx, rtx, rtx));
 extern void loongarch_split_lsx_insert_d (rtx, rtx, rtx, rtx);
 extern void loongarch_split_lsx_fill_d (rtx, rtx);
 extern const char *loongarch_output_move (rtx, rtx);
-extern bool loongarch_cfun_has_cprestore_slot_p (void);
 #ifdef RTX_CODE
 extern void loongarch_expand_scc (rtx *);
 extern bool loongarch_expand_vec_cmp (rtx *);
@@ -135,7 +134,6 @@ extern int loongarch_class_max_nregs (enum reg_class, 
machine_mode);
 extern machine_mode loongarch_hard_regno_caller_save_mode (unsigned int,
   unsigned int,
   machine_mode);
-extern int loongarch_adjust_insn_length (rtx_insn *, int);
 extern const char *loongarch_output_conditional_branch (rtx_insn *, rtx *,
const char *,
const char *);
@@ -157,7 +155,6 @@ extern bool loongarch_global_symbol_noweak_p (const_rtx);
 extern bool loongarch_weak_symbol_p (const_rtx);
 extern bool loongarch_symbol_binds_local_p (const_rtx);
 
-extern const char *current_section_name (void);
 extern unsigned int current_section_flags (void);
 extern bool loongarch_use_ins_ext_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
 extern bool loongarch_check_zero_div_p (void);
@@ -198,8 +195,6 @@ extern bool loongarch_epilogue_uses (unsigned int);
 extern bool loongarch_load_store_bonding_p (rtx *, machine_mode, bool);
 extern bool loongarch_split_symbol_type (enum loongarch_symbol_type);
 
-typedef rtx (*mulsidi3_gen_fn) (rtx, rtx, rtx);
-
 extern void loongarch_register_frame_header_opt (void);
 extern void loongarch_expand_vec_cond_expr (machine_mode, machine_mode, rtx *);
 extern void loongarch_expand_vec_cond_mask_expr (machine_mode, machine_mode,
diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 70e31bb831c..b25624c9406 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -10810,23 +10810,6 @@ loongarch_expand_vec_cmp (rtx operands[])
   return true;
 }
 
-/* Implement TARGET_CASE_VALUES_THRESHOLD.  */
-
-unsigned int
-loongarch_case_values_threshold (void)
-{
-  return default_case_values_threshold ();
-}
-
-/* Implement TARGET_SPILL_CLASS.  */
-
-static reg_class_t
-loongarch_spill_class (reg_class_t rclass ATTRIBUTE_UNUSED,
-  machine_mode mode ATTRIBUTE_UNUSED)
-{
-  return NO_REGS;
-}
-
 /* Implement TARGET_PROMOTE_FUNCTION_MODE.  */
 
 /* This function is equivalent to default_promote_function_mode_always_promote
@@ -11281,9 +11264,6 @@ loongarch_asm_code_end (void)
 #undef TARGET_FUNCTION_ARG_BOUNDARY
 #define TARGET_FUNCTION_ARG_BOUNDARY loongarch_function_arg_boundary
 
-#undef TARGET_OPTAB_SUPPORTED_P
-#define TARGET_OPTAB_SUPPORTED_P loongarch_optab_supported_p
-
 #undef TARGET_VECTOR_MODE_SUPPORTED_P
 #define TARGET_VECTOR_MODE_SUPPORTED_P loongarch_vector_mode_supported_p
 
@@ -11353,18 +11333,12 @@ loongarch_asm_code_end (void)
 #undef TARGET_SCHED_REASSOCIATION_WIDTH
 #define TARGET_SCHED_REASSOCIATION_WIDTH loongarch_sched_reassociation_width
 
-#undef TARGET_CASE_VALUES_THRESHOLD
-#define TARGET_CASE_VALUES_THRESHOLD loongarch_case_values_threshold
-
 #undef TARGET_ATOMIC_ASSIGN_EXPAND_FENV
 #define TARGET_ATOMIC_ASSIGN_EXPAND_FENV loongarch_atomic_assign_expand_fenv
 
 #undef TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS
 #define TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS true
 
-#undef TARGET_SPILL_CLASS
-#define TARGET_SPILL_CLASS loongarch_spill_class
-
 #undef TARGET_HARD_REGNO_NREGS
 #define TARGET_HARD_REGNO_NREGS loongarch_hard_regno_nregs
 #undef TARGET_HARD_REGNO_MODE_OK
-- 
2.39.3



[PATCH v2] LoongArch: Remove masking process for operand 3 of xvpermi.q.

2024-03-13 Thread Chenghui Pan
The behavior of non-zero unused bits in xvpermi.q instruction's
third operand is undefined on LoongArch, according to our
discussion (https://github.com/llvm/llvm-project/pull/83540),
we think that keeping original insn operand as unmodified
state is better solution.

This patch partially reverts 7b158e036a95b1ab40793dd53bed7dbd770ffdaf.

gcc/ChangeLog:

* config/loongarch/lasx.md (lasx_xvpermi_q_):
Remove masking of operand 3.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c:
Reposition operand 3's value into instruction's defined accept range.
---
 gcc/config/loongarch/lasx.md| 5 -
 .../gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c   | 6 +++---
 2 files changed, 3 insertions(+), 8 deletions(-)

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index ac84db7f0ce..3f25c0c1756 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -640,8 +640,6 @@ (define_insn "lasx_xvpermi_d__1"
(set_attr "mode" "")])
 
 ;; xvpermi.q
-;; Unused bits in operands[3] need be set to 0 to avoid
-;; causing undefined behavior on LA464.
 (define_insn "lasx_xvpermi_q_"
   [(set (match_operand:LASX 0 "register_operand" "=f")
(unspec:LASX
@@ -651,9 +649,6 @@ (define_insn "lasx_xvpermi_q_"
  UNSPEC_LASX_XVPERMI_Q))]
   "ISA_HAS_LASX"
 {
-  int mask = 0x33;
-  mask &= INTVAL (operands[3]);
-  operands[3] = GEN_INT (mask);
   return "xvpermi.q\t%u0,%u2,%3";
 }
   [(set_attr "type" "simd_splat")
diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c 
b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c
index dbc29d2fb22..f89dfc31120 100644
--- a/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c
+++ b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c
@@ -27,7 +27,7 @@ main ()
   *((unsigned long*)& __m256i_result[2]) = 0x7fff7fff7fff;
   *((unsigned long*)& __m256i_result[1]) = 0x7fe37fe3001d001d;
   *((unsigned long*)& __m256i_result[0]) = 0x7fff7fff7fff;
-  __m256i_out = __lasx_xvpermi_q (__m256i_op0, __m256i_op1, 0x2a);
+  __m256i_out = __lasx_xvpermi_q (__m256i_op0, __m256i_op1, 0x22);
   ASSERTEQ_64 (__LINE__, __m256i_result, __m256i_out);
 
   *((unsigned long*)& __m256i_op0[3]) = 0x;
@@ -42,7 +42,7 @@ main ()
   *((unsigned long*)& __m256i_result[2]) = 0x0019001c;
   *((unsigned long*)& __m256i_result[1]) = 0x;
   *((unsigned long*)& __m256i_result[0]) = 0x01fe;
-  __m256i_out = __lasx_xvpermi_q (__m256i_op0, __m256i_op1, 0xb9);
+  __m256i_out = __lasx_xvpermi_q (__m256i_op0, __m256i_op1, 0x31);
   ASSERTEQ_64 (__LINE__, __m256i_result, __m256i_out);
 
   *((unsigned long*)& __m256i_op0[3]) = 0x00ff00ff00ff00ff;
@@ -57,7 +57,7 @@ main ()
   *((unsigned long*)& __m256i_result[2]) = 0x;
   *((unsigned long*)& __m256i_result[1]) = 0x00ff00ff00ff00ff;
   *((unsigned long*)& __m256i_result[0]) = 0x00ff00ff00ff00ff;
-  __m256i_out = __lasx_xvpermi_q (__m256i_op0, __m256i_op1, 0xca);
+  __m256i_out = __lasx_xvpermi_q (__m256i_op0, __m256i_op1, 0x02);
   ASSERTEQ_64 (__LINE__, __m256i_result, __m256i_out);
 
   return 0;
-- 
2.39.3



[PATCH v1] LoongArch: Remove masking process for operand 3 of xvpermi.q.

2024-03-11 Thread Chenghui Pan
The behavior of non-zero unused bits in xvpermi.q instruction's
third operand is undefined on LoongArch, according to our
discussion (https://github.com/llvm/llvm-project/pull/83540),
we think that keeping original insn operand as unmodified
state is better solution.

This patch partially reverts 7b158e036a95b1ab40793dd53bed7dbd770ffdaf.

gcc/ChangeLog:

* config/loongarch/lasx.md: Remove masking of operand 3.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c:
  Reposition operand 3's value into instruction's defined accept range.
---
 gcc/config/loongarch/lasx.md| 5 -
 .../gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c   | 6 +++---
 2 files changed, 3 insertions(+), 8 deletions(-)

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index ac84db7f0ce..3f25c0c1756 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -640,8 +640,6 @@ (define_insn "lasx_xvpermi_d__1"
(set_attr "mode" "")])
 
 ;; xvpermi.q
-;; Unused bits in operands[3] need be set to 0 to avoid
-;; causing undefined behavior on LA464.
 (define_insn "lasx_xvpermi_q_"
   [(set (match_operand:LASX 0 "register_operand" "=f")
(unspec:LASX
@@ -651,9 +649,6 @@ (define_insn "lasx_xvpermi_q_"
  UNSPEC_LASX_XVPERMI_Q))]
   "ISA_HAS_LASX"
 {
-  int mask = 0x33;
-  mask &= INTVAL (operands[3]);
-  operands[3] = GEN_INT (mask);
   return "xvpermi.q\t%u0,%u2,%3";
 }
   [(set_attr "type" "simd_splat")
diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c 
b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c
index dbc29d2fb22..f89dfc31120 100644
--- a/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c
+++ b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c
@@ -27,7 +27,7 @@ main ()
   *((unsigned long*)& __m256i_result[2]) = 0x7fff7fff7fff;
   *((unsigned long*)& __m256i_result[1]) = 0x7fe37fe3001d001d;
   *((unsigned long*)& __m256i_result[0]) = 0x7fff7fff7fff;
-  __m256i_out = __lasx_xvpermi_q (__m256i_op0, __m256i_op1, 0x2a);
+  __m256i_out = __lasx_xvpermi_q (__m256i_op0, __m256i_op1, 0x22);
   ASSERTEQ_64 (__LINE__, __m256i_result, __m256i_out);
 
   *((unsigned long*)& __m256i_op0[3]) = 0x;
@@ -42,7 +42,7 @@ main ()
   *((unsigned long*)& __m256i_result[2]) = 0x0019001c;
   *((unsigned long*)& __m256i_result[1]) = 0x;
   *((unsigned long*)& __m256i_result[0]) = 0x01fe;
-  __m256i_out = __lasx_xvpermi_q (__m256i_op0, __m256i_op1, 0xb9);
+  __m256i_out = __lasx_xvpermi_q (__m256i_op0, __m256i_op1, 0x31);
   ASSERTEQ_64 (__LINE__, __m256i_result, __m256i_out);
 
   *((unsigned long*)& __m256i_op0[3]) = 0x00ff00ff00ff00ff;
@@ -57,7 +57,7 @@ main ()
   *((unsigned long*)& __m256i_result[2]) = 0x;
   *((unsigned long*)& __m256i_result[1]) = 0x00ff00ff00ff00ff;
   *((unsigned long*)& __m256i_result[0]) = 0x00ff00ff00ff00ff;
-  __m256i_out = __lasx_xvpermi_q (__m256i_op0, __m256i_op1, 0xca);
+  __m256i_out = __lasx_xvpermi_q (__m256i_op0, __m256i_op1, 0x02);
   ASSERTEQ_64 (__LINE__, __m256i_result, __m256i_out);
 
   return 0;
-- 
2.39.3



[PATCH v1 1/3] LoongArch: Remove unused/useless definitions.

2024-03-11 Thread Chenghui Pan
This patch removes some unnecessary definitions of target hook
functions according to the documentation of GCC.

gcc/ChangeLog:

* config/loongarch/loongarch-protos.h 
(loongarch_cfun_has_cprestore_slot_p): Delete.
(loongarch_adjust_insn_length): Delete.
(current_section_name): Delete.
(loongarch_split_symbol_type): Delete.
* config/loongarch/loongarch.cc (loongarch_case_values_threshold): 
Delete.
(loongarch_spill_class): Delete.
(TARGET_OPTAB_SUPPORTED_P): Delete.
(TARGET_CASE_VALUES_THRESHOLD): Delete.
(TARGET_SPILL_CLASS): Delete.

Change-Id: I115b3f5e45170d67dcbd7afb5d73e5ac4108b647
---
 gcc/config/loongarch/loongarch-protos.h |  5 -
 gcc/config/loongarch/loongarch.cc   | 26 -
 2 files changed, 31 deletions(-)

diff --git a/gcc/config/loongarch/loongarch-protos.h 
b/gcc/config/loongarch/loongarch-protos.h
index 1fdfda9af01..871544f760c 100644
--- a/gcc/config/loongarch/loongarch-protos.h
+++ b/gcc/config/loongarch/loongarch-protos.h
@@ -93,7 +93,6 @@ extern void loongarch_split_lsx_copy_d (rtx, rtx, rtx, rtx 
(*)(rtx, rtx, rtx));
 extern void loongarch_split_lsx_insert_d (rtx, rtx, rtx, rtx);
 extern void loongarch_split_lsx_fill_d (rtx, rtx);
 extern const char *loongarch_output_move (rtx, rtx);
-extern bool loongarch_cfun_has_cprestore_slot_p (void);
 #ifdef RTX_CODE
 extern void loongarch_expand_scc (rtx *);
 extern bool loongarch_expand_vec_cmp (rtx *);
@@ -135,7 +134,6 @@ extern int loongarch_class_max_nregs (enum reg_class, 
machine_mode);
 extern machine_mode loongarch_hard_regno_caller_save_mode (unsigned int,
   unsigned int,
   machine_mode);
-extern int loongarch_adjust_insn_length (rtx_insn *, int);
 extern const char *loongarch_output_conditional_branch (rtx_insn *, rtx *,
const char *,
const char *);
@@ -157,7 +155,6 @@ extern bool loongarch_global_symbol_noweak_p (const_rtx);
 extern bool loongarch_weak_symbol_p (const_rtx);
 extern bool loongarch_symbol_binds_local_p (const_rtx);
 
-extern const char *current_section_name (void);
 extern unsigned int current_section_flags (void);
 extern bool loongarch_use_ins_ext_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
 extern bool loongarch_check_zero_div_p (void);
@@ -198,8 +195,6 @@ extern bool loongarch_epilogue_uses (unsigned int);
 extern bool loongarch_load_store_bonding_p (rtx *, machine_mode, bool);
 extern bool loongarch_split_symbol_type (enum loongarch_symbol_type);
 
-typedef rtx (*mulsidi3_gen_fn) (rtx, rtx, rtx);
-
 extern void loongarch_register_frame_header_opt (void);
 extern void loongarch_expand_vec_cond_expr (machine_mode, machine_mode, rtx *);
 extern void loongarch_expand_vec_cond_mask_expr (machine_mode, machine_mode,
diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 0428b6e65d5..8f13d9ca264 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -10797,23 +10797,6 @@ loongarch_expand_vec_cmp (rtx operands[])
   return true;
 }
 
-/* Implement TARGET_CASE_VALUES_THRESHOLD.  */
-
-unsigned int
-loongarch_case_values_threshold (void)
-{
-  return default_case_values_threshold ();
-}
-
-/* Implement TARGET_SPILL_CLASS.  */
-
-static reg_class_t
-loongarch_spill_class (reg_class_t rclass ATTRIBUTE_UNUSED,
-  machine_mode mode ATTRIBUTE_UNUSED)
-{
-  return NO_REGS;
-}
-
 /* Implement TARGET_PROMOTE_FUNCTION_MODE.  */
 
 /* This function is equivalent to default_promote_function_mode_always_promote
@@ -11268,9 +11251,6 @@ loongarch_asm_code_end (void)
 #undef TARGET_FUNCTION_ARG_BOUNDARY
 #define TARGET_FUNCTION_ARG_BOUNDARY loongarch_function_arg_boundary
 
-#undef TARGET_OPTAB_SUPPORTED_P
-#define TARGET_OPTAB_SUPPORTED_P loongarch_optab_supported_p
-
 #undef TARGET_VECTOR_MODE_SUPPORTED_P
 #define TARGET_VECTOR_MODE_SUPPORTED_P loongarch_vector_mode_supported_p
 
@@ -11340,18 +11320,12 @@ loongarch_asm_code_end (void)
 #undef TARGET_SCHED_REASSOCIATION_WIDTH
 #define TARGET_SCHED_REASSOCIATION_WIDTH loongarch_sched_reassociation_width
 
-#undef TARGET_CASE_VALUES_THRESHOLD
-#define TARGET_CASE_VALUES_THRESHOLD loongarch_case_values_threshold
-
 #undef TARGET_ATOMIC_ASSIGN_EXPAND_FENV
 #define TARGET_ATOMIC_ASSIGN_EXPAND_FENV loongarch_atomic_assign_expand_fenv
 
 #undef TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS
 #define TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS true
 
-#undef TARGET_SPILL_CLASS
-#define TARGET_SPILL_CLASS loongarch_spill_class
-
 #undef TARGET_HARD_REGNO_NREGS
 #define TARGET_HARD_REGNO_NREGS loongarch_hard_regno_nregs
 #undef TARGET_HARD_REGNO_MODE_OK
-- 
2.39.3



[PATCH v1 0/3] LoongArch: Cleanup unused/redundant codes.

2024-03-11 Thread Chenghui Pan
There's some ununsed/useless definition inside LoongArch target support
codes, these patches make a simple cleanup. Regression test passed.

Chenghui Pan (3):
  LoongArch: Remove unused/useless definitions.
  LoongArch: Change loongarch_expand_vec_cmp()'s return type from bool
to void.
  LoongArch: Combine UNITS_PER_FP_REG and UNITS_PER_FPREG macros.

 gcc/config/loongarch/lasx.md|  6 ++--
 gcc/config/loongarch/loongarch-protos.h |  7 +
 gcc/config/loongarch/loongarch.cc   | 39 -
 gcc/config/loongarch/loongarch.h|  7 ++---
 gcc/config/loongarch/lsx.md |  6 ++--
 5 files changed, 13 insertions(+), 52 deletions(-)

-- 
2.39.3



[PATCH v1 3/3] LoongArch: Combine UNITS_PER_FP_REG and UNITS_PER_FPREG macros.

2024-03-11 Thread Chenghui Pan
These macros are completely same in definition, so we can keep the previous one 
and
eliminate later one.

gcc/ChangeLog:

* config/loongarch/loongarch.cc (loongarch_hard_regno_mode_ok_uncached):
Combine UNITS_PER_FP_REG and UNITS_PER_FPREG macros.
(loongarch_hard_regno_nregs): Ditto.
(loongarch_class_max_nregs): Ditto.
(loongarch_get_separate_components): Ditto.
(loongarch_process_components): Ditto.
* config/loongarch/loongarch.h (UNITS_PER_FPREG): Ditto.
(UNITS_PER_HWFPVALUE): Ditto.
(UNITS_PER_FPVALUE): Ditto.

Change-Id: Ic3b846bcc5af710a3bfd9ae1db771ae93411ea13
---
 gcc/config/loongarch/loongarch.cc | 10 +-
 gcc/config/loongarch/loongarch.h  |  7 ++-
 2 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index e1073c9debd..519e29db1c3 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -6757,7 +6757,7 @@ loongarch_hard_regno_mode_ok_uncached (unsigned int 
regno, machine_mode mode)
 and TRUNC.  There's no point allowing sizes smaller than a word,
 because the FPU has no appropriate load/store instructions.  */
   if (mclass == MODE_INT)
-   return size >= MIN_UNITS_PER_WORD && size <= UNITS_PER_FPREG;
+   return size >= MIN_UNITS_PER_WORD && size <= UNITS_PER_FP_REG;
 }
 
   return false;
@@ -6800,7 +6800,7 @@ loongarch_hard_regno_nregs (unsigned int regno, 
machine_mode mode)
   if (LASX_SUPPORTED_MODE_P (mode))
return 1;
 
-  return (GET_MODE_SIZE (mode) + UNITS_PER_FPREG - 1) / UNITS_PER_FPREG;
+  return (GET_MODE_SIZE (mode) + UNITS_PER_FP_REG - 1) / UNITS_PER_FP_REG;
 }
 
   /* All other registers are word-sized.  */
@@ -6835,7 +6835,7 @@ loongarch_class_max_nregs (enum reg_class rclass, 
machine_mode mode)
  else if (LSX_SUPPORTED_MODE_P (mode))
size = MIN (size, UNITS_PER_LSX_REG);
  else
-   size = MIN (size, UNITS_PER_FPREG);
+   size = MIN (size, UNITS_PER_FP_REG);
}
   left &= ~reg_class_contents[FP_REGS];
 }
@@ -8209,7 +8209,7 @@ loongarch_get_separate_components (void)
if (IMM12_OPERAND (offset))
  bitmap_set_bit (components, regno);
 
-   offset -= UNITS_PER_FPREG;
+   offset -= UNITS_PER_FP_REG;
   }
 
   /* Don't mess with the hard frame pointer.  */
@@ -8288,7 +8288,7 @@ loongarch_process_components (sbitmap components, 
loongarch_save_restore_fn fn)
if (bitmap_bit_p (components, regno))
  loongarch_save_restore_reg (mode, regno, offset, fn);
 
-   offset -= UNITS_PER_FPREG;
+   offset -= UNITS_PER_FP_REG;
   }
 }
 
diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h
index bf2351f0968..888a633961d 100644
--- a/gcc/config/loongarch/loongarch.h
+++ b/gcc/config/loongarch/loongarch.h
@@ -138,19 +138,16 @@ along with GCC; see the file COPYING3.  If not see
 /* Width of a LASX vector register in bits.  */
 #define BITS_PER_LASX_REG (UNITS_PER_LASX_REG * BITS_PER_UNIT)
 
-/* For LARCH, width of a floating point register.  */
-#define UNITS_PER_FPREG (TARGET_DOUBLE_FLOAT ? 8 : 4)
-
 /* The largest size of value that can be held in floating-point
registers and moved with a single instruction.  */
 #define UNITS_PER_HWFPVALUE \
-  (TARGET_SOFT_FLOAT ? 0 : UNITS_PER_FPREG)
+  (TARGET_SOFT_FLOAT ? 0 : UNITS_PER_FP_REG)
 
 /* The largest size of value that can be held in floating-point
registers.  */
 #define UNITS_PER_FPVALUE \
   (TARGET_SOFT_FLOAT ? 0 \
-   : TARGET_SINGLE_FLOAT ? UNITS_PER_FPREG \
+   : TARGET_SINGLE_FLOAT ? UNITS_PER_FP_REG \
 : LONG_DOUBLE_TYPE_SIZE / BITS_PER_UNIT)
 
 /* The number of bytes in a double.  */
-- 
2.39.3



[PATCH v1 2/3] LoongArch: Change loongarch_expand_vec_cmp()'s return type from bool to void.

2024-03-11 Thread Chenghui Pan
This function is always return true at the end of function implementation,
so the return value is useless.

gcc/ChangeLog:

* config/loongarch/lasx.md: Remove checking of 
loongarch_expand_vec_cmp()'s
return value.
* config/loongarch/loongarch-protos.h (loongarch_expand_vec_cmp): Change
loongarch_expand_vec_cmp()'s return type from bool to void.
* config/loongarch/loongarch.cc (loongarch_expand_vec_cmp): Ditto.
* config/loongarch/lsx.md: Remove checking of 
loongarch_expand_vec_cmp()'s
return value.

Change-Id: I4925ec9c7355125a231f2ab8b8b00c14b72739a2
---
 gcc/config/loongarch/lasx.md| 6 ++
 gcc/config/loongarch/loongarch-protos.h | 2 +-
 gcc/config/loongarch/loongarch.cc   | 3 +--
 gcc/config/loongarch/lsx.md | 6 ++
 4 files changed, 6 insertions(+), 11 deletions(-)

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index ac84db7f0ce..8d4c6b4ec35 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -1383,8 +1383,7 @@
   (match_operand:LASX 3 "register_operand")]))]
   "ISA_HAS_LASX"
 {
-  bool ok = loongarch_expand_vec_cmp (operands);
-  gcc_assert (ok);
+  loongarch_expand_vec_cmp (operands);
   DONE;
 })
 
@@ -1395,8 +1394,7 @@
   (match_operand:ILASX 3 "register_operand")]))]
   "ISA_HAS_LASX"
 {
-  bool ok = loongarch_expand_vec_cmp (operands);
-  gcc_assert (ok);
+  loongarch_expand_vec_cmp (operands);
   DONE;
 })
 
diff --git a/gcc/config/loongarch/loongarch-protos.h 
b/gcc/config/loongarch/loongarch-protos.h
index 871544f760c..e3ed2b912a5 100644
--- a/gcc/config/loongarch/loongarch-protos.h
+++ b/gcc/config/loongarch/loongarch-protos.h
@@ -95,7 +95,7 @@ extern void loongarch_split_lsx_fill_d (rtx, rtx);
 extern const char *loongarch_output_move (rtx, rtx);
 #ifdef RTX_CODE
 extern void loongarch_expand_scc (rtx *);
-extern bool loongarch_expand_vec_cmp (rtx *);
+extern void loongarch_expand_vec_cmp (rtx *);
 extern void loongarch_expand_conditional_branch (rtx *);
 extern void loongarch_expand_conditional_move (rtx *);
 extern void loongarch_expand_conditional_trap (rtx);
diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 8f13d9ca264..e1073c9debd 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -10788,13 +10788,12 @@ loongarch_expand_vec_cond_mask_expr (machine_mode 
mode, machine_mode vimode,
 }
 
 /* Expand integer vector comparison */
-bool
+void
 loongarch_expand_vec_cmp (rtx operands[])
 {
 
   rtx_code code = GET_CODE (operands[1]);
   loongarch_expand_lsx_cmp (operands[0], code, operands[2], operands[3]);
-  return true;
 }
 
 /* Implement TARGET_PROMOTE_FUNCTION_MODE.  */
diff --git a/gcc/config/loongarch/lsx.md b/gcc/config/loongarch/lsx.md
index b9b94b9079c..87d3e7c5d9f 100644
--- a/gcc/config/loongarch/lsx.md
+++ b/gcc/config/loongarch/lsx.md
@@ -518,8 +518,7 @@
   (match_operand:LSX 3 "register_operand")]))]
   "ISA_HAS_LSX"
 {
-  bool ok = loongarch_expand_vec_cmp (operands);
-  gcc_assert (ok);
+  loongarch_expand_vec_cmp (operands);
   DONE;
 })
 
@@ -530,8 +529,7 @@
   (match_operand:ILSX 3 "register_operand")]))]
   "ISA_HAS_LSX"
 {
-  bool ok = loongarch_expand_vec_cmp (operands);
-  gcc_assert (ok);
+  loongarch_expand_vec_cmp (operands);
   DONE;
 })
 
-- 
2.39.3



[PATCH v1] LoongArch: Fix insn output of vec_concat templates for LASX.

2023-12-22 Thread Chenghui Pan
When investigaing failure of gcc.dg/vect/slp-reduc-sad.c, following
instruction block are being generated by vec_concatv32qi (which is
generated by vec_initv32qiv16qi) at entrance of foo() function:

  vldx$vr3,$r5,$r6
  vld $vr2,$r5,0
  xvpermi.q   $xr2,$xr3,0x20

causes the reversion of vec_initv32qiv16qi operation's high and
low 128-bit part.

According to other target's similar impl and LSX impl for following
RTL representation, current definition in lasx.md of "vec_concat"
are wrong:

  (set (op0) (vec_concat (op1) (op2)))

For correct behavior, the last argument of xvpermi.q should be 0x02
instead of 0x20. This patch fixes this issue and cleanup the vec_concat
template impl.

gcc/ChangeLog:

* config/loongarch/lasx.md (vec_concatv4di): Delete.
(vec_concatv8si): Delete.
(vec_concatv16hi): Delete.
(vec_concatv32qi): Delete.
(vec_concatv4df): Delete.
(vec_concatv8sf): Delete.
(vec_concat): New template with insn output fixed.
---
 gcc/config/loongarch/lasx.md | 74 
 1 file changed, 7 insertions(+), 67 deletions(-)

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index eeac8cd984b..a9d948bb606 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -590,77 +590,17 @@ (define_insn "lasx_xvinsgr2vr_"
   [(set_attr "type" "simd_insert")
(set_attr "mode" "")])
 
-(define_insn "vec_concatv4di"
-  [(set (match_operand:V4DI 0 "register_operand" "=f")
-   (vec_concat:V4DI
- (match_operand:V2DI 1 "register_operand" "0")
- (match_operand:V2DI 2 "register_operand" "f")))]
-  "ISA_HAS_LASX"
-{
-  return "xvpermi.q\t%u0,%u2,0x20";
-}
-  [(set_attr "type" "simd_splat")
-   (set_attr "mode" "V4DI")])
-
-(define_insn "vec_concatv8si"
-  [(set (match_operand:V8SI 0 "register_operand" "=f")
-   (vec_concat:V8SI
- (match_operand:V4SI 1 "register_operand" "0")
- (match_operand:V4SI 2 "register_operand" "f")))]
-  "ISA_HAS_LASX"
-{
-  return "xvpermi.q\t%u0,%u2,0x20";
-}
-  [(set_attr "type" "simd_splat")
-   (set_attr "mode" "V4DI")])
-
-(define_insn "vec_concatv16hi"
-  [(set (match_operand:V16HI 0 "register_operand" "=f")
-   (vec_concat:V16HI
- (match_operand:V8HI 1 "register_operand" "0")
- (match_operand:V8HI 2 "register_operand" "f")))]
-  "ISA_HAS_LASX"
-{
-  return "xvpermi.q\t%u0,%u2,0x20";
-}
-  [(set_attr "type" "simd_splat")
-   (set_attr "mode" "V4DI")])
-
-(define_insn "vec_concatv32qi"
-  [(set (match_operand:V32QI 0 "register_operand" "=f")
-   (vec_concat:V32QI
- (match_operand:V16QI 1 "register_operand" "0")
- (match_operand:V16QI 2 "register_operand" "f")))]
-  "ISA_HAS_LASX"
-{
-  return "xvpermi.q\t%u0,%u2,0x20";
-}
-  [(set_attr "type" "simd_splat")
-   (set_attr "mode" "V4DI")])
-
-(define_insn "vec_concatv4df"
-  [(set (match_operand:V4DF 0 "register_operand" "=f")
-   (vec_concat:V4DF
- (match_operand:V2DF 1 "register_operand" "0")
- (match_operand:V2DF 2 "register_operand" "f")))]
-  "ISA_HAS_LASX"
-{
-  return "xvpermi.q\t%u0,%u2,0x20";
-}
-  [(set_attr "type" "simd_splat")
-   (set_attr "mode" "V4DF")])
-
-(define_insn "vec_concatv8sf"
-  [(set (match_operand:V8SF 0 "register_operand" "=f")
-   (vec_concat:V8SF
- (match_operand:V4SF 1 "register_operand" "0")
- (match_operand:V4SF 2 "register_operand" "f")))]
+(define_insn "vec_concat"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+   (vec_concat:LASX
+ (match_operand: 1 "register_operand" "0")
+ (match_operand: 2 "register_operand" "f")))]
   "ISA_HAS_LASX"
 {
-  return "xvpermi.q\t%u0,%u2,0x20";
+  return "xvpermi.q\t%u0,%u2,0x02";
 }
   [(set_attr "type" "simd_splat")
-   (set_attr "mode" "V4DI")])
+   (set_attr "mode" "")])
 
 ;; xshuf.w
 (define_insn "lasx_xvperm_"
-- 
2.39.3



[PATCH v1] LoongArch: Fix ICE when passing two same vector argument consecutively

2023-12-22 Thread Chenghui Pan
Following code will cause ICE on LoongArch target:

  #include 

  extern void bar (__m128i, __m128i);

  __m128i a;

  void
  foo ()
  {
bar (a, a);
  }

It is caused by missing constraint definition in mov_lsx. This
patch fixes the template and remove the unnecessary processing from
loongarch_split_move () function.

This patch also cleanup the redundant definition from
loongarch_split_move () and loongarch_split_move_p ().

gcc/ChangeLog:

* config/loongarch/lasx.md: Use loongarch_split_move and
  loongarch_split_move_p directly.
* config/loongarch/loongarch-protos.h
(loongarch_split_move): Remove unnecessary argument.
(loongarch_split_move_insn_p): Delete.
(loongarch_split_move_insn): Delete.
* config/loongarch/loongarch.cc
(loongarch_split_move_insn_p): Delete.
(loongarch_load_store_insns): Use loongarch_split_move_p
  directly.
(loongarch_split_move): remove the unnecessary processing.
(loongarch_split_move_insn): Delete.
* config/loongarch/lsx.md: Use loongarch_split_move and
  loongarch_split_move_p directly.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/vector/lsx/lsx-mov-1.c: New test.
---
 gcc/config/loongarch/lasx.md  |  4 +-
 gcc/config/loongarch/loongarch-protos.h   |  4 +-
 gcc/config/loongarch/loongarch.cc | 49 +--
 gcc/config/loongarch/lsx.md   | 10 ++--
 .../loongarch/vector/lsx/lsx-mov-1.c  | 14 ++
 5 files changed, 24 insertions(+), 57 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-mov-1.c

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index eeac8cd984b..6418ff52fe5 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -912,10 +912,10 @@ (define_split
   [(set (match_operand:LASX 0 "nonimmediate_operand")
(match_operand:LASX 1 "move_operand"))]
   "reload_completed && ISA_HAS_LASX
-   && loongarch_split_move_insn_p (operands[0], operands[1])"
+   && loongarch_split_move_p (operands[0], operands[1])"
   [(const_int 0)]
 {
-  loongarch_split_move_insn (operands[0], operands[1], curr_insn);
+  loongarch_split_move (operands[0], operands[1]);
   DONE;
 })
 
diff --git a/gcc/config/loongarch/loongarch-protos.h 
b/gcc/config/loongarch/loongarch-protos.h
index c66ab932d67..7bf21a45c69 100644
--- a/gcc/config/loongarch/loongarch-protos.h
+++ b/gcc/config/loongarch/loongarch-protos.h
@@ -82,11 +82,9 @@ extern rtx loongarch_legitimize_call_address (rtx);
 
 extern rtx loongarch_subword (rtx, bool);
 extern bool loongarch_split_move_p (rtx, rtx);
-extern void loongarch_split_move (rtx, rtx, rtx);
+extern void loongarch_split_move (rtx, rtx);
 extern bool loongarch_addu16i_imm12_operand_p (HOST_WIDE_INT, machine_mode);
 extern void loongarch_split_plus_constant (rtx *, machine_mode);
-extern bool loongarch_split_move_insn_p (rtx, rtx);
-extern void loongarch_split_move_insn (rtx, rtx, rtx);
 extern void loongarch_split_128bit_move (rtx, rtx);
 extern bool loongarch_split_128bit_move_p (rtx, rtx);
 extern void loongarch_split_256bit_move (rtx, rtx);
diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 390e3206a17..98709123770 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -2562,7 +2562,6 @@ loongarch_split_const_insns (rtx x)
   return low + high;
 }
 
-bool loongarch_split_move_insn_p (rtx dest, rtx src);
 /* Return one word of 128-bit value OP, taking into account the fixed
endianness of certain registers.  BYTE selects from the byte address.  */
 
@@ -2602,7 +2601,7 @@ loongarch_load_store_insns (rtx mem, rtx_insn *insn)
 {
   set = single_set (insn);
   if (set
- && !loongarch_split_move_insn_p (SET_DEST (set), SET_SRC (set)))
+ && !loongarch_split_move_p (SET_DEST (set), SET_SRC (set)))
might_split_p = false;
 }
 
@@ -4220,7 +4219,7 @@ loongarch_split_move_p (rtx dest, rtx src)
SPLIT_TYPE describes the split condition.  */
 
 void
-loongarch_split_move (rtx dest, rtx src, rtx insn_)
+loongarch_split_move (rtx dest, rtx src)
 {
   rtx low_dest;
 
@@ -4258,33 +4257,6 @@ loongarch_split_move (rtx dest, rtx src, rtx insn_)
   loongarch_subword (src, true));
}
 }
-
-  /* This is a hack.  See if the next insn uses DEST and if so, see if we
- can forward SRC for DEST.  This is most useful if the next insn is a
- simple store.  */
-  rtx_insn *insn = (rtx_insn *) insn_;
-  struct loongarch_address_info addr = {};
-  if (insn)
-{
-  rtx_insn *next = next_nonnote_nondebug_insn_bb (insn);
-  if (next)
-   {
- rtx set = single_set (next);
- if (set && SET_SRC (set) == dest)
-   {
- if (MEM_P (src))
-   {
- rtx tmp = XEXP (src, 0);
- 

Re: [PATCH 0/2] LoongArch: Fix PR113033 and clean up code

2023-12-19 Thread Chenghui Pan

The patches are look ok and no problem in spec2017 correctness check.

On 2023/12/19 15:26, chenglulu wrote:

We will read and test these patches as soon as possible.

Thanks!

在 2023/12/19 下午2:59, Xi Ruoyao 写道:

Superseds
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640871.html.

Per Jakub's response, vec_init patterns do not have a predicate on the
input operand so the operand can be *anything*.  It's not safe to simply
move it into an reg, and we have to use force_reg instead.

The code clean up is separated into the 2nd patch to make reviewing
easier.

Bootstrapped and regtested on loongarch64-linux-gnu.  Ok for trunk?

Xi Ruoyao (2):
   LoongArch: Use force_reg instead of gen_reg_rtx + emit_move_insn in
 vec_init expander [PR113033]
   LoongArch: Clean up vec_init expander

  gcc/config/loongarch/loongarch.cc | 54 +++
  gcc/testsuite/gcc.target/loongarch/pr113033.c | 23 
  2 files changed, 43 insertions(+), 34 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/loongarch/pr113033.c





Re: Fwd: [PATCH] LoongArch: Fix FP vector comparsons [PR113034]

2023-12-19 Thread Chenghui Pan
Hi, I checked the correctness of spec2017 and regression test of gcc, it 
seems ok!


On 2023/12/18 17:04, chenglulu wrote:

 转发的消息 
主题: [PATCH] LoongArch: Fix FP vector comparsons [PR113034]
日期: Sun, 17 Dec 2023 23:12:18 +0800
发件人:Xi Ruoyao 
收件人:gcc-patches@gcc.gnu.org
抄送: 	chenglulu , i...@xen0n.name, 
xucheng...@loongson.cn, Jiahao Xu , c...@jia.je, Xi 
Ruoyao 




We had the following mappings between vfcmp submenmonics and RTX
codes:

(define_code_attr fcc
[(unordered "cun")
(ordered "cor")
(eq "ceq")
(ne "cne")
(uneq "cueq")
(unle "cule")
(unlt "cult")
(le "cle")
(lt "clt")])

This is inconsistent with scalar code:

(define_code_attr fcond [(unordered "cun")
(uneq "cueq")
(unlt "cult")
(unle "cule")
(eq "ceq")
(lt "slt")
(le "sle")
(ordered "cor")
(ltgt "sne")
(ne "cune")
(ge "sge")
(gt "sgt")
(unge "cuge")
(ungt "cugt")])

For every RTX code for which the LSX/LASX code is different from the
scalar code, the scalar code is correct and the LSX/LASX code is wrong.
Most seriously, the RTX code NE should be mapped to "cneq", not "cne".


The "cneq" in the commit info may be "cune" according to the context?



Rewrite vfcmp define_insns in simd.md using the same mapping as
scalar fcmp.

Note that GAS does not support [x]vfcmp.{c/s}[u]{ge/gt} (pseudo)
instruction (although fcmp.{c/s}[u]{ge/gt} is supported), so we need to
switch the order of inputs and use [x]vfcmp.{c/s}[u]{le/lt} instead.

The vfcmp.{sult/sule/clt/cle}.{s/d} instructions do not have a single
RTX code, but they can be modeled as an inversed RTX code following a
"not" operation. Doing so allows the compiler to optimized vectorized
__builtin_isless etc. to a single instruction. This optimization should
be added for scalar code too and I'll do it later.

Tests are added for mapping between C code, IEC 60559 operations, and
vfcmp instructions.

[1]:https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640713.html

gcc/ChangeLog:

PR target/113034
* config/loongarch/lasx.md (UNSPEC_LASX_XVFCMP_*): Remove.
(lasx_xvfcmp_caf_): Remove.
(lasx_xvfcmp_cune_): Remove.
(FSC256_UNS): Remove.
(fsc256): Remove.
(lasx_xvfcmp__): Remove.
(lasx_xvfcmp__): Remove.
* config/loongarch/lsx.md (UNSPEC_LSX_XVFCMP_*): Remove.
(lsx_vfcmp_caf_): Remove.
(lsx_vfcmp_cune_): Remove.
(vfcond): Remove.
(fcc): Remove.
(FSC_UNS): Remove.
(fsc): Remove.
(lsx_vfcmp__): Remove.
(lsx_vfcmp__): Remove.
* config/loongarch/simd.md
(fcond_simd): New define_code_iterator.
(_vfcmp__):
New define_insn.
(fcond_simd_rev): New define_code_iterator.
(fcond_rev_asm): New define_code_attr.
(_vfcmp__):
New define_insn.
(fcond_inv): New define_code_iterator.
(fcond_inv_rev): New define_code_iterator.
(fcond_inv_rev_asm): New define_code_attr.
(_vfcmp__): New define_insn.
(_vfcmp__):
New define_insn.
(UNSPEC_SIMD_FCMP_CAF, UNSPEC_SIMD_FCMP_SAF,
UNSPEC_SIMD_FCMP_SEQ, UNSPEC_SIMD_FCMP_SUN,
UNSPEC_SIMD_FCMP_SUEQ, UNSPEC_SIMD_FCMP_CNE,
UNSPEC_SIMD_FCMP_SOR, UNSPEC_SIMD_FCMP_SUNE): New unspecs.
(SIMD_FCMP): New define_int_iterator.
(fcond_unspec): New define_int_attr.
(_vfcmp__): New define_insn.
* config/loongarch/loongarch.cc (loongarch_expand_lsx_cmp):
Remove unneeded special cases.

gcc/testsuite/ChangeLog:

PR target/113034
* gcc.target/loongarch/vfcmp-f.c: New test.
* gcc.target/loongarch/vfcmp-d.c: New test.
* gcc.target/loongarch/xvfcmp-f.c: New test.
* gcc.target/loongarch/xvfcmp-d.c: New test.
* gcc.target/loongarch/vector/lasx/lasx-vcond-2.c: Scan for cune
instead of cne.
* gcc.target/loongarch/vector/lsx/lsx-vcond-2.c: Likewise.
---

Bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk?

gcc/config/loongarch/lasx.md | 76 
gcc/config/loongarch/loongarch.cc | 60 +-
gcc/config/loongarch/lsx.md | 83 
gcc/config/loongarch/simd.md | 118 
.../loongarch/vector/lasx/lasx-vcond-2.c | 4 +-
.../loongarch/vector/lsx/lsx-vcond-2.c | 4 +-
gcc/testsuite/gcc.target/loongarch/vfcmp-d.c | 28 +++
gcc/testsuite/gcc.target/loongarch/vfcmp-f.c | 178 ++
gcc/testsuite/gcc.target/loongarch/xvfcmp-d.c | 29 +++
gcc/testsuite/gcc.target/loongarch/xvfcmp-f.c | 27 +++
10 files changed, 385 insertions(+), 222 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/loongarch/vfcmp-d.c
create mode 100644 gcc/testsuite/gcc.target/loongarch/vfcmp-f.c
create mode 100644 gcc/testsuite/gcc.target/loongarch/xvfcmp-d.c
create mode 100644 gcc/testsuite/gcc.target/loongarch/xvfcmp-f.c

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index eeac8cd984b..921ce0eebed 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -32,9 +32,7 @@ (define_c_enum "unspec" [
UNSPEC_LASX_XVBITREVI
UNSPEC_LASX_XVBITSET
UNSPEC_LASX_XVBITSETI
- UNSPEC_LASX_XVFCMP_CAF
UNSPEC_LASX_XVFCLASS
- UNSPEC_LASX_XVFCMP_CUNE
UNSPEC_LASX_XVFCVT
UNSPEC_LASX_XVFCVTH
UNSPEC_LASX_XVFCVTL
@@ -44,17 +42,6 @@ (define_c_enum "unspec" [
UNSPEC_LASX_XVFRINT
UNSPEC_LASX_XVFRSQRT
UNSPEC_LASX_XVFRSQRTE
- UNSPEC_LASX_XVFCMP_SAF
- 

[PATCH v1] LoongArch: Fix instruction name typo in lsx_vreplgr2vr_ template

2023-11-03 Thread Chenghui Pan
gcc/ChangeLog:

* config/loongarch/lsx.md: Fix instruction name typo in
lsx_vreplgr2vr_ template.
---
 gcc/config/loongarch/lsx.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/loongarch/lsx.md b/gcc/config/loongarch/lsx.md
index 4af32c8dfe1..55c7d79a030 100644
--- a/gcc/config/loongarch/lsx.md
+++ b/gcc/config/loongarch/lsx.md
@@ -1523,7 +1523,7 @@ (define_insn "lsx_vreplgr2vr_"
   "ISA_HAS_LSX"
 {
   if (which_alternative == 1)
-return "ldi.\t%w0,0";
+return "vldi.\t%w0,0";
 
   if (!TARGET_64BIT && (mode == V2DImode || mode == V2DFmode))
 return "#";
-- 
2.36.0



[PATCH v1] LoongArch: Fix vfrint-releated comments in lsxintrin.h and lasxintrin.h

2023-10-22 Thread Chenghui Pan
The comment of vfrint-related intrinsic functions does not match the return
value type in definition. This patch fixes these comments.

gcc/ChangeLog:

* config/loongarch/lasxintrin.h (__lasx_xvftintrnel_l_s): Fix comments.
(__lasx_xvfrintrne_s): Ditto.
(__lasx_xvfrintrne_d): Ditto.
(__lasx_xvfrintrz_s): Ditto.
(__lasx_xvfrintrz_d): Ditto.
(__lasx_xvfrintrp_s): Ditto.
(__lasx_xvfrintrp_d): Ditto.
(__lasx_xvfrintrm_s): Ditto.
(__lasx_xvfrintrm_d): Ditto.
* config/loongarch/lsxintrin.h (__lsx_vftintrneh_l_s): Ditto.
(__lsx_vfrintrne_s): Ditto.
(__lsx_vfrintrne_d): Ditto.
(__lsx_vfrintrz_s): Ditto.
(__lsx_vfrintrz_d): Ditto.
(__lsx_vfrintrp_s): Ditto.
(__lsx_vfrintrp_d): Ditto.
(__lsx_vfrintrm_s): Ditto.
(__lsx_vfrintrm_d): Ditto.
---
 gcc/config/loongarch/lasxintrin.h | 16 
 gcc/config/loongarch/lsxintrin.h  | 16 
 2 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/gcc/config/loongarch/lasxintrin.h 
b/gcc/config/loongarch/lasxintrin.h
index d3937992746..7bce2c757f1 100644
--- a/gcc/config/loongarch/lasxintrin.h
+++ b/gcc/config/loongarch/lasxintrin.h
@@ -3368,7 +3368,7 @@ __m256i __lasx_xvftintrnel_l_s (__m256 _1)
 }
 
 /* Assembly instruction format:xd, xj.  */
-/* Data types in instruction templates:  V8SI, V8SF.  */
+/* Data types in instruction templates:  V8SF, V8SF.  */
 extern __inline __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 __m256 __lasx_xvfrintrne_s (__m256 _1)
 {
@@ -3376,7 +3376,7 @@ __m256 __lasx_xvfrintrne_s (__m256 _1)
 }
 
 /* Assembly instruction format:xd, xj.  */
-/* Data types in instruction templates:  V4DI, V4DF.  */
+/* Data types in instruction templates:  V4DF, V4DF.  */
 extern __inline __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 __m256d __lasx_xvfrintrne_d (__m256d _1)
 {
@@ -3384,7 +3384,7 @@ __m256d __lasx_xvfrintrne_d (__m256d _1)
 }
 
 /* Assembly instruction format:xd, xj.  */
-/* Data types in instruction templates:  V8SI, V8SF.  */
+/* Data types in instruction templates:  V8SF, V8SF.  */
 extern __inline __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 __m256 __lasx_xvfrintrz_s (__m256 _1)
 {
@@ -3392,7 +3392,7 @@ __m256 __lasx_xvfrintrz_s (__m256 _1)
 }
 
 /* Assembly instruction format:xd, xj.  */
-/* Data types in instruction templates:  V4DI, V4DF.  */
+/* Data types in instruction templates:  V4DF, V4DF.  */
 extern __inline __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 __m256d __lasx_xvfrintrz_d (__m256d _1)
 {
@@ -3400,7 +3400,7 @@ __m256d __lasx_xvfrintrz_d (__m256d _1)
 }
 
 /* Assembly instruction format:xd, xj.  */
-/* Data types in instruction templates:  V8SI, V8SF.  */
+/* Data types in instruction templates:  V8SF, V8SF.  */
 extern __inline __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 __m256 __lasx_xvfrintrp_s (__m256 _1)
 {
@@ -3408,7 +3408,7 @@ __m256 __lasx_xvfrintrp_s (__m256 _1)
 }
 
 /* Assembly instruction format:xd, xj.  */
-/* Data types in instruction templates:  V4DI, V4DF.  */
+/* Data types in instruction templates:  V4DF, V4DF.  */
 extern __inline __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 __m256d __lasx_xvfrintrp_d (__m256d _1)
 {
@@ -3416,7 +3416,7 @@ __m256d __lasx_xvfrintrp_d (__m256d _1)
 }
 
 /* Assembly instruction format:xd, xj.  */
-/* Data types in instruction templates:  V8SI, V8SF.  */
+/* Data types in instruction templates:  V8SF, V8SF.  */
 extern __inline __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 __m256 __lasx_xvfrintrm_s (__m256 _1)
 {
@@ -3424,7 +3424,7 @@ __m256 __lasx_xvfrintrm_s (__m256 _1)
 }
 
 /* Assembly instruction format:xd, xj.  */
-/* Data types in instruction templates:  V4DI, V4DF.  */
+/* Data types in instruction templates:  V4DF, V4DF.  */
 extern __inline __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 __m256d __lasx_xvfrintrm_d (__m256d _1)
 {
diff --git a/gcc/config/loongarch/lsxintrin.h b/gcc/config/loongarch/lsxintrin.h
index ec42069904d..29553c093fa 100644
--- a/gcc/config/loongarch/lsxintrin.h
+++ b/gcc/config/loongarch/lsxintrin.h
@@ -3412,7 +3412,7 @@ __m128i __lsx_vftintrneh_l_s (__m128 _1)
 }
 
 /* Assembly instruction format:vd, vj.  */
-/* Data types in instruction templates:  V4SI, V4SF.  */
+/* Data types in instruction templates:  V4SF, V4SF.  */
 extern __inline __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 __m128 __lsx_vfrintrne_s (__m128 _1)
 {
@@ -3420,7 +3420,7 @@ __m128 __lsx_vfrintrne_s (__m128 _1)
 }
 
 /* Assembly instruction format:vd, vj.  */
-/* Data types in instruction templates:  V2DI, V2DF.  */
+/* Data types in instruction templates:  V2DF, V2DF.  */
 extern __inline __attribute__((__gnu_inline__, 

[PATCH v1] LoongArch: Fix vec_initv32qiv16qi template to avoid ICE.

2023-10-11 Thread Chenghui Pan
Following test code triggers unrecognized insn ICE on LoongArch target
with "-O3 -mlasx":

void
foo (unsigned char *dst, unsigned char *src)
{
  for (int y = 0; y < 16; y++)
{
  for (int x = 0; x < 16; x++)
dst[x] = src[x] + 1;
  dst += 32;
  src += 32;
}
}

ICE info:
./test.c: In function ‘foo’:
./test.c:8:1: error: unrecognizable insn:
8 | }
  | ^
(insn 15 14 16 4 (set (reg:V32QI 185 [ vect__24.7 ])
(vec_concat:V32QI (reg:V16QI 186)
(const_vector:V16QI [
(const_int 0 [0]) repeated x16
]))) "./test.c":4:19 -1
 (nil))
during RTL pass: vregs
./test.c:8:1: internal compiler error: in extract_insn, at recog.cc:2791
0x12028023b _fatal_insn(char const*, rtx_def const*, char const*, int, char 
const*)
/home/panchenghui/upstream/gcc/gcc/rtl-error.cc:108
0x12028026f _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
/home/panchenghui/upstream/gcc/gcc/rtl-error.cc:116
0x120a03c5b extract_insn(rtx_insn*)
/home/panchenghui/upstream/gcc/gcc/recog.cc:2791
0x12067ff73 instantiate_virtual_regs_in_insn
/home/panchenghui/upstream/gcc/gcc/function.cc:1610
0x12067ff73 instantiate_virtual_regs
/home/panchenghui/upstream/gcc/gcc/function.cc:1983
0x12067ff73 execute
/home/panchenghui/upstream/gcc/gcc/function.cc:2030

This RTL is generated inside loongarch_expand_vector_group_init function 
(related
to vec_initv32qiv16qi template). Original impl doesn't ensure all vec_concat 
arguments
are register type. This patch adds force_reg() to the vec_concat argument 
generation.

gcc/ChangeLog:

* config/loongarch/loongarch.cc (loongarch_expand_vector_group_init):
  fix impl related to vec_initv32qiv16qi template to avoid ICE.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/vector/lasx/lasx-vec-init-1.c: New test.
---
 gcc/config/loongarch/loongarch.cc  |  3 ++-
 .../loongarch/vector/lasx/lasx-vec-init-1.c| 14 ++
 2 files changed, 16 insertions(+), 1 deletion(-)
 create mode 100644 
gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-vec-init-1.c

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 3420e002efc..14dd0db1674 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -10206,7 +10206,8 @@ loongarch_gen_const_int_vector_shuffle (machine_mode 
mode, int val)
 void
 loongarch_expand_vector_group_init (rtx target, rtx vals)
 {
-  rtx ops[2] = { XVECEXP (vals, 0, 0), XVECEXP (vals, 0, 1) };
+  rtx ops[2] = { force_reg (E_V16QImode, XVECEXP (vals, 0, 0)),
+  force_reg (E_V16QImode, XVECEXP (vals, 0, 1)) };
   emit_insn (gen_rtx_SET (target, gen_rtx_VEC_CONCAT (E_V32QImode, ops[0],
  ops[1])));
 }
diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-vec-init-1.c 
b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-vec-init-1.c
new file mode 100644
index 000..28be329822e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-vec-init-1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+void
+foo (unsigned char *dst, unsigned char *src)
+{
+  for (int y = 0; y < 16; y++)
+{
+  for (int x = 0; x < 16; x++)
+dst[x] = src[x] + 1;
+  dst += 32;
+  src += 32;
+}
+}
-- 
2.36.0



[PATCH v3 2/2] LoongArch: Modify check_effective_target_vect_int_mod according to SX/ASX capabilities.

2023-09-28 Thread Chenghui Pan
gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Add LoongArch in
check_effective_target_vect_int_mod according to SX/ASX capabilities.
---
 gcc/testsuite/lib/target-supports.exp | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 25958aaf0c5..5632904ddfd 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -8747,6 +8747,8 @@ proc check_effective_target_vect_int_mod { } {
 return [check_cached_effective_target_indexed vect_int_mod {
   expr { ([istarget powerpc*-*-*]
  && [check_effective_target_has_arch_pwr10])
+ || ([istarget loongarch*-*-*]
+ && [check_effective_target_loongarch_sx])
  || [istarget amdgcn-*-*] }}]
 }
 
@@ -12824,6 +12826,14 @@ proc 
check_effective_target_const_volatile_readonly_section { } {
   return 1
 }
 
+proc check_effective_target_loongarch_sx { } {
+return [check_no_compiler_messages loongarch_lsx assembly {
+   #if !defined(__loongarch_sx)
+   #error "LSX not defined"
+   #endif
+}]
+}
+
 proc check_effective_target_loongarch_sx_hw { } {
 return [check_runtime loongarch_sx_hw {
#include 
@@ -12836,6 +12846,14 @@ proc check_effective_target_loongarch_sx_hw { } {
 } "-mlsx"]
 }
 
+proc check_effective_target_loongarch_asx { } {
+return [check_no_compiler_messages loongarch_asx assembly {
+   #if !defined(__loongarch_asx)
+   #error "LASX not defined"
+   #endif
+}]
+}
+
 proc check_effective_target_loongarch_asx_hw { } {
 return [check_runtime loongarch_asx_hw {
#include 
-- 
2.36.0



[PATCH v3 0/2] LoongArch: Update target-supports.exp for LoongArch SX/ASX.

2023-09-28 Thread Chenghui Pan
This is the update of: 
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631379.html

This version does not include changes for codes, but fixes the commit title
format and appends the missing PR info.

Chenghui Pan (2):
  LoongArch: Enable vect.exp for LoongArch. [PR111424]
  LoongArch: Modify check_effective_target_vect_int_mod according to
SX/ASX capabilities.

 gcc/testsuite/lib/target-supports.exp | 49 +++
 1 file changed, 49 insertions(+)

-- 
2.36.0



[PATCH v3 1/2] LoongArch: Enable vect.exp for LoongArch. [PR111424]

2023-09-28 Thread Chenghui Pan
gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Enable vect.exp for LoongArch.
---
 gcc/testsuite/lib/target-supports.exp | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 3a472943a9b..25958aaf0c5 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -11335,6 +11335,13 @@ proc check_vect_support_and_set_flags { } {
lappend DEFAULT_VECTCFLAGS "--param" "riscv-vector-abi"
set dg-do-what-default compile
}
+} elseif [istarget loongarch*-*-*] {
+  lappend DEFAULT_VECTCFLAGS "-mdouble-float" "-mlasx"
+  if [check_effective_target_loongarch_asx_hw] {
+ set dg-do-what-default run
+  } else {
+ set dg-do-what-default compile
+  }
 } else {
 return 0
 }
@@ -12817,6 +12824,30 @@ proc 
check_effective_target_const_volatile_readonly_section { } {
   return 1
 }
 
+proc check_effective_target_loongarch_sx_hw { } {
+return [check_runtime loongarch_sx_hw {
+   #include 
+   int main (void)
+   {
+ __m128i a, b, c;
+ c = __lsx_vand_v (a, b);
+ return 0;
+   }
+} "-mlsx"]
+}
+
+proc check_effective_target_loongarch_asx_hw { } {
+return [check_runtime loongarch_asx_hw {
+   #include 
+   int main (void)
+   {
+ __m256i a, b, c;
+ c = __lasx_xvand_v (a, b);
+ return 0;
+   }
+} "-mlasx"]
+}
+
 # Appends necessary Python flags to extra-tool-flags if Python.h is supported.
 # Otherwise, modifies dg-do-what.
 proc dg-require-python-h { args } {
-- 
2.36.0



Re: [PATCH v2 0/2] Update target-supports.exp for LoongArch SX/ASX.

2023-09-26 Thread Chenghui Pan
Correction: vect.exp will be set to run when target is capable of
running LASX instructions, otherwise it will be compiled only.

On Tue, 2023-09-26 at 19:56 +0800, Chenghui Pan wrote:
> This is an update of:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-September/630953.html
> 
> This version of patch set contains code that enable vect.exp for
> LoongArch
> target when target environment is capable of running LASX
> instructions.
> 
> After some attemptions, we still need
> "check_effective_target_loongarch_sx" 
> in "proc check_effective_target_vect_int_mod {}" to choose correct
> dg-final
> directives for LoongArch, because DEFAULT_VECTCFLAGS cannot affect
> pr104992.c
> which is invoked by gcc.dg/dg.exp (not vect.exp).
> 
> Chenghui Pan (2):
>   Enable vect.exp for LoongArch.
>   Add LoongArch in check_effective_target_vect_int_mod according to
> ISA
>     capabilities.
> 
>  gcc/testsuite/lib/target-supports.exp | 49
> +++
>  1 file changed, 49 insertions(+)
> 



[PATCH v2 0/2] Update target-supports.exp for LoongArch SX/ASX.

2023-09-26 Thread Chenghui Pan
This is an update of:
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/630953.html

This version of patch set contains code that enable vect.exp for LoongArch
target when target environment is capable of running LASX instructions.

After some attemptions, we still need "check_effective_target_loongarch_sx" 
in "proc check_effective_target_vect_int_mod {}" to choose correct dg-final
directives for LoongArch, because DEFAULT_VECTCFLAGS cannot affect pr104992.c
which is invoked by gcc.dg/dg.exp (not vect.exp).

Chenghui Pan (2):
  Enable vect.exp for LoongArch.
  Add LoongArch in check_effective_target_vect_int_mod according to ISA
capabilities.

 gcc/testsuite/lib/target-supports.exp | 49 +++
 1 file changed, 49 insertions(+)

-- 
2.36.0



[PATCH v2 2/2] Add LoongArch in check_effective_target_vect_int_mod according to ISA capabilities.

2023-09-26 Thread Chenghui Pan
gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Add LoongArch in
check_effective_target_vect_int_mod according to ISA capabilities.
---
 gcc/testsuite/lib/target-supports.exp | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 17863288ff0..4a84dee430b 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -8586,6 +8586,8 @@ proc check_effective_target_vect_int_mod { } {
 return [check_cached_effective_target_indexed vect_int_mod {
   expr { ([istarget powerpc*-*-*]
  && [check_effective_target_has_arch_pwr10])
+ || ([istarget loongarch*-*-*]
+ && [check_effective_target_loongarch_sx])
  || [istarget amdgcn-*-*] }}]
 }
 
@@ -12663,6 +12665,14 @@ proc 
check_effective_target_const_volatile_readonly_section { } {
   return 1
 }
 
+proc check_effective_target_loongarch_sx { } {
+return [check_no_compiler_messages loongarch_lsx assembly {
+   #if !defined(__loongarch_sx)
+   #error "LSX not defined"
+   #endif
+}]
+}
+
 proc check_effective_target_loongarch_sx_hw { } {
 return [check_runtime loongarch_sx_hw {
#include 
@@ -12675,6 +12685,14 @@ proc check_effective_target_loongarch_sx_hw { } {
 } "-mlsx"]
 }
 
+proc check_effective_target_loongarch_asx { } {
+return [check_no_compiler_messages loongarch_asx assembly {
+   #if !defined(__loongarch_asx)
+   #error "LASX not defined"
+   #endif
+}]
+}
+
 proc check_effective_target_loongarch_asx_hw { } {
 return [check_runtime loongarch_asx_hw {
#include 
-- 
2.36.0



[PATCH v2 1/2] Enable vect.exp for LoongArch.

2023-09-26 Thread Chenghui Pan
gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Enable vect.exp for LoongArch.
---
 gcc/testsuite/lib/target-supports.exp | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 2de41cef2f6..17863288ff0 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -11174,6 +11174,13 @@ proc check_vect_support_and_set_flags { } {
lappend DEFAULT_VECTCFLAGS "--param" "riscv-vector-abi"
set dg-do-what-default compile
}
+} elseif [istarget loongarch*-*-*] {
+  lappend DEFAULT_VECTCFLAGS "-mdouble-float" "-mlasx"
+  if [check_effective_target_loongarch_asx_hw] {
+ set dg-do-what-default run
+  } else {
+ set dg-do-what-default compile
+  }
 } else {
 return 0
 }
@@ -12656,6 +12663,30 @@ proc 
check_effective_target_const_volatile_readonly_section { } {
   return 1
 }
 
+proc check_effective_target_loongarch_sx_hw { } {
+return [check_runtime loongarch_sx_hw {
+   #include 
+   int main (void)
+   {
+ __m128i a, b, c;
+ c = __lsx_vand_v (a, b);
+ return 0;
+   }
+} "-mlsx"]
+}
+
+proc check_effective_target_loongarch_asx_hw { } {
+return [check_runtime loongarch_asx_hw {
+   #include 
+   int main (void)
+   {
+ __m256i a, b, c;
+ c = __lasx_xvand_v (a, b);
+ return 0;
+   }
+} "-mlasx"]
+}
+
 # Appends necessary Python flags to extra-tool-flags if Python.h is supported.
 # Otherwise, modifies dg-do-what.
 proc dg-require-python-h { args } {
-- 
2.36.0



Re: [PATCH v1] Update check_effective_target_vect_int_mod according to LoongArch SX/ASX capabilities.

2023-09-25 Thread Chenghui Pan
Thanks! I will try to improve it.

On Mon, 2023-09-25 at 17:44 +0800, Xi Ruoyao wrote:
> On Mon, 2023-09-25 at 17:38 +0800, Chenghui Pan wrote:
> > Hi!
> > 
> > After some attemptions, I think we still ne to check
> > "check_effective_target_loongarch_sx" in vect_int_mod. I wrote some
> > temp logics in gcc/testsuite/lib/target-supports.exp like this:
> > 
> > diff --git a/gcc/testsuite/lib/target-supports.exp
> > b/gcc/testsuite/lib/target-supports.exp
> > index 2de41cef2f6..91e1c22a6e1 100644
> > --- a/gcc/testsuite/lib/target-supports.exp
> > +++ b/gcc/testsuite/lib/target-supports.exp
> > @@ -8586,7 +8586,8 @@ proc check_effective_target_vect_int_mod { }
> > {
> >  return [check_cached_effective_target_indexed vect_int_mod {
> >    expr { ([istarget powerpc*-*-*]
> >   && [check_effective_target_has_arch_pwr10])
> > - || [istarget amdgcn-*-*] }}]
> > + || [istarget loongarch*-*-*]
> > + || [istarget amdgcn-*-*] }}]
> >  }
> >  
> >  # Return 1 if the target supports vector even/odd elements
> > extraction,
> > 0 otherwise.
> > @@ -11174,6 +11175,12 @@ proc check_vect_support_and_set_flags { }
> > {
> >     lappend DEFAULT_VECTCFLAGS "--param" "riscv-vector-abi"
> >     set dg-do-what-default compile
> >     }
> > +    } elseif [istarget loongarch*-*-*] {
> > +  if [check_effective_target_loongarch_asx_hw] {
> > + lappend DEFAULT_VECTCFLAGS "-mdouble-float" "-mlasx"
> > +  } elseif [check_effective_target_loongarch_sx_hw] {
> > + lappend DEFAULT_VECTCFLAGS "-mdouble-float" "-mlsx"
> > +  }
> 
> I think we can always enable LASX in DEFAULT_VECTCFLAGS, but set dg-
> do-
> what-default to "run" only if both the hardware and the kernel
> supports
> LASX.  If the kernel or the hardware is not capable we set dg-do-
> what-
> default to "compile".
> 
> >  } else {
> >  return 0
> >  }
> > \* temp impl of sx/asx hw proc *\
> > 
> > And then in make check without --target_board=unix/-mlasx, vect.exp
> > is
> > invoked with expected vector isa options, but pr104992.c failed
> > because
> > it expected result with "vect_int_mod returns 1" but it was
> > compiled
> > without -mlsx/-mlasx. Seems pr104992.c is invoked by gcc.dg/dg.exp,
> > pr104992.c is not affected by DEFAULT_CFLAGS, so we still need to
> > check
> > if LSX/LASX is available in vect_int_mod. 
> > 
> > Other parts of new patch is still WIP.
> 



Re: [PATCH v1] Update check_effective_target_vect_int_mod according to LoongArch SX/ASX capabilities.

2023-09-25 Thread Chenghui Pan
Hi!

After some attemptions, I think we still ne to check
"check_effective_target_loongarch_sx" in vect_int_mod. I wrote some
temp logics in gcc/testsuite/lib/target-supports.exp like this:

diff --git a/gcc/testsuite/lib/target-supports.exp
b/gcc/testsuite/lib/target-supports.exp
index 2de41cef2f6..91e1c22a6e1 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -8586,7 +8586,8 @@ proc check_effective_target_vect_int_mod { } {
 return [check_cached_effective_target_indexed vect_int_mod {
   expr { ([istarget powerpc*-*-*]
  && [check_effective_target_has_arch_pwr10])
- || [istarget amdgcn-*-*] }}]
+ || [istarget loongarch*-*-*]
+ || [istarget amdgcn-*-*] }}]
 }
 
 # Return 1 if the target supports vector even/odd elements extraction,
0 otherwise.
@@ -11174,6 +11175,12 @@ proc check_vect_support_and_set_flags { } {
lappend DEFAULT_VECTCFLAGS "--param" "riscv-vector-abi"
set dg-do-what-default compile
}
+} elseif [istarget loongarch*-*-*] {
+  if [check_effective_target_loongarch_asx_hw] {
+ lappend DEFAULT_VECTCFLAGS "-mdouble-float" "-mlasx"
+  } elseif [check_effective_target_loongarch_sx_hw] {
+ lappend DEFAULT_VECTCFLAGS "-mdouble-float" "-mlsx"
+  }
 } else {
 return 0
 }
\* temp impl of sx/asx hw proc *\

And then in make check without --target_board=unix/-mlasx, vect.exp is
invoked with expected vector isa options, but pr104992.c failed because
it expected result with "vect_int_mod returns 1" but it was compiled
without -mlsx/-mlasx. Seems pr104992.c is invoked by gcc.dg/dg.exp,
pr104992.c is not affected by DEFAULT_CFLAGS, so we still need to check
if LSX/LASX is available in vect_int_mod. 

Other parts of new patch is still WIP.

On Sun, 2023-09-24 at 18:05 +0800, Xi Ruoyao wrote:
> On Wed, 2023-09-20 at 09:15 +0800, Chenghui Pan wrote:
> > LoongArch failed to pass gcc.dg/pr104992.c with -mlsx and -mlasx.
> > This test uses
> > different dg-final directives depending on the vect_int_mod result,
> > LoongArch
> > SX/ASX supports this operations but corresponding description is
> > not defined in
> > target-supports.exp. This patch solves the problem above with some
> > modification in proc check_effective_target_vect_int_mod.
> 
> I think we can just add -mdouble-float -mlasx into DEFAULT_VECTCFLAGS
> and always enable vect_int_mod for LoongArch.  This will make
> vect.exp
> tests automatically run for every "make check" on LoongArch.
> 
> > gcc/testsuite/ChangeLog:
> > 
> > * lib/target-supports.exp: Update
> > check_effective_target_vect_int_mod according to
> > LoongArch SX/ASX capabilities.
> > ---
> >  gcc/testsuite/lib/target-supports.exp | 18 ++
> >  1 file changed, 18 insertions(+)
> > 
> > diff --git a/gcc/testsuite/lib/target-supports.exp
> > b/gcc/testsuite/lib/target-supports.exp
> > index 2de41cef2f6..b253dc578d2 100644
> > --- a/gcc/testsuite/lib/target-supports.exp
> > +++ b/gcc/testsuite/lib/target-supports.exp
> > @@ -8586,6 +8586,8 @@ proc check_effective_target_vect_int_mod { }
> > {
> >  return [check_cached_effective_target_indexed vect_int_mod {
> >    expr { ([istarget powerpc*-*-*]
> >   && [check_effective_target_has_arch_pwr10])
> > +    || ([istarget loongarch*-*-*]
> > +    && [check_effective_target_loongarch_sx])
> >   || [istarget amdgcn-*-*] }}]
> >  }
> >  
> > @@ -12656,6 +12658,22 @@ proc
> > check_effective_target_const_volatile_readonly_section { } {
> >    return 1
> >  }
> >  
> > +proc check_effective_target_loongarch_sx { } {
> > +    return [check_no_compiler_messages loongarch_lsx assembly {
> > +   #if !defined(__loongarch_sx)
> > +   #error "LSX not defined"
> > +   #endif
> > +    }]
> > +}
> > +
> > +proc check_effective_target_loongarch_asx { } {
> > +    return [check_no_compiler_messages loongarch_asx assembly {
> > +   #if !defined(__loongarch_asx)
> > +   #error "LASX not defined"
> > +   #endif
> > +    }]
> > +}
> > +
> >  # Appends necessary Python flags to extra-tool-flags if Python.h
> > is supported.
> >  # Otherwise, modifies dg-do-what.
> >  proc dg-require-python-h { args } {
> 



Re: [PATCH v1] Update check_effective_target_vect_int_mod according to LoongArch SX/ASX capabilities.

2023-09-24 Thread Chenghui Pan
Thanks for your advice! I will check it out and submit a new version of
patch.

On Sun, 2023-09-24 at 18:05 +0800, Xi Ruoyao wrote:
> On Wed, 2023-09-20 at 09:15 +0800, Chenghui Pan wrote:
> > LoongArch failed to pass gcc.dg/pr104992.c with -mlsx and -mlasx.
> > This test uses
> > different dg-final directives depending on the vect_int_mod result,
> > LoongArch
> > SX/ASX supports this operations but corresponding description is
> > not defined in
> > target-supports.exp. This patch solves the problem above with some
> > modification in proc check_effective_target_vect_int_mod.
> 
> I think we can just add -mdouble-float -mlasx into DEFAULT_VECTCFLAGS
> and always enable vect_int_mod for LoongArch.  This will make
> vect.exp
> tests automatically run for every "make check" on LoongArch.
> 
> > gcc/testsuite/ChangeLog:
> > 
> > * lib/target-supports.exp: Update
> > check_effective_target_vect_int_mod according to
> > LoongArch SX/ASX capabilities.
> > ---
> >  gcc/testsuite/lib/target-supports.exp | 18 ++
> >  1 file changed, 18 insertions(+)
> > 
> > diff --git a/gcc/testsuite/lib/target-supports.exp
> > b/gcc/testsuite/lib/target-supports.exp
> > index 2de41cef2f6..b253dc578d2 100644
> > --- a/gcc/testsuite/lib/target-supports.exp
> > +++ b/gcc/testsuite/lib/target-supports.exp
> > @@ -8586,6 +8586,8 @@ proc check_effective_target_vect_int_mod { }
> > {
> >  return [check_cached_effective_target_indexed vect_int_mod {
> >    expr { ([istarget powerpc*-*-*]
> >   && [check_effective_target_has_arch_pwr10])
> > +    || ([istarget loongarch*-*-*]
> > +    && [check_effective_target_loongarch_sx])
> >   || [istarget amdgcn-*-*] }}]
> >  }
> >  
> > @@ -12656,6 +12658,22 @@ proc
> > check_effective_target_const_volatile_readonly_section { } {
> >    return 1
> >  }
> >  
> > +proc check_effective_target_loongarch_sx { } {
> > +    return [check_no_compiler_messages loongarch_lsx assembly {
> > +   #if !defined(__loongarch_sx)
> > +   #error "LSX not defined"
> > +   #endif
> > +    }]
> > +}
> > +
> > +proc check_effective_target_loongarch_asx { } {
> > +    return [check_no_compiler_messages loongarch_asx assembly {
> > +   #if !defined(__loongarch_asx)
> > +   #error "LASX not defined"
> > +   #endif
> > +    }]
> > +}
> > +
> >  # Appends necessary Python flags to extra-tool-flags if Python.h
> > is supported.
> >  # Otherwise, modifies dg-do-what.
> >  proc dg-require-python-h { args } {
> 



[PATCH v1] Update check_effective_target_vect_int_mod according to LoongArch SX/ASX capabilities.

2023-09-19 Thread Chenghui Pan
LoongArch failed to pass gcc.dg/pr104992.c with -mlsx and -mlasx. This test uses
different dg-final directives depending on the vect_int_mod result, LoongArch
SX/ASX supports this operations but corresponding description is not defined in
target-supports.exp. This patch solves the problem above with some
modification in proc check_effective_target_vect_int_mod.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Update check_effective_target_vect_int_mod 
according to
LoongArch SX/ASX capabilities.
---
 gcc/testsuite/lib/target-supports.exp | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 2de41cef2f6..b253dc578d2 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -8586,6 +8586,8 @@ proc check_effective_target_vect_int_mod { } {
 return [check_cached_effective_target_indexed vect_int_mod {
   expr { ([istarget powerpc*-*-*]
  && [check_effective_target_has_arch_pwr10])
+|| ([istarget loongarch*-*-*]
+&& [check_effective_target_loongarch_sx])
  || [istarget amdgcn-*-*] }}]
 }
 
@@ -12656,6 +12658,22 @@ proc 
check_effective_target_const_volatile_readonly_section { } {
   return 1
 }
 
+proc check_effective_target_loongarch_sx { } {
+return [check_no_compiler_messages loongarch_lsx assembly {
+   #if !defined(__loongarch_sx)
+   #error "LSX not defined"
+   #endif
+}]
+}
+
+proc check_effective_target_loongarch_asx { } {
+return [check_no_compiler_messages loongarch_asx assembly {
+   #if !defined(__loongarch_asx)
+   #error "LASX not defined"
+   #endif
+}]
+}
+
 # Appends necessary Python flags to extra-tool-flags if Python.h is supported.
 # Otherwise, modifies dg-do-what.
 proc dg-require-python-h { args } {
-- 
2.36.0



Re: [PATCH v6 0/4] Add Loongson SX/ASX instruction support to LoongArch target.

2023-08-31 Thread Chenghui Pan
Thanks for the testing work! I will continue to try to find and resolve
some subtle issues too (Such as use compiler to compile some large
project). I'm also curious about the partly saved register problem and
will take some learning and investigation in the future.

On Thu, 2023-08-31 at 17:41 +0800, Xi Ruoyao wrote:
> On Thu, 2023-08-31 at 17:08 +0800, Chenghui Pan wrote:
> > This is an update of:
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628303.html
> > 
> > Changes since last version of patch set:
> > - "dg-skip-if"-related Changes of the g++.dg/torture/vshuf*
> > testcases are reverted.
> >   (Replaced by __builtin_shuffle fix)
> > - Add fix of __builtin_shuffle() for Loongson SX/ASX (Implemeted by
> > adding
> >   vand/xvand insn in front of shuffle operation). There's no
> > significant performance
> >   impact in current state.
> 
> I think it's the correct fix, thanks!
> 
> I'm still unsure about the "partly saved register" issue (I'll need
> to
> resolve similar issues for "ILP32 ABI on loongarch64") but it seems
> GCC
> just don't attempt to preserve any vectors in register across
> function
> call.
> 
> After the patches are committed I (and Xuerui, maybe) will perform
> full
> system rebuild with LASX enabled to see if there are subtle issues. 
> IMO
> we still have plenty of time to fix them (if there are any) before
> GCC
> 14 release.
> 
> > - Rebased on the top of Yang Yujie's latest target configuration
> > interface patch set
> >  
> > (https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628772.html)
> > .
> > 
> > Brief history of patch set:
> > v1 -> v2:
> > - Reduce usage of "unspec" in RTL template.
> > - Append Support of ADDR_REG_REG in LSX and LASX.
> > - Constraint docs are appended in gcc/doc/md.texi and ccomment
> > block.
> > - Codes related to vecarg are removed.
> > - Testsuite of LSX and LASX is added in v2. (Because of the size
> > limitation of
> >   mail list, these patches are not shown)
> > - Adjust the loongarch_expand_vector_init() function to reduce
> > instruction 
> > output amount.
> > - Some minor implementation changes of RTL templates.
> > 
> > v2 -> v3:
> > - Revert vabsd/xvabsd RTL templates to unspec impl.
> > - Resolve warning in gcc/config/loongarch/loongarch.cc when
> > bootstrapping 
> >   with BOOT_CFLAGS="-O2 -ftree-vectorize -fno-vect-cost-model -
> > mlasx".
> > - Remove redundant definitions in lasxintrin.h.
> > - Refine commit info.
> > 
> > v3 -> v4:
> > - Code simplification.
> > - Testsuite patches are splited from this patch set again and will
> > be
> >   submitted independently in the future.
> > 
> > v4 -> v5:
> > - Regression test fix (pr54346.c)
> > - Combine vilvh/xvilvh insn's RTL template impl.
> > - Add dg-skip-if for loongarch*-*-* in vshuf test inside
> > g++.dg/torture
> >   (reverted in this version)
> > 
> > Lulu Cheng (4):
> >   LoongArch: Add Loongson SX base instruction support.
> >   LoongArch: Add Loongson SX directive builtin function support.
> >   LoongArch: Add Loongson ASX base instruction support.
> >   LoongArch: Add Loongson ASX directive builtin function support.
> > 
> >  gcc/config.gcc    |    2 +-
> >  gcc/config/loongarch/constraints.md   |  131 +-
> >  gcc/config/loongarch/genopts/loongarch.opt.in |    4 +
> >  gcc/config/loongarch/lasx.md  | 5104
> > 
> >  gcc/config/loongarch/lasxintrin.h | 5338
> > +
> >  gcc/config/loongarch/loongarch-builtins.cc    | 2686 -
> >  gcc/config/loongarch/loongarch-ftypes.def |  666 +-
> >  gcc/config/loongarch/loongarch-modes.def  |   39 +
> >  gcc/config/loongarch/loongarch-protos.h   |   35 +
> >  gcc/config/loongarch/loongarch.cc | 4751
> > ++-
> >  gcc/config/loongarch/loongarch.h  |  117 +-
> >  gcc/config/loongarch/loongarch.md |   56 +-
> >  gcc/config/loongarch/loongarch.opt    |    4 +
> >  gcc/config/loongarch/lsx.md   | 4467
> > ++
> >  gcc/config/loongarch/lsxintrin.h  | 5181
> > 
> >  gcc/config/loongarch/predicates.md    |  333 +-
> >  gcc/doc/md.texi   |   11 +
> >  17 files changed, 28645 insertions(+), 280 deletions(-)
> >  create mode 100644 gcc/config/loongarch/lasx.md
> >  create mode 100644 gcc/config/loongarch/lasxintrin.h
> >  create mode 100644 gcc/config/loongarch/lsx.md
> >  create mode 100644 gcc/config/loongarch/lsxintrin.h
> > 
> 



[PATCH v6 0/4] Add Loongson SX/ASX instruction support to LoongArch target.

2023-08-31 Thread Chenghui Pan
This is an update of:
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628303.html

Changes since last version of patch set:
- "dg-skip-if"-related Changes of the g++.dg/torture/vshuf* testcases are 
reverted.
  (Replaced by __builtin_shuffle fix)
- Add fix of __builtin_shuffle() for Loongson SX/ASX (Implemeted by adding
  vand/xvand insn in front of shuffle operation). There's no significant 
performance
  impact in current state.
- Rebased on the top of Yang Yujie's latest target configuration interface 
patch set
  (https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628772.html).

Brief history of patch set:
v1 -> v2:
- Reduce usage of "unspec" in RTL template.
- Append Support of ADDR_REG_REG in LSX and LASX.
- Constraint docs are appended in gcc/doc/md.texi and ccomment block.
- Codes related to vecarg are removed.
- Testsuite of LSX and LASX is added in v2. (Because of the size limitation of
  mail list, these patches are not shown)
- Adjust the loongarch_expand_vector_init() function to reduce instruction 
output amount.
- Some minor implementation changes of RTL templates.

v2 -> v3:
- Revert vabsd/xvabsd RTL templates to unspec impl.
- Resolve warning in gcc/config/loongarch/loongarch.cc when bootstrapping 
  with BOOT_CFLAGS="-O2 -ftree-vectorize -fno-vect-cost-model -mlasx".
- Remove redundant definitions in lasxintrin.h.
- Refine commit info.

v3 -> v4:
- Code simplification.
- Testsuite patches are splited from this patch set again and will be
  submitted independently in the future.

v4 -> v5:
- Regression test fix (pr54346.c)
- Combine vilvh/xvilvh insn's RTL template impl.
- Add dg-skip-if for loongarch*-*-* in vshuf test inside g++.dg/torture
  (reverted in this version)

Lulu Cheng (4):
  LoongArch: Add Loongson SX base instruction support.
  LoongArch: Add Loongson SX directive builtin function support.
  LoongArch: Add Loongson ASX base instruction support.
  LoongArch: Add Loongson ASX directive builtin function support.

 gcc/config.gcc|2 +-
 gcc/config/loongarch/constraints.md   |  131 +-
 gcc/config/loongarch/genopts/loongarch.opt.in |4 +
 gcc/config/loongarch/lasx.md  | 5104 
 gcc/config/loongarch/lasxintrin.h | 5338 +
 gcc/config/loongarch/loongarch-builtins.cc| 2686 -
 gcc/config/loongarch/loongarch-ftypes.def |  666 +-
 gcc/config/loongarch/loongarch-modes.def  |   39 +
 gcc/config/loongarch/loongarch-protos.h   |   35 +
 gcc/config/loongarch/loongarch.cc | 4751 ++-
 gcc/config/loongarch/loongarch.h  |  117 +-
 gcc/config/loongarch/loongarch.md |   56 +-
 gcc/config/loongarch/loongarch.opt|4 +
 gcc/config/loongarch/lsx.md   | 4467 ++
 gcc/config/loongarch/lsxintrin.h  | 5181 
 gcc/config/loongarch/predicates.md|  333 +-
 gcc/doc/md.texi   |   11 +
 17 files changed, 28645 insertions(+), 280 deletions(-)
 create mode 100644 gcc/config/loongarch/lasx.md
 create mode 100644 gcc/config/loongarch/lasxintrin.h
 create mode 100644 gcc/config/loongarch/lsx.md
 create mode 100644 gcc/config/loongarch/lsxintrin.h

-- 
2.36.0



[PATCH v5 1/6] LoongArch: Add Loongson SX vector directive compilation framework.

2023-08-23 Thread Chenghui Pan
From: Lulu Cheng 

gcc/ChangeLog:

* config/loongarch/genopts/loongarch-strings: Add compilation framework.
* config/loongarch/genopts/loongarch.opt.in: Ditto.
* config/loongarch/loongarch-c.cc (loongarch_cpu_cpp_builtins): Ditto.
* config/loongarch/loongarch-def.c: Ditto.
* config/loongarch/loongarch-def.h (N_ISA_EXT_TYPES): Ditto.
(ISA_EXT_SIMD_LSX): Ditto.
(N_SWITCH_TYPES): Ditto.
(SW_LSX): Ditto.
(struct loongarch_isa): Ditto.
* config/loongarch/loongarch-driver.cc (APPEND_SWITCH): Ditto.
(driver_get_normalized_m_opts): Ditto.
* config/loongarch/loongarch-driver.h (driver_get_normalized_m_opts): 
Ditto.
* config/loongarch/loongarch-opts.cc (loongarch_config_target): Ditto.
(isa_str): Ditto.
* config/loongarch/loongarch-opts.h (ISA_HAS_LSX): Ditto.
* config/loongarch/loongarch-str.h (OPTSTR_LSX): Ditto.
* config/loongarch/loongarch.opt: Ditto.
---
 .../loongarch/genopts/loongarch-strings   |  3 +
 gcc/config/loongarch/genopts/loongarch.opt.in |  8 +-
 gcc/config/loongarch/loongarch-c.cc   |  7 ++
 gcc/config/loongarch/loongarch-def.c  |  4 +
 gcc/config/loongarch/loongarch-def.h  |  7 +-
 gcc/config/loongarch/loongarch-driver.cc  | 10 +++
 gcc/config/loongarch/loongarch-driver.h   |  1 +
 gcc/config/loongarch/loongarch-opts.cc| 82 ++-
 gcc/config/loongarch/loongarch-opts.h |  1 +
 gcc/config/loongarch/loongarch-str.h  |  2 +
 gcc/config/loongarch/loongarch.opt|  8 +-
 11 files changed, 128 insertions(+), 5 deletions(-)

diff --git a/gcc/config/loongarch/genopts/loongarch-strings 
b/gcc/config/loongarch/genopts/loongarch-strings
index a40998ead97..24a5025061f 100644
--- a/gcc/config/loongarch/genopts/loongarch-strings
+++ b/gcc/config/loongarch/genopts/loongarch-strings
@@ -40,6 +40,9 @@ OPTSTR_SOFT_FLOAT soft-float
 OPTSTR_SINGLE_FLOAT   single-float
 OPTSTR_DOUBLE_FLOAT   double-float
 
+# SIMD extensions
+OPTSTR_LSX lsx
+
 # -mabi=
 OPTSTR_ABI_BASE  abi
 STR_ABI_BASE_LP64Dlp64d
diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in 
b/gcc/config/loongarch/genopts/loongarch.opt.in
index 4b9b4ac273e..338d77a7e40 100644
--- a/gcc/config/loongarch/genopts/loongarch.opt.in
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -76,6 +76,9 @@ m@@OPTSTR_DOUBLE_FLOAT@@
 Target Driver RejectNegative Var(la_opt_switches) Mask(FORCE_F64) 
Negative(m@@OPTSTR_SOFT_FLOAT@@)
 Allow hardware floating-point instructions to cover both 32-bit and 64-bit 
operations.
 
+m@@OPTSTR_LSX@@
+Target RejectNegative Var(la_opt_switches) Mask(LSX) Negative(m@@OPTSTR_LSX@@)
+Enable LoongArch SIMD Extension (LSX).
 
 ;; Base target models (implies ISA & tune parameters)
 Enum
@@ -125,11 +128,14 @@ Target RejectNegative Joined ToLower Enum(abi_base) 
Var(la_opt_abi_base) Init(M_
 Variable
 int la_opt_abi_ext = M_OPTION_NOT_SEEN
 
-
 mbranch-cost=
 Target RejectNegative Joined UInteger Var(loongarch_branch_cost)
 -mbranch-cost=COST Set the cost of branches to roughly COST instructions.
 
+mmemvec-cost=
+Target RejectNegative Joined UInteger Var(loongarch_vector_access_cost) 
IntegerRange(1, 5)
+mmemvec-cost=COST  Set the cost of vector memory access instructions.
+
 mcheck-zero-division
 Target Mask(CHECK_ZERO_DIV)
 Trap on integer divide by zero.
diff --git a/gcc/config/loongarch/loongarch-c.cc 
b/gcc/config/loongarch/loongarch-c.cc
index 67911b78f28..b065921adc3 100644
--- a/gcc/config/loongarch/loongarch-c.cc
+++ b/gcc/config/loongarch/loongarch-c.cc
@@ -99,6 +99,13 @@ loongarch_cpu_cpp_builtins (cpp_reader *pfile)
   else
 builtin_define ("__loongarch_frlen=0");
 
+  if (ISA_HAS_LSX)
+{
+  builtin_define ("__loongarch_simd");
+  builtin_define ("__loongarch_sx");
+  builtin_define ("__loongarch_sx_width=128");
+}
+
   /* Native Data Sizes.  */
   builtin_define_with_int_value ("_LOONGARCH_SZINT", INT_TYPE_SIZE);
   builtin_define_with_int_value ("_LOONGARCH_SZLONG", LONG_TYPE_SIZE);
diff --git a/gcc/config/loongarch/loongarch-def.c 
b/gcc/config/loongarch/loongarch-def.c
index 6729c857f7c..28e24c62249 100644
--- a/gcc/config/loongarch/loongarch-def.c
+++ b/gcc/config/loongarch/loongarch-def.c
@@ -49,10 +49,12 @@ loongarch_cpu_default_isa[N_ARCH_TYPES] = {
   [CPU_LOONGARCH64] = {
   .base = ISA_BASE_LA64V100,
   .fpu = ISA_EXT_FPU64,
+  .simd = 0,
   },
   [CPU_LA464] = {
   .base = ISA_BASE_LA64V100,
   .fpu = ISA_EXT_FPU64,
+  .simd = ISA_EXT_SIMD_LSX,
   },
 };
 
@@ -147,6 +149,7 @@ loongarch_isa_ext_strings[N_ISA_EXT_TYPES] = {
   [ISA_EXT_FPU64] = STR_ISA_EXT_FPU64,
   [ISA_EXT_FPU32] = STR_ISA_EXT_FPU32,
   [ISA_EXT_NOFPU] = STR_ISA_EXT_NOFPU,
+  [ISA_EXT_SIMD_LSX] = OPTSTR_LSX,
 };
 
 const char*
@@ -176,6 +179,7 @@ loongarch_switch_strings[] = {
   [SW_SOFT_FLOAT]= OPTSTR_SOFT_FLOAT,
   [SW_SINGLE_FLOAT]  = 

[PATCH v5 4/6] LoongArch: Add Loongson ASX vector directive compilation framework.

2023-08-23 Thread Chenghui Pan
From: Lulu Cheng 

gcc/ChangeLog:

* config/loongarch/genopts/loongarch-strings: Add compilation framework.
* config/loongarch/genopts/loongarch.opt.in: Ditto.
* config/loongarch/loongarch-c.cc (loongarch_cpu_cpp_builtins): Ditto.
* config/loongarch/loongarch-def.c: Ditto.
* config/loongarch/loongarch-def.h (N_ISA_EXT_TYPES): Ditto.
(ISA_EXT_SIMD_LASX): Ditto.
(N_SWITCH_TYPES): Ditto.
(SW_LASX): Ditto.
* config/loongarch/loongarch-driver.cc (driver_get_normalized_m_opts): 
Ditto.
* config/loongarch/loongarch-driver.h (driver_get_normalized_m_opts): 
Ditto.
* config/loongarch/loongarch-opts.cc (isa_str): Ditto.
* config/loongarch/loongarch-opts.h (ISA_HAS_LSX): Ditto.
(ISA_HAS_LASX): Ditto.
* config/loongarch/loongarch-str.h (OPTSTR_LASX): Ditto.
* config/loongarch/loongarch.opt: Ditto.
---
 gcc/config/loongarch/genopts/loongarch-strings |  1 +
 gcc/config/loongarch/genopts/loongarch.opt.in  |  4 
 gcc/config/loongarch/loongarch-c.cc| 11 +++
 gcc/config/loongarch/loongarch-def.c   |  4 +++-
 gcc/config/loongarch/loongarch-def.h   |  6 --
 gcc/config/loongarch/loongarch-driver.cc   |  2 +-
 gcc/config/loongarch/loongarch-driver.h|  1 +
 gcc/config/loongarch/loongarch-opts.cc |  9 -
 gcc/config/loongarch/loongarch-opts.h  |  4 +++-
 gcc/config/loongarch/loongarch-str.h   |  1 +
 gcc/config/loongarch/loongarch.opt |  4 
 11 files changed, 41 insertions(+), 6 deletions(-)

diff --git a/gcc/config/loongarch/genopts/loongarch-strings 
b/gcc/config/loongarch/genopts/loongarch-strings
index 24a5025061f..35d08f5967d 100644
--- a/gcc/config/loongarch/genopts/loongarch-strings
+++ b/gcc/config/loongarch/genopts/loongarch-strings
@@ -42,6 +42,7 @@ OPTSTR_DOUBLE_FLOAT   double-float
 
 # SIMD extensions
 OPTSTR_LSX lsx
+OPTSTR_LASXlasx
 
 # -mabi=
 OPTSTR_ABI_BASE  abi
diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in 
b/gcc/config/loongarch/genopts/loongarch.opt.in
index 338d77a7e40..afde23c9661 100644
--- a/gcc/config/loongarch/genopts/loongarch.opt.in
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -80,6 +80,10 @@ m@@OPTSTR_LSX@@
 Target RejectNegative Var(la_opt_switches) Mask(LSX) Negative(m@@OPTSTR_LSX@@)
 Enable LoongArch SIMD Extension (LSX).
 
+m@@OPTSTR_LASX@@
+Target RejectNegative Var(la_opt_switches) Mask(LASX) 
Negative(m@@OPTSTR_LASX@@)
+Enable LoongArch Advanced SIMD Extension (LASX).
+
 ;; Base target models (implies ISA & tune parameters)
 Enum
 Name(cpu_type) Type(int)
diff --git a/gcc/config/loongarch/loongarch-c.cc 
b/gcc/config/loongarch/loongarch-c.cc
index b065921adc3..2747fb9e472 100644
--- a/gcc/config/loongarch/loongarch-c.cc
+++ b/gcc/config/loongarch/loongarch-c.cc
@@ -104,8 +104,19 @@ loongarch_cpu_cpp_builtins (cpp_reader *pfile)
   builtin_define ("__loongarch_simd");
   builtin_define ("__loongarch_sx");
   builtin_define ("__loongarch_sx_width=128");
+
+  if (!ISA_HAS_LASX)
+   builtin_define ("__loongarch_simd_width=128");
 }
 
+  if (ISA_HAS_LASX)
+{
+  builtin_define ("__loongarch_asx");
+  builtin_define ("__loongarch_asx_width=256");
+  builtin_define ("__loongarch_simd_width=256");
+}
+
+
   /* Native Data Sizes.  */
   builtin_define_with_int_value ("_LOONGARCH_SZINT", INT_TYPE_SIZE);
   builtin_define_with_int_value ("_LOONGARCH_SZLONG", LONG_TYPE_SIZE);
diff --git a/gcc/config/loongarch/loongarch-def.c 
b/gcc/config/loongarch/loongarch-def.c
index 28e24c62249..bff92c86532 100644
--- a/gcc/config/loongarch/loongarch-def.c
+++ b/gcc/config/loongarch/loongarch-def.c
@@ -54,7 +54,7 @@ loongarch_cpu_default_isa[N_ARCH_TYPES] = {
   [CPU_LA464] = {
   .base = ISA_BASE_LA64V100,
   .fpu = ISA_EXT_FPU64,
-  .simd = ISA_EXT_SIMD_LSX,
+  .simd = ISA_EXT_SIMD_LASX,
   },
 };
 
@@ -150,6 +150,7 @@ loongarch_isa_ext_strings[N_ISA_EXT_TYPES] = {
   [ISA_EXT_FPU32] = STR_ISA_EXT_FPU32,
   [ISA_EXT_NOFPU] = STR_ISA_EXT_NOFPU,
   [ISA_EXT_SIMD_LSX] = OPTSTR_LSX,
+  [ISA_EXT_SIMD_LASX] = OPTSTR_LASX,
 };
 
 const char*
@@ -180,6 +181,7 @@ loongarch_switch_strings[] = {
   [SW_SINGLE_FLOAT]  = OPTSTR_SINGLE_FLOAT,
   [SW_DOUBLE_FLOAT]  = OPTSTR_DOUBLE_FLOAT,
   [SW_LSX]   = OPTSTR_LSX,
+  [SW_LASX]  = OPTSTR_LASX,
 };
 
 
diff --git a/gcc/config/loongarch/loongarch-def.h 
b/gcc/config/loongarch/loongarch-def.h
index f34cffcfb9b..0bbcdb03d22 100644
--- a/gcc/config/loongarch/loongarch-def.h
+++ b/gcc/config/loongarch/loongarch-def.h
@@ -64,7 +64,8 @@ extern const char* loongarch_isa_ext_strings[];
 #define ISA_EXT_FPU642
 #define N_ISA_EXT_FPU_TYPES   3
 #define ISA_EXT_SIMD_LSX  3
-#define N_ISA_EXT_TYPES  4
+#define ISA_EXT_SIMD_LASX 4
+#define N_ISA_EXT_TYPES  5
 
 /* enum abi_base */
 extern const char* 

[PATCH v5 0/6] Add Loongson SX/ASX instruction support to LoongArch target.

2023-08-23 Thread Chenghui Pan
This is an update of:
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627413.html

Changes since last version of patch set:
- Fix regression test fail of pr54346.c with 
RUNTESTFLAGS="--target_board=unix/-mlsx".
  This is caused by the code simplification of 
loongarch_expand_vec_perm_const_2 () in
  last version.
- Combine vilvh/xvilvh insn's RTL template impl.
- Add dg-skip-if for loongarch*-*-* in vshuf test in g++.dg/torture, because
  vshuf/xvshuf insn's result is undefined when 6 or 7 bit of vector's element 
is set,
  and insns with this condition are generated in these testcases.

Brief version history of patch set:

v1 -> v2:
- Reduce usage of "unspec" in RTL template.
- Append Support of ADDR_REG_REG in LSX and LASX.
- Constraint docs are appended in gcc/doc/md.texi and ccomment block.
- Codes related to vecarg are removed.
- Testsuite of LSX and LASX is added in v2. (Because of the size limitation of
  mail list, these patches are not shown)
- Adjust the loongarch_expand_vector_init() function to reduce instruction 
  output amount.
- Some minor implementation changes of RTL templates.

v2 -> v3:
- Revert vabsd/xvabsd RTL templates to unspec impl.
- Resolve warning in gcc/config/loongarch/loongarch.cc when bootstrapping 
  with BOOT_CFLAGS="-O2 -ftree-vectorize -fno-vect-cost-model -mlasx".
- Remove redundant definitions in lasxintrin.h.
- Refine commit info.

v3 -> v4:
- Code simplification.
- Testsuite patches are splited from this patch set again and will be submitted 
independently in the future.

Lulu Cheng (6):
  LoongArch: Add Loongson SX vector directive compilation framework.
  LoongArch: Add Loongson SX base instruction support.
  LoongArch: Add Loongson SX directive builtin function support.
  LoongArch: Add Loongson ASX vector directive compilation framework.
  LoongArch: Add Loongson ASX base instruction support.
  LoongArch: Add Loongson ASX directive builtin function support.

 gcc/config.gcc|2 +-
 gcc/config/loongarch/constraints.md   |  131 +-
 .../loongarch/genopts/loongarch-strings   |4 +
 gcc/config/loongarch/genopts/loongarch.opt.in |   12 +-
 gcc/config/loongarch/lasx.md  | 5104 
 gcc/config/loongarch/lasxintrin.h | 5338 +
 gcc/config/loongarch/loongarch-builtins.cc| 2686 -
 gcc/config/loongarch/loongarch-c.cc   |   18 +
 gcc/config/loongarch/loongarch-def.c  |6 +
 gcc/config/loongarch/loongarch-def.h  |9 +-
 gcc/config/loongarch/loongarch-driver.cc  |   10 +
 gcc/config/loongarch/loongarch-driver.h   |2 +
 gcc/config/loongarch/loongarch-ftypes.def |  666 +-
 gcc/config/loongarch/loongarch-modes.def  |   39 +
 gcc/config/loongarch/loongarch-opts.cc|   89 +-
 gcc/config/loongarch/loongarch-opts.h |3 +
 gcc/config/loongarch/loongarch-protos.h   |   35 +
 gcc/config/loongarch/loongarch-str.h  |3 +
 gcc/config/loongarch/loongarch.cc | 4660 +-
 gcc/config/loongarch/loongarch.h  |  117 +-
 gcc/config/loongarch/loongarch.md |   56 +-
 gcc/config/loongarch/loongarch.opt|   12 +-
 gcc/config/loongarch/lsx.md   | 4467 ++
 gcc/config/loongarch/lsxintrin.h  | 5181 
 gcc/config/loongarch/predicates.md|  333 +-
 gcc/doc/md.texi   |   11 +
 gcc/testsuite/g++.dg/torture/vshuf-v16hi.C|1 +
 gcc/testsuite/g++.dg/torture/vshuf-v16qi.C|1 +
 gcc/testsuite/g++.dg/torture/vshuf-v2df.C |2 +
 gcc/testsuite/g++.dg/torture/vshuf-v2di.C |1 +
 gcc/testsuite/g++.dg/torture/vshuf-v4sf.C |2 +-
 gcc/testsuite/g++.dg/torture/vshuf-v8hi.C |1 +
 32 files changed, 28717 insertions(+), 285 deletions(-)
 create mode 100644 gcc/config/loongarch/lasx.md
 create mode 100644 gcc/config/loongarch/lasxintrin.h
 create mode 100644 gcc/config/loongarch/lsx.md
 create mode 100644 gcc/config/loongarch/lsxintrin.h

-- 
2.36.0



Re: [PATCH v4 0/6] Add Loongson SX/ASX instruction support to LoongArch target.

2023-08-17 Thread Chenghui Pan
Hi! I try to investigate on this problem, and modify the testcase to
compile and run on aarch64 for reference, but I get some strange result
(comment shows the info that I see by stepping through by using gdb):

typedef double __attribute__((vector_size(16))) v2df;

void use1(double d) {}

__attribute__((noipa)) v2df use(double d)
{
  //reg v8's value: {1, 2}
  register v2df x asm("v8") = {5, 9};
  //reg v8's value: {5, 9}
  __asm__("" : "+w" (x));
  return x;
}

void test(void)
{
  register v2df x asm("v8") = {1, 2};
  __asm__("" : "+w" (x));
  //reg v8's value: {1, 2}
  use(x[0]);
  //reg v8's value: {1, 0}
  use1(x[1]);
}

int main(int argc, char **argv)
{
  test();
  return 0;
}

The compile command is: gcc -march=armv8-a -Og -g 1.c (gcc
8.3.0+binutils 2.31)

Disassembly of test() and use():
00400558 :   
  400558:   fc1f0fe8str d8, [sp, #-16]!   
  40055c:   9000adrpx0, 40 <_init-0x3e0>  
  400560:   3dc19c08ldr q8, [x0, #1648]   
  400564:   4ea81d00mov v0.16b, v8.16b
  400568:   fc4107e8ldr d8, [sp], #16 
  40056c:   d65f03c0ret

00400570 :  
  400570:   a9be7bfdstp x29, x30, [sp, #-32]! 
  400574:   910003fdmov x29, sp   
  400578:   fd000be8str d8, [sp, #16] 
  40057c:   9000adrpx0, 40 <_init-0x3e0>  
  400580:   3dc1a008ldr q8, [x0, #1664]   
  400584:   5e080500mov d0, v8.d[0]   
  400588:   97f4bl  400558   
  40058c:   fd400be8ldr d8, [sp, #16] 
  400590:   a8c27bfdldp x29, x30, [sp], #32   
  400594:   d65f03c0ret 

As the register value in the comments, The compiling output on aarch64
also clobbers the high parts of vector register. I googled for some
documents and I find this:
https://developer.arm.com/documentation/den0024/a/The-ABI-for-ARM-64
-bit-Architecture/Register-use-in-the-AArch64-Procedure-Call-Standard
/Parameters-in-NEON-and-floating-point-registers 

Seems ARMv8-A only guarantees to preserve low 64-bit value of
NEON/floating-point register value. I'm not sure that I modify the
testcase in the right way and maybe we need more investigations. Any
ideas or suggestion?

On Wed, 2023-08-16 at 11:27 +0800, Xi Ruoyao wrote:
> The implementation fails to handle this test case properly:
> 
> typedef double __attribute__((vector_size(32))) v4df;
> 
> void use1(double);
> 
> __attribute__((noipa)) double use(double)
> {
>   register double x asm("f24") = 114.514;
>   __asm__("" : "+f" (x));
>   return x;
> }
> 
> void test(void)
> {
>   register v4df x asm("f24") = {1, 2, 3, 4};
>   __asm__("" : "+f" (x));
>   use(x[1]);
>   use1(x[3]);
> }
> 
> Here use() attempts to save and restore f24, but it uses fst.d/fld.d,
> clobbering the high 192 bits of xr24.  Now test() passes a wrong
> value
> of x[3] to use1().
> 
> Note that saving and restoring f24 with xvst/xvld in use() won't
> really
> fix the issue because in real life use() can be in another
> translation
> unit (or even a shared library) compiled with -mno-lsx.  So it seems
> we
> need to tell the compiler "a function call may clobber the high bits
> of
> a vector register even if the corresponding floating-point register
> is
> saved".  I'm not sure how to accomplish this...
> 
> On Tue, 2023-08-15 at 09:05 +0800, Chenghui Pan wrote:
> > This is an update of:
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626194.html
> > 
> > This version of patch set only introduces some small simplications
> > of
> > implementation. Because I missed the size limitation of mail size,
> > the
> > huge testsuite patches of v2 and v3 are not shown in the mail list.
> > So,
> > testsuite patches are splited from this patch set again and will be
> > submitted 
> > independently in the future.
> > 
> > Binutils-gdb introduced LSX/LASX support since 2.41 release:
> > https://lists.gnu.org/archive/html/info-gnu/2023-07/msg9.html
> > 
> > Brief history of patch set version:
> > v1 -> v2:
> > - Reduce usage of "unspec" in RTL template.
> > - Append Support of ADDR_REG_REG in LSX and LASX.
> > - Constraint docs are appended in gcc/doc/md.texi and ccomment
> > block.
> >

[PATCH v4 1/6] LoongArch: Add Loongson SX vector directive compilation framework.

2023-08-14 Thread Chenghui Pan
From: Lulu Cheng 

gcc/ChangeLog:

* config/loongarch/genopts/loongarch-strings: Add compilation framework.
* config/loongarch/genopts/loongarch.opt.in: Ditto.
* config/loongarch/loongarch-c.cc (loongarch_cpu_cpp_builtins): Ditto.
* config/loongarch/loongarch-def.c: Ditto.
* config/loongarch/loongarch-def.h (N_ISA_EXT_TYPES): Ditto.
(ISA_EXT_SIMD_LSX): Ditto.
(N_SWITCH_TYPES): Ditto.
(SW_LSX): Ditto.
(struct loongarch_isa): Ditto.
* config/loongarch/loongarch-driver.cc (APPEND_SWITCH): Ditto.
(driver_get_normalized_m_opts): Ditto.
* config/loongarch/loongarch-driver.h (driver_get_normalized_m_opts): 
Ditto.
* config/loongarch/loongarch-opts.cc (loongarch_config_target): Ditto.
(isa_str): Ditto.
* config/loongarch/loongarch-opts.h (ISA_HAS_LSX): Ditto.
* config/loongarch/loongarch-str.h (OPTSTR_LSX): Ditto.
* config/loongarch/loongarch.opt: Ditto.
---
 .../loongarch/genopts/loongarch-strings   |  3 +
 gcc/config/loongarch/genopts/loongarch.opt.in |  8 +-
 gcc/config/loongarch/loongarch-c.cc   |  7 ++
 gcc/config/loongarch/loongarch-def.c  |  4 +
 gcc/config/loongarch/loongarch-def.h  |  7 +-
 gcc/config/loongarch/loongarch-driver.cc  | 10 +++
 gcc/config/loongarch/loongarch-driver.h   |  1 +
 gcc/config/loongarch/loongarch-opts.cc| 82 ++-
 gcc/config/loongarch/loongarch-opts.h |  1 +
 gcc/config/loongarch/loongarch-str.h  |  2 +
 gcc/config/loongarch/loongarch.opt|  8 +-
 11 files changed, 128 insertions(+), 5 deletions(-)

diff --git a/gcc/config/loongarch/genopts/loongarch-strings 
b/gcc/config/loongarch/genopts/loongarch-strings
index a40998ead97..24a5025061f 100644
--- a/gcc/config/loongarch/genopts/loongarch-strings
+++ b/gcc/config/loongarch/genopts/loongarch-strings
@@ -40,6 +40,9 @@ OPTSTR_SOFT_FLOAT soft-float
 OPTSTR_SINGLE_FLOAT   single-float
 OPTSTR_DOUBLE_FLOAT   double-float
 
+# SIMD extensions
+OPTSTR_LSX lsx
+
 # -mabi=
 OPTSTR_ABI_BASE  abi
 STR_ABI_BASE_LP64Dlp64d
diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in 
b/gcc/config/loongarch/genopts/loongarch.opt.in
index 4b9b4ac273e..338d77a7e40 100644
--- a/gcc/config/loongarch/genopts/loongarch.opt.in
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -76,6 +76,9 @@ m@@OPTSTR_DOUBLE_FLOAT@@
 Target Driver RejectNegative Var(la_opt_switches) Mask(FORCE_F64) 
Negative(m@@OPTSTR_SOFT_FLOAT@@)
 Allow hardware floating-point instructions to cover both 32-bit and 64-bit 
operations.
 
+m@@OPTSTR_LSX@@
+Target RejectNegative Var(la_opt_switches) Mask(LSX) Negative(m@@OPTSTR_LSX@@)
+Enable LoongArch SIMD Extension (LSX).
 
 ;; Base target models (implies ISA & tune parameters)
 Enum
@@ -125,11 +128,14 @@ Target RejectNegative Joined ToLower Enum(abi_base) 
Var(la_opt_abi_base) Init(M_
 Variable
 int la_opt_abi_ext = M_OPTION_NOT_SEEN
 
-
 mbranch-cost=
 Target RejectNegative Joined UInteger Var(loongarch_branch_cost)
 -mbranch-cost=COST Set the cost of branches to roughly COST instructions.
 
+mmemvec-cost=
+Target RejectNegative Joined UInteger Var(loongarch_vector_access_cost) 
IntegerRange(1, 5)
+mmemvec-cost=COST  Set the cost of vector memory access instructions.
+
 mcheck-zero-division
 Target Mask(CHECK_ZERO_DIV)
 Trap on integer divide by zero.
diff --git a/gcc/config/loongarch/loongarch-c.cc 
b/gcc/config/loongarch/loongarch-c.cc
index 67911b78f28..b065921adc3 100644
--- a/gcc/config/loongarch/loongarch-c.cc
+++ b/gcc/config/loongarch/loongarch-c.cc
@@ -99,6 +99,13 @@ loongarch_cpu_cpp_builtins (cpp_reader *pfile)
   else
 builtin_define ("__loongarch_frlen=0");
 
+  if (ISA_HAS_LSX)
+{
+  builtin_define ("__loongarch_simd");
+  builtin_define ("__loongarch_sx");
+  builtin_define ("__loongarch_sx_width=128");
+}
+
   /* Native Data Sizes.  */
   builtin_define_with_int_value ("_LOONGARCH_SZINT", INT_TYPE_SIZE);
   builtin_define_with_int_value ("_LOONGARCH_SZLONG", LONG_TYPE_SIZE);
diff --git a/gcc/config/loongarch/loongarch-def.c 
b/gcc/config/loongarch/loongarch-def.c
index 6729c857f7c..28e24c62249 100644
--- a/gcc/config/loongarch/loongarch-def.c
+++ b/gcc/config/loongarch/loongarch-def.c
@@ -49,10 +49,12 @@ loongarch_cpu_default_isa[N_ARCH_TYPES] = {
   [CPU_LOONGARCH64] = {
   .base = ISA_BASE_LA64V100,
   .fpu = ISA_EXT_FPU64,
+  .simd = 0,
   },
   [CPU_LA464] = {
   .base = ISA_BASE_LA64V100,
   .fpu = ISA_EXT_FPU64,
+  .simd = ISA_EXT_SIMD_LSX,
   },
 };
 
@@ -147,6 +149,7 @@ loongarch_isa_ext_strings[N_ISA_EXT_TYPES] = {
   [ISA_EXT_FPU64] = STR_ISA_EXT_FPU64,
   [ISA_EXT_FPU32] = STR_ISA_EXT_FPU32,
   [ISA_EXT_NOFPU] = STR_ISA_EXT_NOFPU,
+  [ISA_EXT_SIMD_LSX] = OPTSTR_LSX,
 };
 
 const char*
@@ -176,6 +179,7 @@ loongarch_switch_strings[] = {
   [SW_SOFT_FLOAT]= OPTSTR_SOFT_FLOAT,
   [SW_SINGLE_FLOAT]  = 

[PATCH v4 4/6] LoongArch: Add Loongson ASX vector directive compilation framework.

2023-08-14 Thread Chenghui Pan
From: Lulu Cheng 

gcc/ChangeLog:

* config/loongarch/genopts/loongarch-strings: Add compilation framework.
* config/loongarch/genopts/loongarch.opt.in: Ditto.
* config/loongarch/loongarch-c.cc (loongarch_cpu_cpp_builtins): Ditto.
* config/loongarch/loongarch-def.c: Ditto.
* config/loongarch/loongarch-def.h (N_ISA_EXT_TYPES): Ditto.
(ISA_EXT_SIMD_LASX): Ditto.
(N_SWITCH_TYPES): Ditto.
(SW_LASX): Ditto.
* config/loongarch/loongarch-driver.cc (driver_get_normalized_m_opts): 
Ditto.
* config/loongarch/loongarch-driver.h (driver_get_normalized_m_opts): 
Ditto.
* config/loongarch/loongarch-opts.cc (isa_str): Ditto.
* config/loongarch/loongarch-opts.h (ISA_HAS_LSX): Ditto.
(ISA_HAS_LASX): Ditto.
* config/loongarch/loongarch-str.h (OPTSTR_LASX): Ditto.
* config/loongarch/loongarch.opt: Ditto.
---
 gcc/config/loongarch/genopts/loongarch-strings |  1 +
 gcc/config/loongarch/genopts/loongarch.opt.in  |  4 
 gcc/config/loongarch/loongarch-c.cc| 11 +++
 gcc/config/loongarch/loongarch-def.c   |  4 +++-
 gcc/config/loongarch/loongarch-def.h   |  6 --
 gcc/config/loongarch/loongarch-driver.cc   |  2 +-
 gcc/config/loongarch/loongarch-driver.h|  1 +
 gcc/config/loongarch/loongarch-opts.cc |  9 -
 gcc/config/loongarch/loongarch-opts.h  |  4 +++-
 gcc/config/loongarch/loongarch-str.h   |  1 +
 gcc/config/loongarch/loongarch.opt |  4 
 11 files changed, 41 insertions(+), 6 deletions(-)

diff --git a/gcc/config/loongarch/genopts/loongarch-strings 
b/gcc/config/loongarch/genopts/loongarch-strings
index 24a5025061f..35d08f5967d 100644
--- a/gcc/config/loongarch/genopts/loongarch-strings
+++ b/gcc/config/loongarch/genopts/loongarch-strings
@@ -42,6 +42,7 @@ OPTSTR_DOUBLE_FLOAT   double-float
 
 # SIMD extensions
 OPTSTR_LSX lsx
+OPTSTR_LASXlasx
 
 # -mabi=
 OPTSTR_ABI_BASE  abi
diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in 
b/gcc/config/loongarch/genopts/loongarch.opt.in
index 338d77a7e40..afde23c9661 100644
--- a/gcc/config/loongarch/genopts/loongarch.opt.in
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -80,6 +80,10 @@ m@@OPTSTR_LSX@@
 Target RejectNegative Var(la_opt_switches) Mask(LSX) Negative(m@@OPTSTR_LSX@@)
 Enable LoongArch SIMD Extension (LSX).
 
+m@@OPTSTR_LASX@@
+Target RejectNegative Var(la_opt_switches) Mask(LASX) 
Negative(m@@OPTSTR_LASX@@)
+Enable LoongArch Advanced SIMD Extension (LASX).
+
 ;; Base target models (implies ISA & tune parameters)
 Enum
 Name(cpu_type) Type(int)
diff --git a/gcc/config/loongarch/loongarch-c.cc 
b/gcc/config/loongarch/loongarch-c.cc
index b065921adc3..2747fb9e472 100644
--- a/gcc/config/loongarch/loongarch-c.cc
+++ b/gcc/config/loongarch/loongarch-c.cc
@@ -104,8 +104,19 @@ loongarch_cpu_cpp_builtins (cpp_reader *pfile)
   builtin_define ("__loongarch_simd");
   builtin_define ("__loongarch_sx");
   builtin_define ("__loongarch_sx_width=128");
+
+  if (!ISA_HAS_LASX)
+   builtin_define ("__loongarch_simd_width=128");
 }
 
+  if (ISA_HAS_LASX)
+{
+  builtin_define ("__loongarch_asx");
+  builtin_define ("__loongarch_asx_width=256");
+  builtin_define ("__loongarch_simd_width=256");
+}
+
+
   /* Native Data Sizes.  */
   builtin_define_with_int_value ("_LOONGARCH_SZINT", INT_TYPE_SIZE);
   builtin_define_with_int_value ("_LOONGARCH_SZLONG", LONG_TYPE_SIZE);
diff --git a/gcc/config/loongarch/loongarch-def.c 
b/gcc/config/loongarch/loongarch-def.c
index 28e24c62249..bff92c86532 100644
--- a/gcc/config/loongarch/loongarch-def.c
+++ b/gcc/config/loongarch/loongarch-def.c
@@ -54,7 +54,7 @@ loongarch_cpu_default_isa[N_ARCH_TYPES] = {
   [CPU_LA464] = {
   .base = ISA_BASE_LA64V100,
   .fpu = ISA_EXT_FPU64,
-  .simd = ISA_EXT_SIMD_LSX,
+  .simd = ISA_EXT_SIMD_LASX,
   },
 };
 
@@ -150,6 +150,7 @@ loongarch_isa_ext_strings[N_ISA_EXT_TYPES] = {
   [ISA_EXT_FPU32] = STR_ISA_EXT_FPU32,
   [ISA_EXT_NOFPU] = STR_ISA_EXT_NOFPU,
   [ISA_EXT_SIMD_LSX] = OPTSTR_LSX,
+  [ISA_EXT_SIMD_LASX] = OPTSTR_LASX,
 };
 
 const char*
@@ -180,6 +181,7 @@ loongarch_switch_strings[] = {
   [SW_SINGLE_FLOAT]  = OPTSTR_SINGLE_FLOAT,
   [SW_DOUBLE_FLOAT]  = OPTSTR_DOUBLE_FLOAT,
   [SW_LSX]   = OPTSTR_LSX,
+  [SW_LASX]  = OPTSTR_LASX,
 };
 
 
diff --git a/gcc/config/loongarch/loongarch-def.h 
b/gcc/config/loongarch/loongarch-def.h
index f34cffcfb9b..0bbcdb03d22 100644
--- a/gcc/config/loongarch/loongarch-def.h
+++ b/gcc/config/loongarch/loongarch-def.h
@@ -64,7 +64,8 @@ extern const char* loongarch_isa_ext_strings[];
 #define ISA_EXT_FPU642
 #define N_ISA_EXT_FPU_TYPES   3
 #define ISA_EXT_SIMD_LSX  3
-#define N_ISA_EXT_TYPES  4
+#define ISA_EXT_SIMD_LASX 4
+#define N_ISA_EXT_TYPES  5
 
 /* enum abi_base */
 extern const char* 

[PATCH v4 0/6] Add Loongson SX/ASX instruction support to LoongArch target.

2023-08-14 Thread Chenghui Pan
This is an update of:
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626194.html

This version of patch set only introduces some small simplications of
implementation. Because I missed the size limitation of mail size, the
huge testsuite patches of v2 and v3 are not shown in the mail list. So,
testsuite patches are splited from this patch set again and will be submitted 
independently in the future.

Binutils-gdb introduced LSX/LASX support since 2.41 release:
https://lists.gnu.org/archive/html/info-gnu/2023-07/msg9.html

Brief history of patch set version:
v1 -> v2:
- Reduce usage of "unspec" in RTL template.
- Append Support of ADDR_REG_REG in LSX and LASX.
- Constraint docs are appended in gcc/doc/md.texi and ccomment block.
- Codes related to vecarg are removed.
- Testsuite of LSX and LASX is added in v2. (Because of the size limitation of
  mail list, these patches are not shown)
- Adjust the loongarch_expand_vector_init() function to reduce instruction 
  output amount.
- Some minor implementation changes of RTL templates.

v2 -> v3:
- Revert vabsd/xvabsd RTL templates to unspec impl.
- Resolve warning in gcc/config/loongarch/loongarch.cc when bootstrapping 
  with BOOT_CFLAGS="-O2 -ftree-vectorize -fno-vect-cost-model -mlasx".
- Remove redundant definitions in lasxintrin.h.
- Refine commit info.

Lulu Cheng (6):
  LoongArch: Add Loongson SX vector directive compilation framework.
  LoongArch: Add Loongson SX base instruction support.
  LoongArch: Add Loongson SX directive builtin function support.
  LoongArch: Add Loongson ASX vector directive compilation framework.
  LoongArch: Add Loongson ASX base instruction support.
  LoongArch: Add Loongson ASX directive builtin function support.

 gcc/config.gcc|2 +-
 gcc/config/loongarch/constraints.md   |  131 +-
 .../loongarch/genopts/loongarch-strings   |4 +
 gcc/config/loongarch/genopts/loongarch.opt.in |   12 +-
 gcc/config/loongarch/lasx.md  | 5122 
 gcc/config/loongarch/lasxintrin.h | 5338 +
 gcc/config/loongarch/loongarch-builtins.cc| 2686 -
 gcc/config/loongarch/loongarch-c.cc   |   18 +
 gcc/config/loongarch/loongarch-def.c  |6 +
 gcc/config/loongarch/loongarch-def.h  |9 +-
 gcc/config/loongarch/loongarch-driver.cc  |   10 +
 gcc/config/loongarch/loongarch-driver.h   |2 +
 gcc/config/loongarch/loongarch-ftypes.def |  666 +-
 gcc/config/loongarch/loongarch-modes.def  |   39 +
 gcc/config/loongarch/loongarch-opts.cc|   89 +-
 gcc/config/loongarch/loongarch-opts.h |3 +
 gcc/config/loongarch/loongarch-protos.h   |   35 +
 gcc/config/loongarch/loongarch-str.h  |3 +
 gcc/config/loongarch/loongarch.cc | 4586 +-
 gcc/config/loongarch/loongarch.h  |  117 +-
 gcc/config/loongarch/loongarch.md |   56 +-
 gcc/config/loongarch/loongarch.opt|   12 +-
 gcc/config/loongarch/lsx.md   | 4481 ++
 gcc/config/loongarch/lsxintrin.h  | 5181 
 gcc/config/loongarch/predicates.md|  333 +-
 gcc/doc/md.texi   |   11 +
 26 files changed, 28668 insertions(+), 284 deletions(-)
 create mode 100644 gcc/config/loongarch/lasx.md
 create mode 100644 gcc/config/loongarch/lasxintrin.h
 create mode 100644 gcc/config/loongarch/lsx.md
 create mode 100644 gcc/config/loongarch/lsxintrin.h

-- 
2.36.0



[PATCH v3 4/8] LoongArch: Add Loongson ASX vector directive compilation framework.

2023-08-03 Thread Chenghui Pan
From: Lulu Cheng 

gcc/ChangeLog:

* config/loongarch/genopts/loongarch-strings: Add compilation framework.
* config/loongarch/genopts/loongarch.opt.in: Ditto.
* config/loongarch/loongarch-c.cc (loongarch_cpu_cpp_builtins): Ditto.
* config/loongarch/loongarch-def.c: Ditto.
* config/loongarch/loongarch-def.h (N_ISA_EXT_TYPES): Ditto.
(ISA_EXT_SIMD_LASX): Ditto.
(N_SWITCH_TYPES): Ditto.
(SW_LASX): Ditto.
* config/loongarch/loongarch-driver.cc (driver_get_normalized_m_opts): 
Ditto.
* config/loongarch/loongarch-driver.h (driver_get_normalized_m_opts): 
Ditto.
* config/loongarch/loongarch-opts.cc (isa_str): Ditto.
* config/loongarch/loongarch-opts.h (ISA_HAS_LSX): Ditto.
(ISA_HAS_LASX): Ditto.
* config/loongarch/loongarch-str.h (OPTSTR_LASX): Ditto.
* config/loongarch/loongarch.opt: Ditto.
---
 gcc/config/loongarch/genopts/loongarch-strings |  1 +
 gcc/config/loongarch/genopts/loongarch.opt.in  |  4 
 gcc/config/loongarch/loongarch-c.cc| 11 +++
 gcc/config/loongarch/loongarch-def.c   |  4 +++-
 gcc/config/loongarch/loongarch-def.h   |  6 --
 gcc/config/loongarch/loongarch-driver.cc   |  2 +-
 gcc/config/loongarch/loongarch-driver.h|  1 +
 gcc/config/loongarch/loongarch-opts.cc |  9 -
 gcc/config/loongarch/loongarch-opts.h  |  4 +++-
 gcc/config/loongarch/loongarch-str.h   |  1 +
 gcc/config/loongarch/loongarch.opt |  4 
 11 files changed, 41 insertions(+), 6 deletions(-)

diff --git a/gcc/config/loongarch/genopts/loongarch-strings 
b/gcc/config/loongarch/genopts/loongarch-strings
index 24a5025061f..35d08f5967d 100644
--- a/gcc/config/loongarch/genopts/loongarch-strings
+++ b/gcc/config/loongarch/genopts/loongarch-strings
@@ -42,6 +42,7 @@ OPTSTR_DOUBLE_FLOAT   double-float
 
 # SIMD extensions
 OPTSTR_LSX lsx
+OPTSTR_LASXlasx
 
 # -mabi=
 OPTSTR_ABI_BASE  abi
diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in 
b/gcc/config/loongarch/genopts/loongarch.opt.in
index 338d77a7e40..afde23c9661 100644
--- a/gcc/config/loongarch/genopts/loongarch.opt.in
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -80,6 +80,10 @@ m@@OPTSTR_LSX@@
 Target RejectNegative Var(la_opt_switches) Mask(LSX) Negative(m@@OPTSTR_LSX@@)
 Enable LoongArch SIMD Extension (LSX).
 
+m@@OPTSTR_LASX@@
+Target RejectNegative Var(la_opt_switches) Mask(LASX) 
Negative(m@@OPTSTR_LASX@@)
+Enable LoongArch Advanced SIMD Extension (LASX).
+
 ;; Base target models (implies ISA & tune parameters)
 Enum
 Name(cpu_type) Type(int)
diff --git a/gcc/config/loongarch/loongarch-c.cc 
b/gcc/config/loongarch/loongarch-c.cc
index b065921adc3..2747fb9e472 100644
--- a/gcc/config/loongarch/loongarch-c.cc
+++ b/gcc/config/loongarch/loongarch-c.cc
@@ -104,8 +104,19 @@ loongarch_cpu_cpp_builtins (cpp_reader *pfile)
   builtin_define ("__loongarch_simd");
   builtin_define ("__loongarch_sx");
   builtin_define ("__loongarch_sx_width=128");
+
+  if (!ISA_HAS_LASX)
+   builtin_define ("__loongarch_simd_width=128");
 }
 
+  if (ISA_HAS_LASX)
+{
+  builtin_define ("__loongarch_asx");
+  builtin_define ("__loongarch_asx_width=256");
+  builtin_define ("__loongarch_simd_width=256");
+}
+
+
   /* Native Data Sizes.  */
   builtin_define_with_int_value ("_LOONGARCH_SZINT", INT_TYPE_SIZE);
   builtin_define_with_int_value ("_LOONGARCH_SZLONG", LONG_TYPE_SIZE);
diff --git a/gcc/config/loongarch/loongarch-def.c 
b/gcc/config/loongarch/loongarch-def.c
index 28e24c62249..bff92c86532 100644
--- a/gcc/config/loongarch/loongarch-def.c
+++ b/gcc/config/loongarch/loongarch-def.c
@@ -54,7 +54,7 @@ loongarch_cpu_default_isa[N_ARCH_TYPES] = {
   [CPU_LA464] = {
   .base = ISA_BASE_LA64V100,
   .fpu = ISA_EXT_FPU64,
-  .simd = ISA_EXT_SIMD_LSX,
+  .simd = ISA_EXT_SIMD_LASX,
   },
 };
 
@@ -150,6 +150,7 @@ loongarch_isa_ext_strings[N_ISA_EXT_TYPES] = {
   [ISA_EXT_FPU32] = STR_ISA_EXT_FPU32,
   [ISA_EXT_NOFPU] = STR_ISA_EXT_NOFPU,
   [ISA_EXT_SIMD_LSX] = OPTSTR_LSX,
+  [ISA_EXT_SIMD_LASX] = OPTSTR_LASX,
 };
 
 const char*
@@ -180,6 +181,7 @@ loongarch_switch_strings[] = {
   [SW_SINGLE_FLOAT]  = OPTSTR_SINGLE_FLOAT,
   [SW_DOUBLE_FLOAT]  = OPTSTR_DOUBLE_FLOAT,
   [SW_LSX]   = OPTSTR_LSX,
+  [SW_LASX]  = OPTSTR_LASX,
 };
 
 
diff --git a/gcc/config/loongarch/loongarch-def.h 
b/gcc/config/loongarch/loongarch-def.h
index f34cffcfb9b..0bbcdb03d22 100644
--- a/gcc/config/loongarch/loongarch-def.h
+++ b/gcc/config/loongarch/loongarch-def.h
@@ -64,7 +64,8 @@ extern const char* loongarch_isa_ext_strings[];
 #define ISA_EXT_FPU642
 #define N_ISA_EXT_FPU_TYPES   3
 #define ISA_EXT_SIMD_LSX  3
-#define N_ISA_EXT_TYPES  4
+#define ISA_EXT_SIMD_LASX 4
+#define N_ISA_EXT_TYPES  5
 
 /* enum abi_base */
 extern const char* 

[PATCH v3 1/8] LoongArch: Add Loongson SX vector directive compilation framework.

2023-08-03 Thread Chenghui Pan
From: Lulu Cheng 

gcc/ChangeLog:

* config/loongarch/genopts/loongarch-strings: Add compilation framework.
* config/loongarch/genopts/loongarch.opt.in: Ditto.
* config/loongarch/loongarch-c.cc (loongarch_cpu_cpp_builtins): Ditto.
* config/loongarch/loongarch-def.c: Ditto.
* config/loongarch/loongarch-def.h (N_ISA_EXT_TYPES): Ditto.
(ISA_EXT_SIMD_LSX): Ditto.
(N_SWITCH_TYPES): Ditto.
(SW_LSX): Ditto.
(struct loongarch_isa): Ditto.
* config/loongarch/loongarch-driver.cc (APPEND_SWITCH): Ditto.
(driver_get_normalized_m_opts): Ditto.
* config/loongarch/loongarch-driver.h (driver_get_normalized_m_opts): 
Ditto.
* config/loongarch/loongarch-opts.cc (loongarch_config_target): Ditto.
(isa_str): Ditto.
* config/loongarch/loongarch-opts.h (ISA_HAS_LSX): Ditto.
* config/loongarch/loongarch-str.h (OPTSTR_LSX): Ditto.
* config/loongarch/loongarch.opt: Ditto.
---
 .../loongarch/genopts/loongarch-strings   |  3 +
 gcc/config/loongarch/genopts/loongarch.opt.in |  8 +-
 gcc/config/loongarch/loongarch-c.cc   |  7 ++
 gcc/config/loongarch/loongarch-def.c  |  4 +
 gcc/config/loongarch/loongarch-def.h  |  7 +-
 gcc/config/loongarch/loongarch-driver.cc  | 10 +++
 gcc/config/loongarch/loongarch-driver.h   |  1 +
 gcc/config/loongarch/loongarch-opts.cc| 82 ++-
 gcc/config/loongarch/loongarch-opts.h |  1 +
 gcc/config/loongarch/loongarch-str.h  |  2 +
 gcc/config/loongarch/loongarch.opt|  8 +-
 11 files changed, 128 insertions(+), 5 deletions(-)

diff --git a/gcc/config/loongarch/genopts/loongarch-strings 
b/gcc/config/loongarch/genopts/loongarch-strings
index a40998ead97..24a5025061f 100644
--- a/gcc/config/loongarch/genopts/loongarch-strings
+++ b/gcc/config/loongarch/genopts/loongarch-strings
@@ -40,6 +40,9 @@ OPTSTR_SOFT_FLOAT soft-float
 OPTSTR_SINGLE_FLOAT   single-float
 OPTSTR_DOUBLE_FLOAT   double-float
 
+# SIMD extensions
+OPTSTR_LSX lsx
+
 # -mabi=
 OPTSTR_ABI_BASE  abi
 STR_ABI_BASE_LP64Dlp64d
diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in 
b/gcc/config/loongarch/genopts/loongarch.opt.in
index 4b9b4ac273e..338d77a7e40 100644
--- a/gcc/config/loongarch/genopts/loongarch.opt.in
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -76,6 +76,9 @@ m@@OPTSTR_DOUBLE_FLOAT@@
 Target Driver RejectNegative Var(la_opt_switches) Mask(FORCE_F64) 
Negative(m@@OPTSTR_SOFT_FLOAT@@)
 Allow hardware floating-point instructions to cover both 32-bit and 64-bit 
operations.
 
+m@@OPTSTR_LSX@@
+Target RejectNegative Var(la_opt_switches) Mask(LSX) Negative(m@@OPTSTR_LSX@@)
+Enable LoongArch SIMD Extension (LSX).
 
 ;; Base target models (implies ISA & tune parameters)
 Enum
@@ -125,11 +128,14 @@ Target RejectNegative Joined ToLower Enum(abi_base) 
Var(la_opt_abi_base) Init(M_
 Variable
 int la_opt_abi_ext = M_OPTION_NOT_SEEN
 
-
 mbranch-cost=
 Target RejectNegative Joined UInteger Var(loongarch_branch_cost)
 -mbranch-cost=COST Set the cost of branches to roughly COST instructions.
 
+mmemvec-cost=
+Target RejectNegative Joined UInteger Var(loongarch_vector_access_cost) 
IntegerRange(1, 5)
+mmemvec-cost=COST  Set the cost of vector memory access instructions.
+
 mcheck-zero-division
 Target Mask(CHECK_ZERO_DIV)
 Trap on integer divide by zero.
diff --git a/gcc/config/loongarch/loongarch-c.cc 
b/gcc/config/loongarch/loongarch-c.cc
index 67911b78f28..b065921adc3 100644
--- a/gcc/config/loongarch/loongarch-c.cc
+++ b/gcc/config/loongarch/loongarch-c.cc
@@ -99,6 +99,13 @@ loongarch_cpu_cpp_builtins (cpp_reader *pfile)
   else
 builtin_define ("__loongarch_frlen=0");
 
+  if (ISA_HAS_LSX)
+{
+  builtin_define ("__loongarch_simd");
+  builtin_define ("__loongarch_sx");
+  builtin_define ("__loongarch_sx_width=128");
+}
+
   /* Native Data Sizes.  */
   builtin_define_with_int_value ("_LOONGARCH_SZINT", INT_TYPE_SIZE);
   builtin_define_with_int_value ("_LOONGARCH_SZLONG", LONG_TYPE_SIZE);
diff --git a/gcc/config/loongarch/loongarch-def.c 
b/gcc/config/loongarch/loongarch-def.c
index 6729c857f7c..28e24c62249 100644
--- a/gcc/config/loongarch/loongarch-def.c
+++ b/gcc/config/loongarch/loongarch-def.c
@@ -49,10 +49,12 @@ loongarch_cpu_default_isa[N_ARCH_TYPES] = {
   [CPU_LOONGARCH64] = {
   .base = ISA_BASE_LA64V100,
   .fpu = ISA_EXT_FPU64,
+  .simd = 0,
   },
   [CPU_LA464] = {
   .base = ISA_BASE_LA64V100,
   .fpu = ISA_EXT_FPU64,
+  .simd = ISA_EXT_SIMD_LSX,
   },
 };
 
@@ -147,6 +149,7 @@ loongarch_isa_ext_strings[N_ISA_EXT_TYPES] = {
   [ISA_EXT_FPU64] = STR_ISA_EXT_FPU64,
   [ISA_EXT_FPU32] = STR_ISA_EXT_FPU32,
   [ISA_EXT_NOFPU] = STR_ISA_EXT_NOFPU,
+  [ISA_EXT_SIMD_LSX] = OPTSTR_LSX,
 };
 
 const char*
@@ -176,6 +179,7 @@ loongarch_switch_strings[] = {
   [SW_SOFT_FLOAT]= OPTSTR_SOFT_FLOAT,
   [SW_SINGLE_FLOAT]  = 

[PATCH v3 0/8] Add Loongson SX/ASX instruction support to LoongArch target.

2023-08-03 Thread Chenghui Pan
This is an update of : 
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624770.html

Changes since last version:
- Revert vabsd/xvabsd RTL templates to unspec impl, because arithmetic RTL 
expression
  cannot cover the edge case of the instruction output. v2 impl of vabsd/xvabsd 
template
  cause the failure of gcc.dg/sabd_1.c when running regression test with 
  RUNTESTFLAGS="--target_board=unix/-mlsx".
- Resolve warning in gcc/config/loongarch/loongarch.cc when bootstrapping with 
  BOOT_CFLAGS="-O2 -ftree-vectorize -fno-vect-cost-model -mlasx".
- Remove redundant definitions in lasxintrin.h.
- Refine commit info.

Lulu Cheng (8):
  LoongArch: Add Loongson SX vector directive compilation framework.
  LoongArch: Add Loongson SX base instruction support.
  LoongArch: Add Loongson SX directive builtin function support.
  LoongArch: Add Loongson ASX vector directive compilation framework.
  LoongArch: Add Loongson ASX base instruction support.
  LoongArch: Add Loongson ASX directive builtin function support.
  LoongArch: Add Loongson SX directive test cases.
  LoongArch: Add Loongson ASX directive test cases.

 gcc/config.gcc| 2 +-
 gcc/config/loongarch/constraints.md   |   131 +-
 .../loongarch/genopts/loongarch-strings   | 4 +
 gcc/config/loongarch/genopts/loongarch.opt.in |12 +-
 gcc/config/loongarch/lasx.md  |  5122 +++
 gcc/config/loongarch/lasxintrin.h |  5338 +++
 gcc/config/loongarch/loongarch-builtins.cc|  2686 +-
 gcc/config/loongarch/loongarch-c.cc   |18 +
 gcc/config/loongarch/loongarch-def.c  | 6 +
 gcc/config/loongarch/loongarch-def.h  | 9 +-
 gcc/config/loongarch/loongarch-driver.cc  |10 +
 gcc/config/loongarch/loongarch-driver.h   | 2 +
 gcc/config/loongarch/loongarch-ftypes.def |   666 +-
 gcc/config/loongarch/loongarch-modes.def  |39 +
 gcc/config/loongarch/loongarch-opts.cc|89 +-
 gcc/config/loongarch/loongarch-opts.h | 3 +
 gcc/config/loongarch/loongarch-protos.h   |35 +
 gcc/config/loongarch/loongarch-str.h  | 3 +
 gcc/config/loongarch/loongarch.cc |  4669 +-
 gcc/config/loongarch/loongarch.h  |   117 +-
 gcc/config/loongarch/loongarch.md |56 +-
 gcc/config/loongarch/loongarch.opt|12 +-
 gcc/config/loongarch/lsx.md   |  4481 ++
 gcc/config/loongarch/lsxintrin.h  |  5181 +++
 gcc/config/loongarch/predicates.md|   333 +-
 gcc/doc/md.texi   |11 +
 .../gcc.target/loongarch/strict-align.c   |13 +
 .../vector/lasx/lasx-bit-manipulate.c | 27813 +++
 .../loongarch/vector/lasx/lasx-builtin.c  |  1509 +
 .../loongarch/vector/lasx/lasx-cmp.c  |  5361 +++
 .../loongarch/vector/lasx/lasx-fp-arith.c |  6259 +++
 .../loongarch/vector/lasx/lasx-fp-cvt.c   |  7315 +++
 .../loongarch/vector/lasx/lasx-int-arith.c| 38361 
 .../loongarch/vector/lasx/lasx-mem.c  |   147 +
 .../loongarch/vector/lasx/lasx-perm.c |  7730 
 .../vector/lasx/lasx-str-manipulate.c |   712 +
 .../loongarch/vector/lasx/lasx-xvldrepl.c |13 +
 .../loongarch/vector/lasx/lasx-xvstelm.c  |12 +
 .../loongarch/vector/loongarch-vector.exp |42 +
 .../loongarch/vector/lsx/lsx-bit-manipulate.c | 15586 +++
 .../loongarch/vector/lsx/lsx-builtin.c|  1461 +
 .../gcc.target/loongarch/vector/lsx/lsx-cmp.c |  3354 ++
 .../loongarch/vector/lsx/lsx-fp-arith.c   |  3713 ++
 .../loongarch/vector/lsx/lsx-fp-cvt.c |  4114 ++
 .../loongarch/vector/lsx/lsx-int-arith.c  | 22424 +
 .../gcc.target/loongarch/vector/lsx/lsx-mem.c |   537 +
 .../loongarch/vector/lsx/lsx-perm.c   |   +++
 .../loongarch/vector/lsx/lsx-str-manipulate.c |   408 +
 .../loongarch/vector/simd_correctness_check.h |39 +
 49 files changed, 181229 insertions(+), 284 deletions(-)
 create mode 100644 gcc/config/loongarch/lasx.md
 create mode 100644 gcc/config/loongarch/lasxintrin.h
 create mode 100644 gcc/config/loongarch/lsx.md
 create mode 100644 gcc/config/loongarch/lsxintrin.h
 create mode 100644 gcc/testsuite/gcc.target/loongarch/strict-align.c
 create mode 100644 
gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-bit-manipulate.c
 create mode 100644 
gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-builtin.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-cmp.c
 create mode 100644 
gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-fp-arith.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-fp-cvt.c
 create mode 100644 
gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-int-arith.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-mem.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-perm.c
 create mode 

[PATCH v2 0/8] Add Loongson SX/ASX instruction support to LoongArch target.

2023-07-18 Thread Chenghui Pan
This is an update of 
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/623262.html
In addition, LSX/LASX instructions support is added in the master branch of 
binutils-gdb,
and these GCC patches can be used with future releases of binutils-gdb.

Changes since v1:

- Some usages of "unspec" in lsx.md and lasx.md are replaced with arithmetic
  RTL expressions.
- Support ADDR_REG_REG in LSX and LASX.
- Constraint docs are appended in gcc/doc/md.texi and head comment block of
  gcc/config/loongarch/constraints.md.
- Codes related to vecarg in loongarch.cc and loongarch.opt.in are removed in 
v2.
- Testsuite of LSX and LASX is added in v2, and can be called by using
  loongarch-vector.exp independently.
- Adjust the loongarch_expand_vector_init() function to reduce instruction
  amount when initializing the vector with element-by-element style.
- Some minor implementation changes of RTL templates in lsx.md and lasx.md.

Lulu Cheng (8):
  LoongArch: Added Loongson SX vector directive compilation framework.
  LoongArch: Added Loongson SX base instruction support.
  LoongArch: Added Loongson SX directive builtin function support.
  LoongArch: Added Loongson ASX vector directive compilation framework.
  LoongArch: Added Loongson ASX base instruction support.
  LoongArch: Added Loongson ASX directive builtin function support.
  LoongArch: Add Loongson SX directive test cases.
  LoongArch: Add Loongson ASX directive test cases.

 gcc/config.gcc| 2 +-
 gcc/config/loongarch/constraints.md   |   131 +-
 .../loongarch/genopts/loongarch-strings   | 4 +
 gcc/config/loongarch/genopts/loongarch.opt.in |12 +-
 gcc/config/loongarch/lasx.md  |  5120 +++
 gcc/config/loongarch/lasxintrin.h |  5342 +++
 gcc/config/loongarch/loongarch-builtins.cc|  2686 +-
 gcc/config/loongarch/loongarch-c.cc   |18 +
 gcc/config/loongarch/loongarch-def.c  | 6 +
 gcc/config/loongarch/loongarch-def.h  | 9 +-
 gcc/config/loongarch/loongarch-driver.cc  |10 +
 gcc/config/loongarch/loongarch-driver.h   | 2 +
 gcc/config/loongarch/loongarch-ftypes.def |   666 +-
 gcc/config/loongarch/loongarch-modes.def  |39 +
 gcc/config/loongarch/loongarch-opts.cc|89 +-
 gcc/config/loongarch/loongarch-opts.h | 3 +
 gcc/config/loongarch/loongarch-protos.h   |35 +
 gcc/config/loongarch/loongarch-str.h  | 3 +
 gcc/config/loongarch/loongarch.cc |  4669 +-
 gcc/config/loongarch/loongarch.h  |   117 +-
 gcc/config/loongarch/loongarch.md |56 +-
 gcc/config/loongarch/loongarch.opt|12 +-
 gcc/config/loongarch/lsx.md   |  4479 ++
 gcc/config/loongarch/lsxintrin.h  |  5181 +++
 gcc/config/loongarch/predicates.md|   333 +-
 gcc/doc/md.texi   |11 +
 .../gcc.target/loongarch/strict-align.c   |13 +
 .../vector/lasx/lasx-bit-manipulate.c | 27813 +++
 .../loongarch/vector/lasx/lasx-builtin.c  |  1509 +
 .../loongarch/vector/lasx/lasx-cmp.c  |  5361 +++
 .../loongarch/vector/lasx/lasx-fp-arith.c |  6259 +++
 .../loongarch/vector/lasx/lasx-fp-cvt.c   |  7315 +++
 .../loongarch/vector/lasx/lasx-int-arith.c| 38361 
 .../loongarch/vector/lasx/lasx-mem.c  |   147 +
 .../loongarch/vector/lasx/lasx-perm.c |  7730 
 .../vector/lasx/lasx-str-manipulate.c |   712 +
 .../loongarch/vector/lasx/lasx-xvldrepl.c |13 +
 .../loongarch/vector/lasx/lasx-xvstelm.c  |12 +
 .../loongarch/vector/loongarch-vector.exp |42 +
 .../loongarch/vector/lsx/lsx-bit-manipulate.c | 15586 +++
 .../loongarch/vector/lsx/lsx-builtin.c|  1461 +
 .../gcc.target/loongarch/vector/lsx/lsx-cmp.c |  3354 ++
 .../loongarch/vector/lsx/lsx-fp-arith.c   |  3713 ++
 .../loongarch/vector/lsx/lsx-fp-cvt.c |  4114 ++
 .../loongarch/vector/lsx/lsx-int-arith.c  | 22424 +
 .../gcc.target/loongarch/vector/lsx/lsx-mem.c |   537 +
 .../loongarch/vector/lsx/lsx-perm.c   |   +++
 .../loongarch/vector/lsx/lsx-str-manipulate.c |   408 +
 .../loongarch/vector/simd_correctness_check.h |39 +
 49 files changed, 181229 insertions(+), 284 deletions(-)
 create mode 100644 gcc/config/loongarch/lasx.md
 create mode 100644 gcc/config/loongarch/lasxintrin.h
 create mode 100644 gcc/config/loongarch/lsx.md
 create mode 100644 gcc/config/loongarch/lsxintrin.h
 create mode 100644 gcc/testsuite/gcc.target/loongarch/strict-align.c
 create mode 100644 
gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-bit-manipulate.c
 create mode 100644 
gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-builtin.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-cmp.c
 create mode 100644 
gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-fp-arith.c
 create mode 100644 

[PATCH v2 4/8] LoongArch: Added Loongson ASX vector directive compilation framework.

2023-07-18 Thread Chenghui Pan
From: Lulu Cheng 

gcc/ChangeLog:

* config/loongarch/genopts/loongarch-strings: Added compilation 
framework.
* config/loongarch/genopts/loongarch.opt.in: Ditto.
* config/loongarch/loongarch-c.cc (loongarch_cpu_cpp_builtins): Ditto.
* config/loongarch/loongarch-def.c: Ditto.
* config/loongarch/loongarch-def.h (N_ISA_EXT_TYPES): Ditto.
(ISA_EXT_SIMD_LASX): Ditto.
(N_SWITCH_TYPES): Ditto.
(SW_LASX): Ditto.
* config/loongarch/loongarch-driver.cc (driver_get_normalized_m_opts): 
Ditto.
* config/loongarch/loongarch-driver.h (driver_get_normalized_m_opts): 
Ditto.
* config/loongarch/loongarch-opts.cc (isa_str): Ditto.
* config/loongarch/loongarch-opts.h (ISA_HAS_LSX): Ditto.
(ISA_HAS_LASX): Ditto.
* config/loongarch/loongarch-str.h (OPTSTR_LASX): Ditto.
* config/loongarch/loongarch.opt: Ditto.
---
 gcc/config/loongarch/genopts/loongarch-strings |  1 +
 gcc/config/loongarch/genopts/loongarch.opt.in  |  4 
 gcc/config/loongarch/loongarch-c.cc| 11 +++
 gcc/config/loongarch/loongarch-def.c   |  4 +++-
 gcc/config/loongarch/loongarch-def.h   |  6 --
 gcc/config/loongarch/loongarch-driver.cc   |  2 +-
 gcc/config/loongarch/loongarch-driver.h|  1 +
 gcc/config/loongarch/loongarch-opts.cc |  9 -
 gcc/config/loongarch/loongarch-opts.h  |  4 +++-
 gcc/config/loongarch/loongarch-str.h   |  1 +
 gcc/config/loongarch/loongarch.opt |  4 
 11 files changed, 41 insertions(+), 6 deletions(-)

diff --git a/gcc/config/loongarch/genopts/loongarch-strings 
b/gcc/config/loongarch/genopts/loongarch-strings
index 24a5025061f..35d08f5967d 100644
--- a/gcc/config/loongarch/genopts/loongarch-strings
+++ b/gcc/config/loongarch/genopts/loongarch-strings
@@ -42,6 +42,7 @@ OPTSTR_DOUBLE_FLOAT   double-float
 
 # SIMD extensions
 OPTSTR_LSX lsx
+OPTSTR_LASXlasx
 
 # -mabi=
 OPTSTR_ABI_BASE  abi
diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in 
b/gcc/config/loongarch/genopts/loongarch.opt.in
index 338d77a7e40..afde23c9661 100644
--- a/gcc/config/loongarch/genopts/loongarch.opt.in
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -80,6 +80,10 @@ m@@OPTSTR_LSX@@
 Target RejectNegative Var(la_opt_switches) Mask(LSX) Negative(m@@OPTSTR_LSX@@)
 Enable LoongArch SIMD Extension (LSX).
 
+m@@OPTSTR_LASX@@
+Target RejectNegative Var(la_opt_switches) Mask(LASX) 
Negative(m@@OPTSTR_LASX@@)
+Enable LoongArch Advanced SIMD Extension (LASX).
+
 ;; Base target models (implies ISA & tune parameters)
 Enum
 Name(cpu_type) Type(int)
diff --git a/gcc/config/loongarch/loongarch-c.cc 
b/gcc/config/loongarch/loongarch-c.cc
index b065921adc3..2747fb9e472 100644
--- a/gcc/config/loongarch/loongarch-c.cc
+++ b/gcc/config/loongarch/loongarch-c.cc
@@ -104,8 +104,19 @@ loongarch_cpu_cpp_builtins (cpp_reader *pfile)
   builtin_define ("__loongarch_simd");
   builtin_define ("__loongarch_sx");
   builtin_define ("__loongarch_sx_width=128");
+
+  if (!ISA_HAS_LASX)
+   builtin_define ("__loongarch_simd_width=128");
 }
 
+  if (ISA_HAS_LASX)
+{
+  builtin_define ("__loongarch_asx");
+  builtin_define ("__loongarch_asx_width=256");
+  builtin_define ("__loongarch_simd_width=256");
+}
+
+
   /* Native Data Sizes.  */
   builtin_define_with_int_value ("_LOONGARCH_SZINT", INT_TYPE_SIZE);
   builtin_define_with_int_value ("_LOONGARCH_SZLONG", LONG_TYPE_SIZE);
diff --git a/gcc/config/loongarch/loongarch-def.c 
b/gcc/config/loongarch/loongarch-def.c
index 28e24c62249..bff92c86532 100644
--- a/gcc/config/loongarch/loongarch-def.c
+++ b/gcc/config/loongarch/loongarch-def.c
@@ -54,7 +54,7 @@ loongarch_cpu_default_isa[N_ARCH_TYPES] = {
   [CPU_LA464] = {
   .base = ISA_BASE_LA64V100,
   .fpu = ISA_EXT_FPU64,
-  .simd = ISA_EXT_SIMD_LSX,
+  .simd = ISA_EXT_SIMD_LASX,
   },
 };
 
@@ -150,6 +150,7 @@ loongarch_isa_ext_strings[N_ISA_EXT_TYPES] = {
   [ISA_EXT_FPU32] = STR_ISA_EXT_FPU32,
   [ISA_EXT_NOFPU] = STR_ISA_EXT_NOFPU,
   [ISA_EXT_SIMD_LSX] = OPTSTR_LSX,
+  [ISA_EXT_SIMD_LASX] = OPTSTR_LASX,
 };
 
 const char*
@@ -180,6 +181,7 @@ loongarch_switch_strings[] = {
   [SW_SINGLE_FLOAT]  = OPTSTR_SINGLE_FLOAT,
   [SW_DOUBLE_FLOAT]  = OPTSTR_DOUBLE_FLOAT,
   [SW_LSX]   = OPTSTR_LSX,
+  [SW_LASX]  = OPTSTR_LASX,
 };
 
 
diff --git a/gcc/config/loongarch/loongarch-def.h 
b/gcc/config/loongarch/loongarch-def.h
index f34cffcfb9b..0bbcdb03d22 100644
--- a/gcc/config/loongarch/loongarch-def.h
+++ b/gcc/config/loongarch/loongarch-def.h
@@ -64,7 +64,8 @@ extern const char* loongarch_isa_ext_strings[];
 #define ISA_EXT_FPU642
 #define N_ISA_EXT_FPU_TYPES   3
 #define ISA_EXT_SIMD_LSX  3
-#define N_ISA_EXT_TYPES  4
+#define ISA_EXT_SIMD_LASX 4
+#define N_ISA_EXT_TYPES  5
 
 /* enum abi_base */
 extern const char* 

[PATCH v2 1/8] LoongArch: Added Loongson SX vector directive compilation framework.

2023-07-18 Thread Chenghui Pan
From: Lulu Cheng 

gcc/ChangeLog:

* config/loongarch/genopts/loongarch-strings: Added compilation 
framework.
* config/loongarch/genopts/loongarch.opt.in: Ditto.
* config/loongarch/loongarch-c.cc (loongarch_cpu_cpp_builtins): Ditto.
* config/loongarch/loongarch-def.c: Ditto.
* config/loongarch/loongarch-def.h (N_ISA_EXT_TYPES): Ditto.
(ISA_EXT_SIMD_LSX): Ditto.
(N_SWITCH_TYPES): Ditto.
(SW_LSX): Ditto.
(struct loongarch_isa): Ditto.
* config/loongarch/loongarch-driver.cc (APPEND_SWITCH): Ditto.
(driver_get_normalized_m_opts): Ditto.
* config/loongarch/loongarch-driver.h (driver_get_normalized_m_opts): 
Ditto.
* config/loongarch/loongarch-opts.cc (loongarch_config_target): Ditto.
(isa_str): Ditto.
* config/loongarch/loongarch-opts.h (ISA_HAS_LSX): Ditto.
* config/loongarch/loongarch-str.h (OPTSTR_LSX): Ditto.
* config/loongarch/loongarch.opt: Ditto.
---
 .../loongarch/genopts/loongarch-strings   |  3 +
 gcc/config/loongarch/genopts/loongarch.opt.in |  8 +-
 gcc/config/loongarch/loongarch-c.cc   |  7 ++
 gcc/config/loongarch/loongarch-def.c  |  4 +
 gcc/config/loongarch/loongarch-def.h  |  7 +-
 gcc/config/loongarch/loongarch-driver.cc  | 10 +++
 gcc/config/loongarch/loongarch-driver.h   |  1 +
 gcc/config/loongarch/loongarch-opts.cc| 82 ++-
 gcc/config/loongarch/loongarch-opts.h |  1 +
 gcc/config/loongarch/loongarch-str.h  |  2 +
 gcc/config/loongarch/loongarch.opt|  8 +-
 11 files changed, 128 insertions(+), 5 deletions(-)

diff --git a/gcc/config/loongarch/genopts/loongarch-strings 
b/gcc/config/loongarch/genopts/loongarch-strings
index a40998ead97..24a5025061f 100644
--- a/gcc/config/loongarch/genopts/loongarch-strings
+++ b/gcc/config/loongarch/genopts/loongarch-strings
@@ -40,6 +40,9 @@ OPTSTR_SOFT_FLOAT soft-float
 OPTSTR_SINGLE_FLOAT   single-float
 OPTSTR_DOUBLE_FLOAT   double-float
 
+# SIMD extensions
+OPTSTR_LSX lsx
+
 # -mabi=
 OPTSTR_ABI_BASE  abi
 STR_ABI_BASE_LP64Dlp64d
diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in 
b/gcc/config/loongarch/genopts/loongarch.opt.in
index 4b9b4ac273e..338d77a7e40 100644
--- a/gcc/config/loongarch/genopts/loongarch.opt.in
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -76,6 +76,9 @@ m@@OPTSTR_DOUBLE_FLOAT@@
 Target Driver RejectNegative Var(la_opt_switches) Mask(FORCE_F64) 
Negative(m@@OPTSTR_SOFT_FLOAT@@)
 Allow hardware floating-point instructions to cover both 32-bit and 64-bit 
operations.
 
+m@@OPTSTR_LSX@@
+Target RejectNegative Var(la_opt_switches) Mask(LSX) Negative(m@@OPTSTR_LSX@@)
+Enable LoongArch SIMD Extension (LSX).
 
 ;; Base target models (implies ISA & tune parameters)
 Enum
@@ -125,11 +128,14 @@ Target RejectNegative Joined ToLower Enum(abi_base) 
Var(la_opt_abi_base) Init(M_
 Variable
 int la_opt_abi_ext = M_OPTION_NOT_SEEN
 
-
 mbranch-cost=
 Target RejectNegative Joined UInteger Var(loongarch_branch_cost)
 -mbranch-cost=COST Set the cost of branches to roughly COST instructions.
 
+mmemvec-cost=
+Target RejectNegative Joined UInteger Var(loongarch_vector_access_cost) 
IntegerRange(1, 5)
+mmemvec-cost=COST  Set the cost of vector memory access instructions.
+
 mcheck-zero-division
 Target Mask(CHECK_ZERO_DIV)
 Trap on integer divide by zero.
diff --git a/gcc/config/loongarch/loongarch-c.cc 
b/gcc/config/loongarch/loongarch-c.cc
index 67911b78f28..b065921adc3 100644
--- a/gcc/config/loongarch/loongarch-c.cc
+++ b/gcc/config/loongarch/loongarch-c.cc
@@ -99,6 +99,13 @@ loongarch_cpu_cpp_builtins (cpp_reader *pfile)
   else
 builtin_define ("__loongarch_frlen=0");
 
+  if (ISA_HAS_LSX)
+{
+  builtin_define ("__loongarch_simd");
+  builtin_define ("__loongarch_sx");
+  builtin_define ("__loongarch_sx_width=128");
+}
+
   /* Native Data Sizes.  */
   builtin_define_with_int_value ("_LOONGARCH_SZINT", INT_TYPE_SIZE);
   builtin_define_with_int_value ("_LOONGARCH_SZLONG", LONG_TYPE_SIZE);
diff --git a/gcc/config/loongarch/loongarch-def.c 
b/gcc/config/loongarch/loongarch-def.c
index 6729c857f7c..28e24c62249 100644
--- a/gcc/config/loongarch/loongarch-def.c
+++ b/gcc/config/loongarch/loongarch-def.c
@@ -49,10 +49,12 @@ loongarch_cpu_default_isa[N_ARCH_TYPES] = {
   [CPU_LOONGARCH64] = {
   .base = ISA_BASE_LA64V100,
   .fpu = ISA_EXT_FPU64,
+  .simd = 0,
   },
   [CPU_LA464] = {
   .base = ISA_BASE_LA64V100,
   .fpu = ISA_EXT_FPU64,
+  .simd = ISA_EXT_SIMD_LSX,
   },
 };
 
@@ -147,6 +149,7 @@ loongarch_isa_ext_strings[N_ISA_EXT_TYPES] = {
   [ISA_EXT_FPU64] = STR_ISA_EXT_FPU64,
   [ISA_EXT_FPU32] = STR_ISA_EXT_FPU32,
   [ISA_EXT_NOFPU] = STR_ISA_EXT_NOFPU,
+  [ISA_EXT_SIMD_LSX] = OPTSTR_LSX,
 };
 
 const char*
@@ -176,6 +179,7 @@ loongarch_switch_strings[] = {
   [SW_SOFT_FLOAT]= OPTSTR_SOFT_FLOAT,
   [SW_SINGLE_FLOAT] 

Re: [PATCH v1 0/6] Add Loongson SX/ASX instruction support to LoongArch target.

2023-07-06 Thread Chenghui Pan

No, vld/vst can't guaranteed to be atomic in this condition. Seems we
can't implement this on LoongArch for now.

On 2023/7/5 20:57, Xi Ruoyao wrote:

A question: is vld/vst guaranteed to be atomic if the accessed address
is aligned?  If true we can use them to implement lock-free 128-bit
atomic load and store.  See https://gcc.gnu.org/bugzilla/PR104688 for
the background, and some people really hate using a lock for atomics.

On Fri, 2023-06-30 at 10:16 +0800, Chenghui Pan wrote:

These patches add the Loongson SX/ASX instruction support to the
LoongArch
target, and can be utilized by using the new "-mlsx" and
"-mlasx" option.

Patches are bootstrapped and tested on loongarch64-linux-gnu target.

Lulu Cheng (6):
   LoongArch: Added Loongson SX vector directive compilation framework.
   LoongArch: Added Loongson SX base instruction support.
   LoongArch: Added Loongson SX directive builtin function support.
   LoongArch: Added Loongson ASX vector directive compilation
framework.
   LoongArch: Added Loongson ASX base instruction support.
   LoongArch: Added Loongson ASX directive builtin function support.

  gcc/config.gcc    |    2 +-
  gcc/config/loongarch/constraints.md   |  128 +-
  .../loongarch/genopts/loongarch-strings   |    4 +
  gcc/config/loongarch/genopts/loongarch.opt.in |   16 +-
  gcc/config/loongarch/lasx.md  | 5147 
  gcc/config/loongarch/lasxintrin.h | 5342
+
  gcc/config/loongarch/loongarch-builtins.cc    | 2686 -
  gcc/config/loongarch/loongarch-c.cc   |   18 +
  gcc/config/loongarch/loongarch-def.c  |    6 +
  gcc/config/loongarch/loongarch-def.h  |    9 +-
  gcc/config/loongarch/loongarch-driver.cc  |   10 +
  gcc/config/loongarch/loongarch-driver.h   |    2 +
  gcc/config/loongarch/loongarch-ftypes.def |  666 +-
  gcc/config/loongarch/loongarch-modes.def  |   39 +
  gcc/config/loongarch/loongarch-opts.cc    |   89 +-
  gcc/config/loongarch/loongarch-opts.h |    3 +
  gcc/config/loongarch/loongarch-protos.h   |   35 +
  gcc/config/loongarch/loongarch-str.h  |    3 +
  gcc/config/loongarch/loongarch.cc | 4615 +-
  gcc/config/loongarch/loongarch.h  |  117 +-
  gcc/config/loongarch/loongarch.md |   56 +-
  gcc/config/loongarch/loongarch.opt    |   16 +-
  gcc/config/loongarch/lsx.md   | 4490 ++
  gcc/config/loongarch/lsxintrin.h  | 5181 
  gcc/config/loongarch/predicates.md    |  333 +-
  25 files changed, 28723 insertions(+), 290 deletions(-)
  create mode 100644 gcc/config/loongarch/lasx.md
  create mode 100644 gcc/config/loongarch/lasxintrin.h
  create mode 100644 gcc/config/loongarch/lsx.md
  create mode 100644 gcc/config/loongarch/lsxintrin.h





[PATCH v1 4/6] LoongArch: Added Loongson ASX vector directive compilation framework.

2023-06-29 Thread Chenghui Pan
From: Lulu Cheng 

gcc/ChangeLog:

* config/loongarch/genopts/loongarch-strings: Added compilation 
framework.
* config/loongarch/genopts/loongarch.opt.in: Ditto.
* config/loongarch/loongarch-c.cc (loongarch_cpu_cpp_builtins): Ditto.
* config/loongarch/loongarch-def.c: Ditto.
* config/loongarch/loongarch-def.h (N_ISA_EXT_TYPES): Ditto.
(ISA_EXT_SIMD_LASX): Ditto.
(N_SWITCH_TYPES): Ditto.
(SW_LASX): Ditto.
* config/loongarch/loongarch-driver.cc (driver_get_normalized_m_opts): 
Ditto.
* config/loongarch/loongarch-driver.h (driver_get_normalized_m_opts): 
Ditto.
* config/loongarch/loongarch-opts.cc (isa_str): Ditto.
* config/loongarch/loongarch-opts.h (ISA_HAS_LSX): Ditto.
(ISA_HAS_LASX): Ditto.
* config/loongarch/loongarch-str.h (OPTSTR_LASX): Ditto.
* config/loongarch/loongarch.opt: Ditto.
---
 gcc/config/loongarch/genopts/loongarch-strings |  1 +
 gcc/config/loongarch/genopts/loongarch.opt.in  |  4 
 gcc/config/loongarch/loongarch-c.cc| 11 +++
 gcc/config/loongarch/loongarch-def.c   |  4 +++-
 gcc/config/loongarch/loongarch-def.h   |  6 --
 gcc/config/loongarch/loongarch-driver.cc   |  2 +-
 gcc/config/loongarch/loongarch-driver.h|  1 +
 gcc/config/loongarch/loongarch-opts.cc |  9 -
 gcc/config/loongarch/loongarch-opts.h  |  4 +++-
 gcc/config/loongarch/loongarch-str.h   |  1 +
 gcc/config/loongarch/loongarch.opt |  4 
 11 files changed, 41 insertions(+), 6 deletions(-)

diff --git a/gcc/config/loongarch/genopts/loongarch-strings 
b/gcc/config/loongarch/genopts/loongarch-strings
index 24a5025061f..35d08f5967d 100644
--- a/gcc/config/loongarch/genopts/loongarch-strings
+++ b/gcc/config/loongarch/genopts/loongarch-strings
@@ -42,6 +42,7 @@ OPTSTR_DOUBLE_FLOAT   double-float
 
 # SIMD extensions
 OPTSTR_LSX lsx
+OPTSTR_LASXlasx
 
 # -mabi=
 OPTSTR_ABI_BASE  abi
diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in 
b/gcc/config/loongarch/genopts/loongarch.opt.in
index 7ea3f0dea5d..d1c2d2fef34 100644
--- a/gcc/config/loongarch/genopts/loongarch.opt.in
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -80,6 +80,10 @@ m@@OPTSTR_LSX@@
 Target RejectNegative Var(la_opt_switches) Mask(LSX) Negative(m@@OPTSTR_LSX@@)
 Enable LoongArch SIMD Extension (LSX).
 
+m@@OPTSTR_LASX@@
+Target RejectNegative Var(la_opt_switches) Mask(LASX) 
Negative(m@@OPTSTR_LASX@@)
+Enable LoongArch Advanced SIMD Extension (LASX).
+
 ;; Base target models (implies ISA & tune parameters)
 Enum
 Name(cpu_type) Type(int)
diff --git a/gcc/config/loongarch/loongarch-c.cc 
b/gcc/config/loongarch/loongarch-c.cc
index b065921adc3..2747fb9e472 100644
--- a/gcc/config/loongarch/loongarch-c.cc
+++ b/gcc/config/loongarch/loongarch-c.cc
@@ -104,8 +104,19 @@ loongarch_cpu_cpp_builtins (cpp_reader *pfile)
   builtin_define ("__loongarch_simd");
   builtin_define ("__loongarch_sx");
   builtin_define ("__loongarch_sx_width=128");
+
+  if (!ISA_HAS_LASX)
+   builtin_define ("__loongarch_simd_width=128");
 }
 
+  if (ISA_HAS_LASX)
+{
+  builtin_define ("__loongarch_asx");
+  builtin_define ("__loongarch_asx_width=256");
+  builtin_define ("__loongarch_simd_width=256");
+}
+
+
   /* Native Data Sizes.  */
   builtin_define_with_int_value ("_LOONGARCH_SZINT", INT_TYPE_SIZE);
   builtin_define_with_int_value ("_LOONGARCH_SZLONG", LONG_TYPE_SIZE);
diff --git a/gcc/config/loongarch/loongarch-def.c 
b/gcc/config/loongarch/loongarch-def.c
index 28e24c62249..bff92c86532 100644
--- a/gcc/config/loongarch/loongarch-def.c
+++ b/gcc/config/loongarch/loongarch-def.c
@@ -54,7 +54,7 @@ loongarch_cpu_default_isa[N_ARCH_TYPES] = {
   [CPU_LA464] = {
   .base = ISA_BASE_LA64V100,
   .fpu = ISA_EXT_FPU64,
-  .simd = ISA_EXT_SIMD_LSX,
+  .simd = ISA_EXT_SIMD_LASX,
   },
 };
 
@@ -150,6 +150,7 @@ loongarch_isa_ext_strings[N_ISA_EXT_TYPES] = {
   [ISA_EXT_FPU32] = STR_ISA_EXT_FPU32,
   [ISA_EXT_NOFPU] = STR_ISA_EXT_NOFPU,
   [ISA_EXT_SIMD_LSX] = OPTSTR_LSX,
+  [ISA_EXT_SIMD_LASX] = OPTSTR_LASX,
 };
 
 const char*
@@ -180,6 +181,7 @@ loongarch_switch_strings[] = {
   [SW_SINGLE_FLOAT]  = OPTSTR_SINGLE_FLOAT,
   [SW_DOUBLE_FLOAT]  = OPTSTR_DOUBLE_FLOAT,
   [SW_LSX]   = OPTSTR_LSX,
+  [SW_LASX]  = OPTSTR_LASX,
 };
 
 
diff --git a/gcc/config/loongarch/loongarch-def.h 
b/gcc/config/loongarch/loongarch-def.h
index f34cffcfb9b..0bbcdb03d22 100644
--- a/gcc/config/loongarch/loongarch-def.h
+++ b/gcc/config/loongarch/loongarch-def.h
@@ -64,7 +64,8 @@ extern const char* loongarch_isa_ext_strings[];
 #define ISA_EXT_FPU642
 #define N_ISA_EXT_FPU_TYPES   3
 #define ISA_EXT_SIMD_LSX  3
-#define N_ISA_EXT_TYPES  4
+#define ISA_EXT_SIMD_LASX 4
+#define N_ISA_EXT_TYPES  5
 
 /* enum abi_base */
 extern const char* 

[PATCH v1 1/6] LoongArch: Added Loongson SX vector directive compilation framework.

2023-06-29 Thread Chenghui Pan
From: Lulu Cheng 

gcc/ChangeLog:

* config/loongarch/genopts/loongarch-strings: Added compilation 
framework.
* config/loongarch/genopts/loongarch.opt.in: Ditto.
* config/loongarch/loongarch-c.cc (loongarch_cpu_cpp_builtins): Ditto.
* config/loongarch/loongarch-def.c: Ditto.
* config/loongarch/loongarch-def.h (N_ISA_EXT_TYPES): Ditto.
(ISA_EXT_SIMD_LSX): Ditto.
(N_SWITCH_TYPES): Ditto.
(SW_LSX): Ditto.
(struct loongarch_isa): Ditto.
* config/loongarch/loongarch-driver.cc (APPEND_SWITCH): Ditto.
(driver_get_normalized_m_opts): Ditto.
* config/loongarch/loongarch-driver.h (driver_get_normalized_m_opts): 
Ditto.
* config/loongarch/loongarch-opts.cc (loongarch_config_target): Ditto.
(isa_str): Ditto.
* config/loongarch/loongarch-opts.h (ISA_HAS_LSX): Ditto.
* config/loongarch/loongarch-str.h (OPTSTR_LSX): Ditto.
* config/loongarch/loongarch.opt: Ditto.
---
 .../loongarch/genopts/loongarch-strings   |  3 +
 gcc/config/loongarch/genopts/loongarch.opt.in | 12 ++-
 gcc/config/loongarch/loongarch-c.cc   |  7 ++
 gcc/config/loongarch/loongarch-def.c  |  4 +
 gcc/config/loongarch/loongarch-def.h  |  7 +-
 gcc/config/loongarch/loongarch-driver.cc  | 10 +++
 gcc/config/loongarch/loongarch-driver.h   |  1 +
 gcc/config/loongarch/loongarch-opts.cc| 82 ++-
 gcc/config/loongarch/loongarch-opts.h |  1 +
 gcc/config/loongarch/loongarch-str.h  |  2 +
 gcc/config/loongarch/loongarch.opt| 12 ++-
 11 files changed, 136 insertions(+), 5 deletions(-)

diff --git a/gcc/config/loongarch/genopts/loongarch-strings 
b/gcc/config/loongarch/genopts/loongarch-strings
index a40998ead97..24a5025061f 100644
--- a/gcc/config/loongarch/genopts/loongarch-strings
+++ b/gcc/config/loongarch/genopts/loongarch-strings
@@ -40,6 +40,9 @@ OPTSTR_SOFT_FLOAT soft-float
 OPTSTR_SINGLE_FLOAT   single-float
 OPTSTR_DOUBLE_FLOAT   double-float
 
+# SIMD extensions
+OPTSTR_LSX lsx
+
 # -mabi=
 OPTSTR_ABI_BASE  abi
 STR_ABI_BASE_LP64Dlp64d
diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in 
b/gcc/config/loongarch/genopts/loongarch.opt.in
index 4b9b4ac273e..7ea3f0dea5d 100644
--- a/gcc/config/loongarch/genopts/loongarch.opt.in
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -76,6 +76,9 @@ m@@OPTSTR_DOUBLE_FLOAT@@
 Target Driver RejectNegative Var(la_opt_switches) Mask(FORCE_F64) 
Negative(m@@OPTSTR_SOFT_FLOAT@@)
 Allow hardware floating-point instructions to cover both 32-bit and 64-bit 
operations.
 
+m@@OPTSTR_LSX@@
+Target RejectNegative Var(la_opt_switches) Mask(LSX) Negative(m@@OPTSTR_LSX@@)
+Enable LoongArch SIMD Extension (LSX).
 
 ;; Base target models (implies ISA & tune parameters)
 Enum
@@ -125,11 +128,18 @@ Target RejectNegative Joined ToLower Enum(abi_base) 
Var(la_opt_abi_base) Init(M_
 Variable
 int la_opt_abi_ext = M_OPTION_NOT_SEEN
 
-
 mbranch-cost=
 Target RejectNegative Joined UInteger Var(loongarch_branch_cost)
 -mbranch-cost=COST Set the cost of branches to roughly COST instructions.
 
+mvecarg
+Target Var(TARGET_VECARG) Init(1)
+Target pass vect arg uses vector register.
+
+mmemvec-cost=
+Target RejectNegative Joined UInteger Var(loongarch_vector_access_cost) 
IntegerRange(1, 5)
+mmemvec-cost=COST  Set the cost of vector memory access instructions.
+
 mcheck-zero-division
 Target Mask(CHECK_ZERO_DIV)
 Trap on integer divide by zero.
diff --git a/gcc/config/loongarch/loongarch-c.cc 
b/gcc/config/loongarch/loongarch-c.cc
index 67911b78f28..b065921adc3 100644
--- a/gcc/config/loongarch/loongarch-c.cc
+++ b/gcc/config/loongarch/loongarch-c.cc
@@ -99,6 +99,13 @@ loongarch_cpu_cpp_builtins (cpp_reader *pfile)
   else
 builtin_define ("__loongarch_frlen=0");
 
+  if (ISA_HAS_LSX)
+{
+  builtin_define ("__loongarch_simd");
+  builtin_define ("__loongarch_sx");
+  builtin_define ("__loongarch_sx_width=128");
+}
+
   /* Native Data Sizes.  */
   builtin_define_with_int_value ("_LOONGARCH_SZINT", INT_TYPE_SIZE);
   builtin_define_with_int_value ("_LOONGARCH_SZLONG", LONG_TYPE_SIZE);
diff --git a/gcc/config/loongarch/loongarch-def.c 
b/gcc/config/loongarch/loongarch-def.c
index 6729c857f7c..28e24c62249 100644
--- a/gcc/config/loongarch/loongarch-def.c
+++ b/gcc/config/loongarch/loongarch-def.c
@@ -49,10 +49,12 @@ loongarch_cpu_default_isa[N_ARCH_TYPES] = {
   [CPU_LOONGARCH64] = {
   .base = ISA_BASE_LA64V100,
   .fpu = ISA_EXT_FPU64,
+  .simd = 0,
   },
   [CPU_LA464] = {
   .base = ISA_BASE_LA64V100,
   .fpu = ISA_EXT_FPU64,
+  .simd = ISA_EXT_SIMD_LSX,
   },
 };
 
@@ -147,6 +149,7 @@ loongarch_isa_ext_strings[N_ISA_EXT_TYPES] = {
   [ISA_EXT_FPU64] = STR_ISA_EXT_FPU64,
   [ISA_EXT_FPU32] = STR_ISA_EXT_FPU32,
   [ISA_EXT_NOFPU] = STR_ISA_EXT_NOFPU,
+  [ISA_EXT_SIMD_LSX] = OPTSTR_LSX,
 };
 
 const char*
@@ -176,6 +179,7 @@ 

[PATCH v1 0/6] Add Loongson SX/ASX instruction support to LoongArch target.

2023-06-29 Thread Chenghui Pan
These patches add the Loongson SX/ASX instruction support to the LoongArch
target, and can be utilized by using the new "-mlsx" and
"-mlasx" option.

Patches are bootstrapped and tested on loongarch64-linux-gnu target.

Lulu Cheng (6):
  LoongArch: Added Loongson SX vector directive compilation framework.
  LoongArch: Added Loongson SX base instruction support.
  LoongArch: Added Loongson SX directive builtin function support.
  LoongArch: Added Loongson ASX vector directive compilation framework.
  LoongArch: Added Loongson ASX base instruction support.
  LoongArch: Added Loongson ASX directive builtin function support.

 gcc/config.gcc|2 +-
 gcc/config/loongarch/constraints.md   |  128 +-
 .../loongarch/genopts/loongarch-strings   |4 +
 gcc/config/loongarch/genopts/loongarch.opt.in |   16 +-
 gcc/config/loongarch/lasx.md  | 5147 
 gcc/config/loongarch/lasxintrin.h | 5342 +
 gcc/config/loongarch/loongarch-builtins.cc| 2686 -
 gcc/config/loongarch/loongarch-c.cc   |   18 +
 gcc/config/loongarch/loongarch-def.c  |6 +
 gcc/config/loongarch/loongarch-def.h  |9 +-
 gcc/config/loongarch/loongarch-driver.cc  |   10 +
 gcc/config/loongarch/loongarch-driver.h   |2 +
 gcc/config/loongarch/loongarch-ftypes.def |  666 +-
 gcc/config/loongarch/loongarch-modes.def  |   39 +
 gcc/config/loongarch/loongarch-opts.cc|   89 +-
 gcc/config/loongarch/loongarch-opts.h |3 +
 gcc/config/loongarch/loongarch-protos.h   |   35 +
 gcc/config/loongarch/loongarch-str.h  |3 +
 gcc/config/loongarch/loongarch.cc | 4615 +-
 gcc/config/loongarch/loongarch.h  |  117 +-
 gcc/config/loongarch/loongarch.md |   56 +-
 gcc/config/loongarch/loongarch.opt|   16 +-
 gcc/config/loongarch/lsx.md   | 4490 ++
 gcc/config/loongarch/lsxintrin.h  | 5181 
 gcc/config/loongarch/predicates.md|  333 +-
 25 files changed, 28723 insertions(+), 290 deletions(-)
 create mode 100644 gcc/config/loongarch/lasx.md
 create mode 100644 gcc/config/loongarch/lasxintrin.h
 create mode 100644 gcc/config/loongarch/lsx.md
 create mode 100644 gcc/config/loongarch/lsxintrin.h

-- 
2.36.0