[PATCH 4/7 v2] lto: Implement ltrans cache

2024-06-20 Thread Michal Jires
Outside of suggested changes, this version:
- uses #define INCLUDE_*
- is rebased onto current trunk - 'fprintf (mstream,' lines
- --verbose 'recompiling++' count is moved into correct if branch

___

This patch implements Incremental LTO as ltrans cache.

The cache is active when directory $GCC_LTRANS_CACHE is specified and exists.
Stored are pairs of ltrans input/output files and input file hash.
File locking is used to allow multiple GCC instances to use to same cache.

Bootstrapped/regtested on x86_64-pc-linux-gnu

gcc/ChangeLog:

* Makefile.in: Add lto-ltrans-cache.o.
* lto-wrapper.cc: Use ltrans cache.
* lto-ltrans-cache.cc: New file.
* lto-ltrans-cache.h: New file.
---
 gcc/Makefile.in |   5 +-
 gcc/lto-ltrans-cache.cc | 409 
 gcc/lto-ltrans-cache.h  | 147 +++
 gcc/lto-wrapper.cc  | 153 +--
 4 files changed, 699 insertions(+), 15 deletions(-)
 create mode 100644 gcc/lto-ltrans-cache.cc
 create mode 100644 gcc/lto-ltrans-cache.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 90ec59dca75..ab7335ed882 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1855,7 +1855,7 @@ ALL_HOST_BACKEND_OBJS = $(GCC_OBJS) $(OBJS) 
$(OBJS-libcommon) \
   $(OBJS-libcommon-target) main.o c-family/cppspec.o \
   $(COLLECT2_OBJS) $(EXTRA_GCC_OBJS) $(GCOV_OBJS) $(GCOV_DUMP_OBJS) \
   $(GCOV_TOOL_OBJS) $(GENGTYPE_OBJS) gcc-ar.o gcc-nm.o gcc-ranlib.o \
-  lto-wrapper.o collect-utils.o lockfile.o
+  lto-wrapper.o collect-utils.o lockfile.o lto-ltrans-cache.o
 
 # for anything that is shared use the cc1plus profile data, as that
 # is likely the most exercised during the build
@@ -2384,7 +2384,8 @@ collect2$(exeext): $(COLLECT2_OBJS) $(LIBDEPS)
 CFLAGS-collect2.o += -DTARGET_MACHINE=\"$(target_noncanonical)\" \
@TARGET_SYSTEM_ROOT_DEFINE@
 
-LTO_WRAPPER_OBJS = lto-wrapper.o collect-utils.o ggc-none.o lockfile.o
+LTO_WRAPPER_OBJS = lto-wrapper.o collect-utils.o ggc-none.o lockfile.o \
+  lto-ltrans-cache.o
 
 lto-wrapper$(exeext): $(LTO_WRAPPER_OBJS) libcommon-target.a $(LIBDEPS)
+$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) -o T$@ \
diff --git a/gcc/lto-ltrans-cache.cc b/gcc/lto-ltrans-cache.cc
new file mode 100644
index 000..a6ff02afb58
--- /dev/null
+++ b/gcc/lto-ltrans-cache.cc
@@ -0,0 +1,409 @@
+/* File caching.
+   Copyright (C) 2023-2024 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#define INCLUDE_ALGORITHM
+#define INCLUDE_STRING
+#define INCLUDE_ARRAY
+#define INCLUDE_MAP
+#define INCLUDE_VECTOR
+#include "config.h"
+#include "system.h"
+#include "md5.h"
+#include "lto-ltrans-cache.h"
+
+static const md5_checksum_t INVALID_CHECKSUM = {
+  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+};
+
+/* Computes checksum for given file, returns INVALID_CHECKSUM if not
+   possible.
+ */
+static md5_checksum_t
+file_checksum (char const *filename)
+{
+  FILE *file = fopen (filename, "rb");
+
+  if (!file)
+return INVALID_CHECKSUM;
+
+  md5_checksum_t result;
+
+  int ret = md5_stream (file, &result);
+
+  if (ret)
+result = INVALID_CHECKSUM;
+
+  fclose (file);
+
+  return result;
+}
+
+/* Checks identity of two files byte by byte.  */
+static bool
+files_identical (char const *first_filename, char const *second_filename)
+{
+  FILE *f_first = fopen (first_filename, "rb");
+  if (!f_first)
+return false;
+
+  FILE *f_second = fopen (second_filename, "rb");
+  if (!f_second)
+{
+  fclose (f_first);
+  return false;
+}
+
+  bool ret = true;
+
+  for (;;)
+{
+  int c1, c2;
+  c1 = fgetc (f_first);
+  c2 = fgetc (f_second);
+
+  if (c1 != c2)
+   {
+ ret = false;
+ break;
+   }
+
+  if (c1 == EOF)
+   break;
+}
+
+  fclose (f_first);
+  fclose (f_second);
+  return ret;
+}
+
+/* Contructor of cache item.  */
+ltrans_file_cache::item::item (std::string input, std::string output,
+  md5_checksum_t input_checksum,
+  uint32_t last_used):
+  input (std::move (input)), output (std::move (output)),
+  input_checksum (input_checksum), last_used (last_used)
+{
+  lock = lockfile (this->input + ".lock");
+}
+/* Destructor of cache item.  */
+ltrans_file_cache::item::~item ()
+{
+  lock.unlock ();
+}
+
+/* Reads next cache item from cachedata file.
+   Ad

[PATCH 3/7 v2] Lockfile.

2024-06-20 Thread Michal Jires
This version differs by using INCLUDE_STRING instead of .
(+whitespace and year)

___

This patch implements lockfile used for incremental LTO.

Bootstrapped/regtested on x86_64-pc-linux-gnu

gcc/ChangeLog:

* Makefile.in: Add lockfile.o.
* lockfile.cc: New file.
* lockfile.h: New file.
---
 gcc/Makefile.in |   5 +-
 gcc/lockfile.cc | 136 
 gcc/lockfile.h  |  78 +++
 3 files changed, 217 insertions(+), 2 deletions(-)
 create mode 100644 gcc/lockfile.cc
 create mode 100644 gcc/lockfile.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index f5adb647d3f..90ec59dca75 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1855,7 +1855,7 @@ ALL_HOST_BACKEND_OBJS = $(GCC_OBJS) $(OBJS) 
$(OBJS-libcommon) \
   $(OBJS-libcommon-target) main.o c-family/cppspec.o \
   $(COLLECT2_OBJS) $(EXTRA_GCC_OBJS) $(GCOV_OBJS) $(GCOV_DUMP_OBJS) \
   $(GCOV_TOOL_OBJS) $(GENGTYPE_OBJS) gcc-ar.o gcc-nm.o gcc-ranlib.o \
-  lto-wrapper.o collect-utils.o
+  lto-wrapper.o collect-utils.o lockfile.o
 
 # for anything that is shared use the cc1plus profile data, as that
 # is likely the most exercised during the build
@@ -2384,7 +2384,8 @@ collect2$(exeext): $(COLLECT2_OBJS) $(LIBDEPS)
 CFLAGS-collect2.o += -DTARGET_MACHINE=\"$(target_noncanonical)\" \
@TARGET_SYSTEM_ROOT_DEFINE@
 
-LTO_WRAPPER_OBJS = lto-wrapper.o collect-utils.o ggc-none.o
+LTO_WRAPPER_OBJS = lto-wrapper.o collect-utils.o ggc-none.o lockfile.o
+
 lto-wrapper$(exeext): $(LTO_WRAPPER_OBJS) libcommon-target.a $(LIBDEPS)
+$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) -o T$@ \
   $(LTO_WRAPPER_OBJS) libcommon-target.a $(LIBS)
diff --git a/gcc/lockfile.cc b/gcc/lockfile.cc
new file mode 100644
index 000..8ecb4dc2848
--- /dev/null
+++ b/gcc/lockfile.cc
@@ -0,0 +1,136 @@
+/* File locking.
+   Copyright (C) 2023-2024 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#define INCLUDE_STRING
+#include "config.h"
+#include "system.h"
+#include "lockfile.h"
+
+
+/* Unique write lock.  No other lock can be held on this lockfile.
+   Blocking call.  */
+int
+lockfile::lock_write ()
+{
+  fd = open (filename.c_str (), O_RDWR | O_CREAT, 0666);
+  if (fd < 0)
+return -1;
+
+#if HAVE_FCNTL_H
+  struct flock s_flock;
+
+  s_flock.l_whence = SEEK_SET;
+  s_flock.l_start = 0;
+  s_flock.l_len = 0;
+  s_flock.l_pid = getpid ();
+  s_flock.l_type = F_WRLCK;
+
+  while (fcntl (fd, F_SETLKW, &s_flock) && errno == EINTR)
+continue;
+#endif
+  return 0;
+}
+
+/* Unique write lock.  No other lock can be held on this lockfile.
+   Only locks if this filelock is not locked by any other process.
+   Return whether locking was successful.  */
+int
+lockfile::try_lock_write ()
+{
+  fd = open (filename.c_str (), O_RDWR | O_CREAT, 0666);
+  if (fd < 0)
+return -1;
+
+#if HAVE_FCNTL_H
+  struct flock s_flock;
+
+  s_flock.l_whence = SEEK_SET;
+  s_flock.l_start = 0;
+  s_flock.l_len = 0;
+  s_flock.l_pid = getpid ();
+  s_flock.l_type = F_WRLCK;
+
+  if (fcntl (fd, F_SETLK, &s_flock) == -1)
+{
+  close (fd);
+  fd = -1;
+  return 1;
+}
+#endif
+  return 0;
+}
+
+/* Shared read lock.  Only read lock can be held concurrently.
+   If write lock is already held by this process, it will be
+   changed to read lock.
+   Blocking call.  */
+int
+lockfile::lock_read ()
+{
+  fd = open (filename.c_str (), O_RDWR | O_CREAT, 0666);
+  if (fd < 0)
+return -1;
+
+#if HAVE_FCNTL_H
+  struct flock s_flock;
+
+  s_flock.l_whence = SEEK_SET;
+  s_flock.l_start = 0;
+  s_flock.l_len = 0;
+  s_flock.l_pid = getpid ();
+  s_flock.l_type = F_RDLCK;
+
+  while (fcntl (fd, F_SETLKW, &s_flock) && errno == EINTR)
+continue;
+#endif
+  return 0;
+}
+
+/* Unlock all previously placed locks.  */
+void
+lockfile::unlock ()
+{
+  if (fd < 0)
+{
+#if HAVE_FCNTL_H
+  struct flock s_flock;
+
+  s_flock.l_whence = SEEK_SET;
+  s_flock.l_start = 0;
+  s_flock.l_len = 0;
+  s_flock.l_pid = getpid ();
+  s_flock.l_type = F_UNLCK;
+
+  fcntl (fd, F_SETLK, &s_flock);
+#endif
+  close (fd);
+  fd = -1;
+}
+}
+
+/* Are lockfiles supported?  */
+bool
+lockfile::lockfile_supported ()
+{
+#if HAVE_FCNTL_H
+  return true;
+#else
+  return false;
+#endif
+}
diff --git a/gcc/lockfile.h b/gcc/lockfile.h
new file mode 100644
index 

Re: [PATCH 3/7] Lockfile.

2024-01-09 Thread Michal Jires
Hi,
> You do not implement GCOV_LINKED_WITH_LOCKING patch, does locking work
> with mingw? Or we only build gcc with cygwin emulation layer these days?

I tried to test _locking implementation with both mingw and msys2, in both
cases fcntl was present and _locking was not. Admittedly I was unable to
finish bootstrap without errors, so I might have been doing something wrong.

So I didn't include _locking implementation, because I was unable to test it,
and I am unsure whether we even have supported host which would require it.

Michal


[PATCH 2/7 v2] lto: Remove random_seed from section name.

2024-01-09 Thread Michal Jires
This patch removes suffixes from section names during LTO linking.

These suffixes were originally added for ld -r to work (PR lto/44992).
They were added to all LTO object files, but are only useful before WPA.
After that they waste space, and if kept random, make LTO caching impossible.

Bootstrapped/regtested on x86_64-pc-linux-gnu

gcc/ChangeLog:

* lto-streamer.cc (lto_get_section_name): Remove suffixes after WPA.

gcc/lto/ChangeLog:

* lto-common.cc (lto_section_with_id): Dont load suffix during LTRANS.
---
 gcc/lto-streamer.cc   | 11 +--
 gcc/lto/lto-common.cc |  7 +++
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/gcc/lto-streamer.cc b/gcc/lto-streamer.cc
index 8032bbf7108..61b5f8ed4dc 100644
--- a/gcc/lto-streamer.cc
+++ b/gcc/lto-streamer.cc
@@ -132,11 +132,18 @@ lto_get_section_name (int section_type, const char *name,
  doesn't confuse the reader with merged sections.
 
  For options don't add a ID, the option reader cannot deal with them
- and merging should be ok here. */
-  if (section_type == LTO_section_opts)
+ and merging should be ok here.
+
+ LTRANS files (output of wpa, input and output of ltrans) are handled
+ directly inside of linker/lto-wrapper, so name uniqueness for external
+ tools is not needed.
+ Randomness would inhibit incremental LTO.  */
+  if (section_type == LTO_section_opts || flag_ltrans)
 strcpy (post, "");
   else if (f != NULL) 
 sprintf (post, "." HOST_WIDE_INT_PRINT_HEX_PURE, f->id);
+  else if (flag_wpa)
+strcpy (post, "");
   else
 sprintf (post, "." HOST_WIDE_INT_PRINT_HEX_PURE, get_random_seed (false)); 
   char *res = concat (section_name_prefix, sep, add, post, NULL);
diff --git a/gcc/lto/lto-common.cc b/gcc/lto/lto-common.cc
index 11e7d63f1be..44aeeddf46f 100644
--- a/gcc/lto/lto-common.cc
+++ b/gcc/lto/lto-common.cc
@@ -2174,6 +2174,13 @@ lto_section_with_id (const char *name, unsigned 
HOST_WIDE_INT *id)
 
   if (strncmp (name, section_name_prefix, strlen (section_name_prefix)))
 return 0;
+
+  if (flag_ltrans)
+{
+  *id = 0;
+  return 1;
+}
+
   s = strrchr (name, '.');
   if (!s)
 return 0;
-- 
2.43.0



[PATCH 7/7] lto: partition specific lto_clone_numbers

2023-11-17 Thread Michal Jires
Replaces "lto_priv.$clone_number" by
"lto_priv.$partition_hash.$partition_specific_clone_number".
To reduce divergence for incremental LTO.

Bootstrapped/regtested on x86_64-pc-linux-gnu

gcc/lto/ChangeLog:

* lto-partition.cc (set_clone_partition_name_checksum): New.
(CHECKSUM_STRING): New.
(privatize_symbol_name_1): Use partition hash for lto_priv.
(lto_promote_cross_file_statics): Use set_clone_partition_name_checksum.
(lto_promote_statics_nonwpa): Changed clone_map type.
---
 gcc/lto/lto-partition.cc | 49 +++-
 1 file changed, 43 insertions(+), 6 deletions(-)

diff --git a/gcc/lto/lto-partition.cc b/gcc/lto/lto-partition.cc
index eb31ecba0d3..a2ce24eea23 100644
--- a/gcc/lto/lto-partition.cc
+++ b/gcc/lto/lto-partition.cc
@@ -35,6 +35,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "ipa-fnsummary.h"
 #include "lto-partition.h"
 #include "sreal.h"
+#include "md5.h"
 
 #include 
 #include 
@@ -1516,8 +1517,36 @@ validize_symbol_for_target (symtab_node *node)
 }
 }
 
-/* Maps symbol names to unique lto clone counters.  */
-static hash_map *lto_clone_numbers;
+/* Maps symbol names with partition checksum to unique lto clone counters.  */
+using clone_map = hash_map>, unsigned>;
+static clone_map *lto_clone_numbers;
+uint64_t current_partition_checksum = 0;
+
+/* Computes a quick checksum to distinguish partitions of clone numbers.  */
+void
+set_clone_partition_name_checksum (ltrans_partition part)
+{
+#define CHECKSUM_STRING(FOO) md5_process_bytes ((FOO), strlen (FOO), &ctx)
+  struct md5_ctx ctx;
+  md5_init_ctx (&ctx);
+
+  CHECKSUM_STRING (part->name);
+
+  lto_symtab_encoder_iterator lsei;
+  lto_symtab_encoder_t encoder = part->encoder;
+
+  for (lsei = lsei_start (encoder); !lsei_end_p (lsei); lsei_next (&lsei))
+{
+  symtab_node *node = lsei_node (lsei);
+  CHECKSUM_STRING (node->name ());
+}
+
+  uint64_t checksum[2];
+  md5_finish_ctx (&ctx, checksum);
+  current_partition_checksum = checksum[0];
+#undef CHECKSUM_STRING
+}
 
 /* Helper for privatize_symbol_name.  Mangle NODE symbol name
represented by DECL.  */
@@ -1531,10 +1560,16 @@ privatize_symbol_name_1 (symtab_node *node, tree decl)
 return false;
 
   const char *name = maybe_rewrite_identifier (name0);
-  unsigned &clone_number = lto_clone_numbers->get_or_insert (name);
+
+  unsigned &clone_number = lto_clone_numbers->get_or_insert (
+std::pair {name, current_partition_checksum});
+
+  char lto_priv[32];
+  sprintf (lto_priv, "lto_priv.%lu", current_partition_checksum);
+
   symtab->change_decl_assembler_name (decl,
  clone_function_name (
- name, "lto_priv", clone_number));
+ name, lto_priv, clone_number));
   clone_number++;
 
   if (node->lto_file_data)
@@ -1735,11 +1770,13 @@ lto_promote_cross_file_statics (void)
   part->encoder = compute_ltrans_boundary (part->encoder);
 }
 
-  lto_clone_numbers = new hash_map;
+  lto_clone_numbers = new clone_map;
 
   /* Look at boundaries and promote symbols as needed.  */
   for (i = 0; i < n_sets; i++)
 {
+  set_clone_partition_name_checksum (ltrans_partitions[i]);
+
   lto_symtab_encoder_iterator lsei;
   lto_symtab_encoder_t encoder = ltrans_partitions[i]->encoder;
 
@@ -1778,7 +1815,7 @@ lto_promote_statics_nonwpa (void)
 {
   symtab_node *node;
 
-  lto_clone_numbers = new hash_map;
+  lto_clone_numbers = new clone_map;
   FOR_EACH_SYMBOL (node)
 {
   rename_statics (NULL, node);
-- 
2.42.1



[PATCH 6/7] lto: squash order of symbols in partitions

2023-11-17 Thread Michal Jires
This patch squashes order of symbols in individual partitions, so that
their relative order is conserved, but is not influenced by symbols in
other partitions.
Order of cloned symbols is set to 0. This should be fine because order
specifies order of symbols in input files, which cloned symbols are not
part of.

This is important for incremental LTO because if there is a new symbol,
it otherwise shifts order of all symbols with higher order, which would
diverge them all.

Bootstrapped/regtested on x86_64-pc-linux-gnu

gcc/ChangeLog:

* lto-cgraph.cc (lto_output_node): Add and use order_remap.
(lto_output_varpool_node): Likewise.
(output_symtab): Likewise.
* lto-streamer-out.cc (produce_asm): Likewise.
(output_function): Likewise.
(output_constructor): Likewise.
(copy_function_or_variable): Likewise.
(cmp_int): New.
(lto_output): Generate order_remap.
* lto-streamer.h (produce_asm): Add order_remap.
(output_symtab): Likewise.
---
 gcc/lto-cgraph.cc   | 20 
 gcc/lto-streamer-out.cc | 71 +
 gcc/lto-streamer.h  |  5 +--
 3 files changed, 73 insertions(+), 23 deletions(-)

diff --git a/gcc/lto-cgraph.cc b/gcc/lto-cgraph.cc
index 32c0f5ac6db..a7530290fba 100644
--- a/gcc/lto-cgraph.cc
+++ b/gcc/lto-cgraph.cc
@@ -381,7 +381,8 @@ reachable_from_this_partition_p (struct cgraph_node *node, 
lto_symtab_encoder_t
 
 static void
 lto_output_node (struct lto_simple_output_block *ob, struct cgraph_node *node,
-lto_symtab_encoder_t encoder)
+lto_symtab_encoder_t encoder,
+hash_map, int>* order_remap)
 {
   unsigned int tag;
   struct bitpack_d bp;
@@ -405,7 +406,9 @@ lto_output_node (struct lto_simple_output_block *ob, struct 
cgraph_node *node,
 
   streamer_write_enum (ob->main_stream, LTO_symtab_tags, LTO_symtab_last_tag,
   tag);
-  streamer_write_hwi_stream (ob->main_stream, node->order);
+
+  int order = flag_wpa ? *order_remap->get (node->order) : node->order;
+  streamer_write_hwi_stream (ob->main_stream, order);
 
   /* In WPA mode, we only output part of the call-graph.  Also, we
  fake cgraph node attributes.  There are two cases that we care.
@@ -585,7 +588,8 @@ lto_output_node (struct lto_simple_output_block *ob, struct 
cgraph_node *node,
 
 static void
 lto_output_varpool_node (struct lto_simple_output_block *ob, varpool_node 
*node,
-lto_symtab_encoder_t encoder)
+lto_symtab_encoder_t encoder,
+hash_map, int>* order_remap)
 {
   bool boundary_p = !lto_symtab_encoder_in_partition_p (encoder, node);
   bool encode_initializer_p
@@ -602,7 +606,8 @@ lto_output_varpool_node (struct lto_simple_output_block 
*ob, varpool_node *node,
 
   streamer_write_enum (ob->main_stream, LTO_symtab_tags, LTO_symtab_last_tag,
   LTO_symtab_variable);
-  streamer_write_hwi_stream (ob->main_stream, node->order);
+  int order = flag_wpa ? *order_remap->get (node->order) : node->order;
+  streamer_write_hwi_stream (ob->main_stream, order);
   lto_output_var_decl_ref (ob->decl_state, ob->main_stream, node->decl);
   bp = bitpack_create (ob->main_stream);
   bp_pack_value (&bp, node->externally_visible, 1);
@@ -967,7 +972,7 @@ compute_ltrans_boundary (lto_symtab_encoder_t in_encoder)
 /* Output the part of the symtab in SET and VSET.  */
 
 void
-output_symtab (void)
+output_symtab (hash_map, int>* order_remap)
 {
   struct cgraph_node *node;
   struct lto_simple_output_block *ob;
@@ -994,9 +999,10 @@ output_symtab (void)
 {
   symtab_node *node = lto_symtab_encoder_deref (encoder, i);
   if (cgraph_node *cnode = dyn_cast  (node))
-lto_output_node (ob, cnode, encoder);
+   lto_output_node (ob, cnode, encoder, order_remap);
   else
-   lto_output_varpool_node (ob, dyn_cast (node), encoder);
+   lto_output_varpool_node (ob, dyn_cast (node), encoder,
+order_remap);
 }
 
   /* Go over the nodes in SET again to write edges.  */
diff --git a/gcc/lto-streamer-out.cc b/gcc/lto-streamer-out.cc
index a1bbea8fc68..9448ab195d5 100644
--- a/gcc/lto-streamer-out.cc
+++ b/gcc/lto-streamer-out.cc
@@ -2212,7 +2212,8 @@ output_cfg (struct output_block *ob, struct function *fn)
a function, set FN to the decl for that function.  */
 
 void
-produce_asm (struct output_block *ob, tree fn)
+produce_asm (struct output_block *ob, tree fn,
+hash_map, int>* order_remap)
 {
   enum lto_section_type section_type = ob->section_type;
   struct lto_function_header header;
@@ -2221,9 +,11 @@ produce_asm (struct output_block *ob, tree fn)
   if (section_type == LTO_section_function_body)
 {
   const char *name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (fn));
-  section_name = lto_get_section_name (section_type, name,
- 

[PATCH 5/7] lto: Implement cache partitioning

2023-11-17 Thread Michal Jires
This patch implements new cache partitioning. It tries to keep symbols
from single source file together to minimize propagation of divergence.

It starts with symbols already grouped by source files. If reasonably
possible it only either combines several files into one final partition,
or, if a file is large, split the file into several final partitions.

Intermediate representation is partition_set which contains set of
groups of symbols (each group corresponding to original source file) and
number of final partitions this partition_set should split into.

First partition_fixed_split splits partition_set into constant number of
partition_sets with equal number of symbols groups. If for example there
are 39 source files, the resulting partition_sets will contain 10, 10,
10, and 9 source files. This splitting intentionally ignores estimated
instruction counts to minimize propagation of divergence.

Second partition_over_target_split separates too large files and splits
them into individual symbols to be combined back into several smaller
files in next step.

Third partition_binary_split splits partition_set into two halves until
it should be split into only one final partition, at which point the
remaining symbols are joined into one final partition.

Bootstrapped/regtested on x86_64-pc-linux-gnu

gcc/ChangeLog:

* common.opt: Add cache partitioning.
* flag-types.h (enum lto_partition_model): Likewise.

gcc/lto/ChangeLog:

* lto-partition.cc (new_partition): Use new_partition_no_push.
(new_partition_no_push): New.
(free_ltrans_partition): New.
(free_ltrans_partitions): Use free_ltrans_partition.
(join_partitions): New.
(split_partition_into_nodes): New.
(is_partition_reorder): New.
(class partition_set): New.
(distribute_n_partitions): New.
(partition_over_target_split): New.
(partition_binary_split): New.
(partition_fixed_split): New.
(class partitioner_base): New.
(class partitioner_default): New.
(lto_cache_map): New.
* lto-partition.h (lto_cache_map): New.
* lto.cc (do_whole_program_analysis): Use lto_cache_map.

gcc/testsuite/ChangeLog:

* gcc.dg/completion-2.c: Add -flto-partition=cache.
---
 gcc/common.opt  |   3 +
 gcc/flag-types.h|   3 +-
 gcc/lto/lto-partition.cc| 605 +++-
 gcc/lto/lto-partition.h |   1 +
 gcc/lto/lto.cc  |   2 +
 gcc/testsuite/gcc.dg/completion-2.c |   1 +
 6 files changed, 605 insertions(+), 10 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 1cf3bdd3b51..fe5cf3c0a05 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2174,6 +2174,9 @@ Enum(lto_partition_model) String(1to1) 
Value(LTO_PARTITION_1TO1)
 EnumValue
 Enum(lto_partition_model) String(max) Value(LTO_PARTITION_MAX)
 
+EnumValue
+Enum(lto_partition_model) String(cache) Value(LTO_PARTITION_CACHE)
+
 flto-partition=
 Common Joined RejectNegative Enum(lto_partition_model) Var(flag_lto_partition) 
Init(LTO_PARTITION_BALANCED)
 Specify the algorithm to partition symbols and vars at linktime.
diff --git a/gcc/flag-types.h b/gcc/flag-types.h
index c1852cd810c..59b3c23081b 100644
--- a/gcc/flag-types.h
+++ b/gcc/flag-types.h
@@ -393,7 +393,8 @@ enum lto_partition_model {
   LTO_PARTITION_ONE = 1,
   LTO_PARTITION_BALANCED = 2,
   LTO_PARTITION_1TO1 = 3,
-  LTO_PARTITION_MAX = 4
+  LTO_PARTITION_MAX = 4,
+  LTO_PARTITION_CACHE = 5
 };
 
 /* flag_lto_linker_output initialization values.  */
diff --git a/gcc/lto/lto-partition.cc b/gcc/lto/lto-partition.cc
index e4c91213f4b..eb31ecba0d3 100644
--- a/gcc/lto/lto-partition.cc
+++ b/gcc/lto/lto-partition.cc
@@ -36,6 +36,9 @@ along with GCC; see the file COPYING3.  If not see
 #include "lto-partition.h"
 #include "sreal.h"
 
+#include 
+#include 
+
 vec ltrans_partitions;
 
 static void add_symbol_to_partition (ltrans_partition part, symtab_node *node);
@@ -59,20 +62,41 @@ cmp_partitions_order (const void *a, const void *b)
   return orderb - ordera;
 }
 
-/* Create new partition with name NAME.  */
-
+/* Create new partition with name NAME.
+   Does not push into ltrans_partitions.  */
 static ltrans_partition
-new_partition (const char *name)
+new_partition_no_push (const char *name)
 {
   ltrans_partition part = XCNEW (struct ltrans_partition_def);
   part->encoder = lto_symtab_encoder_new (false);
   part->name = name;
   part->insns = 0;
   part->symbols = 0;
+  return part;
+}
+
+/* Create new partition with name NAME.  */
+
+static ltrans_partition
+new_partition (const char *name)
+{
+  ltrans_partition part = new_partition_no_push (name);
   ltrans_partitions.safe_push (part);
   return part;
 }
 
+/* Free memory used by ltrans partition.
+   Encoder can be kept to be freed after streaming.  */
+static void
+free_ltrans_partition (ltrans_partition part, bool delete_encoder)
+  {
+if (part->init

[PATCH 3/7] Lockfile.

2023-11-17 Thread Michal Jires
This patch implements lockfile used for incremental LTO.

Bootstrapped/regtested on x86_64-pc-linux-gnu

gcc/ChangeLog:

* Makefile.in: Add lockfile.o.
* lockfile.cc: New file.
* lockfile.h: New file.
---
 gcc/Makefile.in |   5 +-
 gcc/lockfile.cc | 136 
 gcc/lockfile.h  |  85 ++
 3 files changed, 224 insertions(+), 2 deletions(-)
 create mode 100644 gcc/lockfile.cc
 create mode 100644 gcc/lockfile.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 7b7a4ff789a..2c527245c81 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1831,7 +1831,7 @@ ALL_HOST_BACKEND_OBJS = $(GCC_OBJS) $(OBJS) 
$(OBJS-libcommon) \
   $(OBJS-libcommon-target) main.o c-family/cppspec.o \
   $(COLLECT2_OBJS) $(EXTRA_GCC_OBJS) $(GCOV_OBJS) $(GCOV_DUMP_OBJS) \
   $(GCOV_TOOL_OBJS) $(GENGTYPE_OBJS) gcc-ar.o gcc-nm.o gcc-ranlib.o \
-  lto-wrapper.o collect-utils.o
+  lto-wrapper.o collect-utils.o lockfile.o
 
 # for anything that is shared use the cc1plus profile data, as that
 # is likely the most exercised during the build
@@ -2359,7 +2359,8 @@ collect2$(exeext): $(COLLECT2_OBJS) $(LIBDEPS)
 CFLAGS-collect2.o += -DTARGET_MACHINE=\"$(target_noncanonical)\" \
@TARGET_SYSTEM_ROOT_DEFINE@
 
-LTO_WRAPPER_OBJS = lto-wrapper.o collect-utils.o ggc-none.o
+LTO_WRAPPER_OBJS = lto-wrapper.o collect-utils.o ggc-none.o lockfile.o
+
 lto-wrapper$(exeext): $(LTO_WRAPPER_OBJS) libcommon-target.a $(LIBDEPS)
+$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) -o T$@ \
   $(LTO_WRAPPER_OBJS) libcommon-target.a $(LIBS)
diff --git a/gcc/lockfile.cc b/gcc/lockfile.cc
new file mode 100644
index 000..9440e8938f3
--- /dev/null
+++ b/gcc/lockfile.cc
@@ -0,0 +1,136 @@
+/* File locking.
+   Copyright (C) 2009-2023 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+
+#include "lockfile.h"
+
+
+/* Unique write lock.  No other lock can be held on this lockfile.
+   Blocking call.  */
+int
+lockfile::lock_write ()
+{
+  fd = open (filename.c_str (), O_RDWR | O_CREAT, 0666);
+  if (fd < 0)
+return -1;
+
+#if HAVE_FCNTL_H
+  struct flock s_flock;
+
+  s_flock.l_whence = SEEK_SET;
+  s_flock.l_start = 0;
+  s_flock.l_len = 0;
+  s_flock.l_pid = getpid ();
+  s_flock.l_type = F_WRLCK;
+
+  while (fcntl (fd, F_SETLKW, &s_flock) && errno == EINTR)
+continue;
+#endif
+  return 0;
+}
+
+/* Unique write lock.  No other lock can be held on this lockfile.
+   Only locks if this filelock is not locked by any other process.
+   Return whether locking was successful.  */
+int
+lockfile::try_lock_write ()
+{
+  fd = open (filename.c_str (), O_RDWR | O_CREAT, 0666);
+  if (fd < 0)
+return -1;
+
+#if HAVE_FCNTL_H
+  struct flock s_flock;
+
+  s_flock.l_whence = SEEK_SET;
+  s_flock.l_start = 0;
+  s_flock.l_len = 0;
+  s_flock.l_pid = getpid ();
+  s_flock.l_type = F_WRLCK;
+
+  if (fcntl (fd, F_SETLK, &s_flock) == -1)
+{
+  close (fd);
+  fd = -1;
+  return 1;
+}
+#endif
+  return 0;
+}
+
+/* Shared read lock.  Only read lock can be held concurrently.
+   If write lock is already held by this process, it will be
+   changed to read lock.
+   Blocking call.  */
+int
+lockfile::lock_read ()
+{
+  fd = open (filename.c_str (), O_RDWR | O_CREAT, 0666);
+  if (fd < 0)
+return -1;
+
+#if HAVE_FCNTL_H
+  struct flock s_flock;
+
+  s_flock.l_whence = SEEK_SET;
+  s_flock.l_start = 0;
+  s_flock.l_len = 0;
+  s_flock.l_pid = getpid ();
+  s_flock.l_type = F_RDLCK;
+
+  while (fcntl (fd, F_SETLKW, &s_flock) && errno == EINTR)
+continue;
+#endif
+  return 0;
+}
+
+/* Unlock all previously placed locks.  */
+void
+lockfile::unlock ()
+{
+  if (fd < 0)
+{
+#if HAVE_FCNTL_H
+  struct flock s_flock;
+
+  s_flock.l_whence = SEEK_SET;
+  s_flock.l_start = 0;
+  s_flock.l_len = 0;
+  s_flock.l_pid = getpid ();
+  s_flock.l_type = F_UNLCK;
+
+  fcntl (fd, F_SETLK, &s_flock);
+#endif
+  close (fd);
+  fd = -1;
+}
+}
+
+/* Are lockfiles supported?  */
+bool
+lockfile::lockfile_supported ()
+{
+#if HAVE_FCNTL_H
+  return true;
+#else
+  return false;
+#endif
+}
diff --git a/gcc/lockfile.h b/gcc/lockfile.h
new file mode 100644
index 000..afcbaf599c1
--- /dev/null
+++ b/gcc/lockfile.h
@@ -0,0 +1,85 @@
+/* File locking.
+   Copyright (C) 2

[PATCH 4/7] lto: Implement ltrans cache

2023-11-17 Thread Michal Jires
This patch implements Incremental LTO as ltrans cache.

The cache is active when directory $GCC_LTRANS_CACHE is specified and exists.
Stored are pairs of ltrans input/output files and input file hash.
File locking is used to allow multiple GCC instances to use to same cache.

Bootstrapped/regtested on x86_64-pc-linux-gnu

gcc/ChangeLog:

* Makefile.in: Add lto-ltrans-cache.o.
* lto-wrapper.cc: Use ltrans cache.
* lto-ltrans-cache.cc: New file.
* lto-ltrans-cache.h: New file.
---
 gcc/Makefile.in |   5 +-
 gcc/lto-ltrans-cache.cc | 407 
 gcc/lto-ltrans-cache.h  | 164 
 gcc/lto-wrapper.cc  | 150 +--
 4 files changed, 711 insertions(+), 15 deletions(-)
 create mode 100644 gcc/lto-ltrans-cache.cc
 create mode 100644 gcc/lto-ltrans-cache.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 2c527245c81..495e5f3d069 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1831,7 +1831,7 @@ ALL_HOST_BACKEND_OBJS = $(GCC_OBJS) $(OBJS) 
$(OBJS-libcommon) \
   $(OBJS-libcommon-target) main.o c-family/cppspec.o \
   $(COLLECT2_OBJS) $(EXTRA_GCC_OBJS) $(GCOV_OBJS) $(GCOV_DUMP_OBJS) \
   $(GCOV_TOOL_OBJS) $(GENGTYPE_OBJS) gcc-ar.o gcc-nm.o gcc-ranlib.o \
-  lto-wrapper.o collect-utils.o lockfile.o
+  lto-wrapper.o collect-utils.o lockfile.o lto-ltrans-cache.o
 
 # for anything that is shared use the cc1plus profile data, as that
 # is likely the most exercised during the build
@@ -2359,7 +2359,8 @@ collect2$(exeext): $(COLLECT2_OBJS) $(LIBDEPS)
 CFLAGS-collect2.o += -DTARGET_MACHINE=\"$(target_noncanonical)\" \
@TARGET_SYSTEM_ROOT_DEFINE@
 
-LTO_WRAPPER_OBJS = lto-wrapper.o collect-utils.o ggc-none.o lockfile.o
+LTO_WRAPPER_OBJS = lto-wrapper.o collect-utils.o ggc-none.o lockfile.o \
+  lto-ltrans-cache.o
 
 lto-wrapper$(exeext): $(LTO_WRAPPER_OBJS) libcommon-target.a $(LIBDEPS)
+$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) -o T$@ \
diff --git a/gcc/lto-ltrans-cache.cc b/gcc/lto-ltrans-cache.cc
new file mode 100644
index 000..0d43e548fb3
--- /dev/null
+++ b/gcc/lto-ltrans-cache.cc
@@ -0,0 +1,407 @@
+/* File caching.
+   Copyright (C) 2009-2023 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "md5.h"
+#include "lto-ltrans-cache.h"
+
+#include 
+#include 
+#include 
+
+const md5_checksum_t INVALID_CHECKSUM = {
+  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+};
+
+/* Computes checksum for given file, returns INVALID_CHECKSUM if not possible.
+ */
+static md5_checksum_t
+file_checksum (char const *filename)
+{
+  FILE *file = fopen (filename, "rb");
+
+  if (!file)
+return INVALID_CHECKSUM;
+
+  md5_checksum_t result;
+
+  int ret = md5_stream (file, &result);
+
+  if (ret)
+result = INVALID_CHECKSUM;
+
+  fclose (file);
+
+  return result;
+}
+
+/* Checks identity of two files byte by byte.  */
+static bool
+files_identical (char const *first_filename, char const *second_filename)
+{
+  FILE *f_first = fopen (first_filename, "rb");
+  if (!f_first)
+return false;
+
+  FILE *f_second = fopen (second_filename, "rb");
+  if (!f_second)
+{
+  fclose (f_first);
+  return false;
+}
+
+  bool ret = true;
+
+  for (;;)
+{
+  int c1, c2;
+  c1 = fgetc (f_first);
+  c2 = fgetc (f_second);
+
+  if (c1 != c2)
+   {
+ ret = false;
+ break;
+   }
+
+  if (c1 == EOF)
+   break;
+}
+
+  fclose (f_first);
+  fclose (f_second);
+  return ret;
+}
+
+/* Contructor of cache item.  */
+ltrans_file_cache::item::item (std::string input, std::string output,
+  md5_checksum_t input_checksum, uint32_t last_used):
+  input (std::move (input)), output (std::move (output)),
+  input_checksum (input_checksum), last_used (last_used)
+{
+  lock = lockfile (this->input + ".lock");
+}
+/* Destructor of cache item.  */
+ltrans_file_cache::item::~item ()
+{
+  lock.unlock ();
+}
+
+/* Reads next cache item from cachedata file.
+   Adds `dir/` prefix to filenames.  */
+static ltrans_file_cache::item*
+read_cache_item (FILE* f, const char* dir)
+{
+  md5_checksum_t checksum;
+  uint32_t last_used;
+
+  if (fread (&checksum, 1, checksum.size (), f) != checksum.size ())
+return NULL;
+  if (fread (&last_used, sizeof (last_used), 1, f) != 1)
+return NULL;
+
+  std::vector input

[PATCH 2/7] lto: Remove random_seed from section name.

2023-11-17 Thread Michal Jires
Bootstrapped/regtested on x86_64-pc-linux-gnu

gcc/ChangeLog:

* lto-streamer.cc (lto_get_section_name): Remove random_seed in WPA.
---
 gcc/lto-streamer.cc | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/gcc/lto-streamer.cc b/gcc/lto-streamer.cc
index 4968fd13413..53275e32618 100644
--- a/gcc/lto-streamer.cc
+++ b/gcc/lto-streamer.cc
@@ -132,11 +132,17 @@ lto_get_section_name (int section_type, const char *name,
  doesn't confuse the reader with merged sections.
 
  For options don't add a ID, the option reader cannot deal with them
- and merging should be ok here. */
+ and merging should be ok here.
+
+ WPA output is sent to LTRANS directly inside of lto-wrapper, so name
+ uniqueness for external tools is not needed.
+ Randomness would inhibit incremental LTO.  */
   if (section_type == LTO_section_opts)
 strcpy (post, "");
   else if (f != NULL) 
 sprintf (post, "." HOST_WIDE_INT_PRINT_HEX_PURE, f->id);
+  else if (flag_wpa)
+strcpy (post, ".0");
   else
 sprintf (post, "." HOST_WIDE_INT_PRINT_HEX_PURE, get_random_seed (false)); 
   char *res = concat (section_name_prefix, sep, add, post, NULL);
-- 
2.42.1



[PATCH 1/7] lto: Skip flag OPT_fltrans_output_list_.

2023-11-17 Thread Michal Jires
Bootstrapped/regtested on x86_64-pc-linux-gnu

gcc/ChangeLog:

* lto-opts.cc (lto_write_options): Skip OPT_fltrans_output_list_.
---
 gcc/lto-opts.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/lto-opts.cc b/gcc/lto-opts.cc
index c9bee9d4197..0451e290c75 100644
--- a/gcc/lto-opts.cc
+++ b/gcc/lto-opts.cc
@@ -152,6 +152,7 @@ lto_write_options (void)
case OPT_fprofile_prefix_map_:
case OPT_fcanon_prefix_map:
case OPT_fwhole_program:
+   case OPT_fltrans_output_list_:
  continue;
 
default:
-- 
2.42.1



[PATCH 0/7] lto: Incremental LTO.

2023-11-17 Thread Michal Jires
Hi,
these patches implement Incremental LTO, specifically by caching results of
ltrans phase. Secondarily these patches contain changes to reduce divergence of
ltrans partitions so that they can be cached.

The aim is to reduce compile times for quick edit-compile cycles while using
LTO. Even with these minimal changes to the rest of GCC it works surprisingly
well. Currently testing by self compiling cc1, with individual commits used as
incremental changes, on average only ~1/3 of partitions need to be recompiled
with `-O2 -g0` and ~1/2 with `-O2 -g`. Which directly reduces time spent in
ltrans phase of LTO.

Unfortunately larger gains are a bit fragile. You may remember that during my
Cauldron talk I claimed reduction to ~1/6 and ~1/3 recompilations. That was
achieved with branch from March. Since then there were at least two commits
which introduced new divergence of partitions, though they seem fixable in
future.


[COMMITTED] Add myself to write after approval

2023-11-16 Thread Michal Jires
ChangeLog:

* MAINTAINERS: Add myself.
---
 MAINTAINERS | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index c43167d9a75..f0112f5d029 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -486,6 +486,7 @@ Fariborz Jahanian   

 Surya Kumari Jangala   
 Haochen Jiang  
 Qian Jianhua   
+Michal Jires   
 Janis Johnson  
 Teresa Johnson 
 Kean Johnston  
@@ -753,6 +754,7 @@ information.
 
 Robin Dapp 
 Robin Dapp 
+Michal Jires   
 Matthias Kretz 
 Tim Lange  
 Jeff Law   
-- 
2.42.1