Re: [PATCH 4/7] lto: Implement ltrans cache

2024-05-14 Thread Jan Hubicka
> This patch implements Incremental LTO as ltrans cache.
> 
> The cache is active when directory $GCC_LTRANS_CACHE is specified and exists.
> Stored are pairs of ltrans input/output files and input file hash.
> File locking is used to allow multiple GCC instances to use to same cache.
> 
> Bootstrapped/regtested on x86_64-pc-linux-gnu
> 
> gcc/ChangeLog:
> 
>   * Makefile.in: Add lto-ltrans-cache.o.
>   * lto-wrapper.cc: Use ltrans cache.
>   * lto-ltrans-cache.cc: New file.
>   * lto-ltrans-cache.h: New file.
> diff --git a/gcc/lto-ltrans-cache.cc b/gcc/lto-ltrans-cache.cc
> new file mode 100644
> index 000..0d43e548fb3
> --- /dev/null
> +++ b/gcc/lto-ltrans-cache.cc
> @@ -0,0 +1,407 @@
> +/* File caching.
> +   Copyright (C) 2009-2023 Free Software Foundation, Inc.

Probably copyright should be 2023-2024
> +const md5_checksum_t INVALID_CHECKSUM = {
Maybe static here? Officially there should be comment before the
function.
> +  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> +};
> +
> +/* Computes checksum for given file, returns INVALID_CHECKSUM if not 
> possible.
> + */
comment would look more regular if linebreak is made before possible :)
> +
> +/* Checks identity of two files byte by byte.  */
> +static bool
> +files_identical (char const *first_filename, char const *second_filename)
> +{
> +  FILE *f_first = fopen (first_filename, "rb");
> +  if (!f_first)
> +return false;
> +
> +  FILE *f_second = fopen (second_filename, "rb");
> +  if (!f_second)
> +{
> +  fclose (f_first);
> +  return false;
> +}
> +
> +  bool ret = true;
> +
> +  for (;;)
> +{
> +  int c1, c2;
> +  c1 = fgetc (f_first);
> +  c2 = fgetc (f_second);

I guess reading by fgetc may get quite ineffecient here.  Comparing
bigger blocks is probably going to be faster.  We could also
(incrementally) use mmap where supported.
> +
> +/* Contructor of cache item.  */
> +ltrans_file_cache::item::item (std::string input, std::string output,
> +  md5_checksum_t input_checksum, uint32_t last_used):
Here should be enough whitespace so md5_checksum appears just after ( in
line above
  md5_checksum_t input_checksum, uint32_t 
last_used):
> +  input (std::move (input)), output (std::move (output)),
> +  input_checksum (input_checksum), last_used (last_used)
> +{
> +  lock = lockfile (this->input + ".lock");
> +}
> +/* Destructor of cache item.  */
> +ltrans_file_cache::item::~item ()
> +{
> +  lock.unlock ();
> +}
> +
> +/* Reads next cache item from cachedata file.
> +   Adds `dir/` prefix to filenames.  */
> +static ltrans_file_cache::item*
> +read_cache_item (FILE* f, const char* dir)
> +{
> +  md5_checksum_t checksum;
> +  uint32_t last_used;
> +
> +  if (fread (&checksum, 1, checksum.size (), f) != checksum.size ())
> +return NULL;
> +  if (fread (&last_used, sizeof (last_used), 1, f) != 1)
> +return NULL;
> +
> +  std::vector input (strlen (dir));
> +  memcpy (&input[0], dir, input.size ());
> +  input.push_back ('/');
Why this is not std::string?
> +  /* Loads data about previously cached items from cachedata file.
> +
> + Must be called with creation_lock or deletion_lock held to
> + prevent data race.  */
> +  void
> +  load_cache ();
There should be no newline between type and name.  It is there only when
defining function (so it is easy to use old-school grep to find where
function is defined.)

Looks good to me otherwise.
Honza


[PATCH 4/7] lto: Implement ltrans cache

2023-11-17 Thread Michal Jires
This patch implements Incremental LTO as ltrans cache.

The cache is active when directory $GCC_LTRANS_CACHE is specified and exists.
Stored are pairs of ltrans input/output files and input file hash.
File locking is used to allow multiple GCC instances to use to same cache.

Bootstrapped/regtested on x86_64-pc-linux-gnu

gcc/ChangeLog:

* Makefile.in: Add lto-ltrans-cache.o.
* lto-wrapper.cc: Use ltrans cache.
* lto-ltrans-cache.cc: New file.
* lto-ltrans-cache.h: New file.
---
 gcc/Makefile.in |   5 +-
 gcc/lto-ltrans-cache.cc | 407 
 gcc/lto-ltrans-cache.h  | 164 
 gcc/lto-wrapper.cc  | 150 +--
 4 files changed, 711 insertions(+), 15 deletions(-)
 create mode 100644 gcc/lto-ltrans-cache.cc
 create mode 100644 gcc/lto-ltrans-cache.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 2c527245c81..495e5f3d069 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1831,7 +1831,7 @@ ALL_HOST_BACKEND_OBJS = $(GCC_OBJS) $(OBJS) 
$(OBJS-libcommon) \
   $(OBJS-libcommon-target) main.o c-family/cppspec.o \
   $(COLLECT2_OBJS) $(EXTRA_GCC_OBJS) $(GCOV_OBJS) $(GCOV_DUMP_OBJS) \
   $(GCOV_TOOL_OBJS) $(GENGTYPE_OBJS) gcc-ar.o gcc-nm.o gcc-ranlib.o \
-  lto-wrapper.o collect-utils.o lockfile.o
+  lto-wrapper.o collect-utils.o lockfile.o lto-ltrans-cache.o
 
 # for anything that is shared use the cc1plus profile data, as that
 # is likely the most exercised during the build
@@ -2359,7 +2359,8 @@ collect2$(exeext): $(COLLECT2_OBJS) $(LIBDEPS)
 CFLAGS-collect2.o += -DTARGET_MACHINE=\"$(target_noncanonical)\" \
@TARGET_SYSTEM_ROOT_DEFINE@
 
-LTO_WRAPPER_OBJS = lto-wrapper.o collect-utils.o ggc-none.o lockfile.o
+LTO_WRAPPER_OBJS = lto-wrapper.o collect-utils.o ggc-none.o lockfile.o \
+  lto-ltrans-cache.o
 
 lto-wrapper$(exeext): $(LTO_WRAPPER_OBJS) libcommon-target.a $(LIBDEPS)
+$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) -o T$@ \
diff --git a/gcc/lto-ltrans-cache.cc b/gcc/lto-ltrans-cache.cc
new file mode 100644
index 000..0d43e548fb3
--- /dev/null
+++ b/gcc/lto-ltrans-cache.cc
@@ -0,0 +1,407 @@
+/* File caching.
+   Copyright (C) 2009-2023 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "md5.h"
+#include "lto-ltrans-cache.h"
+
+#include 
+#include 
+#include 
+
+const md5_checksum_t INVALID_CHECKSUM = {
+  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+};
+
+/* Computes checksum for given file, returns INVALID_CHECKSUM if not possible.
+ */
+static md5_checksum_t
+file_checksum (char const *filename)
+{
+  FILE *file = fopen (filename, "rb");
+
+  if (!file)
+return INVALID_CHECKSUM;
+
+  md5_checksum_t result;
+
+  int ret = md5_stream (file, &result);
+
+  if (ret)
+result = INVALID_CHECKSUM;
+
+  fclose (file);
+
+  return result;
+}
+
+/* Checks identity of two files byte by byte.  */
+static bool
+files_identical (char const *first_filename, char const *second_filename)
+{
+  FILE *f_first = fopen (first_filename, "rb");
+  if (!f_first)
+return false;
+
+  FILE *f_second = fopen (second_filename, "rb");
+  if (!f_second)
+{
+  fclose (f_first);
+  return false;
+}
+
+  bool ret = true;
+
+  for (;;)
+{
+  int c1, c2;
+  c1 = fgetc (f_first);
+  c2 = fgetc (f_second);
+
+  if (c1 != c2)
+   {
+ ret = false;
+ break;
+   }
+
+  if (c1 == EOF)
+   break;
+}
+
+  fclose (f_first);
+  fclose (f_second);
+  return ret;
+}
+
+/* Contructor of cache item.  */
+ltrans_file_cache::item::item (std::string input, std::string output,
+  md5_checksum_t input_checksum, uint32_t last_used):
+  input (std::move (input)), output (std::move (output)),
+  input_checksum (input_checksum), last_used (last_used)
+{
+  lock = lockfile (this->input + ".lock");
+}
+/* Destructor of cache item.  */
+ltrans_file_cache::item::~item ()
+{
+  lock.unlock ();
+}
+
+/* Reads next cache item from cachedata file.
+   Adds `dir/` prefix to filenames.  */
+static ltrans_file_cache::item*
+read_cache_item (FILE* f, const char* dir)
+{
+  md5_checksum_t checksum;
+  uint32_t last_used;
+
+  if (fread (&checksum, 1, checksum.size (), f) != checksum.size ())
+return NULL;
+  if (fread (&last_used, sizeof (last_used), 1, f) != 1)
+return NULL;
+
+  std::vector input