Re: [ccache] Using a shared ccache in cmake environment (linux)

2020-03-24 Thread Joel Rosdahl via ccache
On Mon, 23 Mar 2020 at 17:11, Steffen Dettmer via ccache
 wrote:
> Just BTW, isn't it common to build in some $builddir different from top
> $srcdir (e.g. automake, cmake) and in that case couldn't the common case need
> two base directories?

Note that base_dir doesn't have to be the top source directory – it can be any
parent directory of the source and build directories, for instance /home. But
sure, it would make sense to be able to specify several base_dir directories,
for instance if you build in /top_level_1 and have the source code in
/top_level_2. (Using "base_dir = /" works as well but has the side effect of
making paths to system include files in /usr relative as well, which isn't
optimal.)

I could have sworn that there already exists an issue about implementing this
but I can't find it so it seems I've only thought about it without writing it
down.

> This is again is a great idea. Will clean recover from corrupted caches,
> or should I add some script like "when each cache value is zero, clear it"?

"ccache -c" first recalculates the size counters and then trims the cache if
needed, so it should be fine.

> Thank you for your great support again!!

You're welcome!

-- Joel

___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Using a shared ccache in cmake environment (linux)

2020-03-23 Thread Steffen Dettmer via ccache
Hi,

thank you so much for your so valuable input!!

On Sat, Mar 21, 2020 at 2:19 PM Joel Rosdahl  wrote:
>
> On Tue, 17 Mar 2020 at 10:06, Steffen Dettmer via ccache
>  wrote:
> > As workaround for a special unrelated issue currently we redefine
> > __FILE__ (and try to remove that redefinition). I understand that
> > ccache still works thanks to CCACHE_BASEDIR even for __FILE__ usage
> > inside files. Is that correct?
>
> Yes, if basedir is a prefix of the the source file path then __FILE__
> will expand to a relative path since ccache passes a relative source
> file path to the compiler.

Thanks for confirmation. For us this now seems to work fine!

Just BTW, isn't it common to build in some $builddir different from
top $srcdir (e.g. automake, cmake) and in that case couldn't the
common case need two base directories?
(for me its no problem, I just build in a subdir below BASEDIR, but
there are team mates who hate this and build in ramdisk or so).

> > I understood that CCACHE_SLOPPINESS=file_macro means that cache
> > results may be used even if __FILE__ is different, i.e. using a
> > __FILE__ from another user (fine for our usecases), is this correct?
>
> That used to be the case, but the file_macro sloppiness was removed in
> 3.7.6; see .

Ahh, thanks for the pointer. I think I now remember that someone posted
about hunting some strange bug down to disassembly only to find something
like that as a cause. Indeed, such case once cannot be ever saved by
reduced compilation times. Good that you fixed it.

> > How to find a reasonable max_size? For now I just arbitrarily picked
> > 25 GB (approximately the build tree size) and I never saw it "full"
> > according to ccache -s.
>
> "cache size" will never reach "max cache size", so that is not a good
> indicator of whether "max cache size" is enough. See
>  for details on
> how automatic cache cleanup works. The TLDR is that "cache size" will
> stay around 90% (assuming limit_multiple is the default 0.8) of "max
> cache size" in steady state. This is because each of the 16
> subdirectories will be between 80% and 100% full with uniform
> probability.

Thanks for your great explanation! I read this (and almost understood it
right, but just almost) and with "full" I meant more than 80%
(I just saw slightly over 50%).

> Instead you can have a look at "cleanups performed". Each time a
> subdirectory gets full that counter will increase by 1.

Ahh, this of course is a great idea, of course. I will watch.
(Actually I wonder why I didn't had it in first place)

> Especially with network caches it might be a good idea to disable
> automatic cleanup and instead perform explicit cleanup periodically on
> one server, preferably the server that hosts the filesystem. That way
> the cleanup won't happen over slow network and several clients won't
> compete to clean up. One way of doing this is to set an unlimited cache
> size and then run something like "CCACHE_MAXSIZE=25G ccache -c"
> periodically the server.

This is again is a great idea. Will clean recover from corrupted caches,
or should I add some script like "when each cache value is zero, clear it"?
I think I could set a high "safety" value for the Jenkins user (ran locally)
and have a smaller periodic cleanup after the nightly builds, later.

> > Is sharing via CIFS possibly at all or could it have bad effects?
>
> Don't know, but I wouldn't be surprised if ccache's locking doesn't work
> properly with SMB/CIFS. Locking is based on creating symlinks atomically
> and I guess that doesn't translate well to Windows filesystems.

Thanks for your clarification. I disabled the Samba share (just needed
to reconfigure two repositories driving auto-updating docker containers,
isn't it simple nowadays lol) and now it seems to run very well. I guess
CIFS was the root of all our issues.

> > Are cache and/or stats version dependent?
>
> The cache data and stats files are intended to be backward and forward
> compatible from ccache 3.2.

ahh that's good to know, so in case someone accidentally uses a wrong
version, we shouldn't face issues. Great work!

> > I'm also still facing scmake issues (using "physical" and "logical" in
> > several mixed combinations). Complex topic.
>
> What is scmake?

A typo! cmake what was I wanted to write :)

Our issue somehow it that there are cases where cmake
uses physical path instead of logical and then BASEDIR
isn't matching.
I didn't yet understand the whole topic, I need to read
when I have a bit more time
(https://discourse.cmake.org/t/when-does-cmake-current-binary-dir-resolve-symlinks/809)

Thank you for your great support again!!

Steffen

___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Using a shared ccache in cmake environment (linux)

2020-03-21 Thread Joel Rosdahl via ccache
On Tue, 17 Mar 2020 at 10:06, Steffen Dettmer via ccache
 wrote:
> As workaround for a special unrelated issue currently we redefine
> __FILE__ (and try to remove that redefinition). I understand that
> ccache still works thanks to CCACHE_BASEDIR even for __FILE__ usage
> inside files. Is that correct?

Yes, if basedir is a prefix of the the source file path then __FILE__
will expand to a relative path since ccache passes a relative source
file path to the compiler.

> I understood that CCACHE_SLOPPINESS=file_macro means that cache
> results may be used even if __FILE__ is different, i.e. using a
> __FILE__ from another user (fine for our usecases), is this correct?

That used to be the case, but the file_macro sloppiness was removed in
3.7.6; see .

> How to find a reasonable max_size? For now I just arbitrarily picked
> 25 GB (approximately the build tree size) and I never saw it "full"
> according to ccache -s.

"cache size" will never reach "max cache size", so that is not a good
indicator of whether "max cache size" is enough. See
 for details on
how automatic cache cleanup works. The TLDR is that "cache size" will
stay around 90% (assuming limit_multiple is the default 0.8) of "max
cache size" in steady state. This is because each of the 16
subdirectories will be between 80% and 100% full with uniform
probability.

Instead you can have a look at "cleanups performed". Each time a
subdirectory gets full that counter will increase by 1.

> On build servers we usually run "make -j 25" (24 cores). Often,
> several such jobs are running by different users (and Jenkins;
> sometimes 400 compiler processes or even more). I assume ccache of
> course safely handles parallel invocation, is this correct?

Yes, assuming that ccache's locking works on the filesystem in question.
(Locking is only used for the stats files; the actual cache data files
are handled via atomic renames.) The cache files can however get corrupt
if a server crashes, depending on how writeback/journaling of filesystem
metadata is configured.

Especially with network caches it might be a good idea to disable
automatic cleanup and instead perform explicit cleanup periodically on
one server, preferably the server that hosts the filesystem. That way
the cleanup won't happen over slow network and several clients won't
compete to clean up. One way of doing this is to set an unlimited cache
size and then run something like "CCACHE_MAXSIZE=25G ccache -c"
periodically the server.

> We have some team mates that have slow (old) laptops only. They
> benefit from using a network shared ccache. Technically, they "mount
> -t cifs" the cache_dir (NFS is firewalled unfortunately). We have
> different Ubuntu/Mint/Debian/Devuan machines, but exactly the same
> compilers (own toolchains).
>
> Is sharing via CIFS possibly at all or could it have bad effects?

Don't know, but I wouldn't be surprised if ccache's locking doesn't work
properly with SMB/CIFS. Locking is based on creating symlinks atomically
and I guess that doesn't translate well to Windows filesystems.

> One issue that occures from time to time is that the ccache -s stats
> become zero (all values expect max cache size are 0). I first didn't
> notice because stats are shared so I assume someone zeroed the stats,
> but with alternate directories we found that it sometimes happens
> without ccache -s. "du -hs $CCACHE_DIR" still shows gigabytes used. We
> didn't find a cause yet, but several candidates exist.

Sounds like the stats file locking fails...

> Can ccache be used on CIFS?

Answered above.

> Are cache and/or stats version dependent?

The cache data and stats files are intended to be backward and forward
compatible from ccache 3.2.

> A few times we noticed that ccache -s reports few GB size but "du -hs"
> reports 40 or 50 GB, although "max_size = 25.0G". Is this expected?

No. But if stats file locking doesn't work then this could happen.

> I'm also still facing scmake issues (using "physical" and "logical" in
> several mixed combinations). Complex topic.

What is scmake?

-- Joel

___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Using a shared ccache in cmake environment (linux)

2020-03-17 Thread Steffen Dettmer via ccache
Hi,

* On Tue, Mar 17, 2020 at 3:54 PM Paul Smith  wrote:
> You don't say which compiler you're using but if you're using GCC you
> can consider using the -ffile-prefix-map option to avoid these issues.

Thanks for the tip! Your are right, this is for gcc, sorry that I
didn't mention.

Unfortunately only gcc-8, but also I have to support 4.4.3, 4.5.1 and others.
These have only -fdebug-prefix-map (which I use).

Steffen

___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Using a shared ccache in cmake environment (linux)

2020-03-17 Thread Paul Smith via ccache
On Mon, 2020-03-16 at 20:26 +0100, Steffen Dettmer via ccache wrote:
> As workaround for a special unrelated issue currently we redefine
> __FILE__ (and try to remove that redefinition). I understand that
> ccache still works thanks to CCACHE_BASEDIR even for __FILE__ usage
> inside files. Is that correct?

You don't say which compiler you're using but if you're using GCC you
can consider using the -ffile-prefix-map option to avoid these issues.

For clang I think they have the -fmacro-prefix-map and
-fdebug-prefix-map options but not -ffile-prefix-map.


___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


[ccache] Using a shared ccache in cmake environment (linux)

2020-03-17 Thread Steffen Dettmer via ccache
Hi,

I setup ccache with shared cache for some of our projects and I would
like to learn how to do it correctly.

Projects here have 1-2 MLOC in 3-10k files (mostly C++) and are built
via cmake files. We have around 20 devs active typically plus Jenkins
and mostly they compile the same inputs (HEADs of active branches with
only their few changes) in various configurations (targets,
debug/release...), in total build output can be 5-30 GB. I made some
tests and found ccache be very efficient (reducing total build
duration by ten or so, depends on many factors of course). However, I
still have some issues.

Setup:
using cmake wrapper scripts that exports:
- CCACHE_BASEDIR
- CCACHE_SLOPPINESS=file_macro,time_macros
- CCACHE_CPP2=set

and havine ccache.conf like:
  max_size = 25.0G
  # find $CCACHE_DIR -type d | xargs chmod g+s
  cache_dir=/local/users/zcone-pisint/tmp/ccache
  hard_link=false
  umask=002

The straight case ("ccache -Cz && make clean all && make clean all &&
ccache -s") works as expected, first build slow, second very fast, 50%
hits.

Is this so far reasonable?

As workaround for a special unrelated issue currently we redefine
__FILE__ (and try to remove that redefinition). I understand that
ccache still works thanks to CCACHE_BASEDIR even for __FILE__ usage
inside files. Is that correct?

I understood that CCACHE_SLOPPINESS=file_macro means that cache
results may be used even if __FILE__ is different, i.e. using a
__FILE__ from another user (fine for our usecases), is this correct?
NB: unfortunately cmake uses absolute paths, so __FILE__ contains user
specific information (currently we redefine it not to do so, but we
might drop this, because it harms other things).

How to find a reasonable max_size?
For now I just arbitrarily picked 25 GB (approximately the build tree
size) and I never saw it "full" according to ccache -s.

On build servers we usually run "make -j 25" (24 cores). Often,
several such jobs are running by different users (and Jenkins;
sometimes 400 compiler processes or even more). I assume ccache of
course safely handles parallel invocation, is this correct?

We have some team mates that have slow (old) laptops only. They
benefit from using a network shared ccache. Technically, they "mount
-t cifs" the cache_dir (NFS is firewalled unfortunately). We have
different Ubuntu/Mint/Debian/Devuan machines, but exactly the same
compilers (own toolchains).

Is sharing via CIFS possibly at all or could it have bad effects?

One issue that occures from time to time is that the ccache -s stats
become zero (all values expect max cache size are 0). I first didn't
notice because stats are shared so I assume someone zeroed the stats,
but with alternate directories we found that it sometimes happens
without ccache -s. "du -hs $CCACHE_DIR" still shows gigabytes used. We
didn't find a cause yet, but several candidates exist.

Can ccache be used on CIFS?
Are cache and/or stats version dependent?
I tried to deploy the same ccache everywhere (3.7.7, now
3.7.7+8_ge65d6c92), but maybe there is some host somewhere with an
older version, hard to say.

A few times we noticed that ccache -s reports few GB size but "du -hs"
reports 40 or 50 GB, although "max_size = 25.0G". Is this expected?
Could be a follow up problem from the one before.

I'm also still facing scmake issues (using "physical" and "logical" in
several mixed combinations). Complex topic.

Any information / hints / pointers are appreciated!


Best regards,
Steffen

___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache