(please keep Alexander in the loop)
On 27/05/2021 10:34, Loris Bennett wrote:
Ole Holm Nielsen <ole.h.niel...@fysik.dtu.dk> writes:
On 5/27/21 9:48 AM, Alexander Grund wrote:
The EB log file reports an error:
//tensorflow/core/common_runtime:graph_constructor_test FAILED TO BUILD
and the log file ends with:
Executed 137 out of 814 tests: 137 tests pass, 1 fails to build and 676 were
skipped.
FAILED: Build did NOT complete successfully
This is a build failure, so something we should fix or at least find the
cause.
Please check the log, there should be something about why/how it failed to
compile. Just search for the name and scroll a bit around. If you attach it, I
can also take a look.
The EB log file is 205 MB, so it's hard to share :-(
I have this environment:
export EASYBUILD_BUILDPATH=/run/user/$UID/eb_build
ulimit -s 2000240
export EASYBUILD_TMPDIR=/scratch/$USER
and there is quite a bit of space available:
$ df -h /run/user/$UID/eb_build /scratch
Filesystem Size Used Avail Use% Mounted on
tmpfs 19G 19G 30M 100% /run/user/983
/dev/mapper/VolGroup00-lv_scratch 850G 675M 849G 1% /scratch
...
/home/modules/software/binutils/2.35-GCCcore-10.2.0/bin/ld.gold: fatal error:
bazel-out/k8-opt/bin/tensorflow/core/common_runtime/graph_constructor_test: No
space left on device
What device might that be? As shown above, I have quite a bit of disk space.
Is /tmp being used and getting full?
This might be the case. In the past I ran into this problem and solved
it with the following:
eb TensorFlow-1.15.0-fosscuda-2019b-Python-3.7.4.eb --robot
--cuda-compute-capabilities=6.1,7.5 --buildpath=/dev/shm
--tmpdir=/scratch/eb-build
Hmm, this surprises me a bit, because I think we make an effort to avoid
that Bazel is using /tmp for too many things, and we tell it to use the
build directory instead...
Please try using --tmpdir to specify an alternate directory than /tmp,
and see if that helps at all.
Alexandre: should we look for patterns like "No space left on device" in
the Bazel output and highlight them better, perhaps with a concrete
suggestion to use --tmpdir to avoid the usage of /tmp?
regards,
Kenneth
YMMV
Cheers,
Loris
I'd also suggest to join Slack as discussions there are potentially faster.
I'll take a look - are there instructions for Slack?
Thanks,
Ole