Dear EasyBuilders,
I would like to build a TensorFlow module supporting GPUs. Currently, that
looks to be TensorFlow-1.12.0-fosscuda-2018b-Python-3.6.6.eb, but this requires
building a new toolchain (fosscuda), including rebuilding both OpenMPI and
Python with GPU support. In addition, any ot
Hi,
I made a TensorFlow easyconfig a while ago depending on Python with the foss
toolchain; and including a variant with GPU support (PR 4904). The latter has
not yet been merged, probably because it is annoying to have something that can
only build on a machine with a GPU (it fails the sanity
Howdy Jakob,
The primary difference between fosscuda and
foss+CUDA is that fosscuda has an OpenMPI built
with CUDA support where as the latter does not.
We run with:
EASYBUILD_MINIMAL_TOOLCHAINS
which cuts down on the number of things that
have be rebuilt here For example for
TensorFlow/1.10
Thank you!
I do not think OpenMPI with cuda support is particularly relevant for us. I
will read the docs and try to understand what --minmal-toolchains does.
Thanks for your suggestions
Jakob
> On 26 Mar 2019, at 14:41, Jack Perdue wrote:
>
> Howdy Jakob,
>
> The primary difference betwe
Hi Jakob,
I installed Tensorflow in my cluster few days ago modifying your
easyconfigs. I have just sent two PR with the two easyconfigs I installed:
https://github.com/easybuilders/easybuild-easyconfigs/pull/5590
https://github.com/easybuilders/easybuild-easyconfigs/pull/5591
I used cuDDN 6.0
Dear Jakob,
On 04/01/2018 10:23, Jakob Schiøtz wrote:
Hi,
I made a TensorFlow easyconfig a while ago depending on Python with the foss
toolchain; and including a variant with GPU support (PR 4904). The latter has
not yet been merged, probably because it is annoying to have something that can
On 18-01-04 04:23, Jakob Schiøtz wrote:
Hi,
I made a TensorFlow easyconfig a while ago depending on Python with the foss
toolchain; and including a variant with GPU support (PR 4904). The latter has
not yet been merged, probably because it is annoying to have something that can
only build on
Dear Kenneth, Pablo and Maxime,
Thanks for your feedback. Yes, I will try to see if I can build from source,
but I will focus on the foss toolchain since we use that one for our Python
here (we do not have the Intel MPI license, and the iomkl toolchain could not
built Python last time I tried)
Hi Kenneth,
Is it possible that you forgot to check in the patches
TensorFlow-1.4.0_swig-env.patch and TensorFlow-1.4.0_no-enum34.patch in your
PR? Attempting to build TensorFlow fails because it cannot find these.
Best regards
Jakob
> On 4 Jan 2018, at 16:37, Jakob Schiøtz wrote:
>
> D
Hi Jakob,
On 05/01/2018 13:19, Jakob Schiøtz wrote:
Hi Kenneth,
Is it possible that you forgot to check in the patches
TensorFlow-1.4.0_swig-env.patch and TensorFlow-1.4.0_no-enum34.patch in your
PR? Attempting to build TensorFlow fails because it cannot find these.
The patch files are ava
On 04/01/2018 16:37, Jakob Schiøtz wrote:
Dear Kenneth, Pablo and Maxime,
Thanks for your feedback. Yes, I will try to see if I can build from source,
but I will focus on the foss toolchain since we use that one for our Python
here (we do not have the Intel MPI license, and the iomkl toolchai
Hi again,
Yes, I have overlooked that - I just switched my repo to your branch and tried
to build :-)
Now I get an error when building TensorFlow. It is a 502 Bad Gateway,
indicating that some server is down somewhere. But is it not a problem that
the build process itself tried to download e
On 05/01/2018 14:13, Jakob Schiøtz wrote:
Hi again,
Yes, I have overlooked that - I just switched my repo to your branch and tried
to build :-)
Now I get an error when building TensorFlow. It is a 502 Bad Gateway,
indicating that some server is down somewhere. But is it not a problem that
> On 5 Jan 2018, at 15:18, Kenneth Hoste wrote:
>
> On 05/01/2018 14:13, Jakob Schiøtz wrote:
>> Hi again,
>>
>> Yes, I have overlooked that - I just switched my repo to your branch and
>> tried to build :-)
>>
>> Now I get an error when building TensorFlow. It is a 502 Bad Gateway,
>> ind
Hi again, Kenneth.
It turns out that I was wrong about the lack of internet access from the
compute nodes. In principle, there should be nothing stopping me from testing
building with GPUs next week, except for my lack of knowledge :-)
I see this in the easyblock:
def extra_options():
Hi Kenneth,
I have now tested your TensorFlow 1.4.0 eb on our machines with a real-world
script. It works, but it runs three times slower than with the prebuild
TensorFlow 1.2.1 :-(
The prebuild version complains that it was build without AVX2 etc, so I do not
really understand why it is so
Hi Jakob,
On 05/01/2018 16:10, Jakob Schiøtz wrote:
On 5 Jan 2018, at 15:18, Kenneth Hoste wrote:
On 05/01/2018 14:13, Jakob Schiøtz wrote:
Hi again,
Yes, I have overlooked that - I just switched my repo to your branch and tried
to build :-)
Now I get an error when building TensorFlow.
On 05/01/2018 17:28, Jakob Schiøtz wrote:
Hi again, Kenneth.
It turns out that I was wrong about the lack of internet access from the
compute nodes. In principle, there should be nothing stopping me from testing
building with GPUs next week, except for my lack of knowledge :-)
I see this i
On 08/01/2018 15:48, Jakob Schiøtz wrote:
Hi Kenneth,
I have now tested your TensorFlow 1.4.0 eb on our machines with a real-world
script. It works, but it runs three times slower than with the prebuild
TensorFlow 1.2.1 :-(
The prebuild version complains that it was build without AVX2 etc,
> On 8 Jan 2018, at 20:27, Kenneth Hoste wrote:
>
> On 08/01/2018 15:48, Jakob Schiøtz wrote:
>> Hi Kenneth,
>>
>> I have now tested your TensorFlow 1.4.0 eb on our machines with a real-world
>> script. It works, but it runs three times slower than with the prebuild
>> TensorFlow 1.2.1 :-(
On 08/01/2018 21:28, Jakob Schiøtz wrote:
On 8 Jan 2018, at 20:27, Kenneth Hoste wrote:
On 08/01/2018 15:48, Jakob Schiøtz wrote:
Hi Kenneth,
I have now tested your TensorFlow 1.4.0 eb on our machines with a real-world
script. It works, but it runs three times slower than with the prebuild
Dear Kenneth,
Now I have figured out what goes wrong, but not why.
I am running with Python 3.6.3 compiled with foss/2017b. I have two versions
of Tensorflow 1.4; the one built from source using your .eb, and a “binary”
variant which is installing from the binary package just like my 1.2.1
ea
22 matches
Mail list logo