Dear Åke and Kenneth,,

Thank you very much for your replies.

El jue, 3 jun 2021 a las 4:00, Kenneth Hoste (<kenneth.ho...@ugent.be>)
escribió:

> Dear Agustín,
>
> The fundemental problem is indeed that you're building software on one
> type of CPU, and then trying to run it on another.
>
> Can you share some more details on what type of CPU is in the master
> node and slave nodes?
>
> If you can, try using the archspec tool (see
> https://github.com/archspec/archspec, install with "pip3 install
> archspec", then run "archspec cpu").
>
> Or share the output of the following commands:
>
> grep 'model name' /proc/cpuinfo  | head -1


> grep flags /proc/cpuinfo | head -1
>

Master node:

model name : Dual-Core AMD Opteron(tm) Processor 2214

flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm
3dnowext 3dnow rep_good nopl cpuid extd_apicid pni cx16 lahf_lm cmp_legacy
svm extapic cr8_legacy 3dnowprefetch vmmcall


Slaves:

model name : Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz

flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb
rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology
nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx
est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe
popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm
3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin
ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase
tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a rdseed adx
smap intel_pt xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local
dtherm ida arat pln pts md_clear flush_l1d


You can also try controlling the optimizations that EasyBuild does by
> default, to prevent that it builds for the specific CPU in the build
> node, using "eb --optarch=GENERIC", see
>
> https://docs.easybuild.io/en/latest/Controlling_compiler_optimization_flags.html
> .
>

I tried doing

eb PySCF-2.0.0a-foss-2020b-Python-3.8.6.eb --optarch=GENERIC -r --force

but the problem is still the same. Maybe the problem is not in this
particular code (PySCF) but in some of its dependencies. Is there something
like a "--force" flag to force dependencies to recompile?



> George's suggestion is better/easier though: building on the oldest node
> should help you too...
>

I tried this a couple of days ago, but it didn't resolve the problem. In
fact: when doing so, I cannot run the code in master (as expected) but I
can neither run it in slaves...

regards,
>
> Kenneth
>


Thank you for your help!

Agustín



> On 02/06/2021 22:20, Agustín Aucar wrote:
> > Dear George,
> >
> > Thanks for your response. A few days ago, I tried to compile the code in
> > a slave node, but it didn't solve the problem...
> >
> > Best,
> > Agustín
> >
> > El mié, 2 jun 2021 a las 11:41, George Tsouloupas
> > (<g.tsoulou...@cyi.ac.cy <mailto:g.tsoulou...@cyi.ac.cy>>) escribió:
> >
> >     Hi,
> >
> >     In a similar situation we ended up just building the software on the
> >     "older" cpu (i.e. the "slave" in your case)
> >
> >     G.
> >
> >
> >     George Tsouloupas, PhD
> >     HPC Facility Technical Director
> >     The Cyprus Institute
> >     tel: +357 22208688
> >
> >     On 6/2/21 4:22 PM, Agustín Aucar wrote:
> >>     Dear EasyBuild experts,
> >>
> >>     Firstly, thank you for your very nice work!
> >>
> >>     I'm trying to compile PySCF with the following *.eb file:
> >>
> >>     easyblock = 'CMakeMakeCp'
> >>
> >>     name = 'PySCF'
> >>     version = '2.0.0a'
> >>     versionsuffix = '-Python-%(pyver)s'
> >>
> >>     homepage = 'http://www.pyscf.org <http://www.pyscf.org/>'
> >>     description = "PySCF is an open-source collection of electronic
> >>     structure modules powered by Python."
> >>
> >>     toolchain = {'name': 'foss', 'version': '2020b'}
> >>
> >>     source_urls = ['https://github.com/pyscf/pyscf/archive/
> >>     <https://github.com/pyscf/pyscf/archive/>']
> >>     sources = ['v%(version)s.tar.gz']
> >>     checksums =
> >>     ['20f4c9faf65436a97f9dfc8099d3c79b988b0a2c5374c701fbe35abc6fad4922']
> >>
> >>     builddependencies = [('CMake', '3.18.4')]
> >>
> >>     dependencies = [
> >>         ('Python', '3.8.6'),
> >>         ('SciPy-bundle', '2020.11'),  # for numpy, scipy
> >>         ('h5py', '3.1.0'),
> >>         ('qcint', '4.0.6', versionsuffix),
> >>         ('libxc', '5.1.3'),
> >>         ('XCFun', '2.1.1'),
> >>     ]
> >>
> >>     start_dir = 'pyscf/lib'
> >>
> >>     separate_build_dir = True
> >>
> >>     configopts = "-DBUILD_LIBCINT=OFF -DBUILD_LIBXC=OFF
> >>     -DBUILD_XCFUN=OFF "
> >>
> >>     prebuildopts = "export
> >>     PYSCF_INC_DIR=$EBROOTQCINT/include:$EBROOTLIBXC/lib && "
> >>
> >>     files_to_copy = ['pyscf']
> >>
> >>     sanity_check_paths = {
> >>         'files': ['pyscf/__init__.py'],
> >>         'dirs': ['pyscf/data', 'pyscf/lib'],
> >>     }
> >>
> >>     sanity_check_commands = ["python -c 'import pyscf'"]
> >>
> >>     modextrapaths = {'PYTHONPATH': '', 'PYSCF_EXT_PATH': ''}
> >>
> >>     moduleclass = 'chem'
> >>
> >>
> >>     Even if the module is created, I am having troubles by running it
> >>     in a node different from master. In particular, when I load the
> >>     module and ran the code, it goes all OK:
> >>
> >>     module load chem/PySCF/2.0.0a-foss-2020b-Python-3.8.6
> >>     python
> >>     from pyscf import gto, scf
> >>     mol = gto.M(atom='H 0 0 0; H 0 0 1')
> >>     mf = scf.RHF(mol).run()
> >>
> >>     but when I try to run it on a node different from the master, I get:
> >>
> >>     Python 3.8.6 (default, Jun  1 2021, 16:43:49)
> >>     [GCC 10.2.0] on linux
> >>     Type "help", "copyright", "credits" or "license" for more
> information.
> >>     >>> from pyscf import gto, scf
> >>     >>> mol = gto.M(atom='H 0 0 0; H 0 0 1')
> >>     >>> mf = scf.RHF(mol).run()
> >>     Illegal instruction (core dumped)
> >>
> >>     As far as I read in different places, it seems to be related to
> >>     the different architectures of our master and slaves nodes.
> >>
> >>     If I execute
> >>
> >>     grep flags -m1 /proc/cpuinfo | cut -d ":" -f 2 | tr '[:upper:]'
> >>     '[:lower:]' | { read FLAGS; OPT="-march=native"; for flag in
> >>     $FLAGS; do case "$flag" in "sse4_1" | "sse4_2" | "ssse3" | "fma" |
> >>     "cx16" | "popcnt" | "avx" | "avx2") OPT+=" -m$flag";; esac; done;
> >>     MODOPT=${OPT//_/\.}; echo "$MODOPT"; }
> >>
> >>     on the slaves I get: -march=native -mssse3 -mfma -mcx16 -msse4.1
> >>     -msse4.2 -mpopcnt -mavx -mavx2
> >>
> >>     whereas on the master node we have: -march=native -mcx16
> >>
> >>     I tried to compile PySCF by adding these lines to my *.eb file:
> >>
> >>     configopts += "-DBUILD_FLAGS='-march=native -mssse3 -mfma -mcx16
> >>     -msse4.1 -msse4.2 -mpopcnt -mavx -mavx2' "
> >>     configopts += "-DCMAKE_C_FLAGS='-march=native -mssse3 -mfma -mcx16
> >>     -msse4.1 -msse4.2 -mpopcnt -mavx -mavx2' "
> >>     configopts += "-DCMAKE_CXX_FLAGS='-march=native -mssse3 -mfma
> >>     -mcx16 -msse4.1 -msse4.2 -mpopcnt -mavx -mavx2' "
> >>     configopts += "-DCMAKE_FORTRAN_FLAGS='-march=native -mssse3 -mfma
> >>     -mcx16 -msse4.1 -msse4.2 -mpopcnt -mavx -mavx2'"
> >>
> >>     but in that case the code does not run on master and neither in
> >>     slaves.
> >>
> >>
> >>     I'm sorry if it is a stupid question. I am far from being a system
> >>     admin...
> >>
> >>     Thanks a lot for your help.
> >>
> >>     Dr. Agustín Aucar
> >>     Institute for Modeling and Innovative Technologies - Argentina
> >
>

Reply via email to