Re: upgrade to jessie from wheezy with cuda problems
On Wed, Nov 13, 2013 at 10:32:26AM +0100, Francesco Pietra wrote: > My answer seems to have disappeared. I summarize here. > > "modinfo nvidia-curred" works well. CUDA libraries are installed. > > For nvidia-cuda-toolkit, nvidia offers SDK packages for Ubuntu, not for > Debian. I don't like to get into troubles with Ubuntu, which, unlike > LinuxMINT, is not compatible with Debian. > > I tried GNU "CUDA-Z-07.189.run" (don't remember from where it was > downloaded). However it does not find the shared libXrender.so.1, even if > made available into the same folder of CUDA-Z. > > Actually > > root@gig64:/home/francesco# apt-file search libXrender.so.1 > libxrender1: /usr/lib/x86_64-linux-gnu/libXrender.so.1 > libxrender1: /usr/lib/x86_64-linux-gnu/libXrender.so.1.3.0 > libxrender1-dbg: /usr/lib/debug/usr/lib/x86_64-linux-gnu/libXrender.so.1.3.0 > root@gig64:/home/francesco# > > francesco@gig64:~$ echo $PATH > /opt/namd2.9_cuda4.0_2012-09-26/bin:/opt/namd2.9_cuda4.0_2012-09-26/bin:/opt/namd2.9_cuda4.0_2012-09-26/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/opt/amber12/bin:/opt/amber10/bin:/opt/UCSF/Chimera64-2012-10-10/bin:/opt/namd2.9_cuda4.0_2012-09-26/bin/namd2:/opt/amber12/bin:/opt/amber10/bin:/opt/UCSF/Chimera64-2012-10-10/bin:/opt/namd2.9_cuda4.0_2012-09-26/bin/namd2:/opt/amber12/bin:/opt/amber10/bin:/opt/UCSF/Chimera64-2012-10-10/bin:/opt/namd2.9_cuda4.0_2012-09-26/bin/namd2 > francesco@gig64:~$ > > Should /usr/lib/x86_64-linux-gnu be put on my path explicitly? The PATH is not for libraries. LD_LIBRARY_PATH is, as is /etc/ld.so.conf stuff. Also is what you downloaded 32 or 64 bit? Try: ldd CUDA-Z-07.189.run See what it is looking for. -- Len Sorensen -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20131113171504.gh20...@csclub.uwaterloo.ca
Re: upgrade to jessie from wheezy with cuda problems
On Wed, Nov 13, 2013 at 10:13:15AM +0100, Francesco Pietra wrote: > > > > I think it was renamed. No idea why. modinfo nvidia-current should > > work though. > > Yes, it does. > > Do you have the cuda libraries for the 319 version installed? > > Yes > > > I don't play around with GPU computations, but from what I have read it > > does need a certain size job before the overhead of transfering the > > data and managing the GPU makse it worthwhile, but for large jobs the > > high core count and memory bandwidth makes a big difference. > > > 500,000 atoms, as in my test, is a large system for unbiased molecular > dynamics. At any event, I looked at the the nvidia-cuda-toolkit version > 5.0. nvidia for GPU Computing SDK, to build examples that should include a > bandwidth test, offers linux packages for Fedora RHEL Ubuntu OpenSUSE and > SUSE. No Debian. I had unpleasant experiences with Ubuntu packages, and it > is well known that Ubuntu, unlike LinuxMint, is not compatible with Debian. > Therefore, I did not try the cuda toolkit. I wonder why Ubuntu has so > widely replaced Debian among the mass. Sad, and somewhat irritating, for me. > > I tried > francesco@gig64:~/tmp$ ls > CUDA-Z-0.7.189.run > francesco@gig64:~/tmp$ ./CUDA-Z-0.7.189.run > CUDA-Z 0.7.189 Container > Starting CUDA-Z... > /home/francesco/tmp/CUDA-Z-657a-580e-a8aa-0faa/cuda-z: error while loading > shared libraries: libXrender.so.1: cannot open shared object file: No such > file or directory Try: LD_LIBRARY_PATH=. ./CUDA-Z-0.7.189.run See if it finds that lirary then. > francesco@gig64:~/tmp$ ls > CUDA-Z-0.7.189.run libXrender.so.1 > francesco@gig64:~/tmp$ ./CUDA-Z-0.7.189.run > CUDA-Z 0.7.189 Container > Starting CUDA-Z... > /home/francesco/tmp/CUDA-Z-a3db-49bf-8cb7-059d/cuda-z: error while loading > shared libraries: libXrender.so.1: cannot open shared object file: No such > file or directory > francesco@gig64:~/tmp$ > > Actually the required lib is available, as shown by my copy into tmp. I > don't remember the source of this GNU CUDA-Z tool. Any experience with? > > I have also met reports of unexciting experience with PCIe 3.0, that is > meager or no gain over PCIe 2.0, however it deals of people carrying out > games, which is different from NAMD molecular dynamics, where most is done > by the GPUs but AT EACH STEP energy has to be calculated by the CPU. I see a package in Debian named 'nvidia-cuda-toolkit'. Does that include that you were looking for? I guess the bandwidthtest isn't built normally. -- Len Sroensen -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20131113171328.gg20...@csclub.uwaterloo.ca
Re: upgrade to jessie from wheezy with cuda problems
My answer seems to have disappeared. I summarize here. "modinfo nvidia-curred" works well. CUDA libraries are installed. For nvidia-cuda-toolkit, nvidia offers SDK packages for Ubuntu, not for Debian. I don't like to get into troubles with Ubuntu, which, unlike LinuxMINT, is not compatible with Debian. I tried GNU "CUDA-Z-07.189.run" (don't remember from where it was downloaded). However it does not find the shared libXrender.so.1, even if made available into the same folder of CUDA-Z. Actually root@gig64:/home/francesco# apt-file search libXrender.so.1 libxrender1: /usr/lib/x86_64-linux-gnu/libXrender.so.1 libxrender1: /usr/lib/x86_64-linux-gnu/libXrender.so.1.3.0 libxrender1-dbg: /usr/lib/debug/usr/lib/x86_64-linux-gnu/libXrender.so.1.3.0 root@gig64:/home/francesco# francesco@gig64:~$ echo $PATH /opt/namd2.9_cuda4.0_2012-09-26/bin:/opt/namd2.9_cuda4.0_2012-09-26/bin:/opt/namd2.9_cuda4.0_2012-09-26/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/opt/amber12/bin:/opt/amber10/bin:/opt/UCSF/Chimera64-2012-10-10/bin:/opt/namd2.9_cuda4.0_2012-09-26/bin/namd2:/opt/amber12/bin:/opt/amber10/bin:/opt/UCSF/Chimera64-2012-10-10/bin:/opt/namd2.9_cuda4.0_2012-09-26/bin/namd2:/opt/amber12/bin:/opt/amber10/bin:/opt/UCSF/Chimera64-2012-10-10/bin:/opt/namd2.9_cuda4.0_2012-09-26/bin/namd2 francesco@gig64:~$ Should /usr/lib/x86_64-linux-gnu be put on my path explicitly? Thanks francesco pietra On Tue, Nov 12, 2013 at 11:37 PM, Lennart Sorensen < lsore...@csclub.uwaterloo.ca> wrote: > On Tue, Nov 12, 2013 at 10:35:53PM +0100, Francesco Pietra wrote: > > # apt-get --purge remove *legacy* > > did the job. > > > > I wonder how these legacy packages entered the scene while > > updating/upgrading from a clean wheezy. > > > > The bad news are that with the new driver 319.60 there was no > acceleration > > of molecular dynamics for a job of modest size (150K atoms) and slight > > acceleration (0.12 s/step vs 0.14 s/step) for a heavy job (500K atoms). > > Weather bringing from PCIe 2.0 (with the 304.xx driver of wheezy) to PCIe > > 3.0 (with driver 319.60 of jessie) (increasing the bandwidth from GPUs > to > > RAM from 5 to 8GB/s) has not the effect that I hoped on the calculations, > > or PCIe is still 2.0 with jessie. > > > > Now, with cuda 5.0, it should be easy to measure the bandwidth directly. > I > > have to learn how and I'll report about in due course. > > > > > > Now > > nvidia-smi activates the GPUs for normal work, > > nvidia-smi -L tells about the GPUs, > > dpkg -l |grep nvidia shows all 319.60 or 5.0.35-8, > > the X-server can be started and gnome loaded (startx, gnome-session), > > nvcc --version gives 5.0, however > > > > > > # modinfo nvidia > > ERROR: module nvidia not found > > > > In analogy with wheezy 3.2.0-4, I expected > > /lib/modules/3.10-3-amd64/updates/dkms/nvidia.ko > > > > Instead, there is > > > > /lib/modules/3.10-3-amd64/nvidia/nvidia-current.ko > > > > is that a feature of jessie or something wrong? > > I think it was renamed. No idea why. modinfo nvidia-current should > work though. > > Do you have the cuda libraries for the 319 version installed? > > I don't play around with GPU computations, but from what I have read it > does need a certain size job before the overhead of transfering the > data and managing the GPU makse it worthwhile, but for large jobs the > high core count and memory bandwidth makes a big difference. > > -- > Len Sorensen >
Re: upgrade to jessie from wheezy with cuda problems
> > I think it was renamed. No idea why. modinfo nvidia-current should > work though. Yes, it does. Do you have the cuda libraries for the 319 version installed? Yes I don't play around with GPU computations, but from what I have read it > does need a certain size job before the overhead of transfering the > data and managing the GPU makse it worthwhile, but for large jobs the > high core count and memory bandwidth makes a big difference. 500,000 atoms, as in my test, is a large system for unbiased molecular dynamics. At any event, I looked at the the nvidia-cuda-toolkit version 5.0. nvidia for GPU Computing SDK, to build examples that should include a bandwidth test, offers linux packages for Fedora RHEL Ubuntu OpenSUSE and SUSE. No Debian. I had unpleasant experiences with Ubuntu packages, and it is well known that Ubuntu, unlike LinuxMint, is not compatible with Debian. Therefore, I did not try the cuda toolkit. I wonder why Ubuntu has so widely replaced Debian among the mass. Sad, and somewhat irritating, for me. I tried francesco@gig64:~/tmp$ ls CUDA-Z-0.7.189.run francesco@gig64:~/tmp$ ./CUDA-Z-0.7.189.run CUDA-Z 0.7.189 Container Starting CUDA-Z... /home/francesco/tmp/CUDA-Z-657a-580e-a8aa-0faa/cuda-z: error while loading shared libraries: libXrender.so.1: cannot open shared object file: No such file or directory francesco@gig64:~/tmp$ ls CUDA-Z-0.7.189.run libXrender.so.1 francesco@gig64:~/tmp$ ./CUDA-Z-0.7.189.run CUDA-Z 0.7.189 Container Starting CUDA-Z... /home/francesco/tmp/CUDA-Z-a3db-49bf-8cb7-059d/cuda-z: error while loading shared libraries: libXrender.so.1: cannot open shared object file: No such file or directory francesco@gig64:~/tmp$ Actually the required lib is available, as shown by my copy into tmp. I don't remember the source of this GNU CUDA-Z tool. Any experience with? I have also met reports of unexciting experience with PCIe 3.0, that is meager or no gain over PCIe 2.0, however it deals of people carrying out games, which is different from NAMD molecular dynamics, where most is done by the GPUs but AT EACH STEP energy has to be calculated by the CPU. thanks francesco pietra On Tue, Nov 12, 2013 at 11:37 PM, Lennart Sorensen < lsore...@csclub.uwaterloo.ca> wrote: > On Tue, Nov 12, 2013 at 10:35:53PM +0100, Francesco Pietra wrote: > > # apt-get --purge remove *legacy* > > did the job. > > > > I wonder how these legacy packages entered the scene while > > updating/upgrading from a clean wheezy. > > > > The bad news are that with the new driver 319.60 there was no > acceleration > > of molecular dynamics for a job of modest size (150K atoms) and slight > > acceleration (0.12 s/step vs 0.14 s/step) for a heavy job (500K atoms). > > Weather bringing from PCIe 2.0 (with the 304.xx driver of wheezy) to PCIe > > 3.0 (with driver 319.60 of jessie) (increasing the bandwidth from GPUs > to > > RAM from 5 to 8GB/s) has not the effect that I hoped on the calculations, > > or PCIe is still 2.0 with jessie. > > > > Now, with cuda 5.0, it should be easy to measure the bandwidth directly. > I > > have to learn how and I'll report about in due course. > > > > > > Now > > nvidia-smi activates the GPUs for normal work, > > nvidia-smi -L tells about the GPUs, > > dpkg -l |grep nvidia shows all 319.60 or 5.0.35-8, > > the X-server can be started and gnome loaded (startx, gnome-session), > > nvcc --version gives 5.0, however > > > > > > # modinfo nvidia > > ERROR: module nvidia not found > > > > In analogy with wheezy 3.2.0-4, I expected > > /lib/modules/3.10-3-amd64/updates/dkms/nvidia.ko > > > > Instead, there is > > > > /lib/modules/3.10-3-amd64/nvidia/nvidia-current.ko > > > > is that a feature of jessie or something wrong? > > I think it was renamed. No idea why. modinfo nvidia-current should > work though. > > Do you have the cuda libraries for the 319 version installed? > > I don't play around with GPU computations, but from what I have read it > does need a certain size job before the overhead of transfering the > data and managing the GPU makse it worthwhile, but for large jobs the > high core count and memory bandwidth makes a big difference. > > -- > Len Sorensen >
Re: upgrade to jessie from wheezy with cuda problems
On Tue, Nov 12, 2013 at 10:35:53PM +0100, Francesco Pietra wrote: > # apt-get --purge remove *legacy* > did the job. > > I wonder how these legacy packages entered the scene while > updating/upgrading from a clean wheezy. > > The bad news are that with the new driver 319.60 there was no acceleration > of molecular dynamics for a job of modest size (150K atoms) and slight > acceleration (0.12 s/step vs 0.14 s/step) for a heavy job (500K atoms). > Weather bringing from PCIe 2.0 (with the 304.xx driver of wheezy) to PCIe > 3.0 (with driver 319.60 of jessie) (increasing the bandwidth from GPUs to > RAM from 5 to 8GB/s) has not the effect that I hoped on the calculations, > or PCIe is still 2.0 with jessie. > > Now, with cuda 5.0, it should be easy to measure the bandwidth directly. I > have to learn how and I'll report about in due course. > > > Now > nvidia-smi activates the GPUs for normal work, > nvidia-smi -L tells about the GPUs, > dpkg -l |grep nvidia shows all 319.60 or 5.0.35-8, > the X-server can be started and gnome loaded (startx, gnome-session), > nvcc --version gives 5.0, however > > > # modinfo nvidia > ERROR: module nvidia not found > > In analogy with wheezy 3.2.0-4, I expected > /lib/modules/3.10-3-amd64/updates/dkms/nvidia.ko > > Instead, there is > > /lib/modules/3.10-3-amd64/nvidia/nvidia-current.ko > > is that a feature of jessie or something wrong? I think it was renamed. No idea why. modinfo nvidia-current should work though. Do you have the cuda libraries for the 319 version installed? I don't play around with GPU computations, but from what I have read it does need a certain size job before the overhead of transfering the data and managing the GPU makse it worthwhile, but for large jobs the high core count and memory bandwidth makes a big difference. -- Len Sorensen -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20131112223724.gf20...@csclub.uwaterloo.ca
Re: upgrade to jessie from wheezy with cuda problems
# apt-get --purge remove *legacy* did the job. I wonder how these legacy packages entered the scene while updating/upgrading from a clean wheezy. The bad news are that with the new driver 319.60 there was no acceleration of molecular dynamics for a job of modest size (150K atoms) and slight acceleration (0.12 s/step vs 0.14 s/step) for a heavy job (500K atoms). Weather bringing from PCIe 2.0 (with the 304.xx driver of wheezy) to PCIe 3.0 (with driver 319.60 of jessie) (increasing the bandwidth from GPUs to RAM from 5 to 8GB/s) has not the effect that I hoped on the calculations, or PCIe is still 2.0 with jessie. Now, with cuda 5.0, it should be easy to measure the bandwidth directly. I have to learn how and I'll report about in due course. Now nvidia-smi activates the GPUs for normal work, nvidia-smi -L tells about the GPUs, dpkg -l |grep nvidia shows all 319.60 or 5.0.35-8, the X-server can be started and gnome loaded (startx, gnome-session), nvcc --version gives 5.0, however # modinfo nvidia ERROR: module nvidia not found In analogy with wheezy 3.2.0-4, I expected /lib/modules/3.10-3-amd64/updates/dkms/nvidia.ko Instead, there is /lib/modules/3.10-3-amd64/nvidia/nvidia-current.ko is that a feature of jessie or something wrong? Thanks a lot for advice. francesco pietra. On Tue, Nov 12, 2013 at 5:59 PM, Lennart Sorensen < lsore...@csclub.uwaterloo.ca> wrote: > On Tue, Nov 12, 2013 at 05:22:18PM +0100, Francesco Pietra wrote: > > Yes. Also, > > > > # apt-get remove nvidia-kernel-dkms > > > > # apt-get install nvidia-kernel-dkms > > > > (which, in the year 2011, served to clear the driver at > > /lib/modules/2.6.38-2-amd64/updates/dkms. But now the kernel was 3.2.) > left > > the issue unaltered. > > > > # modinfo nvidia > >ERROR: module nvidia not found > > > > $ dpkg -l |grep nvidia |less > > > > shows > > > > libl1-nvidia-glx:amd64 319.60 > > > > and > > > > libg1-nvidia-legacy-304xx--glx:amd64 304.108-4 > > > > NVIDIA metapackage rc nvidia-glx 304.88-1-deb7u1 > > > > nvidia-legacy-304xx-driver 304.108-4 > > > > > > nvidia-legacy-304xx-kernel-dkms 304.108-4 > > > > nvidia-settings-legacy-303xx 304.108-2 > > > > xserver-xorg-video-nvidia-legacy-304xx304.108-4 > > > > > > Everything else 319.60-1 and cuda 5.0 > > > > I don't understand why these 304xx are threatening. > > > > I had also run > > # nvidia-xconfig > > I think you should remove all packages with legacy-304xx in the name, > and install the current ones (nvidia-kernel-dkms, nvidia-glx, etc). > > legacy-304xx will never move beyond version 304.xx after all as the > name implies. > > -- > Len Sorensen >
Re: upgrade to jessie from wheezy with cuda problems
On Tue, Nov 12, 2013 at 05:22:18PM +0100, Francesco Pietra wrote: > Yes. Also, > > # apt-get remove nvidia-kernel-dkms > > # apt-get install nvidia-kernel-dkms > > (which, in the year 2011, served to clear the driver at > /lib/modules/2.6.38-2-amd64/updates/dkms. But now the kernel was 3.2.) left > the issue unaltered. > > # modinfo nvidia >ERROR: module nvidia not found > > $ dpkg -l |grep nvidia |less > > shows > > libl1-nvidia-glx:amd64 319.60 > > and > > libg1-nvidia-legacy-304xx--glx:amd64 304.108-4 > > NVIDIA metapackage rc nvidia-glx 304.88-1-deb7u1 > > nvidia-legacy-304xx-driver 304.108-4 > > > nvidia-legacy-304xx-kernel-dkms 304.108-4 > > nvidia-settings-legacy-303xx 304.108-2 > > xserver-xorg-video-nvidia-legacy-304xx304.108-4 > > > Everything else 319.60-1 and cuda 5.0 > > I don't understand why these 304xx are threatening. > > I had also run > # nvidia-xconfig I think you should remove all packages with legacy-304xx in the name, and install the current ones (nvidia-kernel-dkms, nvidia-glx, etc). legacy-304xx will never move beyond version 304.xx after all as the name implies. -- Len Sorensen -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20131112165947.ge20...@csclub.uwaterloo.ca
Re: upgrade to jessie from wheezy with cuda problems
Yes. Also, # apt-get remove nvidia-kernel-dkms # apt-get install nvidia-kernel-dkms (which, in the year 2011, served to clear the driver at /lib/modules/2.6.38-2-amd64/updates/dkms. But now the kernel was 3.2.) left the issue unaltered. # modinfo nvidia ERROR: module nvidia not found $ dpkg -l |grep nvidia |less shows libl1-nvidia-glx:amd64 319.60 and libg1-nvidia-legacy-304xx--glx:amd64 304.108-4 NVIDIA metapackage rc nvidia-glx 304.88-1-deb7u1 nvidia-legacy-304xx-driver 304.108-4 nvidia-legacy-304xx-kernel-dkms 304.108-4 nvidia-settings-legacy-303xx 304.108-2 xserver-xorg-video-nvidia-legacy-304xx304.108-4 Everything else 319.60-1 and cuda 5.0 I don't understand why these 304xx are threatening. I had also run # nvidia-xconfig thanks francesco pietra On Tue, Nov 12, 2013 at 3:59 PM, Lennart Sorensen < lsore...@csclub.uwaterloo.ca> wrote: > On Tue, Nov 12, 2013 at 03:54:32PM +0100, Francesco Pietra wrote: > > Hello: > > I decided to try jessie to get PCIe 3.0 with a recent nvidia driver, thus > > upgrading from wheezy. > > > > wheezy was > > uname -r > > 3.2.0-4-amd64 > > > > nvidia-smi > > 304.88 > > > > nvcc --version > > 4.2 > > > > (the latter is also the version at which the molecular dynamics code was > > compiled, and used without calling the X-server) > > > > > > Following aptitude update > > > > aptitude-upgrade > > > > a number of dependecies related to gnome were not met (evolution-common > > lbfolks25 gnome-panel gnome-shell gnome-theme-extras gnome-theme-standard > > libreoffice-evolution). This notwithstanding, I decided to upgrade. > > > > After rebooting to get linux matching with nvidia: > > > > nvcc --version > > 5.0 > > > > uname -r > > 3.10-3-amd64 > > > > nvidia-smi > > the nvidia kernel module has version 304.108 but the nvidia driver > > component has version 319.60. > > > > > > Driver 319.6 is just what I wanted. Now, how best fix the problems? > Install > > linux image 3.2? > > > > In the past I tried dist-upgrade, getting into devastating problems. > > Do you have nvidia-kernel-dkms installed? > > -- > Len Sorensen >
Re: upgrade to jessie from wheezy with cuda problems
On Tue, Nov 12, 2013 at 03:54:32PM +0100, Francesco Pietra wrote: > Hello: > I decided to try jessie to get PCIe 3.0 with a recent nvidia driver, thus > upgrading from wheezy. > > wheezy was > uname -r > 3.2.0-4-amd64 > > nvidia-smi > 304.88 > > nvcc --version > 4.2 > > (the latter is also the version at which the molecular dynamics code was > compiled, and used without calling the X-server) > > > Following aptitude update > > aptitude-upgrade > > a number of dependecies related to gnome were not met (evolution-common > lbfolks25 gnome-panel gnome-shell gnome-theme-extras gnome-theme-standard > libreoffice-evolution). This notwithstanding, I decided to upgrade. > > After rebooting to get linux matching with nvidia: > > nvcc --version > 5.0 > > uname -r > 3.10-3-amd64 > > nvidia-smi > the nvidia kernel module has version 304.108 but the nvidia driver > component has version 319.60. > > > Driver 319.6 is just what I wanted. Now, how best fix the problems? Install > linux image 3.2? > > In the past I tried dist-upgrade, getting into devastating problems. Do you have nvidia-kernel-dkms installed? -- Len Sorensen -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20131112145937.gc20...@csclub.uwaterloo.ca