Re: Fwd: upgrade to jessie from wheezy with cuda problems
> > Might need nvidia-current instead of nvidia. It failed to bring to PCIe 3.0 when inserted into nvidia.conf francesco@gig64:/etc/modprobe.d$ cat nvidia.conf alias nvidia nvidia-current remove nvidia-current rmmod nvidia # 1. options nvidia-current NVreg_EnablePCIeGen3=1 (of course it was not commented when the test was carried out) However, it brought to PCIe 3.0 when in the kernel GREAT SUGGESTION Thus, I added (temporarily) to GRUB by 1) typing 'e' at grub prompt, 2) adding the option to the END OF the linux line, 3) Ctrl-x to boot verifying that it was taken into accout ~$ cat /proc/cmdline BOOT_IMAGE=/boot/vmlinuz-3.10-3-amd64 root=/dev/mapper/vg1-root ro quiet 1. nvidia-current.NVreg_EnablePCIeGen3=1 #lspci - ... 02:00.0 VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX 680] (rev a1) (prog-if 00 [VGA controller]) Subsystem: NVIDIA Corporation Device 0969 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- Capabilities: [100 v1] Virtual Channel Caps:LPEVC=0 RefClk=100ns PATEntryBits=1 Arb:Fixed- WRR32- WRR64- WRR128- Ctrl:ArbSelect=Fixed Status:InProgress- VC0:Caps:PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- Arb:Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- Ctrl:Enable+ ID=0 ArbSelect=Fixed TC/VC=01 Status:NegoPending- InProgress- Capabilities: [128 v1] Power Budgeting Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1 Len=024 Capabilities: [900 v1] #19 Kernel driver in use: nvidia ... The same for the other GPU. * Well, the surprise was that molecular dynamics for a large system (500K atoms) was very modestly accelerated. From the simulation log file: Info: Benchmark time: 6 CPUs 0.123387 s/step 1.4281 days/ns 1171.53 MB memory >From the same simulation with same motherboard and GTX-680, but with sansy bridge i7-3930 and 1066MHz RAM: Info: Benchmark time: 6 CPUs 0.138832 s/step 1.60686 days/ns 1161.23 MB memory The better performance of the ivy bridge might be the result from the higher clock of both the CPU and RAM (1866MHz). A variety of interpretaions of these observations are possible, taking into account, however, that with simple machines as the one used here, it would be difficult to run MD with much bigger systems than 500K atoms. Finally we succeeded to get PCIe 3.0 and now the PCIe 3.0 setting can be passed permanently to the kernel. I have to learn how. Thanks a lot francesco pietra On Mon, Nov 18, 2013 at 6:02 PM, Lennart Sorensen < lsore...@csclub.uwaterloo.ca> wrote: > On Sun, Nov 17, 2013 at 10:45:58AM +0100, Francesco Pietra wrote: > > I am attacking the problem from another side, directly from within the OS > > itself: > > > > #lspi - > > > > tells that the link speed (= link status) "LnkSta" is at 5Gb/s, no matter > > whether the system is at number crunching or not. I.e., my system is at > > PCIe 2.0. This might explain why upgrading from sandy bridge to ivy > bridge > > gave no speed gain of molecular dynamics. PCIe 3.0 was not achieved. > > > > As far as I could investigate, nvidia suggests to either: > > (1) Modify /etc/modprobe.d/local.conf (which does not exist on jessie) or > > create a new > > > > /etc/modprobe.d/nvidia.conf, adding to that > > > > 1. options nvidia NVreg_EnablePCIeGen3=1 > > Might need nvidia-current instead of nvidia. > > > Actually, on my jessie, nvidia.conf reads > > > > alias nvidia nvidia-current > > remove nvidia-current rm mod nvidia > > > > > > Some guys found that useless, even when both grub-efi and initramfs are > > edited accordingly, so that nvidia offered a different move, updating the > > kernel boot string, by appending this: > > > > 1. options nvidia NVreg_EnablePCIeGen3=1 > > *** > > That is NOT the syntax for a kernel command line. It is the syntax for > the modprobe config. > > Something like nvidia.NVreg_EnablePCIeGen3=1 or > nvidia-current.NVreg_EnablePCIeGen3=1 (depending on the name of the > module as far as the module is concerned). > > > I did nothing, as I hope that the best adaptation to jessie may be > > suggested by those who know the OS better than me. > > The kind of information about links includes: > > > > LnkSta: the actual speed > > > > LnkCap: the capacity of the specific port, as far as I can understand. > > > > LnkCtl: ?? > > > > > > One could also run > > > > #lspci -vt > > > > to determine the bus where the GPU card is located, then running > > > > # lspci -vv -s ## > > > > where "##" is the location. > > ** > > > > So, it is a tricky matter, but perhaps not so much when one knows where > to > > put the hands. At any event, being unable to go to 8GT/s, as fr
Re: Fwd: upgrade to jessie from wheezy with cuda problems
On Sun, Nov 17, 2013 at 10:45:58AM +0100, Francesco Pietra wrote: > I am attacking the problem from another side, directly from within the OS > itself: > > #lspi - > > tells that the link speed (= link status) "LnkSta" is at 5Gb/s, no matter > whether the system is at number crunching or not. I.e., my system is at > PCIe 2.0. This might explain why upgrading from sandy bridge to ivy bridge > gave no speed gain of molecular dynamics. PCIe 3.0 was not achieved. > > As far as I could investigate, nvidia suggests to either: > (1) Modify /etc/modprobe.d/local.conf (which does not exist on jessie) or > create a new > > /etc/modprobe.d/nvidia.conf, adding to that > > 1. options nvidia NVreg_EnablePCIeGen3=1 Might need nvidia-current instead of nvidia. > Actually, on my jessie, nvidia.conf reads > > alias nvidia nvidia-current > remove nvidia-current rm mod nvidia > > > Some guys found that useless, even when both grub-efi and initramfs are > edited accordingly, so that nvidia offered a different move, updating the > kernel boot string, by appending this: > > 1. options nvidia NVreg_EnablePCIeGen3=1 > *** That is NOT the syntax for a kernel command line. It is the syntax for the modprobe config. Something like nvidia.NVreg_EnablePCIeGen3=1 or nvidia-current.NVreg_EnablePCIeGen3=1 (depending on the name of the module as far as the module is concerned). > I did nothing, as I hope that the best adaptation to jessie may be > suggested by those who know the OS better than me. > The kind of information about links includes: > > LnkSta: the actual speed > > LnkCap: the capacity of the specific port, as far as I can understand. > > LnkCtl: ?? > > > One could also run > > #lspci -vt > > to determine the bus where the GPU card is located, then running > > # lspci -vv -s ## > > where "##" is the location. > ** > > So, it is a tricky matter, but perhaps not so much when one knows where to > put the hands. At any event, being unable to go to 8GT/s, as from PCIe 3.0, > means loosing time and energy (=money and pollution), at least when the > GPUs are used for long number crunching. Well it means slower transfers of data to and from the card. If the data set fits in the card entirely during a long number crunch, then bandwidth would not matter much at all. So depends on the size of the data set and how often data has to be moved in and out of the card. > I'll continue investigating. The above seems to be promising. Hope to get > help. -- Len Sorensen -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20131118170238.gm20...@csclub.uwaterloo.ca
Re: Fwd: upgrade to jessie from wheezy with cuda problems
Em 18-11-2013 13:13, Francesco Pietra escreveu: It is getting hard, unless I mistaken what was suggested by nvidia . Thus, following what was suggested by nvidia as a no-barrier solution, https://devtalk.nvidia.com/default/topic/545186/enabling-pcie-3-0-with-nvreg_enablepciegen3-on-titan/?offset=10#4021328 I updated the kernel boot string by 1) typing 'e' at grub prompt, 2) adding the option to the linux line, 3) Ctrl-x to boot If that procedure is correct (probably it is, as francesco@gig64:~$ cat /proc/cmdline BOOT_IMAGE=/boot/vmlinuz-3.10-3-amd64 root=/dev/mapper/vg1-root ro 1. nvidia.NVreg_EnablePCIeGen3=1 quiet francesco@gig64:~$ ) then, no luck. Both LnkCap and LnkSta were at at 5GT/s, as for PCIe 2.0. Molecular dynamics, accordingly, was not accelerated. I wonder whether "1." preceding "nvidia..." is what is needed for a grub bootloader option. I did not find any other instance about that nvidia suggestion on internet. I may be wrong, but it seems that there is a hardware bump in your pci-express 3.0 road , Francesco : http://www.anandtech.com/show/7521/nvidia-launches-tesla-k40 Close to the end it says that nvidia "... is finally enabling full pci-express 3.0 speeds ..." , so it may be that your card suffers from this issue too . Hope it helps. -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/528a3dcd.1090...@gmail.com
Re: Fwd: upgrade to jessie from wheezy with cuda problems
Em 18-11-2013 13:13, Francesco Pietra escreveu: It is getting hard, unless I mistaken what was suggested by nvidia . Thus, following what was suggested by nvidia as a no-barrier solution, https://devtalk.nvidia.com/default/topic/545186/enabling-pcie-3-0-with-nvreg_enablepciegen3-on-titan/?offset=10#4021328 I updated the kernel boot string by 1) typing 'e' at grub prompt, 2) adding the option to the linux line, 3) Ctrl-x to boot If that procedure is correct (probably it is, as francesco@gig64:~$ cat /proc/cmdline BOOT_IMAGE=/boot/vmlinuz-3.10-3-amd64 root=/dev/mapper/vg1-root ro 1. nvidia.NVreg_EnablePCIeGen3=1 quiet francesco@gig64:~$ ) then, no luck. Both LnkCap and LnkSta were at at 5GT/s, as for PCIe 2.0. Molecular dynamics, accordingly, was not accelerated. I wonder whether "1." preceding "nvidia..." is what is needed for a grub bootloader option. I did not find any other instance about that nvidia suggestion on internet. I may be wrong, but it seems that there is a hardware bump in your pci-express 3.0 road , Francesco : http://www.anandtech.com/show/7521/nvidia-launches-tesla-k40 Close to the end it says that nvidia "... is finally enabling full pci-express 3.0 speeds ..." , so it may be that your card suffers from this issue too . Hope it helps. -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/528a3d42.1070...@gmail.com
Re: Fwd: upgrade to jessie from wheezy with cuda problems
I am attacking the problem from another side, directly from within the OS itself: #lspi - tells that the link speed (= link status) "LnkSta" is at 5Gb/s, no matter whether the system is at number crunching or not. I.e., my system is at PCIe 2.0. This might explain why upgrading from sandy bridge to ivy bridge gave no speed gain of molecular dynamics. PCIe 3.0 was not achieved. As far as I could investigate, nvidia suggests to either: (1) Modify /etc/modprobe.d/local.conf (which does not exist on jessie) or create a new /etc/modprobe.d/nvidia.conf, adding to that 1. options nvidia NVreg_EnablePCIeGen3=1 Actually, on my jessie, nvidia.conf reads alias nvidia nvidia-current remove nvidia-current rm mod nvidia Some guys found that useless, even when both grub-efi and initramfs are edited accordingly, so that nvidia offered a different move, updating the kernel boot string, by appending this: 1. options nvidia NVreg_EnablePCIeGen3=1 *** I did nothing, as I hope that the best adaptation to jessie may be suggested by those who know the OS better than me. The kind of information about links includes: LnkSta: the actual speed LnkCap: the capacity of the specific port, as far as I can understand. LnkCtl: ?? One could also run #lspci -vt to determine the bus where the GPU card is located, then running # lspci -vv -s ## where "##" is the location. ** So, it is a tricky matter, but perhaps not so much when one knows where to put the hands. At any event, being unable to go to 8GT/s, as from PCIe 3.0, means loosing time and energy (=money and pollution), at least when the GPUs are used for long number crunching. I'll continue investigating. The above seems to be promising. Hope to get help. francesco pietra PS With my jessie /etc/modprobe.d includes the following files: alsa-base.conf alsa-case-blacklist.conf dkms.conf (which has no active statemente) fbdev-blacklist.conf i915-kms.conf nvidia.conf nvidia-blacklist-nouveau.conf radeon-kms.conf ** On Thu, Nov 14, 2013 at 3:33 AM, Lennart Sorensen < lsore...@csclub.uwaterloo.ca> wrote: > On Wed, Nov 13, 2013 at 05:43:47PM -0500, Lennart Sorensen wrote: > > On Wed, Nov 13, 2013 at 10:53:30PM +0100, Francesco Pietra wrote: > > > francesco@gig64:~/tmp$ file ./CUDA-Z-0.7.189.run > > > ./CUDA-Z-0.7.189.run: data > > > francesco@gig64:~/tmp$ > > > > OK that's weird. I expected to see x86 32 or 64bit binary. > > > > Seems to be a shell scripts with compressed code in it. Yuck. :) > > OK I got it running. It is a 32bit binary. > > I had to install these: > > ii libcuda1:i386 331.20-1 > i386 NVIDIA CUDA runtime library > ii libcudart5.0:i386 5.0.35-8 > i386 NVIDIA CUDA runtime library > ii libgl1-nvidia-glx:i386331.20-1 > i386 NVIDIA binary OpenGL libraries > ii libstdc++6:i386 4.8.2-4 > i386 GNU Standard C++ Library v3 > ii libxrender1:i386 1:0.9.8-1 > i386 X Rendering Extension client library > ii zlib1g:i386 1:1.2.8.dfsg-1 > i386 compression library - runtime > > Then I was able to run it. No messing with LD_LIBRARY_PATH or anything. > > To install :i386 packages you first have to enable multiarch support > with dpkg and run apt-get update. So something like: > > dpkg --add-architecture i386 > apt-get update > apt-get install libcuda1:i386 libcudart5.0:i386 libgl1-nvidia-glx:i386 > libstdc++6:i386 libxrender1:i386 zlib1g:i386 > > Don't worry about the exact versions, since I am running > unstable+experimental. You don;t need to do that to get it working. > > For your 64bit code you probably need libcuda1 libcudart5.0 and such > installed in the 64bit version. > > -- > Len Sorensen >
Re: Fwd: upgrade to jessie from wheezy with cuda problems
On Wed, Nov 13, 2013 at 05:43:47PM -0500, Lennart Sorensen wrote: > On Wed, Nov 13, 2013 at 10:53:30PM +0100, Francesco Pietra wrote: > > francesco@gig64:~/tmp$ file ./CUDA-Z-0.7.189.run > > ./CUDA-Z-0.7.189.run: data > > francesco@gig64:~/tmp$ > > OK that's weird. I expected to see x86 32 or 64bit binary. > > Seems to be a shell scripts with compressed code in it. Yuck. :) OK I got it running. It is a 32bit binary. I had to install these: ii libcuda1:i386 331.20-1i386 NVIDIA CUDA runtime library ii libcudart5.0:i386 5.0.35-8i386 NVIDIA CUDA runtime library ii libgl1-nvidia-glx:i386331.20-1i386 NVIDIA binary OpenGL libraries ii libstdc++6:i386 4.8.2-4 i386 GNU Standard C++ Library v3 ii libxrender1:i386 1:0.9.8-1 i386 X Rendering Extension client library ii zlib1g:i386 1:1.2.8.dfsg-1 i386 compression library - runtime Then I was able to run it. No messing with LD_LIBRARY_PATH or anything. To install :i386 packages you first have to enable multiarch support with dpkg and run apt-get update. So something like: dpkg --add-architecture i386 apt-get update apt-get install libcuda1:i386 libcudart5.0:i386 libgl1-nvidia-glx:i386 libstdc++6:i386 libxrender1:i386 zlib1g:i386 Don't worry about the exact versions, since I am running unstable+experimental. You don;t need to do that to get it working. For your 64bit code you probably need libcuda1 libcudart5.0 and such installed in the 64bit version. -- Len Sorensen -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20131114023334.gk20...@csclub.uwaterloo.ca
Re: Fwd: upgrade to jessie from wheezy with cuda problems
On Wed, Nov 13, 2013 at 10:53:30PM +0100, Francesco Pietra wrote: > francesco@gig64:~/tmp$ file ./CUDA-Z-0.7.189.run > ./CUDA-Z-0.7.189.run: data > francesco@gig64:~/tmp$ OK that's weird. I expected to see x86 32 or 64bit binary. Seems to be a shell scripts with compressed code in it. Yuck. :) -- Len Sorensen -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20131113224347.gj20...@csclub.uwaterloo.ca
Re: Fwd: upgrade to jessie from wheezy with cuda problems
> > What does 'file ./CUDA-Z-0.7.189.run' say? > francesco@gig64:~/tmp$ file ./CUDA-Z-0.7.189.run ./CUDA-Z-0.7.189.run: data francesco@gig64:~/tmp$ On Wed, Nov 13, 2013 at 7:52 PM, Lennart Sorensen < lsore...@csclub.uwaterloo.ca> wrote: > On Wed, Nov 13, 2013 at 07:40:10PM +0100, Francesco Pietra wrote: > > francesco@gig64:~/tmp$ export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu/ > > That is unnecesary. That is already in the library path. The local > directory is not. Windows implicitly looks in the current directory > for files, linux (and almost all other systems) does not. > > hence: export LD_LIBRARY_PATH=. (. for current directory), or > LD_LIBRARY_PATH=/tmp if that is where you put the library you were trying. > > > francesco@gig64:~/tmp$ ./CUDA-Z-0.7.189.run > > CUDA-Z 0.7.189 Container > > Starting CUDA-Z... > > /home/francesco/tmp/CUDA-Z-95b0-7943-3edd-827e/cuda-z: error while > loading > > shared libraries: libXrender.so.1: wrong ELF class: ELFCLASS64 > > francesco@gig64:~/tmp$ > > What does 'file ./CUDA-Z-0.7.189.run' say? > > -- > Len Sorensen >
Re: Fwd: upgrade to jessie from wheezy with cuda problems
francesco@gig64:~$ file /home/francesco/tmp/CUDA-Z-95b0-7943-3edd-827e/cuda-z /home/francesco/tmp/CUDA-Z-95b0-7943-3edd-827e/cuda-z: ERROR: cannot open `/home/francesco/tmp/CUDA-Z-95b0-7943-3edd-827e/cuda-z' (No such file or directory) francesco@gig64:~$ On Wed, Nov 13, 2013 at 8:18 PM, Fabricio Cannini wrote: > Em 13-11-2013 16:40, Francesco Pietra escreveu: > > >> francesco@gig64:~/tmp$ export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu/ >> francesco@gig64:~/tmp$ ./CUDA-Z-0.7.189.run >> CUDA-Z 0.7.189 Container >> Starting CUDA-Z... >> /home/francesco/tmp/CUDA-Z-95b0-7943-3edd-827e/cuda-z: error while >> loading shared libraries: libXrender.so.1: wrong ELF class: ELFCLASS64 >> francesco@gig64:~/tmp$ >> > > Hi Francesco . > > is "CUDA-Z" a 32-bit binary ? what is the output of this command : > > > $ file /home/francesco/tmp/CUDA-Z-95b0-7943-3edd-827e/cuda-z > > > -- > To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org > with a subject of "unsubscribe". Trouble? Contact > listmas...@lists.debian.org > Archive: http://lists.debian.org/5283d078.6010...@gmail.com > >
Re: Fwd: upgrade to jessie from wheezy with cuda problems
Em 13-11-2013 16:40, Francesco Pietra escreveu: francesco@gig64:~/tmp$ export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu/ francesco@gig64:~/tmp$ ./CUDA-Z-0.7.189.run CUDA-Z 0.7.189 Container Starting CUDA-Z... /home/francesco/tmp/CUDA-Z-95b0-7943-3edd-827e/cuda-z: error while loading shared libraries: libXrender.so.1: wrong ELF class: ELFCLASS64 francesco@gig64:~/tmp$ Hi Francesco . is "CUDA-Z" a 32-bit binary ? what is the output of this command : $ file /home/francesco/tmp/CUDA-Z-95b0-7943-3edd-827e/cuda-z -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/5283d078.6010...@gmail.com
Re: Fwd: upgrade to jessie from wheezy with cuda problems
On Wed, Nov 13, 2013 at 07:40:10PM +0100, Francesco Pietra wrote: > francesco@gig64:~/tmp$ export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu/ That is unnecesary. That is already in the library path. The local directory is not. Windows implicitly looks in the current directory for files, linux (and almost all other systems) does not. hence: export LD_LIBRARY_PATH=. (. for current directory), or LD_LIBRARY_PATH=/tmp if that is where you put the library you were trying. > francesco@gig64:~/tmp$ ./CUDA-Z-0.7.189.run > CUDA-Z 0.7.189 Container > Starting CUDA-Z... > /home/francesco/tmp/CUDA-Z-95b0-7943-3edd-827e/cuda-z: error while loading > shared libraries: libXrender.so.1: wrong ELF class: ELFCLASS64 > francesco@gig64:~/tmp$ What does 'file ./CUDA-Z-0.7.189.run' say? -- Len Sorensen -- To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20131113185253.gi20...@csclub.uwaterloo.ca