Re: [openib-general] mstflint error on ppc64 bug fix
Michael, Looking at the attached error file will show a big endian data displayed in wrong little endian order. The attached file mstflint2.patch fix this problem. The mstflint2.patch has to be used after mstflint.patch packed in the OFED-1.1 openib.tgz.(user_patches...) The patch change only the displayed data and not the program's used internal structures , as I found that data write is performed o.k. and I did not want to cause errors in the writing process . Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com <http://www.voltaire.com/> <mailto:[EMAIL PROTECTED]> -Original Message----- From: Moshe Kazir Sent: Monday, October 30, 2006 4:00 PM To: 'Michael S. Tsirkin' Cc: openib-general@openib.org; [EMAIL PROTECTED] Subject: mstflint error on ppc64 Hi Michael, The output of mstflint is changed on ppc64 as result of byte ordering issues. If you take a HCA that was burned using x86_64 or Mellanox manufacturing and perform mstflint -d ... q on ppc64 you'll find that the value of PSID VSD and Board Id was changed. I tried to look at the code to find the error, but then I saw that vsd is defined twice in the code according to it's usage (char[205], or unsigned int[52] ) Can you please look and help ? Best regards, Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com <http://www.voltaire.com/> <mailto:[EMAIL PROTECTED]> Script started on Thu Nov 16 11:17:12 2006 js21-sles10:~ # mstflint -d 0c:00.0 -i /usr/voltaire/fw/BC-HSEC-128-SDR-Rev1-25208-4_7_927.img -vsd 1 01234567 -vsd2 0123456789012345 -nofs burn Current FW version on flash: N/A New FW version: N/A Burn image with the following GUIDs: Node: 0008f1040398047c Port1: 0008f1040398047d Port2: 0008f1040398047e Sys.Image: 0008f1040398047f You are about to replace current PSID in the image file - "0123456789012345" with a different PSID - "0TLV0073". Note: It is highly recommended not to change the image PSID. Do you want to continue ? (y/n) [n] : y You are about to replace current PSID on flash - "3210765410985432" with a different PSID - "0123456789012345". Note: It is highly recommended not to change the PSID. Do you want to continue ? (y/n) [n] : y Burn process will not be failsafe. No checks are performed. ALL flash, including Invariant Sector will be overwritten. If this process fails computer may remain in inoperable state. Do you want to continue ? (y/n) [n] : y 000%000%001%002%003%004%005%006%007%008%009%010%011%012%013%014%015%016%017%018%019%020%021%022%023%024%025%026%027%028%029%030%031%032%033%034%035%036%037%038%039%040%041%042%043%044%045%046%047%048%049%050%051%052%053%054%055%056%057%058%059%060%061%062%063%064%065%066%067%068%069%070%071%072%073%074%075%076%077%078%079%080%081%082%083%084%085%086%087%088%089%090%091%092%093%094%095%096%097%098%099%100%100% js21-sles10:~ # mstflint -d 0c:00.0 q Image type: Failsafe I.S. Version:1 Chip Revision: A0 GUID Des:Node Port1Port2Sys image GUIDs: 0008f1040398047c 0008f1040398047d 0008f1040398047e 0008f1040398047f Board ID:32107654 (3210765410985432) VSD: 32107654 PSID:3210765410985432 js21-sles10:~ # js21-sles10:~ # exit Script done on Thu Nov 16 11:28:46 2006 mstflint2.patch Description: mstflint2.patch ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] mstflint error on ppc64
Title: Message Hi Michael, The output of mstflint is changed on ppc64 as result of byte ordering issues. If you take a HCA that was burned using x86_64 or Mellanox manufacturing and perform mstflint -d ... q on ppc64 you'll find that the value of PSID VSD and Board Id was changed. I tried to look at the code to find the error, but then I saw that vsd is defined twice in the code according to it's usage (char[205], or unsigned int[52] ) Can you please look and help ? Best regards, Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire – The Grid Backbone www.voltaire.com ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] mpitests-2.0-0.src.rpm compile error on ppc64 sles10 js21
Title: Message Any one saw this error ? Moshe /usr/local/ofed/mpi/gcc/openmpi-1.1.1-1/bin/mpicc -I/usr/local/ofed/mpi/gcc/openmpi-1.1.1-1/include -DMPI1 -O -g -c IMB_cpu_exploit.c/usr/local/ofed/mpi/gcc/openmpi-1.1.1-1/bin/mpicc -o IMB-MPI1 IMB.o IMB_declare.o IMB_init.o IMB_mem_manager.o IMB_parse_name_mpi1.o IMB_benchlist.o IMB_strgs.o IMB_err_handler.o IMB_g_info.o IMB_warm_up.o IMB_output.o IMB_pingpong.o IMB_pingping.o IMB_allreduce.o IMB_reduce_scatter.o IMB_reduce.o IMB_exchange.o IMB_bcast.o IMB_barrier.o IMB_allgather.o IMB_allgatherv.o IMB_alltoall.o IMB_sendrecv.o IMB_init_transfer.o IMB_chk_diff.o IMB_cpu_exploit.o -L/usr/local/ofed/mpi/gcc/openmpi-1.1.1-1/lib/shared -L/usr/local/ofed/mpi/gcc/openmpi-1.1.1-1/lib -L/var/tmp/OFED//usr/local/lib64 -L/var/tmp/OFED//usr/local/lib/usr/bin/ld: skipping incompatible /usr/local/ofed/mpi/gcc/openmpi-1.1.1-1/lib64/libmpi.so when searching for -lmpi/usr/bin/ld: cannot find -lmpicollect2: ld returned 1 exit statusmake[2]: *** [MPI1] Error 1make[2]: Leaving directory `/var/tmp/OFEDRPM/BUILD/mpitests-2.0/IMB-2.3/src'make[1]: *** [IMB-MPI1] Error 2make[1]: Leaving directory `/var/tmp/OFEDRPM/BUILD/mpitests-2.0/IMB-2.3/src'make: *** [pmb] Error 2error: Bad exit status from /var/tmp/rpm-tmp.81774 (%install)RPM build errors: user pasha does not exist - using root user pasha does not exist - using root Bad exit status from /var/tmp/rpm-tmp.81774 (%install)ERROR: Failed executing "rpmbuild --rebuild --define '_topdir /var/tmp/OFEDRPM' --define '_name mpitests_openmpi_gcc' --define 'path_to_mpihome /usr/local/ofed/mpi/gcc/openmpi-1.1.1-1' --define 'root_path /var/tmp/OFED' /tmp/GridStack-4.1.1_OFED_1.1_rc6_js21/OFED-1.1-rc6/SRPMS/mpitests-2.0-0.src.rpm" Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire – The Grid Backbone www.voltaire.com ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] FW: FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD
> So there's a work around. > > Could we go deeper into the driver loaded/unloaded issue though? It looks like another kernel bug and it'd be nice to fix it. Do you know the root cause? If not, > cold you pls describe the symptoms and on what systems they occur? I'll try to understand the "mstflint not working when driver is not loaded" problem reported by Or and see how to go on. Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -Original Message- From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED] Sent: Thursday, October 05, 2006 9:06 AM To: Moshe Kazir Cc: Tseng-hui Lin; [EMAIL PROTECTED]; openib-general@openib.org Subject: Re: FW: FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD Quoting r. Moshe Kazir <[EMAIL PROTECTED]>: > Subject: Re: FW: FW: Mstflint - not working on ppc64 andwhendriver is > not loaded on AMD > > Michael, > > In case you missed Frank's signature, > > Look at the attached message Got that, no problem, I think it's fine for RC7. So there's a work around. Could we go deeper into the driver loaded/unloaded issue though? It looks like another kernel bug and it'd be nice to fix it. Do you know the root cause? If not, cold you pls describe the symptoms and on what systems they occur? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] FW: FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD
Michael, In case you missed Frank's signature, Look at the attached message Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -Original Message- From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 04, 2006 10:44 PM To: Moshe Kazir Cc: Tseng-hui Lin; [EMAIL PROTECTED]; openib-general@openib.org Subject: Re: FW: [openib-general] FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD Quoting r. Michael S. Tsirkin <[EMAIL PROTECTED]>: > Subject: Re: FW: [openib-general] FW: Mstflint - not working on ppc64 > andwhendriver is not loaded on AMD > > Quoting r. Moshe Kazir <[EMAIL PROTECTED]>: > > Subject: FW: [openib-general] FW: Mstflint - not working on ppc64 > > andwhendriver is not loaded on AMD > > > > Michael, > > > > I received the attached files from Frank. they look small , easy to > > understand, and change almost nothing in the code. > > > > The patch solves the ppc64 problems. > > > > Please approve the patch and integrate it into OFED-1.1-rc7. > > > > I tested it . it's working o.k. on on JS21 ppc64 sles 10, JS21 > > ppc64 sles9, redhat as4 u3 x86_64, redhat as4 u3 i386. > > Frank also tested it on AMD and JS21 PPC and MAC PPC64 . > > > > > > > > Best regards, > > > > Moshe > > OK, not sure what's in a tarball, but the patch looks small and safe > enough to go in. But, we need the Signed-off-by like from the patch > author, certifying to the Developer's Certificate of Origin 1.1: Please note RC7 is closing tomorrow, so we need to get the signature stuff out of the way by then if the patch's to make it in OFED 1.1. -- MST --- Begin Message --- mmap() does not work on ppc64. The 64-bit machines with 32-bit I/O need ioremap in device driver to allow mmap access to the I/O memory. This patch checks the above situations and try to use PCI config to do the firmware update when mmap() failed. Signed-off-by: Tseng-Hui (Frank) Lin <[EMAIL PROTECTED]> === diff -uPr mstflint.ofed-1.1r6/mtcr.h mstflint/mtcr.h --- mstflint.ofed-1.1r6/mtcr.h 2006-09-17 10:46:21.0 -0500 +++ mstflint/mtcr.h 2006-10-03 10:29:38.0 -0500 @@ -294,6 +294,9 @@ int err; char buf[]=":00:00.0"; char path[]="/sys/bus/pci/devices/:00:00.0/resource0"; + unsigned domain, bus, dev, func; + struct stat dummybuf; + char file_name[]="/proc/bus/pci/:00/00.0"; mf=(mfile*)malloc(sizeof(mfile)); if (!mf) return 0; @@ -338,13 +341,14 @@ mf->ptr = mmap(NULL, 0x10, PROT_READ | PROT_WRITE, MAP_SHARED, mf->fd, 0); -if ( (! mf->ptr) || (mf->ptr == MAP_FAILED) ) goto map_failed; +if ( (! mf->ptr) || (mf->ptr == MAP_FAILED) || +(__be32_to_cpu(*((u_int32_t *) ((char *) mf->ptr + 0xF0014))) == 0x) ) +goto map_failed_try_pciconf; } #endif else { #if CONFIG_ENABLE_MMAP -unsigned bus, dev, func; if (mfind(name,&offset,&bus,&dev,&func)) goto find_failed; #if CONFIG_USE_DEV_MEM @@ -352,8 +356,6 @@ if (mf->fd<0) goto open_failed; #else { - struct stat dummybuf; - char file_name[]="/proc/bus/pci/:00/00.0"; sprintf(file_name,"/proc/bus/pci/%2.2x/%2.2x.%1.1x", bus,dev,func); if (stat(file_name,&dummybuf)) @@ -369,7 +371,9 @@ mf->ptr = mmap(NULL, 0x10, PROT_READ | PROT_WRITE, MAP_SHARED, mf->fd, offset); -if ( (! mf->ptr) || (mf->ptr == MAP_FAILED) ) goto map_failed; +if ( (! mf->ptr) || (mf->ptr == MAP_FAILED) || +(__be32_to_cpu(*((u_int32_t *) ((char *) mf->ptr + 0xF0014))) == 0x) ) +goto map_failed_try_pciconf; #else goto open_failed; @@ -379,6 +383,20 @@ #if CONFIG_ENABLE_MMAP +map_failed_try_pciconf: +#if CONFIG_ENABLE_PCICONF + mf->ptr = NULL; + close(mf->fd); + if (sscanf(name, "%x:%x:%x.%x", &domain, &bus, &dev, &func) != 4) { + domain = 0; + if (sscanf(name, "%x:%x.%x", &bus, &dev, &func) != 3) goto map_failed; + } + snprintf(file_name, sizeof file_name, "/proc/bus/pci/%2.2x/%2.2x.%1.1x", bus, dev, func); + if (stat(file_name,&dummybuf)) + snprintf(file_name, sizeof file_name, "/proc/bus/pci/%4.4x:%2.2x/%2.2x.%1.1x", domain, bus,dev,func); + if ((mf->fd = open(file_name, O_RDWR | O_SYNC)) >= 0) return mf; +#endif + map_failed: #if !CONFIG_USE_DEV_MEM ioctl_failed: ___
Re: [openib-general] FW: Mstflint - not working on ppc64 andwhendriver is notloaded on AMD
Michael, > Not sure what's miss match FWR, but if you can't boot > the need to specify /proc/bus/pci/ is going to be the least of your problems. Are you sure that we will never face a situation were mthca driver is not loaded and we need to burn a new HCA FWR ? What we expect the user to do in this case ? Send the HCA to Mellanox ? > > Please understand, I'd like people to start using mthca0 for device name rather than defaulting to lspci as the first resort. I think that having " mstflint -d mthca0 ... " is really good and user friendly. BUT please notice that , Plenty of the mstflint uses are done when a customer buy / install new IB lab equipment. New users that does not know IB yet knows lspci . Lspci is easy to find info and very convenient for scripts writing. So , Can you explain what's wrong with " mstflint -d .." and why you don't want user to use it ? Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -Original Message- From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 03, 2006 9:45 AM To: Moshe Kazir Cc: Tseng-Hui (Frank) Lin; [EMAIL PROTECTED]; openib-general@openib.org Subject: Re: FW: Mstflint - not working on ppc64 andwhendriver is notloaded on AMD Quoting r. Moshe Kazir <[EMAIL PROTECTED]>: > A work-around that enable the use of mstflint only when the driver is > loaded is not sufficient I think I somewhat understand the mmap related kernel bug thing, (although I'd like to see this discussed on lkml) but I still don't understand where the "driver is loaded" thing comes from, and I'd like to. > What you plan to do when you have system error -> > - boot fail when IB started, > - driver loading fail as result of driver error / miss match FWR, etc. Not sure what's miss match FWR, but if you can't boot the need to specify /proc/bus/pci/ is going to be the least of your problems. Please understand, I'd like people to start using mthca0 for device name rather than defaulting to lspci as the first resort. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] FW: Mstflint - not working on ppc64 andwhendriver is notloaded on AMD
Michael Wrote : > OK, so for OFED just mmap from /proc/bus/pci/ should be sufficient > work-around - it will make things work when driver is loaded. Correct? No ! A work-around that enable the use of mstflint only when the driver is loaded is not sufficient What you plan to do when you have system error -> - boot fail when IB started, - driver loading fail as result of driver error / miss match FWR, etc. When driver is not loaded/operating o.k. We must be able to check the HCA FWR version, and reload FWR if needed. Having mstflint working only when driver is loaded o.k. will not permit us to access the HCA in this case !! Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -Original Message- From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED] Sent: Sunday, October 01, 2006 9:51 AM To: Tseng-Hui (Frank) Lin Cc: Moshe Kazir; [EMAIL PROTECTED]; openib-general@openib.org Subject: Re: FW: Mstflint - not working on ppc64 andwhendriver is notloaded on AMD Quoting r. Tseng-Hui (Frank) Lin <[EMAIL PROTECTED]>: > Subject: RE: FW: Mstflint - not working on ppc64 andwhendriver is > notloaded on AMD > > The ppc64 problem is actually in pci_64.c. Here is the patch: > > cut here = > diff --git a/arch/powerpc/kernel/pci_64.c > b/arch/powerpc/kernel/pci_64.c index 4c4449b..490403c 100644 > --- a/arch/powerpc/kernel/pci_64.c > +++ b/arch/powerpc/kernel/pci_64.c > @@ -734,9 +734,7 @@ static struct resource *__pci_mmap_make_ > if (hose == 0) > return NULL;/* should never happen */ > > - /* If memory, add on the PCI bridge address offset */ > if (mmap_state == pci_mmap_mem) { > - *offset += hose->pci_mem_offset; > res_bit = IORESOURCE_MEM; > } else { > io_offset = (unsigned long)hose->io_base_virt - pci_io_base; > = end cut = > > The mmap() system call on resource0 does not work on ppc64 without > this patch. PowerMAC G5 got away with this because its > hose->pci_mem_offset was set to 0. > > The fix is made on 8/21. It may be able to make it into 2.6.19. But it > certainly won't get into SLES10, SLES9-SP3, or REHL4-U4 which have > already been released. > > To cover both cases with and without the fix, my patch try to mmap > /sys/bus/pci//resource0 first. It it failed it tries mmap > /proc/bus/pci/ If it failed again, we have no choice but fall back > to use PCI config space. OK, so for OFED just mmap from /proc/bus/pci/ should be sufficient work-around - it will make things work when driver is loaded. Correct? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD
Michael, Frank found the cause to the problem in the implementation of arch/ppc/kernel/pci.c , and asked the IBM kernel group to send a bug fix to the Linux kernel group. The problem is : 1. This bug fix will not enter SLES10 as it is closed. 2. It also will not enter SLES9 :-) or Redhate as4 u4 . So we need a bug fix that will enable the use of mstflint on js21 PPC64 + backport to old systems . Franks fix is based on two points (if I understand the code with no errors) - 1. It opens /proc/bus/pci... And not /sys/bus/pci/... 2. It perform an ictl(fd, PCIIOC_MMAP_IS_MEM) ; Frank - am I write ? Can we enter these two small changes to the mstflint to have it working on the PPC64 js21 ? Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -Original Message- From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED] Sent: Thursday, September 28, 2006 4:41 PM To: Moshe Kazir Cc: Tseng-Hui (Frank) Lin; [EMAIL PROTECTED]; openib-general@openib.org Subject: Re: FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD Quoting r. Moshe Kazir <[EMAIL PROTECTED]>: > > Quoting r. Moshe Kazir <[EMAIL PROTECTED]>: > > Subject: RE: FW: Mstflint - not working on ppc64 andwhendriver is > > not > > loaded on AMD > > > > > > # ls /sys/class/infiniband/mthca0/device/resource0 > > /sys/class/infiniband/mthca0/device/resource0 > > OK, so can you try this please: > > strace -f -v -o log mstflint -d > /sys/class/infiniband/mthca0/device/resource0 q > > cat log > > -- > MST > 30463 open("/sys/class/infiniband/mthca0/device/resource0", O_RDWR|O_SYNC|O_LARGEFILE) = 3 > 30463 mmap2(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = -1 EINVAL (Invalid argument) So we see that mmap is failing with EINVAL. But why? We seem to be passing all valid parameters to it. I'm looking at arch/ppc/kernel/pci.c at the moment. It seems that EINVAL is returned if __pci_mmap_make_offset fails, and that seems to be only looking for a valid resource size. Are you up to finding the root cause of the problem in arch/ppc/kernel/pci.c? Maybe the resource offsets are wrong? What does cat /sys/class/infiniband/mthca0/device/resource show? Maybe there's some problem to map a full megabyte? Here's a test that only maps 4K. Could you strace it please? >>>>>>>>>>> #define _XOPEN_SOURCE 500 #define _FILE_OFFSET_BITS 64 #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* #include * #include */ int main() { int fd; unsigned value; volatile void *ptr; fd = open("/proc/bus/pci/00/00.0" ,O_RDWR | O_SYNC); /* ioctl(fd, PCIIOC_MMAP_IS_MEM); */ ptr = mmap(NULL, 0x1000, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0xf); memcpy(&value, (void*)(ptr + 0x14), sizeof value); printf("0x%x\n"); return 0; } -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD
See attached files. Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -Original Message- From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED] Sent: Thursday, September 28, 2006 2:41 PM To: Moshe Kazir Cc: Tseng-Hui (Frank) Lin; [EMAIL PROTECTED]; openib-general@openib.org Subject: Re: FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD Quoting r. Moshe Kazir <[EMAIL PROTECTED]>: > Subject: RE: FW: Mstflint - not working on ppc64 andwhendriver is not > loaded on AMD > > > # ls /sys/class/infiniband/mthca0/device/resource0 > /sys/class/infiniband/mthca0/device/resource0 OK, so can you try this please: strace -f -v -o log mstflint -d /sys/class/infiniband/mthca0/device/resource0 q cat log -- MST 30423 execve("./mstflint", ["./mstflint", "-d", "/sys/class/infiniband/mthca0/dev"..., "q"], ["LESSKEY=/etc/lesskey.bin", "NNTPSERVER=news", "INFODIR=/usr/local/info:/usr/sha"..., "MANPATH=/usr/share/man:/usr/loca"..., "HOSTNAME=js21-sles10", "GNOME2_PATH=/usr/local:/opt/gnom"..., "XKEYSYMDB=/usr/X11R6/lib/X11/XKe"..., "HOST=js21-sles10", "TERM=xterm", "SHELL=/bin/bash", "PROFILEREAD=true", "HISTSIZE=1000", "SSH_CLIENT=172.25.1.70 59870 22", "PERL5LIB=:/regtest/lib/perl", "MORE=-sl", "SSH_TTY=/dev/pts/0", "GROFF_NO_SGR=yes", "USER=root", "LS_COLORS=no=00:fi=00:di=01;34:l"..., "XNLSPATH=/usr/X11R6/lib/X11/nls", "ENV=/etc/bash.bashrc", "HOSTTYPE=ppc64", "FROM_HEADER=", "PAGER=less", "CSHEDIT=emacs", "XDG_CONFIG_DIRS=/usr/local/etc/x"..., "MINICOM=-c on", "MAIL=/var/mail/root", "PATH=/opt/vltmpi/OPENIB/mpi/bin:"..., "CPU=ppc64", "INPUTRC=/etc/inputrc", "PWD=/home/moshek/mstflint.270920"..., "LANG=POSIX", "PYTHONSTARTUP=/etc/pythonstart", "TEXINPUTS=:/root/.TeX:/usr/share"..., "QT_SYSTEM_DIR=/usr/share/desktop"..., "SHLVL=1", "HOME=/root", "LESS_ADVANCED_PREPROCESSOR=no", "OSTYPE=linux", "LS_OPTIONS=-A -N --color=tty -T "..., "XCURSOR_THEME=Industrial", "WINDOWMANAGER=/usr/X11R6/bin/gno"..., "GTK_PATH=/usr/local/lib/gtk-2.0:"..., "LESS=-M -I", "MACHTYPE=ppc64-suse-linux", "LOGNAME=root", "XDG_DATA_DIRS=/usr/local/share/:"..., "LC_CTYPE=en_US.UTF-8", "ACLOCAL_FLAGS=-I /opt/gnome/shar"..., "SSH_CONNECTION=172.25.1.70 59870"..., "PKG_CONFIG_PATH=/usr/local/lib/p"..., "LESSOPEN=lessopen.sh %s", "INFOPATH=/usr/local/info:/usr/sh"..., "DISPLAY=localhost:10.0", "XAUTHLOCALHOSTNAME=js21-sles10", "LESSCLOSE=lessclose.sh %s %s", "G_BROKEN_FILENAMES=1", "COLORTERM=1", "_=/usr/bin/strace", "OLDPWD=/tmp"]) = 0 30423 brk(0)= 0x10045000 30423 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xf7fe 30423 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) 30423 open("/etc/ld.so.cache", O_RDONLY) = 3 30423 fstat64(3, {st_dev=makedev(8, 3), st_ino=178774, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=224, st_size=114346, st_atime=2006/09/28-14:50:42, st_mtime=2006/09/28-13:52:05, st_ctime=2006/09/28-13:52:05}) = 0 30423 mmap(NULL, 114346, PROT_READ, MAP_PRIVATE, 3, 0) = 0xf7fc4000 30423 close(3) = 0 30423 open("/lib/libgcc_s.so.1", O_RDONLY) = 3 30423 read(3, "\177ELF\1\2\1\0\0\0\0\0\0\0\0\0\0\3\0\24\0\0\0\1\0\0 `"..., 512) = 512 30423 fstat64(3, {st_dev=makedev(8, 3), st_ino=16322, st_mode=S_IFREG|0755, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=160, st_size=81408, st_atime=2006/09/28-14:50:42, st_mtime=2006/06/17-11:47:21, st_ctime=2006/08/20-00:00:22}) = 0 30423 mmap(0xffcd000, 142200, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xffcd000 30423 madvise(0xffcd000, 142200, MADV_SEQUENTIAL|0x1) = 0 30423 mprotect(0xffe, 61440, PROT_NONE) = 0 30423 mmap(0xffef000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x12000) = 0xffef000 30423 close(3) = 0 30423 open("/lib/power4/libc.so.6", O_RDONLY) = 3 30423 read(3, "\177ELF\1\2\1\0\0\0\0\0\0\0\0\0\0\3\0\24\0\0\0\1\0\1\353"..., 512) = 512 30423 fstat64(3, {st_dev=makedev(8, 3), st_ino=12506, st_mode=S_IFREG|0755, st_nlink=1,
Re: [openib-general] FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD
# ls /sys/class/infiniband/mthca0/device/resource0 /sys/class/infiniband/mthca0/device/resource0 # ls -ald /sys/class/infiniband/mthca0/device/* lrwxrwxrwx 1 root root 0 Sep 27 11:33 /sys/class/infiniband/mthca0/device/bus -> ../../../../bus/pci -r--r--r-- 1 root root 4096 Sep 27 11:33 /sys/class/infiniband/mthca0/device/class -rw-r--r-- 1 root root 256 Sep 28 14:17 /sys/class/infiniband/mthca0/device/config -r--r--r-- 1 root root 4096 Sep 27 11:33 /sys/class/infiniband/mthca0/device/device -r--r--r-- 1 root root 4096 Sep 27 11:33 /sys/class/infiniband/mthca0/device/devspec lrwxrwxrwx 1 root root 0 Sep 28 11:43 /sys/class/infiniband/mthca0/device/driver -> ../../../../bus/pci/drivers/ib_mthca lrwxrwxrwx 1 root root 0 Sep 28 11:43 /sys/class/infiniband/mthca0/device/infiniband:mthca0 -> ../../../../class/infiniband/mthca0 lrwxrwxrwx 1 root root 0 Sep 28 11:43 /sys/class/infiniband/mthca0/device/infiniband_mad:issm0 -> ../../../../class/infiniband_mad/issm0 lrwxrwxrwx 1 root root 0 Sep 28 11:43 /sys/class/infiniband/mthca0/device/infiniband_mad:issm1 -> ../../../../class/infiniband_mad/issm1 lrwxrwxrwx 1 root root 0 Sep 28 11:43 /sys/class/infiniband/mthca0/device/infiniband_mad:umad0 -> ../../../../class/infiniband_mad/umad0 lrwxrwxrwx 1 root root 0 Sep 28 11:43 /sys/class/infiniband/mthca0/device/infiniband_mad:umad1 -> ../../../../class/infiniband_mad/umad1 lrwxrwxrwx 1 root root 0 Sep 28 11:43 /sys/class/infiniband/mthca0/device/infiniband_verbs:uverbs0 -> ../../../../class/infiniband_verbs/uverbs0 -r--r--r-- 1 root root 4096 Sep 28 14:17 /sys/class/infiniband/mthca0/device/irq -r--r--r-- 1 root root 4096 Sep 27 11:33 /sys/class/infiniband/mthca0/device/local_cpus -r--r--r-- 1 root root 4096 Sep 27 11:33 /sys/class/infiniband/mthca0/device/modalias lrwxrwxrwx 1 root root 0 Sep 28 11:43 /sys/class/infiniband/mthca0/device/net:ib0 -> ../../../../class/net/ib0 lrwxrwxrwx 1 root root 0 Sep 28 11:43 /sys/class/infiniband/mthca0/device/net:ib1 -> ../../../../class/net/ib1 -r--r--r-- 1 root root 4096 Sep 28 11:43 /sys/class/infiniband/mthca0/device/pools -r--r--r-- 1 root root 4096 Sep 28 14:17 /sys/class/infiniband/mthca0/device/resource -rw--- 1 root root 1048576 Sep 28 14:17 /sys/class/infiniband/mthca0/device/resource0 -rw--- 1 root root 8388608 Sep 27 11:33 /sys/class/infiniband/mthca0/device/resource2 -rw--- 1 root root 134217728 Sep 27 11:33 /sys/class/infiniband/mthca0/device/resource4 -r--r--r-- 1 root root 4096 Sep 27 11:33 /sys/class/infiniband/mthca0/device/subsystem_device -r--r--r-- 1 root root 4096 Sep 27 11:33 /sys/class/infiniband/mthca0/device/subsystem_vendor --w--- 1 root root 4096 Sep 27 11:33 /sys/class/infiniband/mthca0/device/uevent -r--r--r-- 1 root root 4096 Sep 27 11:33 /sys/class/infiniband/mthca0/device/vendor # Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -Original Message- From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED] Sent: Thursday, September 28, 2006 12:48 PM To: Moshe Kazir Cc: Tseng-Hui (Frank) Lin; [EMAIL PROTECTED]; openib-general@openib.org Subject: Re: FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD Quoting r. Moshe Kazir <[EMAIL PROTECTED]>: > > When mthca is loaded, what does > > mstflint -d /sys/class/infiniband/mthca0/device/resource0 q do on > > PPC? > > > On PPC64 sles10 with Franks last fix > # ./mstflint -d /sys/class/infiniband/mthca0/device/resource0 q > *** ERROR *** Can not open Does /sys/class/infiniband/mthca0/device/resource0 exist on this system? Pls send output of ls /sys/class/infiniband/mthca0/device/ -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD
O.k. mstflint -d `lspci | grep Mellanox |grep -v Bridge | cut -f1 -d" "` q Will do the job . Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -Original Message- From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED] Sent: Thursday, September 28, 2006 12:53 PM To: Moshe Kazir Cc: openib-general@openib.org; [EMAIL PROTECTED] Subject: Re: FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD Quoting r. Moshe Kazir <[EMAIL PROTECTED]>: > Subject: Re: FW: Mstflint - not working on ppc64 andwhendriver is not > loaded on AMD > > I prefer the "mstflint -d 0c:00.0 q " format BTW, this won't work on systems with multiple domains - you must add the domain as well: mstflint -d :0c:00.0 q -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD
I prefer the "mstflint -d 0c:00.0 q " format As in enables the writing of script that extract lscpi info and getting results -> # mstflint -d `lspci | grep Mellanox |grep -v Bridge | cut -f1 -d" "` q Image type: Failsafe I.S. Version:1 Chip Revision: A0 GUID Des:Node Port1Port2Sys image GUIDs: 0008f1040398047c 0008f1040398047d 0008f1040398047e 0008f1040398047f Board ID: (0TLV0073) VSD: PSID:0TLV0073 # The format "mstflint -d mtch0 " is good but no sufficient . When the HCA is old/wrong/damaged insmod may fail. In this case we'll need mstflint to fix problems. Me must have a way to operate mstflint when driver is not loaded. > When mthca is loaded, what does > mstflint -d /sys/class/infiniband/mthca0/device/resource0 q > do on PPC? On PPC64 sles10 with Franks last fix # ./mstflint -d /sys/class/infiniband/mthca0/device/resource0 q *** ERROR *** Can not open /sys/class/infiniband/mthca0/device/resource0: Invalid argument *** ERROR *** Can not get flash type using device /sys/class/infiniband/mthca0/device/resource0 # On PPC64 with OFED-1.1 rc6 original sources # ./mstflint -d /sys/class/infiniband/mthca0/device/resource0 q *** ERROR *** Can not open /sys/class/infiniband/mthca0/device/resource0: Invalid argument *** ERROR *** Can not get flash type using device /sys/class/infiniband/mthca0/device/resource0 # Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -Original Message- From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED] Sent: Thursday, September 28, 2006 9:35 AM To: Moshe Kazir Cc: Tseng-Hui (Frank) Lin; [EMAIL PROTECTED]; openib-general@openib.org Subject: Re: FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD Quoting r. Moshe Kazir <[EMAIL PROTECTED]>: > The mstflint operated in the "classic way" in OFED-1.1 is not working > on PPC64 sles10 !!! I consider the classic way to be -d /sys/class/infiniband/mthca0/device/resource0 It does seem a bit verbse now that you mention this - would a shortcut to allow just -d mthca0 help a lot? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD
Michael wrote : > Since I don't consider this a critical fix (there's no reason driver > won't go up, and if it does not, there's a simple workaround by Michael , The mstflint operated in the "classic way" in OFED-1.1 is not working on PPC64 sles10 !!! Telling the customer to use a workaround (open /proc...) if there platform is PPC64 is not nice !! We need to fix the bug in the code ! Frank wrote : > The patch can be enabled by defining CONFIG_MOPEN_FALL_BACK to 1. CONFIG_MOPEN_FALL_BACK is defined to 1 for ppc64 and x86_64 and 0 for others This define keeps the program from been damaged when running on other platforms. Can you have a look at the code once more and write how you want us (me and Frank ) to refine it ? It's o.k. for us if the fix will be enter to the OFED-1.2 but we need it in the code ! Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -Original Message- From: Tseng-Hui (Frank) Lin [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 27, 2006 7:46 PM To: Michael S. Tsirkin Cc: Moshe Kazir; Tseng-hui Lin; openib-general@openib.org Subject: Re: [openib-general] FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD On Wed, 2006-09-27 at 18:19 +0300, Michael S. Tsirkin wrote: > Quoting r. Moshe Kazir <[EMAIL PROTECTED]>: > > Subject: FW: [openib-general] Mstflint - not working on ppc64 and > > whendriver is not loaded on AMD > > > > Michael, > > > > Frank new version was tested once more in Voltaire and is working > > o.k. . I tested `./mstflint -d q` when drivers are > > loaded and when drivers are not loaded. in all cases it worked o.k. > > Thanks for testing, but I'd like to get a handle on what's going on > first. > > First, I'm pretty sure when driver is loaded things work OK on all > systems. When driver is not loaded - could you please answer whether > using /sys/bus/pci/devices/\:03\:00.0/resource0 > works for you (on systems that have resource0)? > It doesn't work. > > > > Test was ferformed on the following environments : > > > > -IBM js21 ppc64 sles10 PCI-E > > -IBM js21 ppc64 sles9 sp3 PCI-E > > -IBM hs21 em64t redhat as 4 u3 PCI-E > > -IBM hs21 em64t sles 9 sp3 PCI-E > > -x86_64 sles10 PCI-E > > -MAC ppc64 sles10 PCI-X > > -MAC ppc64 sles10 PCI-E > > > > Please consider inserting the patch to OFED . > > > > Moshe > > Since I don't consider this a critical fix (there's no reason driver > won't go up, and if it does not, there's a simple workaround by > specifying the /proc interface, that is slower but works), I don't > think this should go into OFED 1.1. > > Unfortunately, I never got a small bugfix patch against the latest > mstflint - the patch I saw posted touches all kind of things all over > the code - so I can't insert it in trunk, either. > I agree this is not critical. The patch changes nothing but the way of opening the device. On some ppc64 and x86_64 machines, the I/O memory mapped by mmap() is not accessable (return 0x) unless the kernel code (usually the device driver) does an ioremap. This is why mmap resource0 does not work on these machines. There is no way I am aware of can do ioremap from user space code like mstflint. The only thing I can think of is to fall back to use the config space file in /proc/bus/pci/. The (big) patch I made checks if the faster way (mmap resource0) works. It it doesn't, the patch tries other slower ways and use the fastest working way it can find. That's all the patch does. It does not make big fix. It just save the users trouble of trying all possible ways of opening a devices manually. I understand applying big patch is risky unless it can be throughly tested. Unfortunately, no one has all the machines to test the patch. Moshe and I have tested the patch on Power MAC, Squadrons, JS20, and JS21 (almost all living ppc64 machines) as well as a few x86_64 machines. We believe this patch is safe for these machines. The patch can be enabled by defining CONFIG_MOPEN_FALL_BACK to 1. CONFIG_MOPEN_FALL_BACK is defined to 1 for ppc64 and x86_64 and 0 for others. We can enable this patch on other machines when people who have these machines tested the patch. I agree this is no a critical patch, but it is a useful one. Moreover, it is well tested on the machines with the patch enabled and change nothing on the machines with the patch disabled. I believe this is a safe patch. Please re-consider adding it. Thanks. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OpenSm on sles10 ppc64 OFED 1.0 - bug ?
See attached file. Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -Original Message- From: Hal Rosenstock [mailto:[EMAIL PROTECTED] Sent: Sunday, September 17, 2006 5:50 PM To: Moshe Kazir Cc: openib-general@openib.org; OpenFabricsEWG; Sasha Khapyorsky Subject: Re: [openfabrics-ewg] OpenSm on sles10 ppc64 Hi Moshe, On Sun, 2006-09-17 at 10:41, Moshe Kazir wrote: > /etc/init.d/opensm start produce an error on my JS21 ppc64 SLES10 > OFED 1.0 . What error ? > Should ppc64 SLES10 OFED 1.0 work ? I don't think so. > Anyone tried it ? OFED 1.0 OpenSM release notes say: * PPC support: No PPC QA was performed. There was an issue with PPC64 that Sasha fixed post OFED 1.0. It's in OFED 1.1 and could easily be retrofitted to OFED 1.0 if needed. Contact Sasha or me if you are interested in doing this. -- Hal > > Moshe > > > Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) > > Voltaire - The Grid Backbone > > www.voltaire.com > > > > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > [EMAIL PROTECTED] > Sent: Thursday, September 14, 2006 7:39 PM > To: [EMAIL PROTECTED] > Cc: openib-general@openib.org > Subject: [openfabrics-ewg] OFED-1.1-RC5 is ready > > > Hi, > > OFED-1.1-rc5 is available on > https://openib.org/svn/gen2/branches/1.1/ofed/releases/ > File: OFED-1.1-rc5.tgz > Please report any issues in bugzilla http://openib.org/bugzilla/ > > > Release details: > > Build_id: > > OFED-1.1-rc5 > > openib-1.1 (REV=9485) > # User space https://openib.org/svn/gen2/branches/1.1/src/userspace > Git: git://www.mellanox.co.il/~git/infinibandref: refs/heads/ofed_1_1 > commit 18c1cb87c4b16f1a1577807077bbdcba3f446f09 > > # MPI > mpi_osu-0.9.7-mlx2.2.0.tgz > openmpi-1.1.1-1.src.rpm > mpitests-2.0-0.src.rpm > > OS support: > === > Novell: > - SLES 9.0 SP3 > - SLES10 > Redhat: > - Redhat EL4 up3 > > - Redhat EL4 up4 > kernel.org: > - Kernel 2.6.17 > > > Bug fixes from OFED-1.1-rc4: > == > 1. ISER compilation fixed on SLES10 > 2. Fixed build on SLES9 PPC64 > 3. Updated libehca > 4. OpenSM fixes > 5. Added tavor_quirk option to rdma_cm module (disabled by default): > Tavor performance quirk: limit MTU to 1K if > 0 (int) > > Known issues: > = > libipathverbs compilation fails on SLES10 (Bug:204) > > > OFED-1.1-rc6 (hopefully the last one) planned to be released on Monday > or Tuesday. > > > Regards, > Vladimir > > > > Hi, > > > > The plan is to issue OFED RC5 on Thursday 9/14 and final release > > next > > week. I am aware of the following issues: > > > > > > 1) Compilation on SLES9 on PPC - Jack Morgenstein > > 2) Huge pages on PPC - Eli Cohen > > 3) libipathverbs: - Qlogic > > a) libipathverbs ABI issue > > b) libipathverbs build on SLES10 > > 4) SDP performance on Tavor - Michael Tsirkin > > 5) iSER issue on SLES10 - Voltaire > > > > > > In order to meet tomorrow's RC5 release all owners please send your > > patches by end of today. > > > > > > Regards, > > > > Aviram > > > > ___ > > openfabrics-ewg mailing list > > [EMAIL PROTECTED] > > http://openib.org/mailman/listinfo/openfabrics-ewg > > > > > > ___ > openfabrics-ewg mailing list > [EMAIL PROTECTED] > http://openib.org/mailman/listinfo/openfabrics-ewg > > > ___ > openfabrics-ewg mailing list > [EMAIL PROTECTED] > http://openib.org/mailman/listinfo/openfabrics-ewg > js21-sles10:~ # /etc/init.d/opensmd start *** glibc detected *** /usr/local/ofed/bin/opensm: realloc(): invalid next size: 0x10076ea0 *** === Backtrace: = /lib64/power4/libc.so.6[0x41bceb4] /lib64/power4/libc.so.6[0x41c0708] /lib64/power4/libc.so.6(__libc_realloc-0xccd40)[0x41c2028] /lib64/power4/libc.so.6(__libc_realloc-0xcce24)[0x41c1f44] /lib64/power4/libc.so.6[0x41b37a8] /lib64/power4/libc.so.6(fclose-0xe54cc)[0x41a7a24] /lib64/power4/libc.so.6(__vsyslog_chk-0x78d28)[0x421a9f0] /lib64/power4/libc.so.6(syslog-0x78894)[0x40
Re: [openib-general] Any chance to get 32-Bit libraries on SLES9 x86_64?
Title: Message I had the other problem (trying to find the 64-bit rpm) In sles9 sysfsutils is part of the udev rpm. Therefore I think that you may try udev...rpm for sysfsutils 32-bit version and udev-64bit...rpm for sysfsutils 64-bit version after install the 32 bit libraries are located on /usr/lib and the 64 bit libraries are located under /usr/lib64 Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire – The Grid Backbone www.voltaire.com -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bub ThomasSent: Friday, September 15, 2006 9:24 AMTo: openib-general@openib.org; Bub ThomasSubject: [openib-general] Any chance to get 32-Bit libraries on SLES9 x86_64? Is there any chance/trick to get 32-Bit Libraries build and usable on SLES9 x86_64? When I installed OFED-1.1-rc4 I get: WARNING: sysfsutils 32-bit version is required to build 32-bit libibverbs package. WARNING: Skiping build of 32-bit libraries. I googled around and didn’t find any sysfsutils 32-bit for SLES9. I now that tit is working under SLES10 b ut our customer base is on SLES9 and very conservative when it comes down to using the latest and greates Os/distribution. Thomas Thomas BubGrass Valley Germany GmbHBrunnenweg 964331 Weiterstadt, GermanyTel: +49 6150 104 147Fax: +49 6150 104 656Email: [EMAIL PROTECTED]www.GrassValley.com ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] OpenSm on sles10 ppc64
/etc/init.d/opensm start produce an error on my JS21 ppc64 SLES10 OFED 1.0 . Should ppc64 SLES10 OFED 1.0 work ? Anyone tried it ? Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Thursday, September 14, 2006 7:39 PM To: [EMAIL PROTECTED] Cc: openib-general@openib.org Subject: [openfabrics-ewg] OFED-1.1-RC5 is ready Hi, OFED-1.1-rc5 is available on https://openib.org/svn/gen2/branches/1.1/ofed/releases/ File: OFED-1.1-rc5.tgz Please report any issues in bugzilla http://openib.org/bugzilla/ Release details: Build_id: OFED-1.1-rc5 openib-1.1 (REV=9485) # User space https://openib.org/svn/gen2/branches/1.1/src/userspace Git: git://www.mellanox.co.il/~git/infinibandref: refs/heads/ofed_1_1 commit 18c1cb87c4b16f1a1577807077bbdcba3f446f09 # MPI mpi_osu-0.9.7-mlx2.2.0.tgz openmpi-1.1.1-1.src.rpm mpitests-2.0-0.src.rpm OS support: === Novell: - SLES 9.0 SP3 - SLES10 Redhat: - Redhat EL4 up3 - Redhat EL4 up4 kernel.org: - Kernel 2.6.17 Bug fixes from OFED-1.1-rc4: == 1. ISER compilation fixed on SLES10 2. Fixed build on SLES9 PPC64 3. Updated libehca 4. OpenSM fixes 5. Added tavor_quirk option to rdma_cm module (disabled by default): Tavor performance quirk: limit MTU to 1K if > 0 (int) Known issues: = libipathverbs compilation fails on SLES10 (Bug:204) OFED-1.1-rc6 (hopefully the last one) planned to be released on Monday or Tuesday. Regards, Vladimir > Hi, > > The plan is to issue OFED RC5 on Thursday 9/14 and final release next > week. I am aware of the following issues: > > > 1) Compilation on SLES9 on PPC - Jack Morgenstein > 2) Huge pages on PPC - Eli Cohen > 3) libipathverbs: - Qlogic > a) libipathverbs ABI issue > b) libipathverbs build on SLES10 > 4) SDP performance on Tavor - Michael Tsirkin > 5) iSER issue on SLES10 - Voltaire > > > In order to meet tomorrow's RC5 release all owners please send your > patches by end of today. > > > Regards, > > Aviram > > ___ > openfabrics-ewg mailing list > [EMAIL PROTECTED] > http://openib.org/mailman/listinfo/openfabrics-ewg > ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] getting LOC_QP_OP_ERR with IPoIB - mstflint question
Let assume that the HCA has wrong FWR and/or other reason that cause driver load failure ? We have to check what's going on in this case. -> mstflint is one of our tools. Moshe. Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -Original Message- From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 06, 2006 4:25 PM To: Moshe Kazir Cc: Or Gerlitz; Roland Dreier; openib-general@openib.org; Yiftah Shahar; Tseng-hui Lin Subject: Re: [openib-general] getting LOC_QP_OP_ERR with IPoIB - mstflint question Quoting r. Moshe Kazir <[EMAIL PROTECTED]>: > Is it time to create a work arround that opens /proc/bus/pci/ > And always work ? But why isn't the driver loaded? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] problems to regiser memory as a reglar user on SLES9 SP3
Hi Tziporet, I'm trying Ofed 1.1 rc3 on IBM js21 sles9sp3 ppc64. Install is stopped at the very beginning as 64-bit udev is missing. I tried to compile the udev...src.rpm supplied in sls9sp3 cd3 and failed as result of compilation error. Did you test ofed 1.1 rc3 on ppc64. Can you advice me how to get 64-bit udev ? Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tziporet Koren Sent: Tuesday, August 29, 2006 5:50 PM To: OPENIB Subject: [openib-general] problems to regiser memory as a reglar user on SLES9 SP3 Hi All, In testing today we found that on SLES9 SP3 memory locking as a regular user fails. Although I changed /etc/security/limits.conf and added the following two lines: * soft memlock * hard memlock Note that same change does work in SLES10. Another change I tried (that worked in gen1) was to add the following line to the file/etc/sysctl.conf: vm.disable_cap_mlock=1. However nothing helped in SLES9 Does anyone have any idea how to solve this? Thanks, Tziporet ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] getting LOC_QP_OP_ERR with IPoIB - mstflint question
I have tested the mstflint problem with two different ppc64 machines : - On sles 10 PPC64 PowerMac G5 -> mstflint -d 0001:07:00.0 qworks o.k. with and without the ib_mthca loaded - On s;es10 PPC64 IBM JS21 -> mstflint -d 0001:07:00.0 qDOESN'T work with and without the ib_mthca loaded and I have to use /proc/bus/pci/. Is it time to create a work arround that opens /proc/bus/pci/ And always work ? Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Michael S. Tsirkin Sent: Tuesday, September 05, 2006 4:37 PM To: Or Gerlitz Cc: Roland Dreier; openib-general@openib.org Subject: Re: [openib-general] getting LOC_QP_OP_ERR with IPoIB - mstflint question Quoting r. Or Gerlitz <[EMAIL PROTECTED]>: > Subject: Re: getting LOC_QP_OP_ERR with IPoIB - mstflint question > > Michael S. Tsirkin wrote: > > Donnu, it looks really weird. Could you try firmware 3.5.0 please? > > I just noted that you can not work with mstflint if the mthca driver > is > not loaded, i think it was not the case in the gen1 tools, am i correct. Yes, recent kernels disable device access once driver is unloaded: mstflint -d 08:00.0 q *** ERROR *** Read a corrupted device id (0x). Probably HW/PCI access problem *** ERROR *** Device type 65535 not supported. *** ERROR *** Can not get flash type using device 08:00.0 mstflint should work without driver using /proc: mstflint -d /proc/bus/pci/08/00.0 q Image type: Failsafe I.S. Version:1 Chip Revision: A0 In gen1 flint had a separate driver which you had to load. I am not sure whether this would work on 2.6.18 > Is this connected to this print > > ACPI: PCI interrupt for device :02:00.0 disabled > > i see once the mthca driver is unloaded? > > Or. Probably not. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] new user level branch for OFED 1.1
Please look at Bugzilla bugs :169,170,171,174 Best regards, Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -Original Message- From: Tziporet Koren [mailto:[EMAIL PROTECTED] Sent: Sunday, July 30, 2006 3:50 PM To: Moshe Kazir Cc: EWG; OPENIB Subject: Re: [openfabrics-ewg] new user level branch for OFED 1.1 Moshe Kazir wrote: > Does the new planned OFED 1.1 resolve the open 64 bit compilation > problems over PPC 64 ? > > Moshe > > > Which issues you refer to? Tziporet ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] new user level branch for OFED 1.1
Does the new planned OFED 1.1 resolve the open 64 bit compilation problems over PPC 64 ? Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tziporet Koren Sent: Wednesday, July 26, 2006 5:42 PM To: EWG; OPENIB Subject: [openfabrics-ewg] new user level branch for OFED 1.1 Hi All, Toward OFED 1.1 release I have created the 1.1 branch: https://openib.org/svn/gen2/branches/1.1/ This branch includes the src/userspace/ based on trunk r8680, and all the other ofed staff. Tziporet ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] SLES9 SP3 support was added
I'm trying to use OFED 1.0 on sles9 SP3 PPC64 . OFED 1.0 requires sysfsutils to be installed . I tried to compile & install sysfsutils-1.2.0-4.src.rpm But found that it conflicts with udev--021-36 rpm Anyone knows to work around this problem ? Also, OFED 1.0 is svn rev 8031 while patches/2.6.5-7.244 includes some svn 8111 files. Was OFED 1.0 (svn 8031) tested on SLES9 SP3 ? Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tziporet Koren Sent: Thursday, June 22, 2006 4:20 PM To: [EMAIL PROTECTED]; OPENIB Subject: [openib-general] SLES9 SP3 support was added Hi All, We have added support for SLES9 SP3 that can be used with OFED 1.0. The kernel modules supported are: * mthca * core * CM & CMA * IPoIB * SRP All user level apps and libraries are working too. CPU Architectures supported: * x86 * x86_64 * ia64 The backport patches are available at: https://openib.org/svn/gen2/branches/1.0/ofed/patches/2.6.5-7.244/ There is also a need to take the updated configure and install.sh that add SLES9 specific support. There are no other changes in the package beside these. Is there a need to create a package (1.0.1) with SLES9 support? Tziporet ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Compilation issues on rhel4 u3 ppc64 sysfs.o
Vladimir, I'm trying the last tarball OFED svn Rev=8031 on IBM ppc64 redhat as4 U 3. I compiled the pciutils & sysfsutils rpm's with -m64 as you wrote. I face tvflash.c compile error. Do I need to compile more rpm's to 64 bit ? Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Vladimir Sokolovsky Sent: Thursday, May 25, 2006 12:49 PM To: Scott Weitzenkamp (sweitzen) Cc: openib-general@openib.org Subject: Re: [openib-general] Compilation issues on rhel4 u3 ppc64 sysfs.o In OFED-1.0-rc5 all binaries and libraries will be compiled on *ppc64 *with *-m64* flag. This requires sysfsutils and sysfsutils-devel 64-bit RPM to be installed (in order to build libibverbs). Also pciutils and pciutils-devel 64-bit required for tvflash package. libsdp will be built both 32 and 64 bit libraries. Note: in order to build sysfsutils 64-bit RPM run: CC="gcc -m64" rpmbuild --rebuild sysfsutils-1.3.0-1.2.1.src.rpm (This was tested on Fedora C4 PPC64) Regards, Vladimir Scott Weitzenkamp (sweitzen) wrote: > I know Vlad made some changes for rc5 in this area, at least for > libsdp, not sure if other libs got changed as well. > > Scott Weitzenkamp > SQA and Release Manager > Server Virtualization Business Unit > Cisco Systems > > > > *From:* Paul [mailto:[EMAIL PROTECTED] > *Sent:* Wednesday, May 24, 2006 11:00 AM > *To:* Scott Weitzenkamp (sweitzen) > *Cc:* openib-general@openib.org > *Subject:* Re: [openib-general] Compilation issues on rhel4 u3 > ppc64 sysfs.o > > Scott, > Upon further inspection the build.sh and install.sh scripts > built 32bit libraries and binaries. If I export CFLAGS (and the > like) to include -m64 then the build dies while looking for a > 64bit libsysfs. rhel4 u3 does not include a ppc64 sysfsutils, nor > have I been able to find an actual 64bit version of it. Is there a > workaround for getting things to build actual ppc64 > binaries/libraries ? > > The actual error is: > checking for dlsym in -ldl... yes > checking for pthread_mutex_init in -lpthread... yes > checking for sysfs_open_class in -lsysfs... no > configure: error: sysfs_open_class() not found. libibverbs > requires libsysfs. > ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] RE: [openfabrics-ewg] FW: IBED-1.0-rc3 is available
Title: Message When tring to build IBED-1.0-rc3 on 2.6.9-34.EL-smp-x86_64 I got the following error : In file included from /var/tmp/IBED/tmp/openib/openib/src/linux-kernel/infiniband/hw/ipath/ipath_cq.c:36:/var/tmp/IBED/tmp/openib/openib/src/linux-kernel/infiniband/hw/ipath/ipath_verbs.h:395: error: `BITS_PER_BYTE' undeclared here (not in a function)make[3]: *** [/var/tmp/IBED/tmp/openib/openib/src/linux-kernel/infiniband/hw/ipath/ipath_cq.o] Error 1make[2]: *** [/var/tmp/IBED/tmp/openib/openib/src/linux-kernel/infiniband/hw/ipath] Error 2make[1]: *** [_module_/var/tmp/IBED/tmp/openib/openib/src/linux-kernel/infiniband] Error 2make[1]: Leaving directory `/usr/src/kernels/2.6.9-34.EL-smp-x86_64'make: *** [kernel] Error 2ERROR: Failed to execute: make kernel It's look as if in the file ipath_cq.c the line #defined BITS_PER_BYTE is located two lines after the line #include that uses the definition. Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire – The Grid Backbone www.voltaire.com -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tziporet KorenSent: Monday, April 10, 2006 10:17 PMTo: Tziporet Koren; Matters, Todd; Kamen Bodourov; Moni Levy; Vladimir Sokolovsky; Amit Krig; Bryan O'Sullivan; Jeff Squyres (jsquyres); Matt LeiningerCc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; IBCGSubject: [openfabrics-ewg] FW: IBED-1.0-rc3 is available It seems that this mail was not delivered by the openfabrics-ewg list thus I forward it to direct emails. Matt - can you add a link in the news page of OpenFabrics for this release. Thanks, Tziporet -Original Message-From: Vladimir SokolovskySent: Monday, April 10, 2006 7:55 PMTo: [EMAIL PROTECTED]Cc: openib-general; Tziporet KorenSubject: IBED-1.0-rc3 is available Hi All, We have prepared IBED 1.0 RC3. Release location: https://openib.org/svn/gen2/branches/1.0/ibed/releases File: IBED-1.0-rc3.tgz md5sum: 8e143fd4b63646ebc9f5c9f73d18394b BUILD_ID: IBED-1.0-rc3: OpenIB: openib_branch1.0-20060410-1551 (REV=6367) Userspace SVN path: https://openib.org/svn/gen2/branches/1.0/src/userspace IB Kernel modules SVN path: https://openib.org/svn/gen2/branches/1.0/ibed/tags/rc3/linux-kernel MPI: openmpi-1.0.2a12-1 mpi_osu-0.9.7-mlx2.1.0 mpitests-1.0-0 OSes: * RH EL4 up2: 2.6.9-22.ELsmp * RH EL4 up3: 2.6.9-34.ELsmp * Fedora C4: 2.6.11-1.1369_FC4 * SLES10 beta 7: 2.6.16-rc5-git9-2-smp * SUSE 10 Pro: 2.6.13-15-smp * kernel.org: 2.6.16 Systems: * x86_64 * x86 * ia64 * ppc64 Main changes from RC2: 1. Added support in Rh EL4 up3 2. Added Open MPI package 3. OSU MPI is now based on 0.97 release (was 0.95 in RC2) 4. Added Pathscale (ipath) driver 5. Added uDapl 6. build based on the new method: Userlevel from openib branch 1.0 and kernel from openib trunk. (will be from the git in RC4) 7. Added ibutils package 8. Bug fixes Package limitations: 1. iSER is working on SuSE SLES 10 Beta8 only 2. MPI OSU and Open MPI compilation fails on PPC64 3. uDAPL does not supported on RH EL4 (up2 and up3) since rdma_ucm module does not work on 2.6.9* kernels. If someone has a patch we will use it. 4. ipath driver compilation fails on RH EL4 and FedoraC4. Please send me and Vlad any issue you encounter and testing results. Thanks Tziporet & Vlad ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] RC2 delayed a bit
Bob Woodruff wrote -> > BTW. I built some kernel RPMs based on the 1.0 branch kernel code and the backport patches for RedHat EL4.0 U3. If someone wants me to post them somewhere, I will. I'll be glad to test them on a PPC machine . Can you put the rpm on an ftp somewhere ? Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bob Woodruff Sent: Wednesday, April 05, 2006 9:10 PM To: 'Sean Hefty'; Bryan O'Sullivan Cc: [EMAIL PROTECTED]; Openib-general@openib.org Subject: RE: [openib-general] RC2 delayed a bit Bryan wrote, >So, we went from having no openib release to now having two? That's confusing. >Are these vendors members of openib? >- Sean I know that I am confused. Can someone from the ibed (openfabrics-ewg) people please enlighten us ? BTW. I built some kernel RPMs based on the 1.0 branch kernel code and the backport patches for RedHat EL4.0 U3. If someone wants me to post them somewhere, I will. woody ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] openIB backport to sles9 sp3 kernel 2.6.5-7
Title: Message All, Does any one knows of an effort or start working related to openIB backport to sles9 p3 kernel 2.6.5-7 Moshe Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire – The Grid Backbone www.voltaire.com ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general