[openib-general] Problem configuring ipath_ether

2006-03-23 Thread Matt Leininger
I have the ipath driver up, running, and working with IPoIB.  I'm using
2.6.16 with svn 5938.  The ipath_ether comes up as eth2.  I can set the
netmask and broadcast, but when I try to set the ip address for this
device I get the following error:


[EMAIL PROTECTED] infiniband]# ifconfig eth2 10.128.20.103
SIOCSIFFLAGS: Operation not permitted


Here is the same command with an strace.


[EMAIL PROTECTED] infiniband]# strace ifconfig eth2 10.128.20.103
execve("/sbin/ifconfig", ["ifconfig", "eth2", "10.128.20.103"], [/* 31
vars */]) = 0
uname({sys="Linux", node="opt1", ...})  = 0
brk(0)  = 0x61
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x2b6edea52000
access("/etc/ld.so.preload", R_OK)  = -1 ENOENT (No such file or
directory)
open("/etc/ld.so.cache", O_RDONLY)  = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=145885, ...}) = 0
mmap(NULL, 145885, PROT_READ, MAP_PRIVATE, 3, 0) = 0x2b6edea53000
close(3)= 0
open("/lib64/tls/libc.so.6", O_RDONLY)  = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0p\305A|<"...,
640) = 640
lseek(3, 624, SEEK_SET) = 624
read(3, "\4\0\0\0\20\0\0\0\1\0\0\0GNU\0\0\0\0\0\2\0\0\0\4\0\0\0"..., 32)
= 32
fstat(3, {st_mode=S_IFREG|0755, st_size=1489988, ...}) = 0
mmap(0x3c7c40, 2301864, PROT_READ|PROT_EXEC, MAP_PRIVATE|
MAP_DENYWRITE, 3, 0) = 0x3c7c40
mprotect(0x3c7c529000, 1085352, PROT_NONE) = 0
mmap(0x3c7c628000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|
MAP_DENYWRITE, 3, 0x128000) = 0x3c7c628000
mmap(0x3c7c62e000, 16296, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|
MAP_ANONYMOUS, -1, 0) = 0x3c7c62e000
close(3)= 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x2b6edea77000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x2b6edea78000
mprotect(0x3c7c628000, 12288, PROT_READ) = 0
arch_prctl(0x1002, 0x2b6edea77b00)  = 0
munmap(0x2b6edea53000, 145885)  = 0
brk(0)  = 0x61
brk(0x631000)   = 0x631000
open("/usr/lib/locale/locale-archive", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=39550704, ...}) = 0
mmap(NULL, 39550704, PROT_READ, MAP_PRIVATE, 3, 0) = 0x2b6edea79000
close(3)= 0
uname({sys="Linux", node="opt1", ...})  = 0
access("/proc/net", R_OK)   = 0
access("/proc/net/unix", R_OK)  = 0
socket(PF_FILE, SOCK_DGRAM, 0)  = 3
socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4
access("/proc/net/if_inet6", R_OK)  = 0
socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP) = 5
access("/proc/net/ax25", R_OK)  = -1 ENOENT (No such file or
directory)
access("/proc/net/nr", R_OK)= -1 ENOENT (No such file or
directory)
access("/proc/net/rose", R_OK)  = -1 ENOENT (No such file or
directory)
access("/proc/net/ipx", R_OK)   = -1 ENOENT (No such file or
directory)
access("/proc/net/appletalk", R_OK) = -1 ENOENT (No such file or
directory)
access("/proc/sys/net/econet", R_OK)= -1 ENOENT (No such file or
directory)
access("/proc/sys/net/ash", R_OK)   = -1 ENOENT (No such file or
directory)
access("/proc/net/x25", R_OK)   = -1 ENOENT (No such file or
directory)
open("/usr/share/locale/locale.alias", O_RDONLY) = 6
fstat(6, {st_mode=S_IFREG|0644, st_size=2528, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x2b6ee1031000
read(6, "# Locale name alias data base.\n#"..., 4096) = 2528
read(6, "", 4096)   = 0
close(6)= 0
munmap(0x2b6ee1031000, 4096)= 0
open("/usr/share/locale/en_US.UTF-8/LC_MESSAGES/net-tools.mo", O_RDONLY)
= -1 ENOENT (No such file or directory)
open("/usr/share/locale/en_US.utf8/LC_MESSAGES/net-tools.mo", O_RDONLY)
= -1 ENOENT (No such file or directory)
open("/usr/share/locale/en_US/LC_MESSAGES/net-tools.mo", O_RDONLY) = -1
ENOENT (No such file or directory)
open("/usr/share/locale/en.UTF-8/LC_MESSAGES/net-tools.mo", O_RDONLY) =
-1 ENOENT (No such file or directory)
open("/usr/share/locale/en.utf8/LC_MESSAGES/net-tools.mo", O_RDONLY) =
-1 ENOENT (No such file or directory)
open("/usr/share/locale/en/LC_MESSAGES/net-tools.mo", O_RDONLY) = -1
ENOENT (No such file or directory)
ioctl(4, SIOCSIFADDR, 0x7fbc8620)   = 0
ioctl(4, SIOCGIFFLAGS, 0x7fbc8550)  = 0
ioctl(4, SIOCSIFFLAGS, 0x7fbc8550)  = -1 EPERM (Operation not
permitted)
dup(2)  = 6
fcntl(6, F_GETFL)   = 0x8002 (flags O_RDWR|
O_LARGEFILE|0x8000)
fstat(6, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x2b6ee1031000
lseek(6, 0, SEEK_CUR)   = -1 ESPIPE (Illegal seek)
open("/usr/share/locale/en_US.UTF-8/LC_MESSAGES/libc.mo", O_RDONLY) = -1
ENOENT (No such file or directory)
o

[openib-general] Question on get_dma_mr()

2006-03-23 Thread Devesh Sharma
Hello all,

Please any body explain me about the functionality of verbs ib_get_dma_mr()? 
What is the need of this function?
what a driver implementer is supposed to implement in this function?

Thanks
Devesh

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] [PATCH 18 of 18] ipath - kbuild infrastructure

2006-03-23 Thread Bryan O'Sullivan
Integrate the ipath core and OpenIB drivers into the kernel build
infrastructure.  Add entry to MAINTAINERS.

Signed-off-by: Bryan O'Sullivan <[EMAIL PROTECTED]>

diff -r fd3710b1b069 -r d303ccc5870e MAINTAINERS
--- a/MAINTAINERS   Thu Mar 23 20:27:45 2006 -0800
+++ b/MAINTAINERS   Thu Mar 23 20:27:45 2006 -0800
@@ -1404,6 +1404,12 @@ P:   Juanjo Ciarlante
 P: Juanjo Ciarlante
 M: [EMAIL PROTECTED]
 S: Maintained
+
+IPATH DRIVER:
+P: Bryan O'Sullivan
+M: [EMAIL PROTECTED]
+L: openib-general@openib.org
+S: Supported
 
 IPX NETWORK LAYER
 P: Arnaldo Carvalho de Melo
diff -r fd3710b1b069 -r d303ccc5870e drivers/infiniband/Kconfig
--- a/drivers/infiniband/KconfigThu Mar 23 20:27:45 2006 -0800
+++ b/drivers/infiniband/KconfigThu Mar 23 20:27:45 2006 -0800
@@ -30,6 +30,7 @@ config INFINIBAND_USER_ACCESS
  .
 
 source "drivers/infiniband/hw/mthca/Kconfig"
+source "drivers/infiniband/hw/ipath/Kconfig"
 
 source "drivers/infiniband/ulp/ipoib/Kconfig"
 
diff -r fd3710b1b069 -r d303ccc5870e drivers/infiniband/Makefile
--- a/drivers/infiniband/Makefile   Thu Mar 23 20:27:45 2006 -0800
+++ b/drivers/infiniband/Makefile   Thu Mar 23 20:27:45 2006 -0800
@@ -1,4 +1,5 @@ obj-$(CONFIG_INFINIBAND)+= core/
 obj-$(CONFIG_INFINIBAND)   += core/
 obj-$(CONFIG_INFINIBAND_MTHCA) += hw/mthca/
+obj-$(CONFIG_IPATH_CORE)   += hw/ipath/
 obj-$(CONFIG_INFINIBAND_IPOIB) += ulp/ipoib/
 obj-$(CONFIG_INFINIBAND_SRP)   += ulp/srp/
diff -r fd3710b1b069 -r d303ccc5870e drivers/infiniband/hw/ipath/Kconfig
--- /dev/null   Thu Jan  1 00:00:00 1970 +
+++ b/drivers/infiniband/hw/ipath/Kconfig   Thu Mar 23 20:27:45 2006 -0800
@@ -0,0 +1,18 @@
+config IPATH_CORE
+   tristate "PathScale InfiniPath Driver"
+   depends on 64BIT && PCI_MSI
+   ---help---
+   This is a low-level driver for PathScale InfiniPath host channel
+   adapters (HCAs) based on the HT-400 and PE-800 chips, including
+   the InfiniPath HT-460, the small form factor InfiniPath HT-460,
+   the InfiniPath HT-470 and the Linux Networx LS/X.
+
+config INFINIBAND_IPATH
+   tristate "PathScale InfiniPath Verbs Driver"
+   depends on IPATH_CORE && INFINIBAND
+   ---help---
+   This is a driver that provides InfiniBand verbs support for
+   PathScale InfiniPath host channel adapters (HCAs).  This
+   allows these devices to be used with both kernel upper level
+   protocols such as IP-over-InfiniBand as well as with userspace
+   applications (in conjunction with InfiniBand userspace access).
diff -r fd3710b1b069 -r d303ccc5870e drivers/infiniband/hw/ipath/Makefile
--- /dev/null   Thu Jan  1 00:00:00 1970 +
+++ b/drivers/infiniband/hw/ipath/Makefile  Thu Mar 23 20:27:45 2006 -0800
@@ -0,0 +1,41 @@
+EXTRA_CFLAGS += -DIPATH_IDSTR='"PathScale kernel.org driver"' \
+   -DIPATH_KERN_TYPE=0
+
+obj-$(CONFIG_IPATH_CORE) += ipath_core.o
+obj-$(CONFIG_INFINIBAND_IPATH) += ib_ipath.o
+obj-$(CONFIG_IPATH_ETHER) += ipath_ether.o
+
+ipath_core-y := \
+   ipath_copy.o \
+   ipath_diag.o \
+   ipath_driver.o \
+   ipath_eeprom.o \
+   ipath_file_ops.o \
+   ipath_fs.o \
+   ipath_ht400.o \
+   ipath_init_chip.o \
+   ipath_intr.o \
+   ipath_layer.o \
+   ipath_pe800.o \
+   ipath_sma.o \
+   ipath_stats.o \
+   ipath_sysfs.o \
+   ipath_user_pages.o
+
+ipath_core-$(CONFIG_X86_64) += ipath_wc_x86_64.o
+
+ib_ipath-y := \
+   ipath_cq.o \
+   ipath_keys.o \
+   ipath_mad.o \
+   ipath_mr.o \
+   ipath_qp.o \
+   ipath_rc.o \
+   ipath_ruc.o \
+   ipath_srq.o \
+   ipath_uc.o \
+   ipath_ud.o \
+   ipath_verbs.o \
+   ipath_verbs_mcast.o
+
+ipath_ether-y := ipath_eth.o
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [PATCH 15 of 18] ipath - misc infiniband code, part 1

2006-03-23 Thread Bryan O'Sullivan
Completion queues, local and remote memory keys, and memory region
support.

Signed-off-by: Bryan O'Sullivan <[EMAIL PROTECTED]>

diff -r c57bdb16 -r 281189953c6f drivers/infiniband/hw/ipath/ipath_cq.c
--- /dev/null   Thu Jan  1 00:00:00 1970 +
+++ b/drivers/infiniband/hw/ipath/ipath_cq.cThu Mar 23 20:27:45 2006 -0800
@@ -0,0 +1,295 @@
+/*
+ * Copyright (c) 2005, 2006 PathScale, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include 
+#include 
+
+#include "ipath_verbs.h"
+
+/**
+ * ipath_cq_enter - add a new entry to the completion queue
+ * @cq: completion queue
+ * @entry: work completion entry to add
+ * @sig: true if @entry is a solicitated entry
+ *
+ * This may be called with one of the qp->s_lock or qp->r_rq.lock held.
+ */
+void ipath_cq_enter(struct ipath_cq *cq, struct ib_wc *entry, int solicited)
+{
+   unsigned long flags;
+   u32 next;
+
+   spin_lock_irqsave(&cq->lock, flags);
+
+   if (cq->head == cq->ibcq.cqe)
+   next = 0;
+   else
+   next = cq->head + 1;
+   if (unlikely(next == cq->tail)) {
+   spin_unlock_irqrestore(&cq->lock, flags);
+   if (cq->ibcq.event_handler) {
+   struct ib_event ev;
+
+   ev.device = cq->ibcq.device;
+   ev.element.cq = &cq->ibcq;
+   ev.event = IB_EVENT_CQ_ERR;
+   cq->ibcq.event_handler(&ev, cq->ibcq.cq_context);
+   }
+   return;
+   }
+   cq->queue[cq->head] = *entry;
+   cq->head = next;
+
+   if (cq->notify == IB_CQ_NEXT_COMP ||
+   (cq->notify == IB_CQ_SOLICITED && solicited)) {
+   cq->notify = IB_CQ_NONE;
+   cq->triggered++;
+   /*
+* This will cause send_complete() to be called in
+* another thread.
+*/
+   tasklet_hi_schedule(&cq->comptask);
+   }
+
+   spin_unlock_irqrestore(&cq->lock, flags);
+
+   if (entry->status != IB_WC_SUCCESS)
+   to_idev(cq->ibcq.device)->n_wqe_errs++;
+}
+
+/**
+ * ipath_poll_cq - poll for work completion entries
+ * @ibcq: the completion queue to poll
+ * @num_entries: the maximum number of entries to return
+ * @entry: pointer to array where work completions are placed
+ *
+ * Returns the number of completion entries polled.
+ *
+ * This may be called from interrupt context.  Also called by ib_poll_cq()
+ * in the generic verbs code.
+ */
+int ipath_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *entry)
+{
+   struct ipath_cq *cq = to_icq(ibcq);
+   unsigned long flags;
+   int npolled;
+
+   spin_lock_irqsave(&cq->lock, flags);
+
+   for (npolled = 0; npolled < num_entries; ++npolled, ++entry) {
+   if (cq->tail == cq->head)
+   break;
+   *entry = cq->queue[cq->tail];
+   if (cq->tail == cq->ibcq.cqe)
+   cq->tail = 0;
+   else
+   cq->tail++;
+   }
+
+   spin_unlock_irqrestore(&cq->lock, flags);
+
+   return npolled;
+}
+
+static void send_complete(unsigned long data)
+{
+   struct ipath_cq *cq = (struct ipath_cq *)data;
+
+   /*
+* The completion handler will most likely rearm the notification
+* and poll for all pending entries.  If a new completion entry
+* is added while we are in this routine, tasklet_hi_schedule()
+* won't call us again until we return so we check triggered to
+* see if we need to call the handler again

[openib-general] [PATCH 17 of 18] ipath - infiniband verbs support

2006-03-23 Thread Bryan O'Sullivan
The ipath_verbs.c file implements the driver-specific components of the
kernel's Infiniband verbs layer.

Signed-off-by: Bryan O'Sullivan <[EMAIL PROTECTED]>

diff -r e230510a56f7 -r fd3710b1b069 drivers/infiniband/hw/ipath/ipath_verbs.c
--- /dev/null   Thu Jan  1 00:00:00 1970 +
+++ b/drivers/infiniband/hw/ipath/ipath_verbs.c Thu Mar 23 20:27:45 2006 -0800
@@ -0,0 +1,1220 @@
+/*
+ * Copyright (c) 2005, 2006 PathScale, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+
+#include "ipath_kernel.h"
+#include "ipath_verbs.h"
+#include "ips_common.h"
+
+/* Not static, because we don't want the compiler removing it */
+const char ipath_verbs_version[] = "ipath_verbs " IPATH_IDSTR;
+
+unsigned int ib_ipath_qp_table_size = 251;
+module_param_named(qp_table_size, ib_ipath_qp_table_size, uint, S_IRUGO);
+MODULE_PARM_DESC(qp_table_size, "QP table size");
+
+unsigned int ib_ipath_lkey_table_size = 12;
+module_param_named(lkey_table_size, ib_ipath_lkey_table_size, uint,
+  S_IRUGO);
+MODULE_PARM_DESC(lkey_table_size,
+"LKEY table size in bits (2^n, 1 <= n <= 23)");
+
+unsigned int ib_ipath_debug;   /* debug mask */
+module_param_named(debug, ib_ipath_debug, uint, S_IWUSR | S_IRUGO);
+MODULE_PARM_DESC(debug, "Verbs debug mask");
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("PathScale <[EMAIL PROTECTED]>");
+MODULE_DESCRIPTION("Pathscale InfiniPath driver");
+
+const int ib_ipath_state_ops[IB_QPS_ERR + 1] = {
+   [IB_QPS_RESET] = 0,
+   [IB_QPS_INIT] = IPATH_POST_RECV_OK,
+   [IB_QPS_RTR] = IPATH_POST_RECV_OK | IPATH_PROCESS_RECV_OK,
+   [IB_QPS_RTS] = IPATH_POST_RECV_OK | IPATH_PROCESS_RECV_OK |
+   IPATH_POST_SEND_OK | IPATH_PROCESS_SEND_OK,
+   [IB_QPS_SQD] = IPATH_POST_RECV_OK | IPATH_PROCESS_RECV_OK |
+   IPATH_POST_SEND_OK,
+   [IB_QPS_SQE] = IPATH_POST_RECV_OK | IPATH_PROCESS_RECV_OK,
+   [IB_QPS_ERR] = 0,
+};
+
+/*
+ * Translate ib_wr_opcode into ib_wc_opcode.
+ */
+const enum ib_wc_opcode ib_ipath_wc_opcode[] = {
+   [IB_WR_RDMA_WRITE] = IB_WC_RDMA_WRITE,
+   [IB_WR_RDMA_WRITE_WITH_IMM] = IB_WC_RDMA_WRITE,
+   [IB_WR_SEND] = IB_WC_SEND,
+   [IB_WR_SEND_WITH_IMM] = IB_WC_SEND,
+   [IB_WR_RDMA_READ] = IB_WC_RDMA_READ,
+   [IB_WR_ATOMIC_CMP_AND_SWP] = IB_WC_COMP_SWAP,
+   [IB_WR_ATOMIC_FETCH_AND_ADD] = IB_WC_FETCH_ADD
+};
+
+/*
+ * System image GUID.
+ */
+__be64 sys_image_guid;
+
+/**
+ * ipath_copy_sge - copy data to SGE memory
+ * @ss: the SGE state
+ * @data: the data to copy
+ * @length: the length of the data
+ */
+void ipath_copy_sge(struct ipath_sge_state *ss, void *data, u32 length)
+{
+   struct ipath_sge *sge = &ss->sge;
+
+   while (length) {
+   u32 len = sge->length;
+
+   BUG_ON(len == 0);
+   if (len > length)
+   len = length;
+   memcpy(sge->vaddr, data, len);
+   sge->vaddr += len;
+   sge->length -= len;
+   sge->sge_length -= len;
+   if (sge->sge_length == 0) {
+   if (--ss->num_sge)
+   *sge = *ss->sg_list++;
+   } else if (sge->length == 0 && sge->mr != NULL) {
+   if (++sge->n >= IPATH_SEGSZ) {
+   if (++sge->m >= sge->mr->mapsz)
+   break;
+   sge->n = 0;
+   }
+   sge->vaddr =
+   sge->mr->map[sge->m]->segs[sge->n].vaddr;
+   sge->l

[openib-general] [PATCH 11 of 18] ipath - layering interfaces used by higher-level driver code

2006-03-23 Thread Bryan O'Sullivan
The layering interfaces are used to implement the Infiniband protocols
and the ethernet emulation driver.

Signed-off-by: Bryan O'Sullivan <[EMAIL PROTECTED]>

diff -r 1ba29b921199 -r 038c26041d01 drivers/infiniband/hw/ipath/ipath_layer.c
--- /dev/null   Thu Jan  1 00:00:00 1970 +
+++ b/drivers/infiniband/hw/ipath/ipath_layer.c Thu Mar 23 20:27:45 2006 -0800
@@ -0,0 +1,1514 @@
+/*
+ * Copyright (c) 2003, 2004, 2005, 2006 PathScale, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+/*
+ * These are the routines used by layered drivers, currently just the
+ * layered ethernet driver and verbs layer.
+ */
+
+#include 
+#include 
+#include 
+
+#include "ipath_kernel.h"
+#include "ips_common.h"
+#include "ipath_layer.h"
+
+/* Acquire before ipath_devs_lock. */
+static DEFINE_MUTEX(ipath_layer_mutex);
+
+u16 ipath_layer_rcv_opcode;
+static int (*layer_intr)(void *, u32);
+static int (*layer_rcv)(void *, void *, struct sk_buff *);
+static int (*layer_rcv_lid)(void *, void *);
+static int (*verbs_piobufavail)(void *);
+static void (*verbs_rcv)(void *, void *, void *, u32);
+int ipath_verbs_registered;
+
+static void *(*layer_add_one)(int, struct ipath_devdata *);
+static void (*layer_remove_one)(void *);
+static void *(*verbs_add_one)(int, struct ipath_devdata *);
+static void (*verbs_remove_one)(void *);
+static void (*verbs_timer_cb)(void *);
+
+int __ipath_layer_intr(struct ipath_devdata *dd, u32 arg)
+{
+   int ret = -ENODEV;
+
+   if (dd->ipath_layer.l_arg && layer_intr)
+   ret = layer_intr(dd->ipath_layer.l_arg, arg);
+
+   return ret;
+}
+
+int ipath_layer_intr(struct ipath_devdata *dd, u32 arg)
+{
+   int ret;
+
+   mutex_lock(&ipath_layer_mutex);
+
+   ret = __ipath_layer_intr(dd, arg);
+
+   mutex_unlock(&ipath_layer_mutex);
+
+   return ret;
+}
+
+int __ipath_layer_rcv(struct ipath_devdata *dd, void *hdr,
+ struct sk_buff *skb)
+{
+   int ret = -ENODEV;
+
+   if (dd->ipath_layer.l_arg && layer_rcv)
+   ret = layer_rcv(dd->ipath_layer.l_arg, hdr, skb);
+
+   return ret;
+}
+
+int __ipath_layer_rcv_lid(struct ipath_devdata *dd, void *hdr)
+{
+   int ret = -ENODEV;
+
+   if (dd->ipath_layer.l_arg && layer_rcv_lid)
+   ret = layer_rcv_lid(dd->ipath_layer.l_arg, hdr);
+
+   return ret;
+}
+
+int __ipath_verbs_piobufavail(struct ipath_devdata *dd)
+{
+   int ret = -ENODEV;
+
+   if (dd->verbs_layer.l_arg && verbs_piobufavail)
+   ret = verbs_piobufavail(dd->verbs_layer.l_arg);
+
+   return ret;
+}
+
+int __ipath_verbs_rcv(struct ipath_devdata *dd, void *rc, void *ebuf,
+ u32 tlen)
+{
+   int ret = -ENODEV;
+
+   if (dd->verbs_layer.l_arg && verbs_rcv) {
+   verbs_rcv(dd->verbs_layer.l_arg, rc, ebuf, tlen);
+   ret = 0;
+   }
+
+   return ret;
+}
+
+int ipath_layer_set_linkstate(struct ipath_devdata *dd, u8 newstate)
+{
+   u32 lstate;
+   int ret;
+
+   switch (newstate) {
+   case IPATH_IB_LINKDOWN:
+   ipath_set_ib_lstate(dd, INFINIPATH_IBCC_LINKINITCMD_POLL <<
+   INFINIPATH_IBCC_LINKINITCMD_SHIFT);
+   /* don't wait */
+   ret = 0;
+   goto bail;
+
+   case IPATH_IB_LINKDOWN_SLEEP:
+   ipath_set_ib_lstate(dd, INFINIPATH_IBCC_LINKINITCMD_SLEEP <<
+   INFINIPATH_IBCC_LINKINITCMD_SHIFT);
+   /* don't wait */
+   ret = 0;
+   goto bail;
+
+   case IPATH_IB_LINKDOWN_DISABLE:
+   

[openib-general] [PATCH 14 of 18] ipath - infiniband RC protocol support

2006-03-23 Thread Bryan O'Sullivan
This is an implementation of the Infiniband RC ("reliable connection")
protocol.

Signed-off-by: Bryan O'Sullivan <[EMAIL PROTECTED]>

diff -r 572e99c2ab63 -r c57bdb16 drivers/infiniband/hw/ipath/ipath_rc.c
--- /dev/null   Thu Jan  1 00:00:00 1970 +
+++ b/drivers/infiniband/hw/ipath/ipath_rc.cThu Mar 23 20:27:45 2006 -0800
@@ -0,0 +1,1864 @@
+/*
+ * Copyright (c) 2005, 2006 PathScale, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "ipath_verbs.h"
+#include "ips_common.h"
+
+/* cut down ridiculously long IB macro names */
+#define OP(x) IB_OPCODE_RC_##x
+
+/**
+ * ipath_init_restart- initialize the qp->s_sge after a restart
+ * @qp: the QP who's SGE we're restarting
+ * @wqe: the work queue to initialize the QP's SGE from
+ *
+ * The QP s_lock should be held.
+ */
+static void ipath_init_restart(struct ipath_qp *qp, struct ipath_swqe *wqe)
+{
+   struct ipath_ibdev *dev;
+   u32 len;
+
+   len = ((qp->s_psn - wqe->psn) & IPS_PSN_MASK) *
+   ib_mtu_enum_to_int(qp->path_mtu);
+   qp->s_sge.sge = wqe->sg_list[0];
+   qp->s_sge.sg_list = wqe->sg_list + 1;
+   qp->s_sge.num_sge = wqe->wr.num_sge;
+   ipath_skip_sge(&qp->s_sge, len);
+   qp->s_len = wqe->length - len;
+   dev = to_idev(qp->ibqp.device);
+   spin_lock(&dev->pending_lock);
+   if (qp->timerwait.next == LIST_POISON1)
+   list_add_tail(&qp->timerwait,
+ &dev->pending[dev->pending_index]);
+   spin_unlock(&dev->pending_lock);
+}
+
+/**
+ * ipath_make_rc_ack - construct a response packet (ACK, NAK, or RDMA read)
+ * @qp: a pointer to the QP
+ * @ohdr: a pointer to the IB header being constructed
+ * @pmtu: the path MTU
+ *
+ * Return bth0 if constructed; otherwise, return 0.
+ * Note the QP s_lock must be held.
+ */
+static inline u32 ipath_make_rc_ack(struct ipath_qp *qp,
+   struct ipath_other_headers *ohdr,
+   u32 pmtu)
+{
+   struct ipath_sge_state *ss;
+   u32 hwords;
+   u32 len;
+   u32 bth0;
+
+   /* header size in 32-bit words LRH+BTH = (8+12)/4. */
+   hwords = 5;
+
+   /*
+* Send a response.  Note that we are in the responder's
+* side of the QP context.
+*/
+   switch (qp->s_ack_state) {
+   case OP(RDMA_READ_REQUEST):
+   ss = &qp->s_rdma_sge;
+   len = qp->s_rdma_len;
+   if (len > pmtu) {
+   len = pmtu;
+   qp->s_ack_state = OP(RDMA_READ_RESPONSE_FIRST);
+   }
+   else
+   qp->s_ack_state = OP(RDMA_READ_RESPONSE_ONLY);
+   qp->s_rdma_len -= len;
+   bth0 = qp->s_ack_state << 24;
+   ohdr->u.aeth = ipath_compute_aeth(qp);
+   hwords++;
+   break;
+
+   case OP(RDMA_READ_RESPONSE_FIRST):
+   qp->s_ack_state = OP(RDMA_READ_RESPONSE_MIDDLE);
+   /* FALLTHROUGH */
+   case OP(RDMA_READ_RESPONSE_MIDDLE):
+   ss = &qp->s_rdma_sge;
+   len = qp->s_rdma_len;
+   if (len > pmtu)
+   len = pmtu;
+   else {
+   ohdr->u.aeth = ipath_compute_aeth(qp);
+   hwords++;
+   qp->s_ack_state = OP(RDMA_READ_RESPONSE_LAST);
+   }
+   qp->s_rdma_len -= len;
+   bth0 = qp->s_ack_state << 24;
+   break;
+
+   case OP(RDMA_READ_RESPONSE_LAST):
+   case OP(RDMA_READ_RESPONSE_ONLY):
+   

[openib-general] [PATCH 13 of 18] ipath - infiniband UC and UD protocol support

2006-03-23 Thread Bryan O'Sullivan
These files implement the Infiniband UC ("unreliable connection") and UD
("unreliable datagram") protocols.

Signed-off-by: Bryan O'Sullivan <[EMAIL PROTECTED]>

diff -r 131fb4befa93 -r 572e99c2ab63 drivers/infiniband/hw/ipath/ipath_uc.c
--- /dev/null   Thu Jan  1 00:00:00 1970 +
+++ b/drivers/infiniband/hw/ipath/ipath_uc.cThu Mar 23 20:27:45 2006 -0800
@@ -0,0 +1,645 @@
+/*
+ * Copyright (c) 2005, 2006 PathScale, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "ipath_verbs.h"
+#include "ips_common.h"
+
+/* cut down ridiculously long IB macro names */
+#define OP(x) IB_OPCODE_UC_##x
+
+static void complete_last_send(struct ipath_qp *qp, struct ipath_swqe *wqe,
+  struct ib_wc *wc)
+{
+   if (++qp->s_last == qp->s_size)
+   qp->s_last = 0;
+   if (!test_bit(IPATH_S_SIGNAL_REQ_WR, &qp->s_flags) ||
+   (wqe->wr.send_flags & IB_SEND_SIGNALED)) {
+   wc->wr_id = wqe->wr.wr_id;
+   wc->status = IB_WC_SUCCESS;
+   wc->opcode = ib_ipath_wc_opcode[wqe->wr.opcode];
+   wc->vendor_err = 0;
+   wc->byte_len = wqe->length;
+   wc->qp_num = qp->ibqp.qp_num;
+   wc->src_qp = qp->remote_qpn;
+   wc->pkey_index = 0;
+   wc->slid = qp->remote_ah_attr.dlid;
+   wc->sl = qp->remote_ah_attr.sl;
+   wc->dlid_path_bits = 0;
+   wc->port_num = 0;
+   ipath_cq_enter(to_icq(qp->ibqp.send_cq), wc, 0);
+   }
+   wqe = get_swqe_ptr(qp, qp->s_last);
+}
+
+/**
+ * ipath_do_uc_send - do a send on a UC queue
+ * @data: contains a pointer to the QP to send on
+ *
+ * Process entries in the send work queue until the queue is exhausted.
+ * Only allow one CPU to send a packet per QP (tasklet).
+ * Otherwise, after we drop the QP lock, two threads could send
+ * packets out of order.
+ * This is similar to ipath_do_rc_send() below except we don't have
+ * timeouts or resends.
+ */
+void ipath_do_uc_send(unsigned long data)
+{
+   struct ipath_qp *qp = (struct ipath_qp *)data;
+   struct ipath_ibdev *dev = to_idev(qp->ibqp.device);
+   struct ipath_swqe *wqe;
+   unsigned long flags;
+   u16 lrh0;
+   u32 hwords;
+   u32 nwords;
+   u32 extra_bytes;
+   u32 bth0;
+   u32 bth2;
+   u32 pmtu = ib_mtu_enum_to_int(qp->path_mtu);
+   u32 len;
+   struct ipath_other_headers *ohdr;
+   struct ib_wc wc;
+
+   if (test_and_set_bit(IPATH_S_BUSY, &qp->s_flags))
+   goto bail;
+
+   if (unlikely(qp->remote_ah_attr.dlid ==
+ipath_layer_get_lid(dev->dd))) {
+   /* Pass in an uninitialized ib_wc to save stack space. */
+   ipath_ruc_loopback(qp, &wc);
+   clear_bit(IPATH_S_BUSY, &qp->s_flags);
+   goto bail;
+   }
+
+   ohdr = &qp->s_hdr.u.oth;
+   if (qp->remote_ah_attr.ah_flags & IB_AH_GRH)
+   ohdr = &qp->s_hdr.u.l.oth;
+
+again:
+   /* Check for a constructed packet to be sent. */
+   if (qp->s_hdrwords != 0) {
+   /*
+* If no PIO bufs are available, return.
+* An interrupt will call ipath_ib_piobufavail()
+* when one is available.
+*/
+   if (ipath_verbs_send(dev->dd, qp->s_hdrwords,
+(u32 *) &qp->s_hdr,
+qp->s_cur_size,
+qp->s_cur_sge)) 

[openib-general] [PATCH 8 of 18] ipath - sysfs and ipathfs support for core driver

2006-03-23 Thread Bryan O'Sullivan
The ipathfs filesystem contains files that are not appropriate for
sysfs, because they contain binary data.  The hierarchy is simple; the
top-level directory contains driver-wide attribute files, while numbered
subdirectories contain per-device attribute files.

Our userspace code currently expects this filesystem to be mounted on
/ipathfs, but a final location has not yet been chosen.

Signed-off-by: Bryan O'Sullivan <[EMAIL PROTECTED]>

diff -r 161d9363d755 -r d408c327e562 drivers/infiniband/hw/ipath/ipath_fs.c
--- /dev/null   Thu Jan  1 00:00:00 1970 +
+++ b/drivers/infiniband/hw/ipath/ipath_fs.cThu Mar 23 20:27:45 2006 -0800
@@ -0,0 +1,601 @@
+/*
+ * Copyright (c) 2006 PathScale, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "ipath_kernel.h"
+
+#define IPATHFS_MAGIC 0x726a77
+
+static struct super_block *ipath_super;
+
+static int ipathfs_mknod(struct inode *dir, struct dentry *dentry,
+int mode, struct file_operations *fop,
+void *data)
+{
+   int error;
+   struct inode *inode = new_inode(dir->i_sb);
+
+   if (!inode) {
+   error = -EPERM;
+   goto bail;
+   }
+
+   inode->i_mode = mode;
+   inode->i_uid = 0;
+   inode->i_gid = 0;
+   inode->i_blksize = PAGE_CACHE_SIZE;
+   inode->i_blocks = 0;
+   inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
+   inode->u.generic_ip = data;
+   if ((mode & S_IFMT) == S_IFDIR) {
+   inode->i_op = &simple_dir_inode_operations;
+   inode->i_nlink++;
+   dir->i_nlink++;
+   }
+
+   inode->i_fop = fop;
+
+   d_instantiate(dentry, inode);
+   error = 0;
+
+bail:
+   return error;
+}
+
+static int create_file(const char *name, mode_t mode,
+  struct dentry *parent, struct dentry **dentry,
+  struct file_operations *fop, void *data)
+{
+   int error;
+
+   *dentry = NULL;
+   mutex_lock(&parent->d_inode->i_mutex);
+   *dentry = lookup_one_len(name, parent, strlen(name));
+   if (!IS_ERR(dentry))
+   error = ipathfs_mknod(parent->d_inode, *dentry,
+ mode, fop, data);
+   else
+   error = PTR_ERR(dentry);
+   mutex_unlock(&parent->d_inode->i_mutex);
+
+   return error;
+}
+
+static ssize_t atomic_stats_read(struct file *file, char __user *buf,
+size_t count, loff_t *ppos)
+{
+   return simple_read_from_buffer(buf, count, ppos, &ipath_stats,
+  sizeof ipath_stats);
+}
+
+static struct file_operations atomic_stats_ops = {
+   .read = atomic_stats_read,
+};
+
+#define NUM_COUNTERS sizeof(struct infinipath_counters) / sizeof(u64)
+
+static ssize_t atomic_counters_read(struct file *file, char __user *buf,
+   size_t count, loff_t *ppos)
+{
+   u64 counters[NUM_COUNTERS];
+   u16 i;
+   struct ipath_devdata *dd;
+
+   dd = file->f_dentry->d_inode->u.generic_ip;
+
+   for (i = 0; i < NUM_COUNTERS; i++)
+   counters[i] = ipath_snap_cntr(dd, i);
+
+   return simple_read_from_buffer(buf, count, ppos, counters,
+  sizeof counters);
+}
+
+static struct file_operations atomic_counters_ops = {
+   .read = atomic_counters_read,
+};
+
+static ssize_t atomic_node_info_read(struct file *file, char __user *buf,
+

[openib-general] [PATCH 9 of 18] ipath - char devices for diagnostics and lightweight subnet management

2006-03-23 Thread Bryan O'Sullivan
The ipath_diag.c file permits userspace diagnostic tools to read and
write a chip's registers.  It is different in purpose from the mmap
interfaces to the /sys/bus/pci resource files.

The ipath_sma.c file supports a lightweight userspace subnet management
agent (SMA).  This is used in deployments (such as HPC clusters) where
a full Infiniband protocol stack is not needed.  The facilities provided
by ipath_sma.c are also used by userspace hardware diagnostic tools.

Signed-off-by: Bryan O'Sullivan <[EMAIL PROTECTED]>

diff -r d408c327e562 -r 26cb45d8d61c drivers/infiniband/hw/ipath/ipath_diag.c
--- /dev/null   Thu Jan  1 00:00:00 1970 +
+++ b/drivers/infiniband/hw/ipath/ipath_diag.c  Thu Mar 23 20:27:45 2006 -0800
@@ -0,0 +1,379 @@
+/*
+ * Copyright (c) 2003, 2004, 2005, 2006 PathScale, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+/*
+ * This file contains support for diagnostic functions.  It is accessed by
+ * opening the ipath_diag device, normally minor number 129.  Diagnostic use
+ * of the InfiniPath chip may render the chip or board unusable until the
+ * driver is unloaded, or in some cases, until the system is rebooted.
+ *
+ * Accesses to the chip through this interface are not similar to going
+ * through the /sys/bus/pci resource mmap interface.
+ */
+
+#include 
+#include 
+
+#include "ipath_common.h"
+#include "ipath_kernel.h"
+#include "ips_common.h"
+#include "ipath_layer.h"
+
+int ipath_diag_inuse;
+static int diag_set_link;
+
+static int ipath_diag_open(struct inode *in, struct file *fp);
+static int ipath_diag_release(struct inode *in, struct file *fp);
+static ssize_t ipath_diag_read(struct file *fp, char __user *data,
+  size_t count, loff_t *off);
+static ssize_t ipath_diag_write(struct file *fp, const char __user *data,
+   size_t count, loff_t *off);
+
+static struct file_operations diag_file_ops = {
+   .owner = THIS_MODULE,
+   .write = ipath_diag_write,
+   .read = ipath_diag_read,
+   .open = ipath_diag_open,
+   .release = ipath_diag_release
+};
+
+static struct cdev *diag_cdev;
+static struct class_device *diag_class_dev;
+
+int ipath_diag_init(void)
+{
+   return ipath_cdev_init(IPATH_DIAG_MINOR, "ipath_diag",
+  &diag_file_ops, &diag_cdev, &diag_class_dev);
+}
+
+void ipath_diag_cleanup(void)
+{
+   ipath_cdev_cleanup(&diag_cdev, &diag_class_dev);
+}
+
+/**
+ * ipath_read_umem64 - read a 64-bit quantity from the chip into user space
+ * @dd: the infinipath device
+ * @uaddr: the location to store the data in user memory
+ * @caddr: the source chip address (full pointer, not offset)
+ * @count: number of bytes to copy (multiple of 32 bits)
+ *
+ * This function also localizes all chip memory accesses.
+ * The copy should be written such that we read full cacheline packets
+ * from the chip.  This is usually used for a single qword
+ *
+ * NOTE:  This assumes the chip address is 64-bit aligned.
+ */
+static int ipath_read_umem64(struct ipath_devdata *dd, void __user *uaddr,
+const void __iomem *caddr, size_t count)
+{
+   const u64 __iomem *reg_addr = caddr;
+   const u64 __iomem *reg_end = reg_addr + (count / sizeof(u64));
+   int ret;
+
+   /* not very efficient, but it works for now */
+   if (reg_addr < dd->ipath_kregbase ||
+   reg_end > dd->ipath_kregend) {
+   ret = -EINVAL;
+   goto bail;
+   }
+   while (reg_addr < reg_end) {
+   u64 data = readq(reg_addr);
+   if (copy_to_user(uaddr, &data, si

[openib-general] [PATCH 12 of 18] ipath - infiniband header files

2006-03-23 Thread Bryan O'Sullivan
These header files are used by the layered Infiniband driver.

Signed-off-by: Bryan O'Sullivan <[EMAIL PROTECTED]>

diff -r 038c26041d01 -r 131fb4befa93 drivers/infiniband/hw/ipath/ipath_verbs.h
--- /dev/null   Thu Jan  1 00:00:00 1970 +
+++ b/drivers/infiniband/hw/ipath/ipath_verbs.h Thu Mar 23 20:27:45 2006 -0800
@@ -0,0 +1,697 @@
+/*
+ * Copyright (c) 2005, 2006 PathScale, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef IPATH_VERBS_H
+#define IPATH_VERBS_H
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "ipath_layer.h"
+#include "verbs_debug.h"
+
+#define QPN_MAX (1 << 24)
+#define QPNMAP_ENTRIES  (QPN_MAX / PAGE_SIZE / BITS_PER_BYTE)
+
+/*
+ * Increment this value if any changes that break userspace ABI
+ * compatibility are made.
+ */
+#define IPATH_UVERBS_ABI_VERSION   1
+
+/*
+ * Define an ib_cq_notify value that is not valid so we know when CQ
+ * notifications are armed.
+ */
+#define IB_CQ_NONE (IB_CQ_NEXT_COMP + 1)
+
+#define IB_RNR_NAK 0x20
+#define IB_NAK_PSN_ERROR   0x60
+#define IB_NAK_INVALID_REQUEST 0x61
+#define IB_NAK_REMOTE_ACCESS_ERROR 0x62
+#define IB_NAK_REMOTE_OPERATIONAL_ERROR 0x63
+#define IB_NAK_INVALID_RD_REQUEST  0x64
+
+#define IPATH_POST_SEND_OK 0x01
+#define IPATH_POST_RECV_OK 0x02
+#define IPATH_PROCESS_RECV_OK  0x04
+#define IPATH_PROCESS_SEND_OK  0x08
+
+/* IB Performance Manager status values */
+#define IB_PMA_SAMPLE_STATUS_DONE  0x00
+#define IB_PMA_SAMPLE_STATUS_STARTED   0x01
+#define IB_PMA_SAMPLE_STATUS_RUNNING   0x02
+
+/* Mandatory IB performance counter select values. */
+#define IB_PMA_PORT_XMIT_DATA  __constant_htons(0x0001)
+#define IB_PMA_PORT_RCV_DATA   __constant_htons(0x0002)
+#define IB_PMA_PORT_XMIT_PKTS  __constant_htons(0x0003)
+#define IB_PMA_PORT_RCV_PKTS   __constant_htons(0x0004)
+#define IB_PMA_PORT_XMIT_WAIT  __constant_htons(0x0005)
+
+struct ib_reth {
+   __be64 vaddr;
+   __be32 rkey;
+   __be32 length;
+} __attribute__ ((packed));
+
+struct ib_atomic_eth {
+   __be64 vaddr;
+   __be32 rkey;
+   __be64 swap_data;
+   __be64 compare_data;
+} __attribute__ ((packed));
+
+struct ipath_other_headers {
+   __be32 bth[3];
+   union {
+   struct {
+   __be32 deth[2];
+   __be32 imm_data;
+   } ud;
+   struct {
+   struct ib_reth reth;
+   __be32 imm_data;
+   } rc;
+   struct {
+   __be32 aeth;
+   __be64 atomic_ack_eth;
+   } at;
+   __be32 imm_data;
+   __be32 aeth;
+   struct ib_atomic_eth atomic_eth;
+   } u;
+} __attribute__ ((packed));
+
+/*
+ * Note that UD packets with a GRH header are 8+40+12+8 = 68 bytes
+ * long (72 w/ imm_data).  Only the first 56 bytes of the IB header
+ * will be in the eager header buffer.  The remaining 12 or 16 bytes
+ * are in the data buffer.
+ */
+struct ipath_ib_header {
+   __be16 lrh[4];
+   union {
+   struct {
+   struct ib_grh grh;
+   struct ipath_other_headers oth;
+   } l;
+   struct ipath_other_headers oth;
+   } u;
+} __attribute__ ((packed));
+
+/*
+ * There is one struct ipath_mcast for each multicast GID.
+ * All attached QPs are then stored as a list of
+ * struct ipath_mcast_qp.
+ */
+struct ipath_mcast_qp {
+   struct list_head list;
+   

[openib-general] [PATCH 7 of 18] ipath - misc driver support code

2006-03-23 Thread Bryan O'Sullivan
EEPROM support, interrupt handling, statistics gathering, and write
combining management for x86_64.

A note regarding i2c: The Atmel EEPROM hardware we use looks like an
i2c device electrically, but is not i2c compliant at all from a
functional perspective.  We tried using the kernel's i2c support to
talk to it, but failed.

Normal i2c devices have a single 7-bit or 10-bit i2c address that they
respond to.  Valid 7-bit addresses range from 0x03 to 0x77.  Addresses
0x00 to 0x02 and 0x78 to 0x7F are special reserved addresses
(e.g. 0x00 is the "general call" address.)  The Atmel device, on the
other hand, responds to ALL addresses.  It's designed to be the only
device on a given i2c bus.  A given i2c device address corresponds to
the memory address within the i2c device itself.

At least one reason why the linux core i2c stuff won't work for this
is that it prohibits access to reserved addresses like 0x00, which are
really valid addresses on the Atmel devices.

Signed-off-by: Bryan O'Sullivan <[EMAIL PROTECTED]>

diff -r 4a34e16a8d87 -r 161d9363d755 drivers/infiniband/hw/ipath/ipath_eeprom.c
--- /dev/null   Thu Jan  1 00:00:00 1970 +
+++ b/drivers/infiniband/hw/ipath/ipath_eeprom.cThu Mar 23 20:27:45 
2006 -0800
@@ -0,0 +1,613 @@
+/*
+ * Copyright (c) 2003, 2004, 2005, 2006 PathScale, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+
+#include "ipath_kernel.h"
+
+/*
+ * InfiniPath I2C driver for a serial eeprom.  This is not a generic
+ * I2C interface.  For a start, the device we're using (Atmel AT24C11)
+ * doesn't work like a regular I2C device.  It looks like one
+ * electrically, but not logically.  Normal I2C devices have a single
+ * 7-bit or 10-bit I2C address that they respond to.  Valid 7-bit
+ * addresses range from 0x03 to 0x77.  Addresses 0x00 to 0x02 and 0x78
+ * to 0x7F are special reserved addresses (e.g. 0x00 is the "general
+ * call" address.)  The Atmel device, on the other hand, responds to ALL
+ * 7-bit addresses.  It's designed to be the only device on a given I2C
+ * bus.  A 7-bit address corresponds to the memory address within the
+ * Atmel device itself.
+ *
+ * Also, the timing requirements mean more than simple software
+ * bitbanging, with readbacks from chip to ensure timing (simple udelay
+ * is not enough).
+ *
+ * This all means that accessing the device is specialized enough
+ * that using the standard kernel I2C bitbanging interface would be
+ * impossible.  For example, the core I2C eeprom driver expects to find
+ * a device at one or more of a limited set of addresses only.  It doesn't
+ * allow writing to an eeprom.  It also doesn't provide any means of
+ * accessing eeprom contents from within the kernel, only via sysfs.
+ */
+
+enum i2c_type {
+   i2c_line_scl = 0,
+   i2c_line_sda
+};
+
+enum i2c_state {
+   i2c_line_low = 0,
+   i2c_line_high
+};
+
+#define READ_CMD 1
+#define WRITE_CMD 0
+
+static int eeprom_init;
+
+/*
+ * The gpioval manipulation really should be protected by spinlocks
+ * or be converted to use atomic operations.
+ */
+
+/**
+ * i2c_gpio_set - set a GPIO line
+ * @dd: the infinipath device
+ * @line: the line to set
+ * @new_line_state: the state to set
+ *
+ * Returns 0 if the line was set to the new state successfully, non-zero
+ * on error.
+ */
+static int i2c_gpio_set(struct ipath_devdata *dd,
+   enum i2c_type line,
+   enum i2c_state new_line_state)
+{
+   u64 read_val, write_val, mask, *gpioval;
+
+   gpioval = &dd->ipath_gpio_out;
+   read_val = ipath_read_kreg64(dd, dd->ipath_kr

[openib-general] [PATCH 6 of 18] ipath - chip initialisation code

2006-03-23 Thread Bryan O'Sullivan
ipath_init_chip.c sets up an InfiniPath device for use.

Signed-off-by: Bryan O'Sullivan <[EMAIL PROTECTED]>

diff -r 057022f0645a -r 4a34e16a8d87 
drivers/infiniband/hw/ipath/ipath_init_chip.c
--- /dev/null   Thu Jan  1 00:00:00 1970 +
+++ b/drivers/infiniband/hw/ipath/ipath_init_chip.c Thu Mar 23 20:27:45 
2006 -0800
@@ -0,0 +1,963 @@
+/*
+ * Copyright (c) 2003, 2004, 2005, 2006 PathScale, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+
+#include "ipath_kernel.h"
+#include "ips_common.h"
+
+/*
+ * min buffers we want to have per port, after driver
+ */
+#define IPATH_MIN_USER_PORT_BUFCNT 8
+
+/*
+ * Number of ports we are configured to use (to allow for more pio
+ * buffers per port, etc.)  Zero means use chip value.
+ */
+static ushort ipath_cfgports;
+
+module_param_named(cfgports, ipath_cfgports, ushort, S_IRUGO);
+MODULE_PARM_DESC(cfgports, "Set max number of ports to use");
+
+/*
+ * Number of buffers reserved for driver (layered drivers and SMA
+ * send).  Reserved at end of buffer list.
+ */
+static ushort ipath_kpiobufs = 32;
+
+static int ipath_set_kpiobufs(const char *val, struct kernel_param *kp);
+
+module_param_call(kpiobufs, ipath_set_kpiobufs, param_get_uint,
+ &ipath_kpiobufs, S_IWUSR | S_IRUGO);
+MODULE_PARM_DESC(kpiobufs, "Set number of PIO buffers for driver");
+
+/**
+ * create_port0_egr - allocate the eager TID buffers
+ * @dd: the infinipath device
+ *
+ * This code is now quite different for user and kernel, because
+ * the kernel uses skb's, for the accelerated network performance.
+ * This is the kernel (port0) version.
+ *
+ * Allocate the eager TID buffers and program them into infinipath.
+ * We use the network layer alloc_skb() allocator to allocate the
+ * memory, and either use the buffers as is for things like SMA
+ * packets, or pass the buffers up to the ipath layered driver and
+ * thence the network layer, replacing them as we do so (see
+ * ipath_rcv_layer()).
+ */
+static int create_port0_egr(struct ipath_devdata *dd)
+{
+   unsigned e, egrcnt;
+   struct sk_buff **skbs;
+   int ret;
+
+   egrcnt = dd->ipath_rcvegrcnt;
+
+   skbs = vmalloc(sizeof(*dd->ipath_port0_skbs) * egrcnt);
+   if (skbs == NULL) {
+   ipath_dev_err(dd, "allocation error for eager TID "
+ "skb array\n");
+   ret = -ENOMEM;
+   goto bail;
+   }
+   for (e = 0; e < egrcnt; e++) {
+   /*
+* This is a bit tricky in that we allocate extra
+* space for 2 bytes of the 14 byte ethernet header.
+* These two bytes are passed in the ipath header so
+* the rest of the data is word aligned.  We allocate
+* 4 bytes so that the data buffer stays word aligned.
+* See ipath_kreceive() for more details.
+*/
+   skbs[e] = ipath_alloc_skb(dd, GFP_KERNEL);
+   if (!skbs[e]) {
+   ipath_dev_err(dd, "SKB allocation error for "
+ "eager TID %u\n", e);
+   while (e != 0)
+   dev_kfree_skb(skbs[--e]);
+   ret = -ENOMEM;
+   goto bail;
+   }
+   }
+   /*
+* After loop above, so we can test non-NULL to see if ready
+* to use at receive, etc.
+*/
+   dd->ipath_port0_skbs = skbs;
+
+   for (e = 0; e < egrcnt; e++) {
+   unsigned long phys =
+   virt_to_phys(

[openib-general] [PATCH 4 of 18] ipath - support for HyperTransport devices

2006-03-23 Thread Bryan O'Sullivan
The ipath_ht400.c file contains routines and definitions specific to
HyperTransport-based InfiniPath devices.

Signed-off-by: Bryan O'Sullivan <[EMAIL PROTECTED]>

diff -r 5685fc1cd481 -r 77e660c4cb59 drivers/infiniband/hw/ipath/ipath_ht400.c
--- /dev/null   Thu Jan  1 00:00:00 1970 +
+++ b/drivers/infiniband/hw/ipath/ipath_ht400.c Thu Mar 23 20:27:44 2006 -0800
@@ -0,0 +1,1585 @@
+/*
+ * Copyright (c) 2003, 2004, 2005, 2006 PathScale, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+/*
+ * This file contains all of the code that is specific to the InfiniPath
+ * HT-400 chip.
+ */
+
+#include 
+#include 
+
+#include "ipath_kernel.h"
+#include "ipath_registers.h"
+
+/*
+ * This lists the InfiniPath HT400 registers, in the actual chip layout.
+ * This structure should never be directly accessed.
+ *
+ * The names are in InterCap form because they're taken straight from
+ * the chip specification.  Since they're only used in this file, they
+ * don't pollute the rest of the source.
+*/
+
+struct _infinipath_do_not_use_kernel_regs {
+   unsigned long long Revision;
+   unsigned long long Control;
+   unsigned long long PageAlign;
+   unsigned long long PortCnt;
+   unsigned long long DebugPortSelect;
+   unsigned long long DebugPort;
+   unsigned long long SendRegBase;
+   unsigned long long UserRegBase;
+   unsigned long long CounterRegBase;
+   unsigned long long Scratch;
+   unsigned long long ReservedMisc1;
+   unsigned long long InterruptConfig;
+   unsigned long long IntBlocked;
+   unsigned long long IntMask;
+   unsigned long long IntStatus;
+   unsigned long long IntClear;
+   unsigned long long ErrorMask;
+   unsigned long long ErrorStatus;
+   unsigned long long ErrorClear;
+   unsigned long long HwErrMask;
+   unsigned long long HwErrStatus;
+   unsigned long long HwErrClear;
+   unsigned long long HwDiagCtrl;
+   unsigned long long MDIO;
+   unsigned long long IBCStatus;
+   unsigned long long IBCCtrl;
+   unsigned long long ExtStatus;
+   unsigned long long ExtCtrl;
+   unsigned long long GPIOOut;
+   unsigned long long GPIOMask;
+   unsigned long long GPIOStatus;
+   unsigned long long GPIOClear;
+   unsigned long long RcvCtrl;
+   unsigned long long RcvBTHQP;
+   unsigned long long RcvHdrSize;
+   unsigned long long RcvHdrCnt;
+   unsigned long long RcvHdrEntSize;
+   unsigned long long RcvTIDBase;
+   unsigned long long RcvTIDCnt;
+   unsigned long long RcvEgrBase;
+   unsigned long long RcvEgrCnt;
+   unsigned long long RcvBufBase;
+   unsigned long long RcvBufSize;
+   unsigned long long RxIntMemBase;
+   unsigned long long RxIntMemSize;
+   unsigned long long RcvPartitionKey;
+   unsigned long long ReservedRcv[10];
+   unsigned long long SendCtrl;
+   unsigned long long SendPIOBufBase;
+   unsigned long long SendPIOSize;
+   unsigned long long SendPIOBufCnt;
+   unsigned long long SendPIOAvailAddr;
+   unsigned long long TxIntMemBase;
+   unsigned long long TxIntMemSize;
+   unsigned long long ReservedSend[9];
+   unsigned long long SendBufferError;
+   unsigned long long SendBufferErrorCONT1;
+   unsigned long long SendBufferErrorCONT2;
+   unsigned long long SendBufferErrorCONT3;
+   unsigned long long ReservedSBE[4];
+   unsigned long long RcvHdrAddr0;
+   unsigned long long RcvHdrAddr1;
+   unsigned long long RcvHdrAddr2;
+   unsigned long long RcvHdrAddr3;
+   unsigned long long R

[openib-general] [PATCH 5 of 18] ipath - support for PCI Express devices

2006-03-23 Thread Bryan O'Sullivan
This file contains routines and definitions specific to InfiniPath
devices that have PCI Express interfaces.

Signed-off-by: Bryan O'Sullivan <[EMAIL PROTECTED]>

diff -r 77e660c4cb59 -r 057022f0645a drivers/infiniband/hw/ipath/ipath_pe800.c
--- /dev/null   Thu Jan  1 00:00:00 1970 +
+++ b/drivers/infiniband/hw/ipath/ipath_pe800.c Thu Mar 23 20:27:45 2006 -0800
@@ -0,0 +1,1243 @@
+/*
+ * Copyright (c) 2003, 2004, 2005, 2006 PathScale, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+/*
+ * This file contains all of the code that is specific to the
+ * InfiniPath PE-800 chip.
+ */
+
+#include 
+#include 
+#include 
+
+
+#include "ipath_kernel.h"
+#include "ipath_registers.h"
+
+/*
+ * This file contains all the chip-specific register information and
+ * access functions for the PathScale PE800, the PCI-Express chip.
+ *
+ * This lists the InfiniPath PE800 registers, in the actual chip layout.
+ * This structure should never be directly accessed.
+ */
+struct _infinipath_do_not_use_kernel_regs {
+   unsigned long long Revision;
+   unsigned long long Control;
+   unsigned long long PageAlign;
+   unsigned long long PortCnt;
+   unsigned long long DebugPortSelect;
+   unsigned long long Reserved0;
+   unsigned long long SendRegBase;
+   unsigned long long UserRegBase;
+   unsigned long long CounterRegBase;
+   unsigned long long Scratch;
+   unsigned long long Reserved1;
+   unsigned long long Reserved2;
+   unsigned long long IntBlocked;
+   unsigned long long IntMask;
+   unsigned long long IntStatus;
+   unsigned long long IntClear;
+   unsigned long long ErrorMask;
+   unsigned long long ErrorStatus;
+   unsigned long long ErrorClear;
+   unsigned long long HwErrMask;
+   unsigned long long HwErrStatus;
+   unsigned long long HwErrClear;
+   unsigned long long HwDiagCtrl;
+   unsigned long long MDIO;
+   unsigned long long IBCStatus;
+   unsigned long long IBCCtrl;
+   unsigned long long ExtStatus;
+   unsigned long long ExtCtrl;
+   unsigned long long GPIOOut;
+   unsigned long long GPIOMask;
+   unsigned long long GPIOStatus;
+   unsigned long long GPIOClear;
+   unsigned long long RcvCtrl;
+   unsigned long long RcvBTHQP;
+   unsigned long long RcvHdrSize;
+   unsigned long long RcvHdrCnt;
+   unsigned long long RcvHdrEntSize;
+   unsigned long long RcvTIDBase;
+   unsigned long long RcvTIDCnt;
+   unsigned long long RcvEgrBase;
+   unsigned long long RcvEgrCnt;
+   unsigned long long RcvBufBase;
+   unsigned long long RcvBufSize;
+   unsigned long long RxIntMemBase;
+   unsigned long long RxIntMemSize;
+   unsigned long long RcvPartitionKey;
+   unsigned long long Reserved3;
+   unsigned long long RcvPktLEDCnt;
+   unsigned long long Reserved4[8];
+   unsigned long long SendCtrl;
+   unsigned long long SendPIOBufBase;
+   unsigned long long SendPIOSize;
+   unsigned long long SendPIOBufCnt;
+   unsigned long long SendPIOAvailAddr;
+   unsigned long long TxIntMemBase;
+   unsigned long long TxIntMemSize;
+   unsigned long long Reserved5;
+   unsigned long long PCIeRBufTestReg0;
+   unsigned long long PCIeRBufTestReg1;
+   unsigned long long Reserved51[6];
+   unsigned long long SendBufferError;
+   unsigned long long SendBufferErrorCONT1;
+   unsigned long long Reserved6SBE[6];
+   unsigned long long RcvHdrAddr0;
+   unsigned long long RcvHdrAddr1;
+   unsigned long long RcvHdrAddr2;
+   uns

[openib-general] [PATCH 3 of 18] ipath - copy and send routines for sending an skb

2006-03-23 Thread Bryan O'Sullivan
These routines handle the access and alignment patterns required by the
hardware, so that skbs, which have looser requirements on alignment and
sizing, can be copied, checksummed, and sent efficiently.

Signed-off-by: Bryan O'Sullivan <[EMAIL PROTECTED]>

diff -r 4b2debbcae33 -r 5685fc1cd481 drivers/infiniband/hw/ipath/ipath_copy.c
--- /dev/null   Thu Jan  1 00:00:00 1970 +
+++ b/drivers/infiniband/hw/ipath/ipath_copy.c  Thu Mar 23 20:27:44 2006 -0800
@@ -0,0 +1,521 @@
+/*
+ * Copyright (c) 2003, 2004, 2005, 2006 PathScale, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+/*
+ * This file provides support for doing sk_buff buffer swapping between
+ * the low level driver eager buffers, and the network layer.  It's part
+ * of the core driver, rather than the ether driver, because it relies
+ * on variables and functions in the core driver.  It exports a single
+ * entry point for use in the ipath_ether module.
+ */
+
+#include 
+#include 
+#include 
+
+#include "ipath_kernel.h"
+#include "ips_common.h"
+
+/**
+ * layer_send_getpiobuf - allocate, setup and copy out a PIO send buffer
+ * @dd: the infinipath device
+ * @cdp: the data to copy
+ *
+ * Allocate a PIO send buffer, initialize the header and copy it out.
+ */
+static int layer_send_getpiobuf(struct ipath_devdata *dd,
+   struct copy_data_s *cdp)
+{
+   u32 extra_bytes;
+   u32 len, nwords, hdrwords;
+   u32 __iomem *piobuf;
+   int ret;
+
+   piobuf = ipath_getpiobuf(dd, NULL);
+   if (!piobuf) {
+   cdp->error = -EBUSY;
+   ret = cdp->error;
+   goto bail;
+   }
+
+   /*
+* Compute the max amount of data that can fit into a PIO buffer.
+* buffer size - header size - trigger qword length & flags - CRC
+*/
+   len = dd->ipath_ibmaxlen -
+   sizeof(struct ether_header) - 8 - (SIZE_OF_CRC << 2);
+   if (len > dd->ipath_rcvegrbufsize)
+   len = dd->ipath_rcvegrbufsize;
+   if (len > (cdp->len + cdp->extra))
+   len = (cdp->len + cdp->extra);
+   /* Compute word aligment (i.e., (len & 3) ? 4 - (len & 3) : 0) */
+   extra_bytes = (4 - len) & 3;
+   nwords = (sizeof(struct ether_header) + len + extra_bytes) >> 2;
+   cdp->hdr->lrh[2] = htons(nwords + SIZE_OF_CRC);
+   cdp->hdr->bth[0] = htonl((OPCODE_ITH4X << 24) +
+(extra_bytes << 20) +
+IPS_DEFAULT_P_KEY);
+   cdp->hdr->sub_opcode = OPCODE_ENCAP;
+
+   cdp->hdr->bth[2] = 0;
+   /*
+* Generate an interrupt on the receive side for the last
+* fragment.
+*/
+   cdp->hdr->iph.pkt_flags = ((cdp->len + cdp->extra) == len)
+   ? __cpu_to_le16(INFINIPATH_KPF_INTR) : 0;
+   cdp->hdr->iph.chksum = __cpu_to_le16(
+   (u16) IPS_LRH_BTH + (u16) (nwords + SIZE_OF_CRC) -
+   (u16) ((__le32_to_cpu(cdp->hdr->iph.ver_port_tid_offset)
+   >> 16) & 0x) -
+   (u16) (__le32_to_cpu(cdp->hdr->iph.ver_port_tid_offset)
+  & 0x) -
+   (u16) __le16_to_cpu(cdp->hdr->iph.pkt_flags));
+
+   ipath_cdbg(VERBOSE, "send %d (%x %x %x %x %x %x %x)\n", nwords,
+  cdp->hdr->lrh[0], cdp->hdr->lrh[1],
+  cdp->hdr->lrh[2], cdp->hdr->lrh[3],
+  cdp->hdr->bth[0], cdp->hdr->bth[1], cdp->hdr->bth[2]);
+   /*
+* Write len to control qword, no flags.
+* +1 is for the qword padding of pbc.
+*/
+   writeq(nwords + 1ULL, (u6

[openib-general] [PATCH 0 of 18] ipath driver - for inclusion in 2.6.17

2006-03-23 Thread Bryan O'Sullivan
Hi -

This is a submission of the ipath driver for inclusion in 2.6.17.
Andrew, if this looks good to you, please apply.

We have addressed all earlier rounds of feedback; the driver is stable;
it compiles with no compiler or sparse warnings against current -git (it's
comprehensively annotated for sparse); and I think it's in good shape.
We have gone to great lengths over the past several months to make it
an exemplary kernel citizen.

Changes since the last round of review comments:

  - We have rewritten some code in ipath_rc.c to make it more
comprehensible and maintainable.

  - The ipathfs filesystem now handles hotplugged devices.

  - Miscellaneous fixes requested by Greg and Andrew.

If you have any comments or suggestions, please let me know.

The ipath driver is a driver for PathScale InfiniPath host channel
adapters (HCAs) based on the HT-400 and PE-800 chips, including the
InfiniPath HT-460, the small form factor InfiniPath HT-460, the InfiniPath
HT-470 and the Linux Networx LS/X.

The core driver manages the hardware, and provides a fast memory-mapped
interface to the hardware for userspace networking applications.
Our implementation of the Infiniband protocols and integration into the
kernel's Infiniband stack is written as a layer on top of the core driver.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Re: patch request: openib 1.0-rc1 IPoIB destructor patch

2006-03-23 Thread Bryan O'Sullivan
On Thu, 2006-03-23 at 17:28 -0800, Roland Dreier wrote:

> My understanding is that the kernel will be outside of the openib 1.0
> release.

That's correct.

http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: patch request: openib 1.0-rc1 IPoIB destructor patch

2006-03-23 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> I think you're in trouble then.  Basically the old IPoIB driver in the
> kernel has a bug, and if you can't change the kernel, then you're
> stuck with the bug.

Shirley, I think what you are doing is replacing the kernel IB stack
with the development version from svn. Right?
In this case, just take svn trunk from r5875 or later, and it has
the neighbour problem work-around.
I'll be interested to know whether it works fine for you.

-- 
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Re: [PATCH 9 of 18] ipath - char devices for diagnostics and lightweight subnet management

2006-03-23 Thread Robert Walsh
> I'm
> talking about all the kernel code like the following (and similar
> stuff for guidinfo, nodedescription, portinfo, pkeytable).
> 
> You must have nearly identical code in your userspace SMA, since it
> also has to respond to the same SM queries, right?
> 
> I'm trying to understand why you can't get down to one implementation
> of these functions.

Why does that make a difference?  The way I see it, we handle MAD
packets by either diverting them somewhere or passing them through the
normal ib_mad channel.  We divert them somewhere because we find it
convenient to do so: it allows us to provide an SMA to our customers
without them having to have the full IB stack running.  The SMA we
provide for these circumstances runs in userspace.  It doesn't make use
of the existing ipath_mad.c code because that's tailored to: 1) run in
the kernel; and 2) deal with the IB stack.  Even if we ripped out the
guts of ipath_mad.c and had it pass the requests to the userspace SMA,
we'd still have to have the diversion path in there for cases where the
IB stack isn't around.

-- 
Robert Walsh Email: [EMAIL PROTECTED]
PathScale, Inc.  Phone: +1 650 934 8117
2071 Stierlin Court, Suite 200 Fax: +1 650 428 1969
Mountain View, CA 94043.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: patch request: openib 1.0-rc1 IPoIB destructor patch

2006-03-23 Thread Roland Dreier
Shirley> I thought the IPoIB destructor problem can be addressed
Shirley> either without or with kernel change. Maybe I am
Shirley> wrong. Without changing the kernel, this problem couldn't
Shirley> be addressed? So openib 1.0 release has a kernel
Shirley> dependency here. How can we handle this problem in the
Shirley> environment doesn't allow any kernel changes?

I think you're in trouble then.  Basically the old IPoIB driver in the
kernel has a bug, and if you can't change the kernel, then you're
stuck with the bug.

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: patch request: openib 1.0-rc1 IPoIB destructor patch

2006-03-23 Thread Michael S. Tsirkin
Quoting r. Shirley Ma <[EMAIL PROTECTED]>:
> Subject: Re: patch request: openib 1.0-rc1 IPoIB destructor patch
> 
> 
> I thought the IPoIB destructor problem can be addressed either without or 
> with kernel change. Maybe I am wrong. Without changing the kernel, this 
> problem couldn't be addressed?

svn trunk includes ipoib work-around for 2.6.16. This might be what you want.

> So openib 1.0 release has a kernel dependency here.
> How can we handle this problem in the environment doesn't allow any kernel 
> changes?

ipoib is not part of openib releases, it is part of the kernel releases.  To
handle bugs in released kernels, you backport fixes from the development kernel
and push to stable team.

We handle the kernel dependency problem by making userspace compatible with
older kernels.


-- 
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: patch request: openib 1.0-rc1 IPoIB destructor patch

2006-03-23 Thread Shirley Ma

I thought the IPoIB destructor problem
can be addressed either without or with kernel change. Maybe I am wrong.
Without changing the kernel, this problem couldn't be addressed? So openib
1.0 release has a kernel dependency here. How can we handle this problem
in the environment doesn't allow any kernel changes?

Michael,

I remember you provided me a work around
patch before from your own directoy without any changes in the kernel.
Does that patch apply to openib 1.0 rc1? If not, would you please create
a one? 

Thanks in advance!
Shirley Ma
IBM Linux Technology Center
15300 SW Koll Parkway
Beaverton, OR 97006-6063
Phone(Fax): (503) 578-7638___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] Re[4]: People who are sensible about love are incapable of it

2006-03-23 Thread shabria anabell

Good evening,

Just one bottle will change your life!
Gain MASSIVE Sexual Chemistry on Demand!

Instantly Arouse, Attract, Excite, Intrigue and Seduce
Gorgeous Women Whenever You Want, Wherever You Want,
As Often As You Want ... Any Time YOU Are In the Mood.

Increase your self-confidence and masculinity big-time.
then this may be the most important news you will read all year.
Here's why ...

If YOU have ever experienced even one of those frustrating 
"What did I do wrong?" moments, trying to meet and seduce good-looking 
women, it's time for you to stop wondering and fumbling, ever again.


But the fact of the matter is, for most guys, this "chemistry" is rare.
Way more often you have awkward silences, embarrassing miscues,
wrong turns and dead ends .

You know what I mean? It hurts. Well, just suppose for a moment .
Suppose you could have total control. Suppose you could turn on the
chemistry whenever and with whoever YOU want?
Suppose you knew you could turn ON this magical attraction with
as many women as you want, whenever YOU want it?

"Now You Can!!!" 

More information you will get on our eiiittte: mynewweapon[DOT]info (replace "[DOT]" to ".") 




a smile.
"I can't defend his opinions," Darya Alexandrovna said, firing
up; "but I can say that he's a highly cultivated man, and if he
were here he would know very well how to answer you, though I am
not capable of doing so."
"I like him extremely, and we are great friends," Sviazhsky said,
smiling good-naturedly.  "_Mais pardon, il est un petit peu toqu?;_
he maintains, for instance, that district councils and
arbitration boards are all of no use, and he is unwilling to take
part in anything."
"It's our Russian apathy," said Vronsky, pouring water from an
iced decanter into a delicate glass on a high stem; "we've no
sense of the duties our privileges impose upon us, and so we
refuse to recognize these duties."
"I know no man more strict in the performance of his duties,"

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: patch request: openib 1.0-rc1 IPoIB destructor patch

2006-03-23 Thread Roland Dreier
Michael> Shirley, I agree there's a problem in 2.6.16, but openib
Michael> does not make kernel releases. That is managed by Linus
Michael> and the stable team at http://kernel.org/.

And further -- nothing is preventing you from pulling the relevant
changes from Linus's tree (they have all been merged into the 2.6.17
tree) and sending them to [EMAIL PROTECTED] for inclusion in 2.6.16.1.

I will do this eventually but I might not get a chance in time for 2.6.16.1.

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: patch request: openib 1.0-rc1 IPoIB destructor patch

2006-03-23 Thread Roland Dreier
Shirley> Roland, Do I need to submit a bug against openib 1.0 rc1?
Shirley> Anyone will provide a patch for openib 1.0 rc1? How these
Shirley> kind of problems are handled in openib 1.0 release cycle?

My understanding is that the kernel will be outside of the openib 1.0
release.  But you would have to ask Bryan about the details of the 1.0
process.

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: patch request: openib 1.0-rc1 IPoIB destructor patch

2006-03-23 Thread Michael S. Tsirkin
Quoting r. Shirley Ma <[EMAIL PROTECTED]>:
> Subject: Re: patch request: openib 1.0-rc1 IPoIB destructor patch
> 
> 
> Roland,
> 
> Do I need to submit a bug against openib 1.0 rc1? Anyone will provide a patch 
> for openib 1.0 rc1? How these kind of problems are handled in openib 1.0 
> release cycle?

Shirley, I agree there's a problem in 2.6.16, but openib does not make kernel
releases. That is managed by Linus and the stable team at http://kernel.org/.

-- 
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH 9 of 18] ipath - char devices for diagnostics and lightweight subnet management

2006-03-23 Thread Roland Dreier
Bryan> I'm a bit confused by your question.  We only have one SMA
Bryan> implementation, which is in userspace.  The stuff that's in
Bryan> our core driver is purely for supporting it.  That same
Bryan> code is also used during diags, too, to let userspace send
Bryan> and receive low-level packets.

We seem to be having a problem with the definition of an SMA.  I'm
talking about all the kernel code like the following (and similar
stuff for guidinfo, nodedescription, portinfo, pkeytable).

You must have nearly identical code in your userspace SMA, since it
also has to respond to the same SM queries, right?

I'm trying to understand why you can't get down to one implementation
of these functions.

 > +struct nodeinfo {
 > +u8 base_version;
 > +u8 class_version;
 > +u8 node_type;
 > +u8 num_ports;
 > +__be64 sys_guid;
 > +__be64 node_guid;
 > +__be64 port_guid;
 > +__be16 partition_cap;
 > +__be16 device_id;
 > +__be32 revision;
 > +u8 local_port_num;
 > +u8 vendor_id[3];
 > +} __attribute__ ((packed));
 > +
 > +static inline int recv_subn_get_nodeinfo(struct ib_smp *smp,
 > + struct ib_device *ibdev, u8 port)
 > +{
 > +struct nodeinfo *nip = (struct nodeinfo *)&smp->data;
 > +struct ipath_devdata *dd = to_idev(ibdev)->dd;
 > +u32 vendor, boardid, majrev, minrev;
 > +
 > +if (smp->attr_mod)
 > +smp->status |= IB_SMP_INVALID_FIELD;
 > +
 > +nip->base_version = 1;
 > +nip->class_version = 1;
 > +nip->node_type = 1; /* channel adapter */
 > +/*
 > + * XXX The num_ports value will need a layer function to get
 > + * the value if we ever have more than one IB port on a chip.
 > + * We will also need to get the GUID for the port.
 > + */
 > +nip->num_ports = ibdev->phys_port_cnt;
 > +/* This is already in network order */
 > +nip->sys_guid = to_idev(ibdev)->sys_image_guid;
 > +nip->node_guid = ipath_layer_get_guid(dd);
 > +nip->port_guid = nip->sys_guid;
 > +nip->partition_cap = cpu_to_be16(ipath_layer_get_npkeys(dd));
 > +nip->device_id = cpu_to_be16(ipath_layer_get_deviceid(dd));
 > +ipath_layer_query_device(dd, &vendor, &boardid, &majrev, &minrev);
 > +nip->revision = cpu_to_be32((majrev << 16) | minrev);
 > +nip->local_port_num = port;
 > +nip->vendor_id[0] = 0;
 > +nip->vendor_id[1] = vendor >> 8;
 > +nip->vendor_id[2] = vendor;
 > +
 > +return reply(smp);
 > +}
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [PATCH] clean up node descriptors for printing in ibnetdiscover

2006-03-23 Thread Ira Weiny
Our ibnetdiscover output was getting hosed by a '\r' in the switches nodedesc.

Before:

# odev0 /home/weiny2 > ibswitches 
" port 0 lid 208f104003f00a1 ports 8 "SW-12CIB4 Voltaire
...

After:

# odev0 /home/weiny2 > ./ibswitches 
Switch  : 0x0008f104003f00a1 ports 8 "SW-12CIB4 Voltaire" port 0 lid 2
...

This patch is against the 1.0 Branch.

Ira



clean_nodedesc.patch
Description: Binary data
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] ehca error message translation request..

2006-03-23 Thread Troy Benjegerdes

Can someone please translate? babelfish doesn't talk ibmese..

[8270280.043608] eHCA Infiniband Device Driver (Rel.: SVNEHCA_0002)
[8297399.067840] PU0002 000e0139:ehca_hcall_7arg_7ret HCAD_ERROR
opcode=168 ret=ffd3 arg1=10010304
arg2=2009 arg3=ac0 arg4=7c46000 arg5=0 arg6=0
arg7=0 out1=0 out2=0 out3=0 out4=0 out5=0 out6=8005aa18 out7=0
[8297399.067914] PU0002 000b04a7:internal_modify_qp HCAD_ERROR
hipz_h_modify_qp() failed rc=ffd3 ehca_qp=c001dae4ec80
qp_num=9
[8297447.131758] eHCA Infiniband Device Driver (Rel.: SVNEHCA_0002)
[8297454.299214] PU0002 00060100:parse_ec  ehca0: port 1 is active.
[8297479.282491] PU0002 000e0139:ehca_hcall_7arg_7ret HCAD_ERROR
opcode=160 ret=ffd4 arg1=10010304 arg2=5
arg3=1001dbb0 arg4=1 arg5=c0 arg6=7be03e0 arg7=0 out1=0
out2=0 out3=0 out4=0 out5=0 out6=0 out7=0
[8297479.282531] PU0002 00090443:ehca_reg_mr HCAD_ERROR  hipz_alloc_mr
failed, rc=ffd4 hca_hndl=10010304 mr_hndl=0
[8297479.282561] PU0002 00090463:ehca_reg_mr <<< retcode=ffea
shca=c003cbcad000 e_mr=c001dac7ee80 iova_start=1001dbb0
size=1 acl=3 e_pd=c7be03e0 pginfo=c001d8287a90 num_pages=1
[8297479.282595] PU0002 00090173:ehca_reg_user_mr <<<
rc=ffea pd=c7be03e0 region=c71e7aa8
mr_access_flags=3 udata=c001d8287bb0
[8297610.812988] PU0007 000e0139:ehca_hcall_7arg_7ret HCAD_ERROR
opcode=160 ret=ffd4 arg1=10010304 arg2=5
arg3=1001b000 arg4=1000 arg5=80 arg6=b178f420 arg7=0 out1=0
out2=0 out3=0 out4=0 out5=0 out6=0 out7=0
[8297610.813031] PU0007 00090443:ehca_reg_mr HCAD_ERROR  hipz_alloc_mr
failed, rc=ffd4 hca_hndl=10010304 mr_hndl=0
[8297610.813061] PU0007 00090463:ehca_reg_mr <<< retcode=ffea
shca=c003cbcad000 e_mr=c003af268080 iova_start=1001b000
size=1000 acl=1 e_pd=c003b178f420 pginfo=c001db31ba90
num_pages=1
[8297610.813097] PU0007 00090173:ehca_reg_user_mr <<<
rc=ffea pd=c003b178f420 region=c003cbe08d28
mr_access_flags=1 udata=c001db31bbb0
[8297633.828665] PU0007 000e0139:ehca_hcall_7arg_7ret HCAD_ERROR
opcode=160 ret=ffd4 arg1=10010304 arg2=5
arg3=1001b000 arg4=1000 arg5=80 arg6=b178f3a0 arg7=0 out1=0
out2=0 out3=0 out4=0 out5=0 out6=0 out7=0
[8297633.828703] PU0007 00090443:ehca_reg_mr HCAD_ERROR  hipz_alloc_mr
failed, rc=ffd4 hca_hndl=10010304 mr_hndl=0
[8297633.828733] PU0007 00090463:ehca_reg_mr <<< retcode=ffea
shca=c003cbcad000 e_mr=c003af268a80 iova_start=1001b000
size=1000 acl=1 e_pd=c003b178f3a0 pginfo=c001dac77a90
num_pages=1
[8297633.828768] PU0007 00090173:ehca_reg_user_mr <<<
rc=ffea pd=c003b178f3a0 region=c003b2b38928
mr_access_flags=1 udata=c001dac77bb0
[8297638.644845] PU0007 000e0139:ehca_hcall_7arg_7ret HCAD_ERROR
opcode=160 ret=ffd4 arg1=10010304 arg2=5
arg3=1001b000 arg4=1000 arg5=80 arg6=b178f3a0 arg7=0 out1=0
out2=0 out3=0 out4=0 out5=0 out6=0 out7=0
[8297638.644883] PU0007 00090443:ehca_reg_mr HCAD_ERROR  hipz_alloc_mr
failed, rc=ffd4 hca_hndl=10010304 mr_hndl=0
[8297638.644912] PU0007 00090463:ehca_reg_mr <<< retcode=ffea
shca=c003cbcad000 e_mr=c003af268a80 iova_start=1001b000
size=1000 acl=1 e_pd=c003b178f3a0 pginfo=c001dac77a90
num_pages=1
[8297638.644947] PU0007 00090173:ehca_reg_user_mr <<<
rc=ffea pd=c003b178f3a0 region=c003b2b38928
mr_access_flags=1 udata=c001dac77bb0
[8297641.252159] PU0007 000e0139:ehca_hcall_7arg_7ret HCAD_ERROR
opcode=160 ret=ffd4 arg1=10010304 arg2=5
arg3=1001b000 arg4=1000 arg5=80 arg6=b178f3a0 arg7=0 out1=0
out2=0 out3=0 out4=0 out5=0 out6=0 out7=0
[8297641.252197] PU0007 00090443:ehca_reg_mr HCAD_ERROR  hipz_alloc_mr
failed, rc=ffd4 hca_hndl=10010304 mr_hndl=0
[8297641.252226] PU0007 00090463:ehca_reg_mr <<< retcode=ffea
shca=c003cbcad000 e_mr=c003af268a80 iova_start=1001b000
size=1000 acl=1 e_pd=c003b178f3a0 pginfo=c001dac77a90
num_pages=1
[8297641.252263] PU0007 00090173:ehca_reg_user_mr <<<
rc=ffea pd=c003b178f3a0 region=c003b2b38928
mr_access_flags=1 udata=c001dac77bb0
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: patch request: openib 1.0-rc1 IPoIB destructor patch

2006-03-23 Thread Shirley Ma

Roland,

Do I need to submit a bug against openib
1.0 rc1? Anyone will provide a patch for openib 1.0 rc1? How these kind
of problems are handled in openib 1.0 release cycle?

Thanks
Shirley Ma
IBM Linux Technology Center
15300 SW Koll Parkway
Beaverton, OR 97006-6063
Phone(Fax): (503) 578-7638___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] ehca ipz_qeit_reset???

2006-03-23 Thread Troy Benjegerdes
Okay guys, what gives here..

src/ehca_umain.c: In function 'ehcau_modify_qp':
src/ehca_umain.c:407: warning: implicit declaration of function
'ipz_qeit_reset'
 gcc -DHAVE_CONFIG_H -I. -I. -I. -O2 -g -Wall -D_GNU_SOURCE -DP_SERIES
 -I../libibverbs/include -Isrc -g -O2 -MT src_libehca_la-ehca_umain.lo
 -MD -MP -MF .deps/src_libehca_la-ehca_umain.Tpo -c src/ehca_umain.c -o
 src_libehca_la-ehca_umain.o >/dev/null 2>&1


Why do we have functions that don't exist anywhere in the svn source?
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] ehca weirdness??

2006-03-23 Thread Troy Benjegerdes
On Thu, Mar 23, 2006 at 01:35:50PM -0800, Roland Dreier wrote:
> Troy> Okay, this is hokey. Both drivers should be able to
> Troy> coexist. Here is a full strace with the libmthca.so removed,
> Troy> which still doens't seem to work right.
> 
> Yes, the drivers should be able to coexist.  I just meant that
> libmthca was the one looking for the "vendor" attribute -- it won't
> find it, and (correctly) conclude that it can't drive that device.
> 
> I'm not sure why libehca doesn't think it owns the device.  I would
> suggest putting some printf() calls in the openib_driver_init()
> function in ehca_uinit.c to make sure it's getting called, and figure
> out where it's bailing out if it is.

We should probably check the result of dlopen..

libibverbs/src$ svn diff
Index: init.c
===
--- init.c  (revision 5988)
+++ init.c  (working copy)
@@ -62,10 +62,16 @@
void *dlhandle;
ibv_driver_init_func init_func;
struct ibv_driver *driver;
+   char * err;

+   printf("init.c: load_driver( %s )\n", so_path);
+
dlhandle = dlopen(so_path, RTLD_NOW);
-   if (!dlhandle)
+   if (!dlhandle){
+   err = dlerror();
+   printf("load_driver: %s\n", err);
return;
+   }

dlerror();
init_func = dlsym(dlhandle, "openib_driver_init");



p5l3:/usr/src/openib-src/userspace/libibverbs/examples# ibv_devices
init.c: load_driver( /usr/lib/infiniband/libehca.so )
load_driver: /usr/lib/infiniband/libehca.so: undefined symbol:
ipz_qeit_reset
init.c: load_driver( (null) )
init_drivers
ibdev_name: ehca0
libibverbs: Warning: no userspace device-specific driver found for
uverbs0
driver search path: /usr/lib/infiniband
device node GUID
--  

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general] question related to rdma_bind_addr

2006-03-23 Thread Sean Hefty
>What does it mean to bind to a remote address? What functionality
>would that enable? Spoofing?

I think that Or is just exploring the idea of synchronously binding to a local
*device* based on a remote address.  This would allow an application to bind,
then allocate PDs, CQs, QPs, etc. up front, rather than deferring resource
allocation until address resolution completes.  A ULP may be able to take
advantage of this, but I can't personally say that I know what benefit it would
provide.  (Maybe avoid the need to keep track of everything that must be
allocated once address resolution completes?)

>When I think of bind(2), I only think of binding to local addresses.

Yes - this is what rdma_bind_addr(src_addr) does.  But I can envision adding a
new call, rdma_bind_device(dst_addr), provided some use for it can be found.

- Sean
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH 9 of 18] ipath - char devices for diagnostics and lightweight subnet management

2006-03-23 Thread Bryan O'Sullivan
On Thu, 2006-03-23 at 11:18 -0800, Roland Dreier wrote:

> But I still (after all this discussion)
> don't understand why you need to have two SMA implementations to
> handle this along with the code to switch between the two modes like:

I'm a bit confused by your question.  We only have one SMA
implementation, which is in userspace.  The stuff that's in our core
driver is purely for supporting it.  That same code is also used during
diags, too, to let userspace send and receive low-level packets.

The code in ipath_mad.c simply handles requests for ib_mad, if the
in-kernel SMA is being used.

> You also have all the functions like recv_subn_get_nodeinfo() etc. for
> handling SM queries.  Presumably all this is duplicated in the
> userspace SMA.

Only a very small subset of SMA functionality is present in the
userspace SMA.

http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH 8 of 18] ipath - sysfs and ipathfs support for core driver

2006-03-23 Thread Greg KH
On Thu, Mar 23, 2006 at 12:44:45AM -0800, Bryan O'Sullivan wrote:
> On Wed, 2006-03-22 at 21:49 -0800, Greg KH wrote:
> > Oh, and I like your new filesystem, but where do you propose that it be
> > mounted?
> 
> I don't have any good candidates in mind.  In our development
> environment, we're mounting it in /ipath, but that doesn't seem like a
> good long-term name.  Do you have any suggestions?

Nope, sorry.  At least /ipath is LSB compliant :)

thanks,

greg k-h
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general] question related to rdma_bind_addr

2006-03-23 Thread James Lentini

On Thu, 23 Mar 2006, Sean Hefty wrote:

> >I could not approve my assumptions from looking on the cma/addr 
> >code, but if i am correct this opens the door for future 
> >enhancement of rdma_bind_addr() to work on non local addresses.
> 
> I believe that could be the case.

What does it mean to bind to a remote address? What functionality 
would that enable? Spoofing?

When I think of bind(2), I only think of binding to local addresses. 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: patch request: openib 1.0-rc1 IPoIB destructor patch

2006-03-23 Thread Michael S. Tsirkin
Quoting r. Shirley Ma <[EMAIL PROTECTED]>:
> Subject: patch request: openib 1.0-rc1 IPoIB destructor patch
> 
> 
> Hello Michael,
> 
> I couldn't see the IPoIB destructor patch in openib 1.0-rc1. There is only a 
> linux-2.6.16 kernel neighbour->destroctor patch there.
> 
> Could you please create one? I need the work around patch for kernel < 
> 2.6.17. My test has been blocked.
> 

IPoIB is a kernel component. As such, I don't plan to do anything with it
with respect to openib 1.0.

Roland asked me to backport fixes for 2.6.16 stable tree but unfortunately
looks like I will only do it in April.  So I think your best bet now is to use
ipoib from svn trunk which already includes the fixes and no new features.

HTH,
MST

-- 
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] ehca weirdness??

2006-03-23 Thread Roland Dreier
Troy> Okay, this is hokey. Both drivers should be able to
Troy> coexist. Here is a full strace with the libmthca.so removed,
Troy> which still doens't seem to work right.

Yes, the drivers should be able to coexist.  I just meant that
libmthca was the one looking for the "vendor" attribute -- it won't
find it, and (correctly) conclude that it can't drive that device.

I'm not sure why libehca doesn't think it owns the device.  I would
suggest putting some printf() calls in the openib_driver_init()
function in ehca_uinit.c to make sure it's getting called, and figure
out where it's bailing out if it is.

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] patch request: openib 1.0-rc1 IPoIB destructor patch

2006-03-23 Thread Shirley Ma

Hello Michael,

I couldn't see the IPoIB destructor
patch in openib 1.0-rc1. There is only a linux-2.6.16 kernel neighbour->destroctor
patch there. 

Could you please create one? I need
the work around patch for kernel < 2.6.17. My test has been blocked.

Thanks
Shirley Ma
IBM Linux Technology Center
15300 SW Koll Parkway
Beaverton, OR 97006-6063
Phone(Fax): (503) 578-7638___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Re: RFC: e2e credits

2006-03-23 Thread James Lentini


On Thu, 23 Mar 2006, Michael S. Tsirkin wrote:

mst> Guys, please review the verbs.h extension below - I'll be 
mst> submitting a patch including CM update and mthca implementation 
mst> next week. Anyone has a problem with this?
mst> 
mst> Index: openib/drivers/infiniband/include/rdma/ib_user_verbs.h
mst> ===
mst> --- openib.orig/drivers/infiniband/include/rdma/ib_user_verbs.h
2006-03-02 21:41:01.0 +0200
mst> +++ openib/drivers/infiniband/include/rdma/ib_user_verbs.h 2006-03-23 
00:50:48.0 +0200
mst> @@ -378,7 +378,8 @@ struct ib_uverbs_qp_attr {
mst>__u8rnr_retry;
mst>__u8alt_port_num;
mst>__u8alt_timeout;
mst> -  __u8reserved[5];
mst> +  __u8flow_control;
mst> +  __u8reserved[4];
mst>  };
mst>  
mst>  struct ib_uverbs_create_qp {
mst> Index: openib/drivers/infiniband/include/rdma/ib_verbs.h
mst> ===
mst> --- openib.orig/drivers/infiniband/include/rdma/ib_verbs.h 2006-03-02 
21:41:01.0 +0200
mst> +++ openib/drivers/infiniband/include/rdma/ib_verbs.h  2006-03-23 
00:50:28.0 +0200
mst> @@ -573,6 +573,7 @@ struct ib_qp_attr {
mst>u8  rnr_retry;
mst>u8  alt_port_num;
mst>u8  alt_timeout;
mst> +  u8  flow_control;
mst>  };
mst>  
mst>  enum ib_wr_opcode {

Will there be a default value? If so, what will it be?

My vote would be for off.

Does all hardware support the ability to turn this on/off?
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] ehca weirdness??

2006-03-23 Thread Troy Benjegerdes
On Thu, Mar 23, 2006 at 11:10:19AM -0800, Roland Dreier wrote:
>  > libibverbs: Warning: no userspace device-specific driver found for uverbs0
>  > driver search path: /usr/lib/infiniband
> 
> Is the ehca driver in that directory?  As far as I can tell from the
> strace and the libehca source, you have another driver (probably
> libmthca) looking for a vendor attribute; but ehca doesn't try to
> check the vendor attribute.

Okay, this is hokey. Both drivers should be able to coexist. Here is
a full strace with the libmthca.so removed, which still doens't seem
to work right.


execve("/usr/bin/ibv_devices", ["ibv_devices"], [/* 14 vars */]) = 0
uname({sys="Linux", node="p5l3.fast", ...}) = 0
brk(0)  = 0x10012000
access("/etc/ld.so.nohwcap", F_OK)  = -1 ENOENT (No such file or directory)
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x401e000
access("/etc/ld.so.preload", R_OK)  = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)  = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=16780, ...}) = 0
mmap(NULL, 16780, PROT_READ, MAP_PRIVATE, 3, 0) = 0x401f000
close(3)= 0
access("/etc/ld.so.nohwcap", F_OK)  = -1 ENOENT (No such file or directory)
open("/usr/lib/libibverbs.so.1", O_RDONLY) = 3
read(3, "\177ELF\2\2\1\0\0\0\0\0\0\0\0\0\0\3\0\25\0\0\0\1\0\0\0"..., 640) = 640
fstat(3, {st_mode=S_IFREG|0644, st_size=44552, ...}) = 0
mmap(NULL, 109488, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x4032000
mprotect(0x403c000, 68528, PROT_NONE) = 0
mmap(0x404b000, 8192, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x9000) = 0x404b000
close(3)= 0
access("/etc/ld.so.nohwcap", F_OK)  = -1 ENOENT (No such file or directory)
open("/lib/libsysfs.so.1", O_RDONLY)= 3
read(3, "\177ELF\2\2\1\0\0\0\0\0\0\0\0\0\0\3\0\25\0\0\0\1\0\0\0"..., 640) = 640
fstat(3, {st_mode=S_IFREG|0644, st_size=71736, ...}) = 0
mmap(NULL, 72168, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x404d000
mmap(0x405d000, 8192, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1) = 0x405d000
close(3)= 0
access("/etc/ld.so.nohwcap", F_OK)  = -1 ENOENT (No such file or directory)
open("/lib/libpthread.so.0", O_RDONLY)  = 3
read(3, "\177ELF\2\2\1\0\0\0\0\0\0\0\0\0\0\3\0\25\0\0\0\1\0\0\0"..., 640) = 640
fstat(3, {st_mode=S_IFREG|0644, st_size=157478, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x4024000
mmap(NULL, 189056, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x405f000
mprotect(0x4077000, 90752, PROT_NONE) = 0
mmap(0x4086000, 12288, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x17000) = 0x4086000
mmap(0x4089000, 17024, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4089000
close(3)= 0
access("/etc/ld.so.nohwcap", F_OK)  = -1 ENOENT (No such file or directory)
open("/lib/libdl.so.2", O_RDONLY)   = 3
read(3, "\177ELF\2\2\1\0\0\0\0\0\0\0\0\0\0\3\0\25\0\0\0\1\0\0\0"..., 640) = 640
fstat(3, {st_mode=S_IFREG|0644, st_size=15352, ...}) = 0
mmap(NULL, 79304, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x408e000
mprotect(0x4091000, 67016, PROT_NONE) = 0
mmap(0x40a, 8192, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x40a
close(3)= 0
access("/etc/ld.so.nohwcap", F_OK)  = -1 ENOENT (No such file or directory)
open("/lib/libc.so.6", O_RDONLY)= 3
read(3, "\177ELF\2\2\1\0\0\0\0\0\0\0\0\0\0\3\0\25\0\0\0\1\0\0\0"..., 640) = 640
fstat(3, {st_mode=S_IFREG|0755, st_size=1604256, ...}) = 0
mmap(NULL, 1679664, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x40a2000
mprotect(0x4212000, 172336, PROT_NONE) = 0
mmap(0x4221000, 98304, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x16f000) = 0x4221000
mmap(0x4239000, 12592, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4239000
close(3)= 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x4025000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x4026000
mprotect(0x4221000, 8192, PROT_READ) = 0
mprotect(0x40a, 4096, PROT_READ) = 0
mprotect(0x4086000, 4096, PROT_READ) = 0
mprotect(0x402e000, 4096, PROT_READ) = 0
munmap(0x401f000, 16780)= 0
set_tid_address(0x40258a0)  = 31310
rt_sigaction(SIGRTMIN, {0x40870b8, [], SA_SIGINFO}, NULL, 8) = 0
rt_sigaction(SIGRT_1, {0x40870d0, [], SA_RESTART|SA_SIGINFO}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0
_sys

[openib-general] Re: [PATCH 8 of 18] ipath - sysfs and ipathfs support for core driver

2006-03-23 Thread Robert Walsh
> > Oh, and I like your new filesystem, but where do you propose that it be
> > mounted?
> 
> I don't have any good candidates in mind.  In our development
> environment, we're mounting it in /ipath, but that doesn't seem like a
> good long-term name.  Do you have any suggestions?

Actually, I've been mounting it on /ipathfs.  Not that the difference is
terribly important.  But if you do have suggestions on where this kind
of thing should go, I'd love to hear it.

Regards,
 Robert.

-- 
Robert Walsh Email: [EMAIL PROTECTED]
PathScale, Inc.  Phone: +1 650 934 8117
2071 Stierlin Court, Suite 200 Fax: +1 650 428 1969
Mountain View, CA 94043.


signature.asc
Description: This is a digitally signed message part
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] [PATCH v1] opensm: Disregard subn->min_ca_rate/mtu during MC Group creation.

2006-03-23 Thread Sasha Khapyorsky
Hello,

There is updated patch.

Sasha.


Disregard subn->ca_min_mtu and subn->ca_min_rate when new MC group is
created and exact MTU and/or rate values are specified.

Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]>
---

 osm/include/iba/ib_types.h  |9 +++--
 osm/opensm/osm_sa_mcmember_record.c |   12 ++--
 2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/osm/include/iba/ib_types.h b/osm/include/iba/ib_types.h
index d30b547..83bb55e 100644
--- a/osm/include/iba/ib_types.h
+++ b/osm/include/iba/ib_types.h
@@ -1479,7 +1479,10 @@ ib_class_is_vendor_specific(
 #define IB_MTU_LEN_1024
3
 #define IB_MTU_LEN_2048
4
 #define IB_MTU_LEN_4096
5
-#define IB_MAX_MTU5
+
+#define IB_MIN_MTUIB_MTU_LEN_256
+#define IB_MAX_MTUIB_MTU_LEN_4096
+
 /**/
 
 /d* IBA Base: Constants/IB_PATH_SELECTOR_TYPE
@@ -4363,7 +4366,6 @@ ib_port_info_get_link_speed_active(
 #define IB_LINK_SPEED_ACTIVE_104
 
 /* following v1 ver1.2 p901 */
-#define IB_MAX_RATE10
 #define IB_PATH_RECORD_RATE_2_5_GBS2
 #define IB_PATH_RECORD_RATE_10_GBS 3
 #define IB_PATH_RECORD_RATE_30_GBS 4
@@ -4374,6 +4376,9 @@ ib_port_info_get_link_speed_active(
 #define IB_PATH_RECORD_RATE_80_GBS 9
 #define IB_PATH_RECORD_RATE_120_GBS10
 
+#define IB_MIN_RATEIB_PATH_RECORD_RATE_2_5_GBS
+#define IB_MAX_RATEIB_PATH_RECORD_RATE_120_GBS 
+
 /f* IBA Base: Types/ib_port_info_compute_rate
 * NAME
 *  ib_port_info_compute_rate
diff --git a/osm/opensm/osm_sa_mcmember_record.c 
b/osm/opensm/osm_sa_mcmember_record.c
index ce1d036..f720440 100644
--- a/osm/opensm/osm_sa_mcmember_record.c
+++ b/osm/opensm/osm_sa_mcmember_record.c
@@ -1121,12 +1121,12 @@ __mgrp_request_is_realizable(
   break;
 case 2: /* Exactly MTU specified */
   /* make sure it is in the range */
-  if ((1 > mtu_required) || (mtu_required > p_rcv->p_subn->min_ca_mtu))
+  if (mtu_required < IB_MIN_MTU || mtu_required > IB_MAX_MTU)
   {
 osm_log( p_log, OSM_LOG_DEBUG,
  "__mgrp_request_is_realizable: "
- "Requested MTU %x out of range: 1 .. %x\n",
- mtu_required, p_rcv->p_subn->min_ca_mtu);
+ "Requested MTU %x is out of range\n",
+ mtu_required);
 return FALSE;
   }
   break;
@@ -1198,12 +1198,12 @@ __mgrp_request_is_realizable(
   break;
 case 2: /* Exactly RATE specified */
   /* make sure it is in the range */
-  if ((2 > rate_required) || (rate_required > p_rcv->p_subn->min_ca_rate))
+  if (rate_required < IB_MIN_RATE || rate_required > IB_MAX_RATE)
   {
 osm_log( p_log, OSM_LOG_DEBUG,
  "__mgrp_request_is_realizable: "
- "Requested RATE %x out of range: 2 .. %x\n",
- rate_required, p_rcv->p_subn->min_ca_rate);
+ "Requested RATE %x is out of range\n",
+ rate_required);
 return FALSE;
   }
   break;
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH] Disregard subn->min_ca_rate/mtu during MCGroupcreation.

2006-03-23 Thread Sasha Khapyorsky
On 13:16 Thu 23 Mar , Hal Rosenstock wrote:
> > 
> > > The realizability is when the port joins not when the group is created.
> > > This is significant for the precreated groups (as other groups are
> > > created when the first port joins).
> > > 
> > > Is min rate/MTU needed for anything ?
> > 
> > Currently it is used when the rate is not requested.
> 
> for what ?

As rate value for created MC group.

> > Also for cases when the rate is requested as greater than specified value.
> > There is such check: if (rate_requested >= min_ca_rate) error... -  the
> > same problem I think.
> 
> So disregard what ?

min_ca_rate value as failure criteria. There may be something like:

  rate = rate_requested + 1;
  if (rate < min_ca_rate)
 rate = min_ca_rate;

Also in IBTA spec there is option RateSelector=3, when maximum available
rate are desired (the rate value specified in request is ignored). I
don't see that it is handled somehow. I think we may use port's rate
here.

> Part of that patch is still valid, right ?

Right, it still valid. However I will add discussed IB_MIN/MAX_RATE
range check for requested rate and reissue.

I think the rest of issues are beyond of this patch topic and should be
handled handled in separate patch.

Sasha.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH 9 of 18] ipath - char devices for diagnostics and lightweight subnet management

2006-03-23 Thread Roland Dreier
Bryan> We have customers who use our driver who do not want a full
Bryan> IB stack present, for example in embedded environments.

I think it's fine that your low-level driver can work without ib_core,
ib_mad and the rest loaded.  But I still (after all this discussion)
don't understand why you need to have two SMA implementations to
handle this along with the code to switch between the two modes like:

 > +list_for_each_entry(dd, &ipath_dev_list, ipath_list) {
 > +if (!(dd->ipath_flags & IPATH_INITTED))
 > +continue;
 > +*dd->ipath_statusp &= ~IPATH_STATUS_SMA;
 > +if (ipath_verbs_registered)
 > +*dd->ipath_statusp |= IPATH_STATUS_OIB_SMA;
 > +}

You also have all the functions like recv_subn_get_nodeinfo() etc. for
handling SM queries.  Presumably all this is duplicated in the
userspace SMA.  Why can't you get down to one NodeInfo query handler?

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] ehca weirdness??

2006-03-23 Thread Roland Dreier
 > libibverbs: Warning: no userspace device-specific driver found for uverbs0
 > driver search path: /usr/lib/infiniband

Is the ehca driver in that directory?  As far as I can tell from the
strace and the libehca source, you have another driver (probably
libmthca) looking for a vendor attribute; but ehca doesn't try to
check the vendor attribute.

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] ehca weirdness??

2006-03-23 Thread Troy Benjegerdes
I just built a fresh 2.6.16 kernel, and the ehca (and associated ibverbs
stuff) from the lastest stubversion (5988), and things like
'ibv_devices' fail..

p5l3:~# ibv_devices
libibverbs: Warning: no userspace device-specific driver found for
uverbs0
driver search path: /usr/lib/infiniband
device node GUID
--  


strace seems to indicate it can't find a 'vendor' file for the device in
sysfs?

lstat("/sys/devices/ibmebus/B.001.DNW4776-P1-C6/adapter_handle",
{st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
stat("/sys/devices/ibmebus/B.001.DNW4776-P1-C6/adapter_handle",
{st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
open("/sys/devices/ibmebus/B.001.DNW4776-P1-C6/adapter_handle",
O_RDONLY) = 4
read(4, "10010304\n", 4096) = 17
close(4)= 0
lstat("/sys/devices/ibmebus/B.001.DNW4776-P1-C6/uevent",
{st_mode=S_IFREG|0200, st_size=4096, ...}) = 0
stat("/sys/devices/ibmebus/B.001.DNW4776-P1-C6/uevent",
{st_mode=S_IFREG|0200, st_size=4096, ...}) = 0
getdents64(3, /* 0 entries */, 4096)= 0
close(3)= 0


lstat("/sys/devices/ibmebus/B.001.DNW4776-P1-C6/vendor", 0xf847220)
= -1 ENOENT (No such file or directory)
write(2, "libibverbs: Warning: no userspac"..., 96libibverbs: Warning:


no userspace device-specific driver found for uverbs0
driver search path: ) = 96
write(2, "/usr/lib/infiniband\n", 20/usr/lib/infiniband
)   = 20
fstat(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 1), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x401f000
write(1, "device  \t   node GUI"..., 34device
node GUID
) = 34
write(1, "--  \t---"..., 38--

) = 38
munmap(0x401f000, 4096) = 0
exit_group(0)   = ?

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] mthca FMR correctness (and memory windows)

2006-03-23 Thread Michael Krause


At 04:30 PM 3/20/2006, Talpey, Thomas wrote:
At 06:00 PM 3/20/2006, Sean
Hefty wrote:
>Can you provide more details on this statement?  When are you
fencing the send 
>queue when using memory windows?
Infiniband 101, and VI before it. Memory windows fence later
operations
on the send queue until the bind completes. It's a misguided attempt
to
make upper layers' job "easier" because they can post a bind
and then
immediately post a send carrying the rkey. In reality, it introduces
bubbles
in the send pipeline and reduces op rates
dramatically.
The requirement / semantics were derived from the ULP being used to
construct the technology.  The combination of a bind-n-send
operation was to reduce the software interactions with the device by
consolidating this into a combo operation.  I do not follow your
logic that this creates a bubble in the send pipeline as there were also
ordering and correctness issues w.r.t. subsequent operations to the
send.  The bind-n-send is a single operation and its fence semantics
were required to allow the bind to complete before informing the remote
of the subsequent information in order to avoid race conditions.

I argued against
them in iWARP verbs, and lost. If Linux could introduce
a way to make the fencing behavior optional, I would lead the
parade.
I fear most hardware is implemented otherwise.
Hardware generally implements operations in the order they are posted to
a given QP, i.e. it is a serial execution flow that allows pipelined
operations to be posted and executed by the hardware.  Scaling is
achieved by executing across a set of QP and thus a set of
resources.  The ordering domain requirements are kept simple to
allow low-cost hardware implementations.  This does not preclude
software from executing across a set of QP in any order that it
desires.  
Yes, I know about
binding on a separate queue. That doesn't work, because windows are
semantically not fungible (for security
reasons).
You could always simply allow a region to be accessible across multiple
operations but then again storage argued that it must only be accessible
for a single op thus things like FMR, bind-n-send, etc. were all
created.  To say that storage was not listened to or their needs
were not met or balanced against what is practical to implement in either
the creation of IB or iWARP is simply incorrect.  
Mike

Tom.
___
openib-general mailing list
openib-general@openib.org

http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit

http://openib.org/mailman/listinfo/openib-general


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] Re: [PATCH] Disregard subn->min_ca_rate/mtu during MCGroupcreation.

2006-03-23 Thread Hal Rosenstock
Hi again Sasha,

On Thu, 2006-03-23 at 13:17, Sasha Khapyorsky wrote:
> On 11:48 Thu 23 Mar , Hal Rosenstock wrote:
> > > > But is that field set to the max rate/MTU ? (I didn't check the code for
> > > > this). Is it just a name thing or more ?
> > > 
> > > This is more then just name. Currently we only have min_ca_rate which
> > > stores value of slowest port's rate. What we will need is similar
> > > variable for fastest port's rate, and then to check against it. I like
> > > this idea and this should be easy enouph to  do.
> > 
> > As the max rate/MTU port on the subnet can change, is this worth it ?
> 
> This is true, this may change (the same is for min rate/MTU).
> 
> Then there may be just basic range check like this:
> 
>  if (rate_required < IB_MIN_RATE || rate_required > IB_MAX_RATE)
>error...;
> 
> But it is likely useless - later we are checking port's physical
> capability to support such rate).
> 
> > The realizability is when the port joins not when the group is created.
> > This is significant for the precreated groups (as other groups are
> > created when the first port joins).
> > 
> > Is min rate/MTU needed for anything ?
> 
> Currently it is used when the rate is not requested.

for what ?

> Also for cases when the rate is requested as greater than specified value.
> There is such check: if (rate_requested >= min_ca_rate) error... -  the
> same problem I think.

So disregard what ?

Part of that patch is still valid, right ?

Can you reissue ? Thanks.

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH] Disregard subn->min_ca_rate/mtu during MCGroupcreation.

2006-03-23 Thread Sasha Khapyorsky
On 11:48 Thu 23 Mar , Hal Rosenstock wrote:
> > > But is that field set to the max rate/MTU ? (I didn't check the code for
> > > this). Is it just a name thing or more ?
> > 
> > This is more then just name. Currently we only have min_ca_rate which
> > stores value of slowest port's rate. What we will need is similar
> > variable for fastest port's rate, and then to check against it. I like
> > this idea and this should be easy enouph to  do.
> 
> As the max rate/MTU port on the subnet can change, is this worth it ?

This is true, this may change (the same is for min rate/MTU).

Then there may be just basic range check like this:

 if (rate_required < IB_MIN_RATE || rate_required > IB_MAX_RATE)
   error...;

But it is likely useless - later we are checking port's physical
capability to support such rate).

> The realizability is when the port joins not when the group is created.
> This is significant for the precreated groups (as other groups are
> created when the first port joins).
> 
> Is min rate/MTU needed for anything ?

Currently it is used when the rate is not requested.

Also for cases when the rate is requested as greater than specified value.
There is such check: if (rate_requested >= min_ca_rate) error... -  the
same problem I think.

Sasha.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Re: [iproute2] IPoIB link layer address bug

2006-03-23 Thread Mark Butler




James Lentini wrote:

  On Tue, 21 Mar 2006, Jason Gunthorpe wrote:

  
  
On Tue, Mar 21, 2006 at 03:56:17PM -0800, Stephen Hemminger wrote:



  Okay, but there are number of other places in iproute2 that call 
ll_addr_a2n() with ifr.ifr_hwaddr.sa_data. And that is 14 bytes.  
If you want to fix those it will be harder since it would increase 
the sizeof(struct sockaddr) and potentially break compatibility.
  

Maybe the best thing is to upgrade ip (and or netlink?) to use 
netlink messages instead of ioctls for the remaining problematic 
operations. Since netlink already supports an arbitary length hwaddr 
there should be no compatability problem.

Just browsing I see usages of SIOCSIFHWBROADCAST, SIOCSIFHWADDR, 
SIOCADDMULTI, SIOCDELMULTI and SIOCGIFHWADDR that use a struct 
ifreq..

I know SIOCGIFHWADDR can be done over netlink, but I'm not too 
familiar with the others..

  
  
Making ip neighbor work with IPoIB address is what I'm interested in 
now.

As you and Jason point out there are a lot of places where ifreqs are 
used and hence options that will not support IPoIB addresses.

  

The sockaddr union is at the end of struct ifreq.  Couldn't the union
sockaddr members be changed to sockaddr_storage, and the SIOC
encoded size bits be changed?  dev_ifsioc() would just need to mask out
(or substitute) the size bits before the switch statement.

 - Mark





___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] [PATCH] IPoIB PKEY in/out event handling fix

2006-03-23 Thread Leonid Arsh

Roland,

I'm resending you the patch. Hope it will work fine now.

The patch was tested on an earlier version on the kernel 2.6.15.
I checked that in can also be applied on the latest revision (rev 5943.)

This patch requires my previous small patch to be applied ( 
[openib-general] [PATCH] IPoIB network interface "RUNNING" statuswith the cable 
disconnected - fix.)
Although you succeeded to apply the small patch manually, I'm resending it 
too.
I generated it against the latest version, so you shouldn't have any 
problems applying it,
if you need.

Leonid wrote:
>   the patch causes the network interface to respond to PKEY in/out events 
> correctly.
>  As a result, you'll see a child interface in the "RUNNING" state 
> (netif_carrier_on()) only when the corresponding PKEY is configured by the SM.
>   When SM removes a PKEY, the "RUNNING" state will be disabled for the 
> corresponding network interface (netif_carrir_off().)
>   (To do it,  I added IB_EVENT_PKEY_CHANGE event handling.  
>   To prevent flushing the device before the device is open by the "delay 
> open" mechanism,  I added an additional device flag called 
> IPOIB_FLAG_INITIALIZED.)
>
>  The patch also prevents the child network interface from trying to join to 
> multicast groups until the PKEY is configured.
>  We used to get error messages like:
>   "ib0.f2f2: couldn't attach QP to multicast group 
> ff12:401b:f2f2:0:0:0::"
>  in this case.
>  (To do it, I just check  IPOIB_FLAG_OPER_UP flag in
ipoib_set_mcast_list() )


This is the main patch:

Signed-off-by: Leonid Arsh <[EMAIL PROTECTED]>

Index: linux-kernel/infiniband/ulp/ipoib/ipoib_verbs.c
===
--- linux-kernel/infiniband/ulp/ipoib/ipoib_verbs.c (revision 8503)
+++ linux-kernel/infiniband/ulp/ipoib/ipoib_verbs.c (working copy)
@@ -252,6 +252,7 @@
container_of(handler, struct ipoib_dev_priv, event_handler);
 
if (record->event == IB_EVENT_PORT_ERR||
+   record->event == IB_EVENT_PKEY_CHANGE ||
record->event == IB_EVENT_PORT_ACTIVE ||
record->event == IB_EVENT_LID_CHANGE  ||
record->event == IB_EVENT_SM_CHANGE) {
Index: linux-kernel/infiniband/ulp/ipoib/ipoib_main.c
===
--- linux-kernel/infiniband/ulp/ipoib/ipoib_main.c  (revision 8502)
+++ linux-kernel/infiniband/ulp/ipoib/ipoib_main.c  (working copy)
@@ -737,6 +737,11 @@
 {
struct ipoib_dev_priv *priv = netdev_priv(dev);
 
+   if (!test_bit(IPOIB_FLAG_OPER_UP, &priv->flags)) {
+   ipoib_dbg(priv, "IPOIB_FLAG_OPER_UP not set");
+   return;
+   }
+
queue_work(ipoib_workqueue, &priv->restart_task);
 }
 
Index: linux-kernel/infiniband/ulp/ipoib/ipoib.h
===
--- linux-kernel/infiniband/ulp/ipoib/ipoib.h   (revision 8502)
+++ linux-kernel/infiniband/ulp/ipoib/ipoib.h   (working copy)
@@ -79,13 +79,14 @@
IPOIB_MAX_MCAST_QUEUE = 3,
 
IPOIB_FLAG_OPER_UP= 0,
-   IPOIB_FLAG_ADMIN_UP   = 1,
-   IPOIB_PKEY_ASSIGNED   = 2,
-   IPOIB_PKEY_STOP   = 3,
-   IPOIB_FLAG_SUBINTERFACE   = 4,
-   IPOIB_MCAST_RUN   = 5,
-   IPOIB_STOP_REAPER = 6,
-   IPOIB_MCAST_STARTED   = 7,
+   IPOIB_FLAG_INITIALIZED= 1,
+   IPOIB_FLAG_ADMIN_UP   = 2,
+   IPOIB_PKEY_ASSIGNED   = 3,
+   IPOIB_PKEY_STOP   = 4,
+   IPOIB_FLAG_SUBINTERFACE   = 5,
+   IPOIB_MCAST_RUN   = 6,
+   IPOIB_STOP_REAPER = 7,
+   IPOIB_MCAST_STARTED   = 8,
 
IPOIB_MAX_BACKOFF_SECONDS = 16,
 
Index: linux-kernel/infiniband/ulp/ipoib/ipoib_ib.c
===
--- linux-kernel/infiniband/ulp/ipoib/ipoib_ib.c(revision 8502)
+++ linux-kernel/infiniband/ulp/ipoib/ipoib_ib.c(working copy)
@@ -422,13 +422,33 @@
clear_bit(IPOIB_STOP_REAPER, &priv->flags);
queue_delayed_work(ipoib_workqueue, &priv->ah_reap_task, HZ);
 
+   set_bit(IPOIB_FLAG_INITIALIZED, &priv->flags);
return 0;
 }
 
+static void ipoib_pkey_dev_check_presence(struct net_device *dev)
+{
+   struct ipoib_dev_priv *priv = netdev_priv(dev);
+   u16 pkey_index = 0;
+
+   if (ib_find_cached_pkey(priv->ca, priv->port, priv->pkey, &pkey_index))
+   clear_bit(IPOIB_PKEY_ASSIGNED, &priv->flags);
+   else
+   set_bit(IPOIB_PKEY_ASSIGNED, &priv->flags);
+}
+
 int ipoib_ib_dev_up(struct net_device *dev)
 {
struct ipoib_dev_priv *priv = netdev_priv(dev);
 
+   ipoib_pkey_dev_check_presence(dev);
+
+   if (!test_bit(IPOIB_PKEY_ASSIGNED, &priv->flags)) {
+   ipoib_dbg(priv, "PKEY is not assigned.\n");
+   return 0;
+   }
+
+
set_bit(I

[openib-general] Re: RFC: e2e credits

2006-03-23 Thread Michael S. Tsirkin
Quoting r. Hal Rosenstock <[EMAIL PROTECTED]>:
> > > > > > > > >Sean, just to wrap it up, the API at the verbs layer will look
> > > > > > > > >like the below, and then ULPs just put the value they want in
> > > > > > > > >the CM and CM will pass it in to low level.

So this is our question, right?


CM REQ and REP messages include the following field:

---
12.7.26 END-TO-END FLOW CONTROL
Signifies whether the local CA actually implements End-to-End Flow Control
(1), or instead always advertises .invalid credits.(0). See section
9.7.7.2 End-to-End (Message Level) Flow Control for more detail.
---

Consider and implementation that advertises valid credits for
connections, and always advertises invalid credits for other connections.
This is compliant since the IB spec says (end-to-end (message level) flow
control, Requester Behaviour):
"Even a responder which does generate end-to-end credits may choose to send the
'invalid' code in the AETH"

Is it compliant for CM implementations to set/clear the End-to-End Flow Control
field accordingly, taking it to mean

"whether the local CA actually implements End-to-End Flow Control
(1), or instead always advertises 'invalid credits'(0)"
*for the specific connection*


-- 
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: RFC: e2e credits

2006-03-23 Thread Hal Rosenstock
On Thu, 2006-03-23 at 12:09, Michael S. Tsirkin wrote:
> Quoting r. Hal Rosenstock <[EMAIL PROTECTED]>:
> > Subject: Re: RFC: e2e credits
> > 
> > On Thu, 2006-03-23 at 11:41, Michael S. Tsirkin wrote:
> > > Quoting r. Hal Rosenstock <[EMAIL PROTECTED]>:
> > > > Subject: Re: RFC: e2e credits
> > > > 
> > > > On Thu, 2006-03-23 at 11:13, Michael S. Tsirkin wrote:
> > > > > Quoting r. Hal Rosenstock <[EMAIL PROTECTED]>:
> > > > > > > >Sean, just to wrap it up, the API at the verbs layer will look 
> > > > > > > >like the
> > > > > > > >below, and then ULPs just put the value they want in the CM and 
> > > > > > > >CM will
> > > > > > > >pass it in to low level.
> > > > > > > 
> > > > > > > I'm fine with this, but I do think that it's a minor spec
> > > > > > > violation/enhancement, so I'd like to get agreement with Hal, 
> > > > > > > Roland, and
> > > > > > > other HCA vendors about this change.
> > > > > > 
> > > > > > I too think this is a (minor) spec change. I recommended checking 
> > > > > > with
> > > > > > the CM authors on it.
> > > > > 
> > > > > Why are you saying its a spec change? Because now the same HCA might 
> > > > > return
> > > > > different values at different times?
> > > > 
> > > > I think that's a creative interpretation of what the spec says (or
> > > > perhaps doesn't say) but that's just my $0.02 worth.
> > > 
> > > OK, could you check this with the relevant people please?
> > 
> > Is there a reason you can't ? Mellanox has people who participate in the
> > relevant IBTA WGs.
> 
> Yes but it's weekend here :)
> BTW, which WG would that be?

SWG


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: Open IB stack for Switch

2006-03-23 Thread Michael S. Tsirkin
Quoting r. Hal Rosenstock <[EMAIL PROTECTED]>:
> > Unfortunately, this is all on 2.6.12, and I am not sure when we will move to
> > a newer kernel.  
> 
> I will look at the changes and try to merge them up to the latest
> versions but will have no way to test on a switch (only an HCA). Is
> there a way to accomplish this ?

its not hard to build svn trunk on 2.6.12 if you want to:
just apply to it the backport patches from
https://openib.org/svn/gen2/branches/backport/2.6.12

-- 
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Re: [iproute2] IPoIB link layer address bug

2006-03-23 Thread James Lentini

On Tue, 21 Mar 2006, Jason Gunthorpe wrote:

> On Tue, Mar 21, 2006 at 03:56:17PM -0800, Stephen Hemminger wrote:
> 
> > Okay, but there are number of other places in iproute2 that call 
> > ll_addr_a2n() with ifr.ifr_hwaddr.sa_data. And that is 14 bytes.  
> > If you want to fix those it will be harder since it would increase 
> > the sizeof(struct sockaddr) and potentially break compatibility.
> 
> Maybe the best thing is to upgrade ip (and or netlink?) to use 
> netlink messages instead of ioctls for the remaining problematic 
> operations. Since netlink already supports an arbitary length hwaddr 
> there should be no compatability problem.
> 
> Just browsing I see usages of SIOCSIFHWBROADCAST, SIOCSIFHWADDR, 
> SIOCADDMULTI, SIOCDELMULTI and SIOCGIFHWADDR that use a struct 
> ifreq..
> 
> I know SIOCGIFHWADDR can be done over netlink, but I'm not too 
> familiar with the others..

Making ip neighbor work with IPoIB address is what I'm interested in 
now.

As you and Jason point out there are a lot of places where ifreqs are 
used and hence options that will not support IPoIB addresses.

Do you agree with Jason's strategy of moving the ioctls to netlink 
messages (if netlink analogs exist)?
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: RFC: e2e credits

2006-03-23 Thread Michael S. Tsirkin
Quoting r. Hal Rosenstock <[EMAIL PROTECTED]>:
> Subject: Re: RFC: e2e credits
> 
> On Thu, 2006-03-23 at 11:41, Michael S. Tsirkin wrote:
> > Quoting r. Hal Rosenstock <[EMAIL PROTECTED]>:
> > > Subject: Re: RFC: e2e credits
> > > 
> > > On Thu, 2006-03-23 at 11:13, Michael S. Tsirkin wrote:
> > > > Quoting r. Hal Rosenstock <[EMAIL PROTECTED]>:
> > > > > > >Sean, just to wrap it up, the API at the verbs layer will look 
> > > > > > >like the
> > > > > > >below, and then ULPs just put the value they want in the CM and CM 
> > > > > > >will
> > > > > > >pass it in to low level.
> > > > > > 
> > > > > > I'm fine with this, but I do think that it's a minor spec
> > > > > > violation/enhancement, so I'd like to get agreement with Hal, 
> > > > > > Roland, and
> > > > > > other HCA vendors about this change.
> > > > > 
> > > > > I too think this is a (minor) spec change. I recommended checking with
> > > > > the CM authors on it.
> > > > 
> > > > Why are you saying its a spec change? Because now the same HCA might 
> > > > return
> > > > different values at different times?
> > > 
> > > I think that's a creative interpretation of what the spec says (or
> > > perhaps doesn't say) but that's just my $0.02 worth.
> > 
> > OK, could you check this with the relevant people please?
> 
> Is there a reason you can't ? Mellanox has people who participate in the
> relevant IBTA WGs.

Yes but it's weekend here :)
BTW, which WG would that be?

-- 
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: RFC: e2e credits

2006-03-23 Thread Hal Rosenstock
On Thu, 2006-03-23 at 11:41, Michael S. Tsirkin wrote:
> Quoting r. Hal Rosenstock <[EMAIL PROTECTED]>:
> > Subject: Re: RFC: e2e credits
> > 
> > On Thu, 2006-03-23 at 11:13, Michael S. Tsirkin wrote:
> > > Quoting r. Hal Rosenstock <[EMAIL PROTECTED]>:
> > > > > >Sean, just to wrap it up, the API at the verbs layer will look like 
> > > > > >the
> > > > > >below, and then ULPs just put the value they want in the CM and CM 
> > > > > >will
> > > > > >pass it in to low level.
> > > > > 
> > > > > I'm fine with this, but I do think that it's a minor spec
> > > > > violation/enhancement, so I'd like to get agreement with Hal, Roland, 
> > > > > and
> > > > > other HCA vendors about this change.
> > > > 
> > > > I too think this is a (minor) spec change. I recommended checking with
> > > > the CM authors on it.
> > > 
> > > Why are you saying its a spec change? Because now the same HCA might 
> > > return
> > > different values at different times?
> > 
> > I think that's a creative interpretation of what the spec says (or
> > perhaps doesn't say) but that's just my $0.02 worth.
> 
> OK, could you check this with the relevant people please?

Is there a reason you can't ? Mellanox has people who participate in the
relevant IBTA WGs.

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general] Re: [PATCH 3/3] iWARP CM - iWARPConnectionManager

2006-03-23 Thread Steve Wise
On Wed, 2006-03-22 at 18:10 -0800, Sean Hefty wrote:
> This is fairly abstract, so please let me know if my understanding or logic is
> off anywhere.
> 
> Do we really have two conceptual contexts here?  One that tracks the user's
> information and another that interacts with the provider?
> 
> The user context is created / destroyed by calls to iw_create_cm_id() /
> iw_destroy_cm_id().
> 
> The provider context is "created" before calling connect() or accept().  It's
> "destroyed" if connect() or accept() fail, CONNECT_REPLY returns a failure, 
> or a
> CLOSE event occurs.  For the events, the context can be considered "destroyed"
> after the callback that reported these events has returned to the provider.
> 
> The issue is that we cannot touch either context from the other after it has
> been destroyed.  In turn, this implies that we need to acquire a reference on
> each context before accessing it.
> 
> The good news is that acquiring and releasing references is simple and doable.
> More good news is that destruction of each context will only be initiated 
> once.
> 
> The bad news is that we cannot block in the thread that reports events, which
> potentially complicates destruction of the provider context.  So, while we are
> safe having the provider context reference/dereference the user context.  The
> user context may never reference the provider context, because it can be
> destroyed at anytime after calling connect() or accept().
> 
> So, what if we viewed the cm_id as having two pieces, or gave it multiple
> reference counts, or split it into two structures?  Could any of that be used 
> to
> simplify / validate the design?
> 

IMO, doing this will result in another IWCM implementation that is about
the same complexity...

Stevo.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH] Disregard subn->min_ca_rate/mtu during MCGroupcreation.

2006-03-23 Thread Hal Rosenstock
On Thu, 2006-03-23 at 11:22, Sasha Khapyorsky wrote:
> On 10:24 Thu 23 Mar , Hal Rosenstock wrote:
> > 
> > On Thu, 2006-03-23 at 09:57, Eitan Zahavi wrote:
> > > Now I get it. Its my bug.
> > > What I meant was to check that the request is realizable:
> > > if ((2 > rate_required) || (rate_required > p_rcv->p_subn->min_ca_rate))
> > > Should catch the case where the rate required is too slow to be valid or
> > > if it is faster then the MAX rate of the fabric. But the use of
> > > min_ca_rate is incorrect it should be a new variable (probably named:
> > > max_ca_rate) that would hold the MAX rate of all the CA ports...
> > 
> > Yes, that's what I said in a separate email. 
> > 
> > if ((2 > rate_required) || (rate_required > p_rcv->p_subn->max_ca_rate))
> > makes sense
> > 
> > But is that field set to the max rate/MTU ? (I didn't check the code for
> > this). Is it just a name thing or more ?
> 
> This is more then just name. Currently we only have min_ca_rate which
> stores value of slowest port's rate. What we will need is similar
> variable for fastest port's rate, and then to check against it. I like
> this idea and this should be easy enouph to  do.

As the max rate/MTU port on the subnet can change, is this worth it ?
The realizability is when the port joins not when the group is created.
This is significant for the precreated groups (as other groups are
created when the first port joins).

Is min rate/MTU needed for anything ?

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Open IB stack for Switch

2006-03-23 Thread Hal Rosenstock
Hi Suri,

On Thu, 2006-03-23 at 11:04, Suresh Shelvapille wrote:
> Folks:
> 
> I am happy to announce that the Switch implementation of OpenIB stack is
> working very well.

Great!

>  I have tested this with Voltaire and Silver Storm SM.

What about OpenSM ?

> Of course, this is a very limited implementation in that I am only using the
> core-stack and nothing else. 

> Also, I have changed mad.c and agent.c to make this work, and I will send it
> to Hal as he made the initial changes. 
> 
> Unfortunately, this is all on 2.6.12, and I am not sure when we will move to
> a newer kernel.  

I will look at the changes and try to merge them up to the latest
versions but will have no way to test on a switch (only an HCA). Is
there a way to accomplish this ?

> Many thanks to this group!
> 
> Suri
> 
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: FW: [openib-general] [PATCH] osm_sa_mcmember_record : MCMemberGet/GetTable Trusted mode

2006-03-23 Thread Hal Rosenstock
Hi Ofer,

On Thu, 2006-03-23 at 11:04, Ofer Gigi wrote:
> Hi Hal,
> 
> The fix below fixes the retrieve of the mcmember records according to
> the 
> Errata MGTWG3280.
> 
> Quoting from MGTWG3280:
> 
> SA can be queried for multicast groups by sending a SubnAdmGet() or a
>  SubnAdmGetTable() request to it using the SA query mechanism (see
> 15.4.4 Administration Query Subsystem on page 923).
> 
> What SA returns in response to a query of multicast groups depends
> strongly on whether the request is or is not a trusted request; the
> degree of trust affects both the data returned in each attribute and
> the set of attributes that are returned. See .
> 
> o15-0.2.5 is made obsolete.
> 
> So we need to implement the following descriptive text:
> 
> SA can be queried for multicast groups by sending a SubnAdmGet() or a
> SubnAdmGetTable() request to it using the SA query mechanism (see
> 15.4.4 Administration Query Subsystem on page 923). SA will return one
> MCMemberRecord per multicast group matching the query, except in
> cases where trust is specified as indicated in 15.4.1.2 Access
> Restrictions
> For Other Attributes on page 922; in that case all the MCMemberRecords
> associated with the multicast group are returned. The MCMemberRecord
> will be returned with the PortGID, ProxyJoin, and the JoinState
> components
> set to 0, except where trust is specified as indicated above, in that
> case the actual contents for the above components will be provided.
> Thanks

Is this is same or different from the one which was sent earlier ?

-- Hal

> 
> Ofer G.
> 
> Signed-off-by:  Ofer Gigi <[EMAIL PROTECTED]>
> Index: osm_sa_mcmember_record.c
> ===
> --- osm_sa_mcmember_record.c  (revision 5919)
> +++ osm_sa_mcmember_record.c  (working copy)
> @@ -91,6 +91,7 @@ typedef  struct   osm_sa_mcmr_search_ctx
>cl_qlist_t  *p_list; /*  hold results */
>ib_net64_t  comp_mask;
>const osm_physp_t*p_req_physp;
> +  boolean_t   trusted_req;
>  } osm_sa_mcmr_search_ctxt_t;
>  
>  /**
> @@ -1918,6 +1919,9 @@ __osm_sa_mcm_by_comp_mask_cb(
>/* will be used for group or port info */
>uint8_t scope_state; 
>uint8_t scope_state_mask = 0;
> +  cl_map_item_t *p_item;
> +  ib_gid_t   port_gid;
> +  boolean_t proxy_join;
>  
>OSM_LOG_ENTER( p_rcv->p_log, __osm_sa_mcm_by_comp_mask_cb );
>  
> @@ -1938,39 +1942,28 @@ __osm_sa_mcm_by_comp_mask_cb(
>  
>/* first try to eliminate the group by MGID, MLID, or P_Key */
>if ((IB_MCR_COMPMASK_MGID & comp_mask) &&
> -  cl_memcmp(&p_rcvd_rec->mgid, &p_mgrp->mcmember_rec.mgid,
> sizeof(ib_gid_t))) {
> +  cl_memcmp(&p_rcvd_rec->mgid, 
> +&p_mgrp->mcmember_rec.mgid, 
> +sizeof(ib_gid_t)))
> +  {
>  goto Exit;
>}
>  
>if ((IB_MCR_COMPMASK_MLID & comp_mask) &&
> -  cl_memcmp(&p_rcvd_rec->mlid, &p_mgrp->mcmember_rec.mlid,
> sizeof(uint16_t))) {
> +  cl_memcmp(&p_rcvd_rec->mlid, 
> +  &p_mgrp->mcmember_rec.mlid, 
> +  sizeof(uint16_t))) 
> +  {
>  goto Exit;
>}
>  
> -  /* if the requester physical port doesn't have the pkey that is
> defined for the
> - group - exit. */
> -  if (! osm_physp_has_pkey( p_rcv->p_log, p_mgrp->mcmember_rec.pkey,
> p_req_physp ))
> +  /* if the requester physical port doesn't have the pkey that is
> defined for
> + the group - exit. */
> +  if (! osm_physp_has_pkey( p_rcv->p_log, 
> +p_mgrp->mcmember_rec.pkey, 
> +p_req_physp ))
>  goto Exit;
>  
> -  /* so did we get the PortGUID mask */
> -  if (IB_MCR_COMPMASK_PORT_GID & comp_mask)
> -  {
> -/* try to find this port */
> -if (osm_mgrp_is_port_present(p_mgrp, portguid, &p_mcm_port))
> -{
> -  scope_state = p_mcm_port->scope_state;
> -}
> -else
> -{
> -  /* port not in group */
> -  goto Exit;
> -}
> -  }
> -  else
> -  {
> -/* point to the group information */
> -scope_state = p_mgrp->mcmember_rec.scope_state;
> -  }
>  
>/* now do the rest of the match */
>if ((IB_MCR_COMPMASK_QKEY & comp_mask) &&
> @@ -2004,17 +1997,15 @@ __osm_sa_mcm_by_comp_mask_cb(
>if (query_hop != mgrp_hop) goto Exit;
>}
>  
> +  if ((IB_MCR_COMPMASK_PROXY & comp_mask) &&
> +  (p_rcvd_rec->proxy_join != p_mgrp->mcmember_rec.proxy_join)) goto
> Exit;
> +
>if (IB_MCR_COMPMASK_SCOPE & comp_mask)
>  scope_state_mask = 0xF0;
>  
>if (IB_MCR_COMPMASK_JOIN_STATE & comp_mask)
>  scope_state_mask = scope_state_mask | 0x0F;
>  
> -  if ((scope_state_mask & p_rcvd_rec->scope_state) !=
> -  (scope_state_mask & scope_state)) goto Exit;
> -
> -  if ((IB_MCR_COMPMASK_PROXY & comp_mask) &&
> -  (p_rcvd_rec->proxy_join != p_mgrp->mcmember_rec.proxy_join)) goto
> Exit;
>  
>/* need to validate mtu, rate, and pkt_lifetime fields. */
>if (__validate

[openib-general] Re: RFC: e2e credits

2006-03-23 Thread Michael S. Tsirkin
Quoting r. Hal Rosenstock <[EMAIL PROTECTED]>:
> Subject: Re: RFC: e2e credits
> 
> On Thu, 2006-03-23 at 11:13, Michael S. Tsirkin wrote:
> > Quoting r. Hal Rosenstock <[EMAIL PROTECTED]>:
> > > > >Sean, just to wrap it up, the API at the verbs layer will look like the
> > > > >below, and then ULPs just put the value they want in the CM and CM will
> > > > >pass it in to low level.
> > > > 
> > > > I'm fine with this, but I do think that it's a minor spec
> > > > violation/enhancement, so I'd like to get agreement with Hal, Roland, 
> > > > and
> > > > other HCA vendors about this change.
> > > 
> > > I too think this is a (minor) spec change. I recommended checking with
> > > the CM authors on it.
> > 
> > Why are you saying its a spec change? Because now the same HCA might return
> > different values at different times?
> 
> I think that's a creative interpretation of what the spec says (or
> perhaps doesn't say) but that's just my $0.02 worth.

OK, could you check this with the relevant people please?

-- 
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: RFC: e2e credits

2006-03-23 Thread Hal Rosenstock
On Thu, 2006-03-23 at 11:13, Michael S. Tsirkin wrote:
> Quoting r. Hal Rosenstock <[EMAIL PROTECTED]>:
> > > >Sean, just to wrap it up, the API at the verbs layer will look like the
> > > >below, and then ULPs just put the value they want in the CM and CM will
> > > >pass it in to low level.
> > > 
> > > I'm fine with this, but I do think that it's a minor spec
> > > violation/enhancement, so I'd like to get agreement with Hal, Roland, and
> > > other HCA vendors about this change.
> > 
> > I too think this is a (minor) spec change. I recommended checking with
> > the CM authors on it.
> 
> Why are you saying its a spec change? Because now the same HCA might return
> different values at different times?

I think that's a creative interpretation of what the spec says (or
perhaps doesn't say) but that's just my $0.02 worth.

-- Hal


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH] Disregard subn->min_ca_rate/mtu during MCGroupcreation.

2006-03-23 Thread Sasha Khapyorsky
On 10:24 Thu 23 Mar , Hal Rosenstock wrote:
> 
> On Thu, 2006-03-23 at 09:57, Eitan Zahavi wrote:
> > Now I get it. Its my bug.
> > What I meant was to check that the request is realizable:
> > if ((2 > rate_required) || (rate_required > p_rcv->p_subn->min_ca_rate))
> > Should catch the case where the rate required is too slow to be valid or
> > if it is faster then the MAX rate of the fabric. But the use of
> > min_ca_rate is incorrect it should be a new variable (probably named:
> > max_ca_rate) that would hold the MAX rate of all the CA ports...
> 
> Yes, that's what I said in a separate email. 
> 
> if ((2 > rate_required) || (rate_required > p_rcv->p_subn->max_ca_rate))
> makes sense
> 
> But is that field set to the max rate/MTU ? (I didn't check the code for
> this). Is it just a name thing or more ?

This is more then just name. Currently we only have min_ca_rate which
stores value of slowest port's rate. What we will need is similar
variable for fastest port's rate, and then to check against it. I like
this idea and this should be easy enouph to  do.

Sasha.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: RFC: e2e credits

2006-03-23 Thread Michael S. Tsirkin
Quoting r. Hal Rosenstock <[EMAIL PROTECTED]>:
> > >Sean, just to wrap it up, the API at the verbs layer will look like the
> > >below, and then ULPs just put the value they want in the CM and CM will
> > >pass it in to low level.
> > 
> > I'm fine with this, but I do think that it's a minor spec
> > violation/enhancement, so I'd like to get agreement with Hal, Roland, and
> > other HCA vendors about this change.
> 
> I too think this is a (minor) spec change. I recommended checking with
> the CM authors on it.

Why are you saying its a spec change? Because now the same HCA might return
different values at different times?

-- 
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: RFC: e2e credits

2006-03-23 Thread Hal Rosenstock
On Wed, 2006-03-22 at 18:26, Sean Hefty wrote:
> >Sean, just to wrap it up, the API at the verbs layer will look like the 
> >below,
> >and then ULPs just put the value they want in the CM and CM will pass
> >it in to low level.
> 
> I'm fine with this, but I do think that it's a minor spec 
> violation/enhancement,
> so I'd like to get agreement with Hal, Roland, and other HCA vendors about 
> this
> change.

I too think this is a (minor) spec change. I recommended checking with
the CM authors on it.

-- Hal


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Open IB stack for Switch

2006-03-23 Thread Suresh Shelvapille

Folks:

I am happy to announce that the Switch implementation of OpenIB stack is
working very well. I have tested this with Voltaire and Silver Storm SM.
Of course, this is a very limited implementation in that I am only using the
core-stack and nothing else. 

Also, I have changed mad.c and agent.c to make this work, and I will send it
to Hal as he made the initial changes. 

Unfortunately, this is all on 2.6.12, and I am not sure when we will move to
a newer kernel.  

Many thanks to this group!

Suri

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


FW: [openib-general] [PATCH] osm_sa_mcmember_record : MCMemberGet/GetTable Trusted mode

2006-03-23 Thread Ofer Gigi
Hi Hal,

The fix below fixes the retrieve of the mcmember records according to
the 
Errata MGTWG3280.

Quoting from MGTWG3280:

SA can be queried for multicast groups by sending a SubnAdmGet() or a
 SubnAdmGetTable() request to it using the SA query mechanism (see
15.4.4 Administration Query Subsystem on page 923).

What SA returns in response to a query of multicast groups depends
strongly on whether the request is or is not a trusted request; the
degree of trust affects both the data returned in each attribute and
the set of attributes that are returned. See .

o15-0.2.5 is made obsolete.

So we need to implement the following descriptive text:

SA can be queried for multicast groups by sending a SubnAdmGet() or a
SubnAdmGetTable() request to it using the SA query mechanism (see
15.4.4 Administration Query Subsystem on page 923). SA will return one
MCMemberRecord per multicast group matching the query, except in
cases where trust is specified as indicated in 15.4.1.2 Access
Restrictions
For Other Attributes on page 922; in that case all the MCMemberRecords
associated with the multicast group are returned. The MCMemberRecord
will be returned with the PortGID, ProxyJoin, and the JoinState
components
set to 0, except where trust is specified as indicated above, in that
case the actual contents for the above components will be provided.
Thanks

Ofer G.

Signed-off-by:  Ofer Gigi <[EMAIL PROTECTED]>
Index: osm_sa_mcmember_record.c
===
--- osm_sa_mcmember_record.c(revision 5919)
+++ osm_sa_mcmember_record.c(working copy)
@@ -91,6 +91,7 @@ typedef  struct   osm_sa_mcmr_search_ctx
   cl_qlist_t  *p_list; /*  hold results */
   ib_net64_t  comp_mask;
   const osm_physp_t*p_req_physp;
+  boolean_t   trusted_req;
 } osm_sa_mcmr_search_ctxt_t;
 
 /**
@@ -1918,6 +1919,9 @@ __osm_sa_mcm_by_comp_mask_cb(
   /* will be used for group or port info */
   uint8_t scope_state; 
   uint8_t scope_state_mask = 0;
+  cl_map_item_t *p_item;
+  ib_gid_t port_gid;
+  boolean_t proxy_join;
 
   OSM_LOG_ENTER( p_rcv->p_log, __osm_sa_mcm_by_comp_mask_cb );
 
@@ -1938,39 +1942,28 @@ __osm_sa_mcm_by_comp_mask_cb(
 
   /* first try to eliminate the group by MGID, MLID, or P_Key */
   if ((IB_MCR_COMPMASK_MGID & comp_mask) &&
-  cl_memcmp(&p_rcvd_rec->mgid, &p_mgrp->mcmember_rec.mgid,
sizeof(ib_gid_t))) {
+  cl_memcmp(&p_rcvd_rec->mgid, 
+&p_mgrp->mcmember_rec.mgid, 
+sizeof(ib_gid_t)))
+  {
 goto Exit;
   }
 
   if ((IB_MCR_COMPMASK_MLID & comp_mask) &&
-  cl_memcmp(&p_rcvd_rec->mlid, &p_mgrp->mcmember_rec.mlid,
sizeof(uint16_t))) {
+  cl_memcmp(&p_rcvd_rec->mlid, 
+  &p_mgrp->mcmember_rec.mlid, 
+  sizeof(uint16_t))) 
+  {
 goto Exit;
   }
 
-  /* if the requester physical port doesn't have the pkey that is
defined for the
- group - exit. */
-  if (! osm_physp_has_pkey( p_rcv->p_log, p_mgrp->mcmember_rec.pkey,
p_req_physp ))
+  /* if the requester physical port doesn't have the pkey that is
defined for
+ the group - exit. */
+  if (! osm_physp_has_pkey( p_rcv->p_log, 
+p_mgrp->mcmember_rec.pkey, 
+p_req_physp ))
 goto Exit;
 
-  /* so did we get the PortGUID mask */
-  if (IB_MCR_COMPMASK_PORT_GID & comp_mask)
-  {
-/* try to find this port */
-if (osm_mgrp_is_port_present(p_mgrp, portguid, &p_mcm_port))
-{
-  scope_state = p_mcm_port->scope_state;
-}
-else
-{
-  /* port not in group */
-  goto Exit;
-}
-  }
-  else
-  {
-/* point to the group information */
-scope_state = p_mgrp->mcmember_rec.scope_state;
-  }
 
   /* now do the rest of the match */
   if ((IB_MCR_COMPMASK_QKEY & comp_mask) &&
@@ -2004,17 +1997,15 @@ __osm_sa_mcm_by_comp_mask_cb(
   if (query_hop != mgrp_hop) goto Exit;
   }
 
+  if ((IB_MCR_COMPMASK_PROXY & comp_mask) &&
+  (p_rcvd_rec->proxy_join != p_mgrp->mcmember_rec.proxy_join)) goto
Exit;
+
   if (IB_MCR_COMPMASK_SCOPE & comp_mask)
 scope_state_mask = 0xF0;
 
   if (IB_MCR_COMPMASK_JOIN_STATE & comp_mask)
 scope_state_mask = scope_state_mask | 0x0F;
 
-  if ((scope_state_mask & p_rcvd_rec->scope_state) !=
-  (scope_state_mask & scope_state)) goto Exit;
-
-  if ((IB_MCR_COMPMASK_PROXY & comp_mask) &&
-  (p_rcvd_rec->proxy_join != p_mgrp->mcmember_rec.proxy_join)) goto
Exit;
 
   /* need to validate mtu, rate, and pkt_lifetime fields. */
   if (__validate_more_comp_fields( p_rcv->p_log,
@@ -2022,11 +2013,84 @@ __osm_sa_mcm_by_comp_mask_cb(
p_rcvd_rec,
comp_mask ) == FALSE) goto Exit;
 
+
+  /* Port specific fields */
+  /* so did we got the PortGUID mask */
+  if (IB_MCR_COMPMASK_PORT_GID & comp_mask)
+  {
+ /* try to find this port */
+ if (osm_mgrp_is_port_present(p_mgrp, por

[openib-general] RE: [PATCH] Disregard subn->min_ca_rate/mtu during MCGroupcreation.

2006-03-23 Thread Hal Rosenstock
Hi Eitan,

On Thu, 2006-03-23 at 09:57, Eitan Zahavi wrote:
> Now I get it. Its my bug.
> What I meant was to check that the request is realizable:
> if ((2 > rate_required) || (rate_required > p_rcv->p_subn->min_ca_rate))
> Should catch the case where the rate required is too slow to be valid or
> if it is faster then the MAX rate of the fabric. But the use of
> min_ca_rate is incorrect it should be a new variable (probably named:
> max_ca_rate) that would hold the MAX rate of all the CA ports...

Yes, that's what I said in a separate email. 

if ((2 > rate_required) || (rate_required > p_rcv->p_subn->max_ca_rate))
makes sense

But is that field set to the max rate/MTU ? (I didn't check the code for
this). Is it just a name thing or more ?

> Similar criteria should be applied for the MTU and in the various other
> checks (< / > / =)
> And so on

Yes.

-- Hal

> Eitan Zahavi
> Design Technology Director
> Mellanox Technologies LTD
> Tel:+972-4-9097208
> Fax:+972-4-9593245
> P.O. Box 586 Yokneam 20692 ISRAEL
> 
> 
> > -Original Message-
> > From: Hal Rosenstock [mailto:[EMAIL PROTECTED]
> > Sent: Thursday, March 23, 2006 3:34 PM
> > To: Eitan Zahavi
> > Cc: Sasha Khapyorsky; openib-general@openib.org; Yael Kalka; Ofer Gigi
> > Subject: RE: [PATCH] Disregard subn->min_ca_rate/mtu during
> MCGroupcreation.
> > 
> > Hi Eitan,
> > 
> > On Thu, 2006-03-23 at 08:03, Eitan Zahavi wrote:
> > > Hi Sasha
> > >
> > > The spec requires that the request be disregarded in not realizable:
> > > o15-0.1.8: If SA supports UD multicast, then if SA receives a
> request
> > > that
> > > would result in the creation of a multicast group with components
> > > specified
> > > that are unrealizable for its subnet, SA shall return an error
> status of
> > > ERR_REQ_INVALID in its response.
> > 
> > Right but what does this min CA rate/MTU have to do with that unless
> > those ports indeed do join.
> > 
> > -- Hal
> > 
> > > I hope the original code does that - but I am not sure.
> > >
> > > Eitan Zahavi
> > > Design Technology Director
> > > Mellanox Technologies LTD
> > > Tel:+972-4-9097208
> > > Fax:+972-4-9593245
> > > P.O. Box 586 Yokneam 20692 ISRAEL
> > >
> > >
> > > > -Original Message-
> > > > From: Hal Rosenstock [mailto:[EMAIL PROTECTED]
> > > > Sent: Wednesday, March 22, 2006 4:05 PM
> > > > To: Sasha Khapyorsky
> > > > Cc: openib-general@openib.org; Yael Kalka; Eitan Zahavi; Ofer Gigi
> > > > Subject: Re: [PATCH] Disregard subn->min_ca_rate/mtu during MC
> > > Groupcreation.
> > > >
> > > > On Wed, 2006-03-22 at 08:25, Sasha Khapyorsky wrote:
> > > > > Hello,
> > > > >
> > > > > Now at MC Group creation when exact rate or MTU values are
> requested
> > > and
> > > > > those values are greater than rate (or mtu) of slowest port on
> the
> > > subnet
> > > > > then MC group creation fails. It is likely not desired
> behaviour.
> > > >
> > > > Yes, if there were such a check it would be against
> max_ca_mtu/rate
> > > and
> > > > even that is subject to change post group creation as the subnet
> > > changes
> > > > so this doesn't seem like a good idea to me (to be checking it
> here).
> > > >
> > > > -- Hal
> > > >
> > > > > Sasha.
> > > > >
> > > > >
> > > > > Disregard subn->ca_min_mtu and subn->ca_min_rate when new MC
> group
> > > is
> > > > > created and exact MTU and/or rate values are specified.
> > > > >
> > > > >
> > > > > Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]>
> > > > > ---
> > > > >
> > > > >  osm/opensm/osm_sa_mcmember_record.c |   12 ++--
> > > > >  1 files changed, 6 insertions(+), 6 deletions(-)
> > > > >
> > > > > diff --git a/osm/opensm/osm_sa_mcmember_record.c
> > > > b/osm/opensm/osm_sa_mcmember_record.c
> > > > > index ce1d036..826c4d3 100644
> > > > > --- a/osm/opensm/osm_sa_mcmember_record.c
> > > > > +++ b/osm/opensm/osm_sa_mcmember_record.c
> > > > > @@ -1121,12 +1121,12 @@ __mgrp_request_is_realizable(
> > > > >break;
> > > > >  case 2: /* Exactly MTU specified */
> > > > >/* make sure it is in the range */
> > > > > -  if ((1 > mtu_required) || (mtu_required >
> > > p_rcv->p_subn->min_ca_mtu))
> > > > > +  if ((1 > mtu_required))
> > > > >{
> > > > >  osm_log( p_log, OSM_LOG_DEBUG,
> > > > >   "__mgrp_request_is_realizable: "
> > > > > - "Requested MTU %x out of range: 1 .. %x\n",
> > > > > - mtu_required, p_rcv->p_subn->min_ca_mtu);
> > > > > + "Requested MTU %x is less than 1\n",
> > > > > + mtu_required);
> > > > >  return FALSE;
> > > > >}
> > > > >break;
> > > > > @@ -1198,12 +1198,12 @@ __mgrp_request_is_realizable(
> > > > >break;
> > > > >  case 2: /* Exactly RATE specified */
> > > > >/* make sure it is in the range */
> > > > > -  if ((2 > rate_required) || (rate_required >
> > > p_rcv->p_subn->min_ca_rate))
> > > > > +  if ((2 > rate_required))
> > > > >{
> > > > >   

[openib-general] Re: [PATCH] OpenSM - fix osmt_multicast.c

2006-03-23 Thread Hal Rosenstock
Hi Yael,

On Thu, 2006-03-23 at 04:50, Yael Kalka wrote:
> Hi Hal,
> 
> There was an error in osmt_multicast.c that matched the 1.1 spec and
> wasn't updated to the 1.2 version of the spec. When checking
> unrealistic rate, the MCMemberRecord was sent with rate 30BG/sec and
> rateSelector set to 0. But this is realistic according to the 1.2
> spec. The following patch changes this to rate 120GB/sec.
> 
> Thanks,
> Yael
> 
> Signed-off-by:  Yael Kalka <[EMAIL PROTECTED]>

Thanks. Applied to both trunk and 1.0 branch.

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] RE: [PATCH] Disregard subn->min_ca_rate/mtu during MCGroupcreation.

2006-03-23 Thread Eitan Zahavi
Now I get it. Its my bug.
What I meant was to check that the request is realizable:
if ((2 > rate_required) || (rate_required > p_rcv->p_subn->min_ca_rate))

Should catch the case where the rate required is too slow to be valid or
if it is faster then the MAX rate of the fabric. But the use of
min_ca_rate is incorrect it should be a new variable (probably named:
max_ca_rate) that would hold the MAX rate of all the CA ports...

Similar criteria should be applied for the MTU and in the various other
checks (< / > / =)
And so on

Eitan Zahavi
Design Technology Director
Mellanox Technologies LTD
Tel:+972-4-9097208
Fax:+972-4-9593245
P.O. Box 586 Yokneam 20692 ISRAEL


> -Original Message-
> From: Hal Rosenstock [mailto:[EMAIL PROTECTED]
> Sent: Thursday, March 23, 2006 3:34 PM
> To: Eitan Zahavi
> Cc: Sasha Khapyorsky; openib-general@openib.org; Yael Kalka; Ofer Gigi
> Subject: RE: [PATCH] Disregard subn->min_ca_rate/mtu during
MCGroupcreation.
> 
> Hi Eitan,
> 
> On Thu, 2006-03-23 at 08:03, Eitan Zahavi wrote:
> > Hi Sasha
> >
> > The spec requires that the request be disregarded in not realizable:
> > o15-0.1.8: If SA supports UD multicast, then if SA receives a
request
> > that
> > would result in the creation of a multicast group with components
> > specified
> > that are unrealizable for its subnet, SA shall return an error
status of
> > ERR_REQ_INVALID in its response.
> 
> Right but what does this min CA rate/MTU have to do with that unless
> those ports indeed do join.
> 
> -- Hal
> 
> > I hope the original code does that - but I am not sure.
> >
> > Eitan Zahavi
> > Design Technology Director
> > Mellanox Technologies LTD
> > Tel:+972-4-9097208
> > Fax:+972-4-9593245
> > P.O. Box 586 Yokneam 20692 ISRAEL
> >
> >
> > > -Original Message-
> > > From: Hal Rosenstock [mailto:[EMAIL PROTECTED]
> > > Sent: Wednesday, March 22, 2006 4:05 PM
> > > To: Sasha Khapyorsky
> > > Cc: openib-general@openib.org; Yael Kalka; Eitan Zahavi; Ofer Gigi
> > > Subject: Re: [PATCH] Disregard subn->min_ca_rate/mtu during MC
> > Groupcreation.
> > >
> > > On Wed, 2006-03-22 at 08:25, Sasha Khapyorsky wrote:
> > > > Hello,
> > > >
> > > > Now at MC Group creation when exact rate or MTU values are
requested
> > and
> > > > those values are greater than rate (or mtu) of slowest port on
the
> > subnet
> > > > then MC group creation fails. It is likely not desired
behaviour.
> > >
> > > Yes, if there were such a check it would be against
max_ca_mtu/rate
> > and
> > > even that is subject to change post group creation as the subnet
> > changes
> > > so this doesn't seem like a good idea to me (to be checking it
here).
> > >
> > > -- Hal
> > >
> > > > Sasha.
> > > >
> > > >
> > > > Disregard subn->ca_min_mtu and subn->ca_min_rate when new MC
group
> > is
> > > > created and exact MTU and/or rate values are specified.
> > > >
> > > >
> > > > Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]>
> > > > ---
> > > >
> > > >  osm/opensm/osm_sa_mcmember_record.c |   12 ++--
> > > >  1 files changed, 6 insertions(+), 6 deletions(-)
> > > >
> > > > diff --git a/osm/opensm/osm_sa_mcmember_record.c
> > > b/osm/opensm/osm_sa_mcmember_record.c
> > > > index ce1d036..826c4d3 100644
> > > > --- a/osm/opensm/osm_sa_mcmember_record.c
> > > > +++ b/osm/opensm/osm_sa_mcmember_record.c
> > > > @@ -1121,12 +1121,12 @@ __mgrp_request_is_realizable(
> > > >break;
> > > >  case 2: /* Exactly MTU specified */
> > > >/* make sure it is in the range */
> > > > -  if ((1 > mtu_required) || (mtu_required >
> > p_rcv->p_subn->min_ca_mtu))
> > > > +  if ((1 > mtu_required))
> > > >{
> > > >  osm_log( p_log, OSM_LOG_DEBUG,
> > > >   "__mgrp_request_is_realizable: "
> > > > - "Requested MTU %x out of range: 1 .. %x\n",
> > > > - mtu_required, p_rcv->p_subn->min_ca_mtu);
> > > > + "Requested MTU %x is less than 1\n",
> > > > + mtu_required);
> > > >  return FALSE;
> > > >}
> > > >break;
> > > > @@ -1198,12 +1198,12 @@ __mgrp_request_is_realizable(
> > > >break;
> > > >  case 2: /* Exactly RATE specified */
> > > >/* make sure it is in the range */
> > > > -  if ((2 > rate_required) || (rate_required >
> > p_rcv->p_subn->min_ca_rate))
> > > > +  if ((2 > rate_required))
> > > >{
> > > >  osm_log( p_log, OSM_LOG_DEBUG,
> > > >   "__mgrp_request_is_realizable: "
> > > > - "Requested RATE %x out of range: 2 .. %x\n",
> > > > - rate_required, p_rcv->p_subn->min_ca_rate);
> > > > + "Requested RATE %x is less than 2\n",
> > > > + rate_required);
> > > >  return FALSE;
> > > >}
> > > >break;
> > >
> >
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsub

[openib-general] RE: [PATCH] Disregard subn->min_ca_rate/mtu during MC Groupcreation.

2006-03-23 Thread Hal Rosenstock
Hi Eitan,

On Thu, 2006-03-23 at 08:03, Eitan Zahavi wrote:
> Hi Sasha
> 
> The spec requires that the request be disregarded in not realizable:
> o15-0.1.8: If SA supports UD multicast, then if SA receives a request
> that
> would result in the creation of a multicast group with components
> specified
> that are unrealizable for its subnet, SA shall return an error status of
> ERR_REQ_INVALID in its response.

Right but what does this min CA rate/MTU have to do with that unless
those ports indeed do join.

-- Hal

> I hope the original code does that - but I am not sure.
> 
> Eitan Zahavi
> Design Technology Director
> Mellanox Technologies LTD
> Tel:+972-4-9097208
> Fax:+972-4-9593245
> P.O. Box 586 Yokneam 20692 ISRAEL
> 
> 
> > -Original Message-
> > From: Hal Rosenstock [mailto:[EMAIL PROTECTED]
> > Sent: Wednesday, March 22, 2006 4:05 PM
> > To: Sasha Khapyorsky
> > Cc: openib-general@openib.org; Yael Kalka; Eitan Zahavi; Ofer Gigi
> > Subject: Re: [PATCH] Disregard subn->min_ca_rate/mtu during MC
> Groupcreation.
> > 
> > On Wed, 2006-03-22 at 08:25, Sasha Khapyorsky wrote:
> > > Hello,
> > >
> > > Now at MC Group creation when exact rate or MTU values are requested
> and
> > > those values are greater than rate (or mtu) of slowest port on the
> subnet
> > > then MC group creation fails. It is likely not desired behaviour.
> > 
> > Yes, if there were such a check it would be against max_ca_mtu/rate
> and
> > even that is subject to change post group creation as the subnet
> changes
> > so this doesn't seem like a good idea to me (to be checking it here).
> > 
> > -- Hal
> > 
> > > Sasha.
> > >
> > >
> > > Disregard subn->ca_min_mtu and subn->ca_min_rate when new MC group
> is
> > > created and exact MTU and/or rate values are specified.
> > >
> > >
> > > Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]>
> > > ---
> > >
> > >  osm/opensm/osm_sa_mcmember_record.c |   12 ++--
> > >  1 files changed, 6 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/osm/opensm/osm_sa_mcmember_record.c
> > b/osm/opensm/osm_sa_mcmember_record.c
> > > index ce1d036..826c4d3 100644
> > > --- a/osm/opensm/osm_sa_mcmember_record.c
> > > +++ b/osm/opensm/osm_sa_mcmember_record.c
> > > @@ -1121,12 +1121,12 @@ __mgrp_request_is_realizable(
> > >break;
> > >  case 2: /* Exactly MTU specified */
> > >/* make sure it is in the range */
> > > -  if ((1 > mtu_required) || (mtu_required >
> p_rcv->p_subn->min_ca_mtu))
> > > +  if ((1 > mtu_required))
> > >{
> > >  osm_log( p_log, OSM_LOG_DEBUG,
> > >   "__mgrp_request_is_realizable: "
> > > - "Requested MTU %x out of range: 1 .. %x\n",
> > > - mtu_required, p_rcv->p_subn->min_ca_mtu);
> > > + "Requested MTU %x is less than 1\n",
> > > + mtu_required);
> > >  return FALSE;
> > >}
> > >break;
> > > @@ -1198,12 +1198,12 @@ __mgrp_request_is_realizable(
> > >break;
> > >  case 2: /* Exactly RATE specified */
> > >/* make sure it is in the range */
> > > -  if ((2 > rate_required) || (rate_required >
> p_rcv->p_subn->min_ca_rate))
> > > +  if ((2 > rate_required))
> > >{
> > >  osm_log( p_log, OSM_LOG_DEBUG,
> > >   "__mgrp_request_is_realizable: "
> > > - "Requested RATE %x out of range: 2 .. %x\n",
> > > - rate_required, p_rcv->p_subn->min_ca_rate);
> > > + "Requested RATE %x is less than 2\n",
> > > + rate_required);
> > >  return FALSE;
> > >}
> > >break;
> > 
> 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH] IPoIB network interface "RUNNING" statuswith the cable disconnected - fix

2006-03-23 Thread Michael S. Tsirkin
Quoting r. Leonid Arsh <[EMAIL PROTECTED]>:
> >Also, for the future, is there some way we can be smarter about link
> >down events?  There's no sense in starting to try and join multicast
> >groups, etc. if we know that the port is down.

...

> As to the event handling, you are right.
> I think we should review the event handling design in IPoIB and change
> it a bit, to make it smarter.

I guess we could add another task for this purpose, running out of
ipoib_workqueue.  But I'm not sure its worth it - no actual harm is done,
and it sure is easy to intruduce subtle races here, quite hard to debug.

-- 
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] RE: [PATCH] Disregard subn->min_ca_rate/mtu during MC Groupcreation.

2006-03-23 Thread Eitan Zahavi
Hi Sasha

The spec requires that the request be disregarded in not realizable:
o15-0.1.8: If SA supports UD multicast, then if SA receives a request
that
would result in the creation of a multicast group with components
specified
that are unrealizable for its subnet, SA shall return an error status of
ERR_REQ_INVALID in its response.

I hope the original code does that - but I am not sure.

Eitan Zahavi
Design Technology Director
Mellanox Technologies LTD
Tel:+972-4-9097208
Fax:+972-4-9593245
P.O. Box 586 Yokneam 20692 ISRAEL


> -Original Message-
> From: Hal Rosenstock [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, March 22, 2006 4:05 PM
> To: Sasha Khapyorsky
> Cc: openib-general@openib.org; Yael Kalka; Eitan Zahavi; Ofer Gigi
> Subject: Re: [PATCH] Disregard subn->min_ca_rate/mtu during MC
Groupcreation.
> 
> On Wed, 2006-03-22 at 08:25, Sasha Khapyorsky wrote:
> > Hello,
> >
> > Now at MC Group creation when exact rate or MTU values are requested
and
> > those values are greater than rate (or mtu) of slowest port on the
subnet
> > then MC group creation fails. It is likely not desired behaviour.
> 
> Yes, if there were such a check it would be against max_ca_mtu/rate
and
> even that is subject to change post group creation as the subnet
changes
> so this doesn't seem like a good idea to me (to be checking it here).
> 
> -- Hal
> 
> > Sasha.
> >
> >
> > Disregard subn->ca_min_mtu and subn->ca_min_rate when new MC group
is
> > created and exact MTU and/or rate values are specified.
> >
> >
> > Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]>
> > ---
> >
> >  osm/opensm/osm_sa_mcmember_record.c |   12 ++--
> >  1 files changed, 6 insertions(+), 6 deletions(-)
> >
> > diff --git a/osm/opensm/osm_sa_mcmember_record.c
> b/osm/opensm/osm_sa_mcmember_record.c
> > index ce1d036..826c4d3 100644
> > --- a/osm/opensm/osm_sa_mcmember_record.c
> > +++ b/osm/opensm/osm_sa_mcmember_record.c
> > @@ -1121,12 +1121,12 @@ __mgrp_request_is_realizable(
> >break;
> >  case 2: /* Exactly MTU specified */
> >/* make sure it is in the range */
> > -  if ((1 > mtu_required) || (mtu_required >
p_rcv->p_subn->min_ca_mtu))
> > +  if ((1 > mtu_required))
> >{
> >  osm_log( p_log, OSM_LOG_DEBUG,
> >   "__mgrp_request_is_realizable: "
> > - "Requested MTU %x out of range: 1 .. %x\n",
> > - mtu_required, p_rcv->p_subn->min_ca_mtu);
> > + "Requested MTU %x is less than 1\n",
> > + mtu_required);
> >  return FALSE;
> >}
> >break;
> > @@ -1198,12 +1198,12 @@ __mgrp_request_is_realizable(
> >break;
> >  case 2: /* Exactly RATE specified */
> >/* make sure it is in the range */
> > -  if ((2 > rate_required) || (rate_required >
p_rcv->p_subn->min_ca_rate))
> > +  if ((2 > rate_required))
> >{
> >  osm_log( p_log, OSM_LOG_DEBUG,
> >   "__mgrp_request_is_realizable: "
> > - "Requested RATE %x out of range: 2 .. %x\n",
> > - rate_required, p_rcv->p_subn->min_ca_rate);
> > + "Requested RATE %x is less than 2\n",
> > + rate_required);
> >  return FALSE;
> >}
> >break;
> 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] [PATCH] IPoIB PKEY in/out event handling fix

2006-03-23 Thread Leonid Arsh

Roland,

I'll resend the patch later, after I check it on the latest version again.

Thank you,
Leonid

Roland Dreier wrote:

you seem to have generated this
against an old version of the driver: your patch has

>schedule_work(&priv->flush_task);

but the current source has:

queue_work(ipoib_workqueue, &priv->flush_task);


 > + if (ib_find_cached_pkey(priv->ca, priv->port, priv->pkey,
 > &pkey_index))

Now your patch is line-wrapped :(

Can you use a different mailer and resend?

  


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] [PATCH] IPoIB network interface "RUNNING" statuswith the cable disconnected - fix

2006-03-23 Thread Leonid Arsh

Roland, thank you.
Sorry for the muddle, I'm going to use another mailer.

As to the event handling, you are right.
I think we should  review the event handling design in IPoIB and change
it a bit, to make it smarter.

Regards,
  Leonid


Roland Dreier wrote:

I applied this, although I had to fix the patch up by hand.  In
addition to your mailer doing some sort of quoted-printable mangling
(that turns "==" into "=3D=3D"), you seem to have generated this
against an old version of the driver: your patch has

 >   schedule_work(&priv->flush_task);

but the current source has:

queue_work(ipoib_workqueue, &priv->flush_task);


Also, for the future, is there some way we can be smarter about link
down events?  There's no sense in starting to try and join multicast
groups, etc. if we know that the port is down.

Thanks,
  Roland

  


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH 9 of 18] ipath - char devices for diagnostics and lightweight subnet management

2006-03-23 Thread Bryan O'Sullivan
On Thu, 2006-03-23 at 12:13 +0200, Michael S. Tsirkin wrote:

> OK. I gather that much. But why? I'm just trying to figure out the motivation.

I don't know for sure, but it's repeated at me in vehement terms when I
ask, so someone cares :-)  I'll try to find an actual compelling answer
when my coworkers are awake.

http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH 9 of 18] ipath - char devices for diagnostics and lightweight subnet management

2006-03-23 Thread Michael S. Tsirkin
Quoting r. Bryan O'Sullivan <[EMAIL PROTECTED]>:
> > I understand they do, but they could just use the parts of IB stack and
> > never notice.
> 
> No, in some cases they want there to not be an IB stack present, which
> is not the same thing at all as not caring if it's there.

OK. I gather that much. But why? I'm just trying to figure out the motivation.
Might this be interesting for our drivers too?

-- 
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [PATCH] osm_sa_mcmember_record : MCMember Get/GetTable Trusted mode

2006-03-23 Thread Ofer Gigi
Hi Hal,

The fix below fixes the retrieve of the mcmember records according to the 
Errata MGTWG3280.

Quoting from MGTWG3280:

SA can be queried for multicast groups by sending a SubnAdmGet() or a
 SubnAdmGetTable() request to it using the SA query mechanism (see
15.4.4 Administration Query Subsystem on page 923).

What SA returns in response to a query of multicast groups depends
strongly on whether the request is or is not a trusted request; the
degree of trust affects both the data returned in each attribute and
the set of attributes that are returned. See .

o15-0.2.5 is made obsolete.

So we need to implement the following descriptive text:

SA can be queried for multicast groups by sending a SubnAdmGet() or a
SubnAdmGetTable() request to it using the SA query mechanism (see
15.4.4 Administration Query Subsystem on page 923). SA will return one
MCMemberRecord per multicast group matching the query, except in
cases where trust is specified as indicated in 15.4.1.2 Access Restrictions
For Other Attributes on page 922; in that case all the MCMemberRecords
associated with the multicast group are returned. The MCMemberRecord
will be returned with the PortGID, ProxyJoin, and the JoinState components
set to 0, except where trust is specified as indicated above, in that
case the actual contents for the above components will be provided.
Thanks

Ofer G.

Signed-off-by:  Ofer Gigi <[EMAIL PROTECTED]>
Index: osm_sa_mcmember_record.c
===
--- osm_sa_mcmember_record.c(revision 5919)
+++ osm_sa_mcmember_record.c(working copy)
@@ -91,6 +91,7 @@ typedef  struct   osm_sa_mcmr_search_ctx
   cl_qlist_t  *p_list; /*  hold results */
   ib_net64_t  comp_mask;
   const osm_physp_t*p_req_physp;
+  boolean_t   trusted_req;
 } osm_sa_mcmr_search_ctxt_t;
 
 /**
@@ -1918,6 +1919,9 @@ __osm_sa_mcm_by_comp_mask_cb(
   /* will be used for group or port info */
   uint8_t scope_state; 
   uint8_t scope_state_mask = 0;
+  cl_map_item_t *p_item;
+  ib_gid_t port_gid;
+  boolean_t proxy_join;
 
   OSM_LOG_ENTER( p_rcv->p_log, __osm_sa_mcm_by_comp_mask_cb );
 
@@ -1938,39 +1942,28 @@ __osm_sa_mcm_by_comp_mask_cb(
 
   /* first try to eliminate the group by MGID, MLID, or P_Key */
   if ((IB_MCR_COMPMASK_MGID & comp_mask) &&
-  cl_memcmp(&p_rcvd_rec->mgid, &p_mgrp->mcmember_rec.mgid, 
sizeof(ib_gid_t))) {
+  cl_memcmp(&p_rcvd_rec->mgid, 
+&p_mgrp->mcmember_rec.mgid, 
+sizeof(ib_gid_t)))
+  {
 goto Exit;
   }
 
   if ((IB_MCR_COMPMASK_MLID & comp_mask) &&
-  cl_memcmp(&p_rcvd_rec->mlid, &p_mgrp->mcmember_rec.mlid, 
sizeof(uint16_t))) {
+  cl_memcmp(&p_rcvd_rec->mlid, 
+  &p_mgrp->mcmember_rec.mlid, 
+  sizeof(uint16_t))) 
+  {
 goto Exit;
   }
 
-  /* if the requester physical port doesn't have the pkey that is defined for 
the
- group - exit. */
-  if (! osm_physp_has_pkey( p_rcv->p_log, p_mgrp->mcmember_rec.pkey, 
p_req_physp ))
+  /* if the requester physical port doesn't have the pkey that is defined for
+ the group - exit. */
+  if (! osm_physp_has_pkey( p_rcv->p_log, 
+p_mgrp->mcmember_rec.pkey, 
+p_req_physp ))
 goto Exit;
 
-  /* so did we get the PortGUID mask */
-  if (IB_MCR_COMPMASK_PORT_GID & comp_mask)
-  {
-/* try to find this port */
-if (osm_mgrp_is_port_present(p_mgrp, portguid, &p_mcm_port))
-{
-  scope_state = p_mcm_port->scope_state;
-}
-else
-{
-  /* port not in group */
-  goto Exit;
-}
-  }
-  else
-  {
-/* point to the group information */
-scope_state = p_mgrp->mcmember_rec.scope_state;
-  }
 
   /* now do the rest of the match */
   if ((IB_MCR_COMPMASK_QKEY & comp_mask) &&
@@ -2004,17 +1997,15 @@ __osm_sa_mcm_by_comp_mask_cb(
   if (query_hop != mgrp_hop) goto Exit;
   }
 
+  if ((IB_MCR_COMPMASK_PROXY & comp_mask) &&
+  (p_rcvd_rec->proxy_join != p_mgrp->mcmember_rec.proxy_join)) goto Exit;
+
   if (IB_MCR_COMPMASK_SCOPE & comp_mask)
 scope_state_mask = 0xF0;
 
   if (IB_MCR_COMPMASK_JOIN_STATE & comp_mask)
 scope_state_mask = scope_state_mask | 0x0F;
 
-  if ((scope_state_mask & p_rcvd_rec->scope_state) !=
-  (scope_state_mask & scope_state)) goto Exit;
-
-  if ((IB_MCR_COMPMASK_PROXY & comp_mask) &&
-  (p_rcvd_rec->proxy_join != p_mgrp->mcmember_rec.proxy_join)) goto Exit;
 
   /* need to validate mtu, rate, and pkt_lifetime fields. */
   if (__validate_more_comp_fields( p_rcv->p_log,
@@ -2022,11 +2013,84 @@ __osm_sa_mcm_by_comp_mask_cb(
p_rcvd_rec,
comp_mask ) == FALSE) goto Exit;
 
+
+  /* Port specific fields */
+  /* so did we got the PortGUID mask */
+  if (IB_MCR_COMPMASK_PORT_GID & comp_mask)
+  {
+ /* try to find this port */
+ if (osm_mgrp_is_port_present(p_mgrp,

[openib-general] Re: [PATCH 9 of 18] ipath - char devices for diagnostics and lightweight subnet management

2006-03-23 Thread Bryan O'Sullivan
On Thu, 2006-03-23 at 11:37 +0200, Michael S. Tsirkin wrote:

> I understand they do, but they could just use the parts of IB stack and never
> notice.

No, in some cases they want there to not be an IB stack present, which
is not the same thing at all as not caring if it's there.

> I think IB stack is modest, as core modules go.

I don't understand why you persist on this point.  We have a need for an
SMA that is not tied to the IB stack.  The kernel code to support it is
about 500 lines long, about 2% of the driver.

> And I don't believe you can save much since as a solution you seem to have
> re-implemented the full IB stack in your low level driver:

No, we haven't.  The IB protocols are implemented in the ib_ipath
module, not the core driver.

http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [PATCH] OpenSM - fix osmt_multicast.c

2006-03-23 Thread Yael Kalka

Hi Hal,

There was an error in osmt_multicast.c that matched the 1.1 spec and
wasn't updated to the 1.2 version of the spec. When checking
unrealistic rate, the MCMemberRecord was sent with rate 30BG/sec and
rateSelector set to 0. But this is realistic according to the 1.2
spec. The following patch changes this to rate 120GB/sec.

Thanks,
Yael

Signed-off-by:  Yael Kalka <[EMAIL PROTECTED]>

Index: osmtest/osmt_multicast.c
===
--- osmtest/osmt_multicast.c(revision 5972)
+++ osmtest/osmt_multicast.c(working copy)
@@ -994,12 +994,12 @@ osmt_run_mcast_flow( IN osmtest_t * cons
   /* Check Valid value which is unreasonable now */
   osm_log( &p_osmt->log, OSM_LOG_INFO,
"osmt_run_mcast_flow: "
-   "Checking Join with unrealistic rate 30GB (o15.0.1.8)...\n"
+   "Checking Join with unrealistic rate 120GB (o15.0.1.8)...\n"
);
 
   /* impossible requested rate */
   mc_req_rec.rate =
-IB_PATH_RECORD_RATE_30_GBS |
+IB_PATH_RECORD_RATE_120_GBS |
 IB_PATH_SELECTOR_GREATER_THAN << 6;
 
   comp_mask =

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH 8 of 18] ipath - sysfs and ipathfs support for core driver

2006-03-23 Thread Michael S. Tsirkin
Quoting r. Bryan O'Sullivan <[EMAIL PROTECTED]>:
> > InfiniBand core already exposes these attributes to userspace, see
> > drivers/infiniband/core/sysfs.c
> 
> This is needed for cases where the Infiniband stack isn't present.

But re-implementing same thing with a different kernel-user interface and
pushing it into a low-level driver does not strike me like a sane solution.

-- 
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH 9 of 18] ipath - char devices for diagnostics and lightweight subnet management

2006-03-23 Thread Michael S. Tsirkin
Quoting r. Bryan O'Sullivan <[EMAIL PROTECTED]>:
> > Could you please explain why is this useful? Users could not care less -
> > they never have to touch an SMA.
> 
> We have customers who use our driver who do not want a full IB stack
> present, for example in embedded environments.

I understand they do, but they could just use the parts of IB stack and never
notice.  In my experience, embedded systems are typically diskless - why is a
userspace SMA better than kernel-level one for them? The advantage would be
everyone using a single kernel/user interface, common utilities for
management, diagnostics ... I could go on.

So what's your point? Memory usage? Let's take a look:

ib_mad is the IB stack module that includes between other things the
kernel-level SMA (BTW, if necessary, you should be able to split it out so that
it is only loaded on demand):

I think IB stack is modest, as core modules go.
Here's how a loaded IB stack looks like on my system:

Module  Size  Used by
ib_mad 36260  2 ib_ipath,ib_mthca
ib_core46080  3 ib_ipath,ib_mthca,ib_mad

So there are *maximum* 82K code to save.  This is a 64-bit system, I think
embedded systems are usually 32 bit so there'll be just 41K.

And I don't believe you can save much since as a solution you seem to have
re-implemented the full IB stack in your low level driver:

Module  Size  Used by
ib_ipath   79256  0
ipath_core159764  1 ib_ipath

By contrast, a low-level which doesn't reimplement IB core is just a bit
above 100K.

-- 
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Re: getting many (4/5 of the time)RDMA_CM_EVENT_ROUTE_ERROR

2006-03-23 Thread Or Gerlitz

Sean Hefty wrote:

Does this use RMPP ?



If the local SA is involved, it will use RMPP.  If a path is not found by the
local SA, this should result in a standard path record query.  You can try
disabling the local SA by loading it with "cache_timeout=0".  This will force
path record queries, which doesn't use RMPP.


Yes, its with the latest trunk so the cma first checks with the local SA 
if a path exists, I will give it a try (disabling the local sa).


Is it correct to say that the local sa would always return path or 
ENODATA, so if i am getting timeout it means the cma issued path query 
directly via ib_sa?


Or

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] question related to rdma_bind_addr

2006-03-23 Thread Or Gerlitz

Sean Hefty wrote:

If my understanding is correct, the current code of rdma_bind_addr
assumes you supply it one of three (all are **src** address)

+1 ANY (0.0.0.0) addr
+2 local loopback addr
+3 other local addr



Correct. - Note that currently a valid port number needs to be provided, but
this is a temporary restriction.


I am not sure to understand your comment on the port number, you mean to 
the ((struct sockaddr_in *)addr)->sin_port field of addr ?



So it is not possible to syncrously create and bind the cma id to
ib device based on the destination address (which is the typical info
the active side has).



It is not possible to synchronously bind based on the destination address.
Rdma_bind_addr() binds synchronously to a local device based on a local address
only.  To bind based on a destination address, you use rdma_resolve_addr().
However, the lookup may involve issuing an ARP request in order to determine the
remote hardware address, which is needed in resolving the route.


rdma_resolve_addr resolves two things based on the dest address

+1 the local IB device to use (plus its port number, pkey etc)
+2 the remote (dest) IB GID (or iWARP MAC)

Now, i was thinking that the first step of getting the local device 
based on the dest address is done by ip_route_output_key() and friends, 
so you synchronously get a network device (on which you later issues the 
ARP) whose private/rdma pointer is ipoib_device who has ib device.


I could not approve my assumptions from looking on the cma/addr code,
but if i am correct this opens the door for future enhancement of 
rdma_bind_addr() to work on non local addresses.




I'm not sure that binding to a local device synchronously based on a remote
address is exactly impossible.  But it doesn't remove the need to resolve the
remote address to a hardware address, which is asynchronous.


sure, i see that you kind of approve my assumptions that its possible.

thanks,

Or.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH 9 of 18] ipath - char devices for diagnostics and lightweight subnet management

2006-03-23 Thread Bryan O'Sullivan
On Thu, 2006-03-23 at 08:41 +0200, Michael S. Tsirkin wrote:

> Could you please explain why is this useful? Users could not care less - they
> never have to touch an SMA.

We have customers who use our driver who do not want a full IB stack
present, for example in embedded environments.

http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH 8 of 18] ipath - sysfs and ipathfs support for core driver

2006-03-23 Thread Bryan O'Sullivan
On Thu, 2006-03-23 at 08:30 +0200, Michael S. Tsirkin wrote:

> InfiniBand core already exposes these attributes to userspace, see
> drivers/infiniband/core/sysfs.c

This is needed for cases where the Infiniband stack isn't present.

http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH 8 of 18] ipath - sysfs and ipathfs support for core driver

2006-03-23 Thread Bryan O'Sullivan
On Wed, 2006-03-22 at 21:49 -0800, Greg KH wrote:

> Why are you testing kobj.dentry in these functions?

I think this is a holdover from an earlier version of the driver that
didn't clean up properly after driver registration failed.  In other
words, those tests are no longer needed.  Thanks for spotting this.

> Oh, and I like your new filesystem, but where do you propose that it be
> mounted?

I don't have any good candidates in mind.  In our development
environment, we're mounting it in /ipath, but that doesn't seem like a
good long-term name.  Do you have any suggestions?

> You leak a group if the second call to sysfs_create_group() fails for
> some reason.

Thanks.

http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH 10 of 18] ipath - support for userspace apps using core driver

2006-03-23 Thread Bryan O'Sullivan
On Wed, 2006-03-22 at 19:06 -0800, Andrew Morton wrote:

> CPU topology is available in sysfs - it shouild be possible to push policy
> decisions like this up to userspace.

We covered this in an earlier round of submission, but I should have
updated the inline comments.

While we expose a way for userspace to open an explicit device based on
its knowledge of topology, I think we need a "take your best shot" minor
number, too, which is what this is.

If we *don't* provide one, userspace can end up with exactly the kinds
of messy race situations that used to be seen with ptys before /dev/ptmx
and /dev/pts came along.

http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general][patch review] srp: fmr implementation,

2006-03-23 Thread Or Gerlitz

Roland Dreier wrote:
Or>> (F)MR IO VA (mod page_shift) === the SG first Bus Address (mod 
page_shift)



Yes, this is true.  But Vu's code guarantees that the first bus
address is page-aligned, so the IO VA can always be 0.


I see. I would say that Vu's code gives up on FMRing SG sections whose 
first bus address is not page-aligned...


Or.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general