Title: RE: [openib-general] ip over ib
throughtput
First, I love Hardware Reliability.
1QP per node, this might be fine for small clusters, but what
about larger clusters, where I have an all-to-all communications
pattern ? What about say *IF* something like IB was ever designed into
a BG/L
Title: RE: [openib-general] OpenSM died a horrible death
Hi Shahaf
The assert are in:
osm_lid_mgr.c:968: CL_ASSERT( p_mgr-p_subn-sm_port_guid );
osm_lid_mgr.c:1011: CL_ASSERT( p_mgr-p_subn-sm_port_guid );
osm_mcast_mgr.c:1150: CL_ASSERT( port_guid );
osm_port.c:977: CL_ASSERT(
On Thu, 2005-01-06 at 10:11, Eitan Zahavi wrote:
The assert are in:
osm_lid_mgr.c:968: CL_ASSERT( p_mgr-p_subn-sm_port_guid );
osm_lid_mgr.c:1011: CL_ASSERT( p_mgr-p_subn-sm_port_guid );
osm_mcast_mgr.c:1150: CL_ASSERT( port_guid );
osm_port.c:977: CL_ASSERT( port_guid );
Hello!
Quoting r. Hal Rosenstock ([EMAIL PROTECTED]) Re: [openib-general] Some
Missing Features from mthca/user MAD access:
On Thu, 2005-01-06 at 05:53, Michael S. Tsirkin wrote:
Hello, Roland!
Quoting r. Roland Dreier ([EMAIL PROTECTED]) Re: [openib-general] Some
Missing Features from
On Thu, 2005-01-06 at 10:32, Michael S. Tsirkin wrote:
Making sure the bit is cleared is indeed important. It is simple when SM
exits properly. Hopefully something can occur at process cleanup time to
ensure this does happen even in the case where the SM dies. Otherwise a
special utility
The simplest approach I think is, keep some file open, when its closed
(which happends automatically when the process dies) clean the is_sm bit.
This does seem to be the simplest approach. However, there are two issues
I'm still trying to figure out:
- Where should the file be? Do we really
On Thu, 2005-01-06 at 10:56, Roland Dreier wrote:
I don't understand this. The (logical) port state shows everything
the LEDs show:
a state of INIT means one LED is on, ACTIVE means both are on.
What if it doesn't get to INIT ?
Also, the justification for doing this in the kernel is that
Hello!
Quoting r. Hal Rosenstock ([EMAIL PROTECTED]) Re: [openib-general] Some
Missing Features from mthca/user MAD access:
On Thu, 2005-01-06 at 11:01, Roland Dreier wrote:
The simplest approach I think is, keep some file open, when its closed
(which happends automatically when the
I wander if sysfs can be used for this somehow.
Not as we're discussing, because all the file operations are already set by
the sysfs code.
However, is it so bad to make the existing cap_mask sysfs file writable and
just say that userspace has to clean up if the SM exits uncleanly?
- R.
While working on the CM, I came up with the following list of features
that could be useful. I don't have time currently to implement any of
these properly, but they're probably worth discussing.
* It would be nice to be able to take a received MAD and turn it around
as a send.
* When
At 04:43 AM 1/6/2005, Diego Crupnicoff wrote:
I feel like we are talking
about different things here:
The ***IP*** MTU is
relevant for IPoIB performance because it determines the number of times
that you are going to be hit by the per-packet overhead of the ***host***
networking stack. My
Hello!
Quoting r. Roland Dreier ([EMAIL PROTECTED]) Re: [openib-general] Some Missing
Features from mthca/user MAD access:
I wander if sysfs can be used for this somehow.
Not as we're discussing, because all the file operations are already set by
the sysfs code.
I know, I was thinking
On Thu, 6 Jan 2005, Grant Grundler wrote:
That's a limitation of linux. Linux drivers assume physically
contigous pages are available for anything that crosses
a page boundary. KISS when it works but not robust.
yeah, I know, freebsd never had this problem ...
FWIW, I had the impression
On Thu, 2005-01-06 at 12:38, Michael S. Tsirkin wrote:
After consideration, I think the proper way is add a reference count
and clean is_sm when it falls to 0.
Why is a reference count needed ? (Just want to understand).
This way runnning two opensms on the same HCA and two different
HCAs in
FYI.The specification can be found at:
http://www.opengroup.org/bookstore/catalog/c050.htm
Use of this new interface will enable Sockets based applications to fully
exploit the performance of RDMA interconnects through the SDP wire
protocol. This API also provides explicit memory
I am trying to say it could be transparent. you could be running
two sms and it could work more or less.
So for example opensm hangs. If I kill it, it will clear the
is_sm bit, *but* I dont want that.
The way to do it cold be to start a new one, then kill
the old one.
How can two SMs on the
I know. But where does lspci get the domain number?
From sysfs -- lspci goes through the entries in /sys/bus/pci/devices.
If you strace lspci on a modern distro, you can see it doesn't open
anything in /proc.
Cool, how do *they* look in /proc/bus/pci/devices?
As you can see from show_device
They dont have to work, for example I can CTRL-Z one of them
and start another one.
Unfortunately this won't work with the current MAD layer. The first SM
will register an agent to receive SM class MADs, and the second SM will
fail because the agent is already registered.
- R.
Roland Dreier wrote:
They dont have to work, for example I can CTRL-Z one of them
and start another one.
Unfortunately this won't work with the current MAD layer. The first SM
will register an agent to receive SM class MADs, and the second SM will
fail because the agent is already registered.
Are there any code examples + docs available for the additions to the 2.6
kernel updates.
I am somewhat familiar with the VAPI interface from Mellanox, but I am
uncertain what changes or modifications may be necessary given the new
additions. Also, are you aware of any docs which help to
I've been coding the CM messages, and just setting the SRQ field in
them based on whether a QP has a SRQ. My guess is that this will work
fine, but my question is does anyone know why the CM or remote QP cares
about this at all? I want to make sure that I'm not missing something
here.
-
At 09:58 AM 12/18/2004, Hal Rosenstock wrote:
On Sat, 2004-12-18 at 12:55,
Roland Dreier wrote:
Surely link width and/or speed can't change without the port
state
changing, can they? As I understand it, the link layer
can't
renegotiate this sort of thing without bringing the link down.
In
Title: RE: [openib-general] SRQ field in CM messages
This bit was added to the CM protocol so that the remote side QP can distinguish between a SRQ and a TCA that does not generate e2e credits.
Thanks,
Diego
-Original Message-
From: Sean Hefty [mailto:[EMAIL PROTECTED]]
Sent:
Diego Crupnicoff wrote:
This bit was added to the CM protocol so that the remote side QP can
distinguish between a SRQ and a TCA that does not generate e2e credits.
Thanks,
Diego
thanks
___
openib-general mailing list
openib-general@openib.org
On Wed, 2005-01-05 at 20:03 -0500, Hal Rosenstock wrote:
Do you know what was going on on the subnet at the time ? Did a end node
SA client request a PathRecord with a SGID of 0 but turn the component
mask bit for SGID on ?
Running the subnet with opensm exposed a bug on the Solaris side that
On Thu, 6 Jan 2005, Michael S. Tsirkin wrote:
Well, I see regular 8100 there, where does lspci get another : ?
Its a mystery.
that's the pci domain stuff. Turns out on newer machines you can have
multiple pci configuration domains. Oh joy :-)
ron
On Thu, Jan 06, 2005 at 10:55:19AM -0800, Roland Dreier wrote:
I know. But where does lspci get the domain number?
From sysfs -- lspci goes through the entries in /sys/bus/pci/devices.
If you strace lspci on a modern distro, you can see it doesn't open
anything in /proc.
That's an
On Thu, 2005-01-06 at 20:05 +0200, shaharf wrote:
Hi Tom,
Are you able to reproduce this problem? If you are I would like you
to reproduce it will full verbosity (-V). If you cant or the scenario
is not consistent, please tell me too. It might direct us to some
other directions.
Well, I
Hello!
Quoting r. Roland Dreier ([EMAIL PROTECTED]) Re: [openib-general] Re: mstflint
failing on sparc64:
I know. But where does lspci get the domain number?
From sysfs -- lspci goes through the entries in /sys/bus/pci/devices.
If you strace lspci on a modern distro, you can see it doesn't
Hello!
Quoting r. Michael S. Tsirkin ([EMAIL PROTECTED]) [openib-general] Re:
mstflint failing on sparc64:
tat:~# ./mstflint -d 81:00.0 q
Bus error
Interesting. Maybe mmap does not work as it should?
Could you run it under gdb and do a backtrace? I also added
a sanity checks
Hello!
Quoting r. Ronald G. Minnich (rminnich@lanl.gov) Re: [openib-general] Re:
mstflint failing on sparc64:
On Thu, 6 Jan 2005, Michael S. Tsirkin wrote:
Well, I see regular 8100 there, where does lspci get another : ?
Its a mystery.
that's the pci domain stuff. Turns out on
On Thu, 2005-01-06 at 16:44, Michael S. Tsirkin wrote:
Well, I was thinking for things like failover it could be nice.
I would think failover is more reliable with SMs on different machines
but this is a conceivable scenario. I for one need to convince myself
that the SM state machine works fine
On Thu, 2005-01-06 at 20:28 +0200, Michael S. Tsirkin wrote:
Crashes on access to mapped memory.
Could you print mf-ptr and offset at that point?
(gdb) print mf-ptr
$1 = (void *) 0x70304000
(gdb) print offset
$2 = 984060
Generally, do yo happend to know if mmapping /dev/mem
to userspace
Michael Use of this new interface will enable Sockets based
Michael applications to fully exploit the performance of RDMA
Michael interconnects through the SDP wire protocol. This API
Michael also provides explicit memory management taking some of
Michael the guesswork out of
On Thu, 2005-01-06 at 23:52 +0200, Michael S. Tsirkin wrote:
Tom, if you can try it before the weekend I'll be thankful,
I am working on Sundays, but I dont have a sparc.
I took the plunge and tried to flash the firmware, and it took!
tat:~# ./mstflint -d /proc/bus/pci/\:81/00.0 q
Image
On Thu, Jan 06, 2005 at 03:26:13PM -0800, Roland Dreier wrote:
Generally, do yo happend to know if mmapping /dev/mem
to userspace works on this architecture?
I can't imagine that it would not. I will see if I can dig info up.
The one thing that is weird on sparc64 is that the pci bus
If certain fields do not exist on the node you are running ibstatus
script on, like when Roland adds a new one and you haven't upgraded yet,
have ibstatus behave better.
Signed-off-by: Tom Duffy [EMAIL PROTECTED]
Index: gen2/trunk/src/userspace/management/diags/host/scripts/ibstatus
On Tue, Jan 04, 2005 at 05:11:11PM -0800, Roland Dreier wrote:
Grant I'll see how hard it would be to try it with tg3.
My previous patch seems like it won't work (need MSGINT_MODE_ENABLE
set too, according to
http://www.ussg.iu.edu/hypermail/linux/kernel/0301.3/0123.html)
I've applied
On Thu, 2005-01-06 at 19:22, Tom Duffy wrote:
If certain fields do not exist on the node you are running ibstatus
script on, like when Roland adds a new one and you haven't upgraded yet,
have ibstatus behave better.
Signed-off-by: Tom Duffy [EMAIL PROTECTED]
Thanks. Applied. Patch is line
39 matches
Mail list logo