Re: [openib-general] Data structure size mismatch

2005-11-15 Thread Pradeep Satyanarayana
Roland Dreier <[EMAIL PROTECTED]> wrote on 11/14/2005 11:47:13 AM: >     Pradeep> I am trying to use copy_from_user()/copy_to_user of data >     Pradeep> structures that contains pointers. > > If you are defining a new interface, then the simplest thing is not to > do that: always put pointers i

Re: [openib-general] another opensm crash

2005-11-15 Thread Troy Benjegerdes
On Mon, Nov 14, 2005 at 09:54:28PM +0200, Eitan Zahavi wrote: > Hi Troy > > Try to move aside your /lib/tls directory and see if you still get these > crashes. > We have issues with TLS pthread and glibc We still have issues with -maxsmps=8. And no, running with maxsmps=1 is not an option on thi

[openib-general] compile error -libibverbs (64-bit)

2005-11-15 Thread Pradeep Satyanarayana
I am trying to compile some of the userspace utilities as 64-bit apps on a ppc64 machine with a sles9sp2 distribution. I am getting the following compile errors (appended below). Yes, I am using some older bits, but I do not think that is the issue here. I have exported LDFLAGS=-L /lib64 -m64 a

[openib-general] Announce: preview RPMs for FC-4 and RHEL-4 available

2005-11-15 Thread Doug Ledford
I have initial RPM support for both of these releases available for use/testing. For Fedora Core 4, I didn't compile a new kernel since the current FC4 kernel is 2.6.14 based and includes the upstream Infiniband support. For RHEL4 I obviously compiled a new kernel, but it used the code pulled

Re: [openib-general] SRP device management client (and a few opensmglitches)

2005-11-15 Thread Roland Dreier
Eitan> Hi Roland, Now I get it ! In the port info record there is Eitan> a field named LocalPortNumber. This field is NOT the port Eitan> number the data is about. It is the port number the packet Eitan> of the query came from. (see table 145 p823 l-38). When Eitan> OpenSM ob

RE: [openib-general] [ANNOUNCE] Contribute RDS(ReliableDatagramSockets) to OpenIB

2005-11-15 Thread Caitlin Bestler
  In absence of any protocol level ack (and regardless of protocol level ack), it is the application which has to implement its own reliability. RDS becomes a passive channel passing packet back and forth including duplicate packets. The responsibility then shifts to the a

RE: [openib-general] SRP device management client (and a few opensmglitches)

2005-11-15 Thread Eitan Zahavi
Hi Roland, Now I get it ! In the port info record there is a field named LocalPortNumber. This field is NOT the port number the data is about. It is the port number the packet of the query came from. (see table 145 p823 l-38). When OpenSM obtains the PortInfo associated with that particular port

RE: [openib-general] RE: [dat-discussions] socket based connectionmodel for IB proposal - round 3

2005-11-15 Thread Caitlin Bestler
[EMAIL PROTECTED] wrote: > Kanevsky, Arkady wrote: >> Which entity is responsible to "use" the proposed protocol is an >> interesting one. I was assuming that this will be CM. After all the >> proposed protocol is CM extension protocol. But it can be another >> entity module between CM and ULP. >

RE: [openib-general] SRP device management client (and a few opensmglitches)

2005-11-15 Thread Hal Rosenstock
Are you referring to SCinet ? It is definitely running off HCA port 1. -- Hal From: [EMAIL PROTECTED] on behalf of Roland Dreier Sent: Tue 11/15/2005 11:04 AM To: Eitan Zahavi Cc: openib-general@openib.org Subject: Re: [openib-general] SRP device management clie

Re: [openib-general] RE: [dat-discussions] socket based connectionmodel for IB proposal - round 3

2005-11-15 Thread Sean Hefty
Kanevsky, Arkady wrote: Which entity is responsible to "use" the proposed protocol is an interesting one. I was assuming that this will be CM. After all the proposed protocol is CM extension protocol. But it can be another entity module between CM and ULP. The use of a reserved bit in the CM me

Re: [openib-general] SRP device management client (and a few opensm glitches)

2005-11-15 Thread Roland Dreier
I just noticed that the host port that the SM is running on is connected to switch port 2. What seems to be happening is that all of the switch's ports (except port 0) are seen as having local port number 2 in the actual PortInfo attribute information, even though the PortNum field in the SA recor

Re: [openib-general] SRP device management client (and a few opensm glitches)

2005-11-15 Thread Roland Dreier
Eitan> Could you dump out the content of the PortRecords that you Eitan> get as response? They look like valid records for switch ports, except the local port number field doesn't match the port number field in the SA record identifier wrapper. - R. __

Re: [openib-general] SRP device management client (and a few opensmglitches)

2005-11-15 Thread Roland Dreier
Eitan> You should not get more then one SA header. I assumed you Eitan> are doing GetTable of PortInfoRecord. If this is correct Eitan> you should only get one SA header in the resulting RMPP Eitan> (reassembled MAD). Yes, I only have one SA header. I just meant the SA wrapper of

RE: [openib-general] SRP device management client (and a few opensmglitches)

2005-11-15 Thread Eitan Zahavi
You should not get more then one SA header. I assumed you are doing GetTable of PortInfoRecord. If this is correct you should only get one SA header in the resulting RMPP (reassembled MAD). EZ Eitan Zahavi Design Technology Director Mellanox Technologies LTD Tel:+972-4-9097208 Fax:+972-4-9593245

RE: [openib-general] SRP device management client (and a few opensm glitches)

2005-11-15 Thread Eitan Zahavi
Hi Roland, If you only got single 24port switch you should only see 1 record with base lid = 0 and port num = 2. But maybe we have a bug not comparing port num. On our test today we have seen only one record for port 2 from each switch (we had two switches so got 2 recodrs). Could you dump out

Re: [openib-general] SRP device management client (and a few opensmglitches)

2005-11-15 Thread Roland Dreier
Hal> Are those the only component fields which were zeroed out ? No, for example the capability mask is all 0 as well. It seems to be something a little bit more complicated. In my test fabric, which has a single 24-port switch with hosts connected to both port 1 and port 2 (of the switch),

Re: [openib-general] SRP device management client (and a few opensm glitches)

2005-11-15 Thread Roland Dreier
Yael> Hello Roland, When turning on only the comp_mask for the Yael> local_port_num you will get all relevant PortInfo records Yael> from the switches. These records do have many fields zeroed Yael> out (e.g subnet_prefix), but they are still valid records. Yael> Is this what y

[openib-general] Re: [PATCH] mthca: fix qp max_send/recv_sge calculation

2005-11-15 Thread Michael S. Tsirkin
Quoting r. Michael S. Tsirkin <[EMAIL PROTECTED]>: > Subject: [PATCH] mthca: fix qp max_send/recv_sge calculation > > Roland, I think I see a problem in mthca, where qp capability values > we return arent safe. > How does the following look (compile tested only)? This is tested now, please review

[openib-general] Fwd: Invitation to OpenIB BOF at SC05 Wednesday 11-12 Room 205 in Convention Center

2005-11-15 Thread Bill Boas
Date: Mon, 14 Nov 2005 15:16:51 -0800 To: [EMAIL PROTECTED], openib-general@openib.org, [EMAIL PROTECTED], Eric Lantz, [EMAIL PROTECTED] From: Bill Boas <[EMAIL PROTECTED]> Subject: Invitation to OpenIB BOF at SC05 Wednesday 11-12 Room 205 in Convention Center Cc: [EMAIL PROTECTED], [EMAIL PR

RE: [openib-general] RE: [dat-discussions] socket based connectionmodel for IB proposal - round 3

2005-11-15 Thread Kanevsky, Arkady
The goal that this proposal is to provide underpinning for common RDMA transport CM. Thus, the API ULP (both user space and kernel space) use socket addressing. For ULP addressing this means 5 tuple: protocol, src IP addr, src port, dst IP addr, and dst port. Port is 16 bit entity. The proposal j

RE: [openib-general] SRP device management client (and a few opensmglitches)

2005-11-15 Thread Hal Rosenstock
Roland, Just to close the loop on this: Are those the only component fields which were zeroed out ? Thanks. -- Hal From: Eitan Zahavi [mailto:[EMAIL PROTECTED] Sent: Tue 11/15/2005 9:41 AM To: Hal Rosenstock; Roland Dreier Cc: openib-general@openib.org Sub

Re: [openib-general] [ANNOUNCE] Contribute RDS(ReliableDatagramSockets) to OpenIB

2005-11-15 Thread Michael Krause
At 12:49 PM 11/14/2005, Nitin Hande wrote: Michael Krause wrote: At 01:01 PM 11/11/2005, Nitin Hande wrote: Michael Krause wrote: At 10:28 AM 11/9/2005, Rick Frank wrote: Yes, the application is responsible for detecting lost msgs at the application level - the transport can not do this.   RDS do

Re: [openib-general] [ANNOUNCE] Contribute RDS(ReliableDatagramSockets) to OpenIB

2005-11-15 Thread Michael Krause
At 12:49 PM 11/14/2005, Nitin Hande wrote: Michael Krause wrote: At 01:02 PM 11/11/2005, Ranjit Pandit wrote: On 11/11/05, Michael Krause <[EMAIL PROTECTED]> wrote: > Please clarify the following which was in the document provided by Oracle. > > On page 3 of the RDS document, under the section "R

RE: [openib-general] SRP device management client (and a few opensmglitches)

2005-11-15 Thread Eitan Zahavi
I think Yael figured it out: Looking at Roland's code it seems like it will not filter out the PortRecords coming from switch physical ports. So actually he gets many records that all have base lid = 0 and gid = 0 from these ports... I assume this is the case. There is no trivial way to know from

RE: [openib-general] SRP device management client (and a few opensmglitches)

2005-11-15 Thread Hal Rosenstock
Hi, It's not necessarily an RMPP bug. A lot of the port 2s on SCinet are not plugged in. -- Hal From: [EMAIL PROTECTED] on behalf of Roland Dreier Sent: Tue 11/15/2005 3:27 AM To: Eitan Zahavi Cc: openib-general@openib.org Subject: Re: [openib-general] SRP de

[openib-general] Re: ipoib oops

2005-11-15 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: ipoib oops > > Sorry I haven't been able to look at this immediately, since I've been > busy with SC05-related stuff. > > I hope to sit down and think about this in detail tomorrow... > > - R. > OK. Meanwhile, there's another possib

RE: [openib-general] SRP device management client (and a few opensm glitches)

2005-11-15 Thread Yael Kalka
Hello Roland, When turning on only the comp_mask for the local_port_num you will get all relevant PortInfo records from the switches. These records do have many fields zeroed out (e.g subnet_prefix), but they are still valid records. Is this what you are seeing? Thanks, Yael -Original Messag

RE: [openib-general] SRP device management client (and a few opensm glitches)

2005-11-15 Thread Eitan Zahavi
Thanks. We will try and reproduce it here. Eitan Zahavi Design Technology Director Mellanox Technologies LTD Tel:+972-4-9097208 Fax:+972-4-9593245 P.O. Box 586 Yokneam 20692 ISRAEL > -Original Message- > From: Roland Dreier [mailto:[EMAIL PROTECTED] > Sent: Tuesday, November 15, 2005 10:

[openib-general] [git patch review 2/3] [IB] srp: don't post receive if no send buf available

2005-11-15 Thread Roland Dreier
Have __srp_get_tx_iu() fail if the target port's request limit will not allow the initiator to post a send. This avoids continuing on and posting a receive, and then failing to post a corresponding send. If that happens, then the initiator will end up with an extra receive posted, and if this hap

[openib-general] [git patch review 1/3] [IB] srp: increase max_luns

2005-11-15 Thread Roland Dreier
Increase SRP max_luns to 512 to match the kernel's default, since SRP storage targets can have lots of LUNs and the SRP initiator itself doesn't have any particular limit. Signed-off-by: Roland Dreier <[EMAIL PROTECTED]> --- drivers/infiniband/ulp/srp/ib_srp.c |2 ++ drivers/infiniband/ulp/

[openib-general] [git patch review 3/3] [IB] mthca: don't disable RDMA writes if no responder resources

2005-11-15 Thread Roland Dreier
Responder resources are only required to handle RDMA reads and atomic operations, not RDMA writes. So the driver should allow RDMA writes even if responder resources are set to 0. This is especially important for the UC transport -- with the old code, it was impossible to enable RDMA writes for U

[openib-general] Re: ipoib oops

2005-11-15 Thread Roland Dreier
Sorry I haven't been able to look at this immediately, since I've been busy with SC05-related stuff. I hope to sit down and think about this in detail tomorrow... - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/

Re: [openib-general] SRP device management client (and a few opensm glitches)

2005-11-15 Thread Roland Dreier
Roland> Quite easy in my setup -- it seems to happen every time on Roland> my fabric when I do a get table for PortInfoRecords with Roland> local port num 2. And running ibsrpdm on the scinet fabric at SC'05 I see hundreds of PortInfoRecords with a base LID of 0... - R. _

Re: [openib-general] SRP device management client (and a few opensm glitches)

2005-11-15 Thread Roland Dreier
Eitan> Yes this is correct we never got requested for that query. Eitan> If you are only interested in obtaining the guid of the Eitan> port you can simply use NodeInfoRecord and you get the guid Eitan> in the NodeInfo. But you probably know that. Is there Eitan> anything more

[openib-general] Re: OpenSM size

2005-11-15 Thread Michael S. Tsirkin
Hi! 1. Did you strip it? # ls -l /usr/local/bin/opensm -rwxr-xr-x 1 root root 2124734 Nov 15 10:07 /usr/local/bin/opensm # strip /usr/local/bin/opensm # ls -l /usr/local/bin/opensm -rwxr-xr-x 1 root root 333024 Nov 15 10:22 /usr/local/bin/opensm 2. Compile with -Os: edit Makefile in /usr/src/o

Re: [openib-general] SRP device management client (and a few opensm glitches)

2005-11-15 Thread Eitan Zahavi
Roland Dreier wrote: The opensm issues I saw were: - GUIDInfoRecord SA queries are not implemented (I think), so by default my code does a (non-compliant) SM class query to get ports' GUIDs. Yes this is correct we never got requested for that query. If you are only interested in obtaini