Re: [openib-general] uDAPL problem

2006-10-17 Thread Arlin Davis
Stephen Smaldone wrote:

>
>
> Arlin Davis wrote:
>
>> Steve Smaldone wrote:
>>
>>> Hi,
>>>
>>> Sorry for replying to myself, but I loaded rdma_ucm and the rdma_cm 
>>> device appears.  However, it now fails with the following:
>>>
>>> $ ./dapltest -T S -D IB1
>>> ...
>>> DAT Registry: dat_ia_openv (IB1,1:2,0) called
>>> DAT Registry: IA IB1, trying to load library /usr/local/lib/libdapl.so
>>> DAT Registry: dat_registry_add_provider (IB1,1:2,0)
>>> libibverbs: Warning: no userspace device-specific driver found for 
>>> uverbs0
>>>driver search path: /usr/local/lib/infiniband
>>> libibverbs: Warning: no userspace device-specific driver found for 
>>> uverbs0
>>>driver search path: /usr/local/lib/infiniband
>>> DT_cs_Server: Could not open IB1 (DAT_INVALID_ADDRESS )
>>> DT_cs_Server (IB1):  Exiting.
>>> DAT Registry: Stopped (dat_fini)
>>>
>>> The configuration remains the same otherwise.
>>>  
>>>
 My dat.conf:
 IB1 u1.2 nonthreadsafe default /usr/local/lib/libdapl.so 
 mv_dapl.1.2 "hora-1-ib0 0" ""
   
>>>
>> Do you have an entry in your /etc/hosts for hora-1-ib0 and 10.2.2.135?
>>
>> there seems to be problems resolving "hora-1-ib0"
>>
>> -arlin
>
> Yes.  There is an entry as follows:
> 10.2.2.135  hora-1-ib0

could you change the "hora-1-ib0 0" to just "ib0 0" in your dat.conf and 
retry? They may be an issue parsing a hostname instead of a netdev name.

>
> Thanks,
>
> Steve
>

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] uDAPL problem

2006-10-17 Thread Stephen Smaldone


Steve Wise wrote:
> On Mon, 2006-10-16 at 18:01 -0400, Steve Smaldone wrote:
>   
>> Hi,
>>
>> Sorry for replying to myself, but I loaded rdma_ucm and the rdma_cm 
>> device appears.  However, it now fails with the following:
>>
>> $ ./dapltest -T S -D IB1
>> ...
>> DAT Registry: dat_ia_openv (IB1,1:2,0) called
>> DAT Registry: IA IB1, trying to load library /usr/local/lib/libdapl.so
>> DAT Registry: dat_registry_add_provider (IB1,1:2,0)
>> libibverbs: Warning: no userspace device-specific driver found for uverbs0
>> driver search path: /usr/local/lib/infiniband
>> libibverbs: Warning: no userspace device-specific driver found for uverbs0
>> driver search path: /usr/local/lib/infiniband
>> 
>
> Seems like it cannot find the provider library.
>
> Is there a libmthca.* in /usr/local/lib/infiniband?
>
>
>
>   
That was the problem.  There was no mthca.so.  Now it works.  Thanks for 
the help!

Steve Smaldone

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] uDAPL problem

2006-10-17 Thread Steve Wise
On Mon, 2006-10-16 at 18:01 -0400, Steve Smaldone wrote:
> Hi,
> 
> Sorry for replying to myself, but I loaded rdma_ucm and the rdma_cm 
> device appears.  However, it now fails with the following:
> 
> $ ./dapltest -T S -D IB1
> ...
> DAT Registry: dat_ia_openv (IB1,1:2,0) called
> DAT Registry: IA IB1, trying to load library /usr/local/lib/libdapl.so
> DAT Registry: dat_registry_add_provider (IB1,1:2,0)
> libibverbs: Warning: no userspace device-specific driver found for uverbs0
> driver search path: /usr/local/lib/infiniband
> libibverbs: Warning: no userspace device-specific driver found for uverbs0
> driver search path: /usr/local/lib/infiniband

Seems like it cannot find the provider library.

Is there a libmthca.* in /usr/local/lib/infiniband?




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] uDAPL problem

2006-10-16 Thread Stephen Smaldone


Arlin Davis wrote:
> Steve Smaldone wrote:
>
>> Hi,
>>
>> Sorry for replying to myself, but I loaded rdma_ucm and the rdma_cm 
>> device appears.  However, it now fails with the following:
>>
>> $ ./dapltest -T S -D IB1
>> ...
>> DAT Registry: dat_ia_openv (IB1,1:2,0) called
>> DAT Registry: IA IB1, trying to load library /usr/local/lib/libdapl.so
>> DAT Registry: dat_registry_add_provider (IB1,1:2,0)
>> libibverbs: Warning: no userspace device-specific driver found for 
>> uverbs0
>>driver search path: /usr/local/lib/infiniband
>> libibverbs: Warning: no userspace device-specific driver found for 
>> uverbs0
>>driver search path: /usr/local/lib/infiniband
>> DT_cs_Server: Could not open IB1 (DAT_INVALID_ADDRESS )
>> DT_cs_Server (IB1):  Exiting.
>> DAT Registry: Stopped (dat_fini)
>>
>> The configuration remains the same otherwise.
>>  
>>
>>> My dat.conf:
>>> IB1 u1.2 nonthreadsafe default /usr/local/lib/libdapl.so mv_dapl.1.2 
>>> "hora-1-ib0 0" ""
>>>   
> Do you have an entry in your /etc/hosts for hora-1-ib0 and 10.2.2.135?
>
> there seems to be problems resolving "hora-1-ib0"
>
> -arlin
Yes.  There is an entry as follows:
10.2.2.135  hora-1-ib0

Thanks,

Steve


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] uDAPL problem

2006-10-16 Thread Arlin Davis
Steve Smaldone wrote:

>Hi,
>
>Sorry for replying to myself, but I loaded rdma_ucm and the rdma_cm 
>device appears.  However, it now fails with the following:
>
>$ ./dapltest -T S -D IB1
>...
>DAT Registry: dat_ia_openv (IB1,1:2,0) called
>DAT Registry: IA IB1, trying to load library /usr/local/lib/libdapl.so
>DAT Registry: dat_registry_add_provider (IB1,1:2,0)
>libibverbs: Warning: no userspace device-specific driver found for uverbs0
>driver search path: /usr/local/lib/infiniband
>libibverbs: Warning: no userspace device-specific driver found for uverbs0
>driver search path: /usr/local/lib/infiniband
>DT_cs_Server: Could not open IB1 (DAT_INVALID_ADDRESS )
>DT_cs_Server (IB1):  Exiting.
>DAT Registry: Stopped (dat_fini)
>
>The configuration remains the same otherwise.
>  
>
>>My dat.conf:
>>IB1 u1.2 nonthreadsafe default /usr/local/lib/libdapl.so mv_dapl.1.2 
>>"hora-1-ib0 0" ""
>>
>>
Do you have an entry in your /etc/hosts for hora-1-ib0 and 10.2.2.135?

there seems to be problems resolving "hora-1-ib0"

-arlin

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] uDAPL problem

2006-10-16 Thread Steve Smaldone
Hi,

Sorry for replying to myself, but I loaded rdma_ucm and the rdma_cm 
device appears.  However, it now fails with the following:

$ ./dapltest -T S -D IB1
...
DAT Registry: dat_ia_openv (IB1,1:2,0) called
DAT Registry: IA IB1, trying to load library /usr/local/lib/libdapl.so
DAT Registry: dat_registry_add_provider (IB1,1:2,0)
libibverbs: Warning: no userspace device-specific driver found for uverbs0
driver search path: /usr/local/lib/infiniband
libibverbs: Warning: no userspace device-specific driver found for uverbs0
driver search path: /usr/local/lib/infiniband
DT_cs_Server: Could not open IB1 (DAT_INVALID_ADDRESS )
DT_cs_Server (IB1):  Exiting.
DAT Registry: Stopped (dat_fini)

The configuration remains the same otherwise.

Thanks,

Steve Smaldone

Steve Smaldone wrote:
> Hi,
>
> I have been trying to run dapltest with trunk rev 9717 with linux 
> kernel 2.6.18 and I get an error.  The error and configuration is 
> shown below.
> Basically, the rdma_cm device is not created under /dev/infiniband.  I 
> am wondering if this is a known problem and how to solve it.
>
> Thanks,
> Steve Smaldone
>
> $ ./dapltest -T S -D IB1
> ...
> DAT Registry: dat_ia_openv (IB1,1:2,0) called
> DAT Registry: IA IB1, trying to load library /usr/local/lib/libdapl.so
> DAT Registry: dat_registry_add_provider (IB1,1:2,0)
> librdmacm: couldn't read ABI version.
> librdmacm: assuming: 2
> libibverbs: Warning: no userspace device-specific driver found for 
> uverbs0
>driver search path: /usr/local/lib/infiniband
> CMA: unable to open /dev/infiniband/rdma_cm
> DT_cs_Server: Could not open IB1 (DAT_INTERNAL_ERROR )
> DT_cs_Server (IB1):  Exiting.
> DAT Registry: Stopped (dat_fini)
>
> My dat.conf:
> IB1 u1.2 nonthreadsafe default /usr/local/lib/libdapl.so mv_dapl.1.2 
> "hora-1-ib0 0" ""
>
> ifconfig:
> ib0   Link encap:UNSPEC  HWaddr 
> 00-00-00-14-FE-80-00-00-00-00-00-00-00-00-00-00
>  inet addr:10.2.2.135  Bcast:10.255.255.255  Mask:255.0.0.0
>  inet6 addr: fe80::202:c901:81e:90e1/64 Scope:Link
>  UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1
>  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>  TX packets:1 errors:0 dropped:5 overruns:0 carrier:0
>  collisions:0 txqueuelen:128
>  RX bytes:0 (0.0 b)  TX bytes:68 (68.0 b)
>
> lsmod:
> Module  Size  Used by
> ib_ucm 20612  0
> ib_uverbs  40232  1 ib_ucm
> rdma_cm33572  0
> ib_cm  39824  2 ib_ucm,rdma_cm
> ib_addr11524  1 rdma_cm
> ib_local_sa15752  1 rdma_cm
> findex  8576  1 ib_local_sa
> ib_ipoib   51144  0
> ib_multicast   15940  2 rdma_cm,ib_ipoib
> ib_sa  20004  5 
> rdma_cm,ib_cm,ib_local_sa,ib_ipoib,ib_multicast
> ib_umad20016  0
> ib_mthca  131236  0
> ib_mad 42272  5 ib_cm,ib_local_sa,ib_sa,ib_umad,ib_mthca
> ib_core52992  11 
> ib_ucm,ib_uverbs,rdma_cm,ib_cm,ib_local_sa,ib_ipoib,ib_multicast,ib_sa,ib_umad,ib_mthca,ib_mad
>  
>
>
> udev rules:
> KERNEL="umad*", NAME="infiniband/%k"
> KERNEL="issm*", NAME="infiniband/%k"
> KERNEL="uverbs*", NAME="infiniband/%k", MODE="0666"
> KERNEL="ucm*", NAME="infiniband/%k", MODE="0666"
> KERNEL="rdma_cm", NAME="infiniband/%k", MODE="0666"
>
> /dev/infiniband:
> issm0  issm1  ucm0  umad0  umad1  uverbs0
>
>

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] uDAPL problem

2006-10-16 Thread Robert Walsh
Steve Smaldone wrote:
> Hi,
> 
> I have been trying to run dapltest with trunk rev 9717 with linux kernel 
> 2.6.18 and I get an error.  The error and configuration is shown below.
> Basically, the rdma_cm device is not created under /dev/infiniband.  I 
> am wondering if this is a known problem and how to solve it.

You need to load the rdma_ucm module, too.

Regards,
  Robert.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] uDAPL problem

2006-10-16 Thread Steve Smaldone
Hi,

I have been trying to run dapltest with trunk rev 9717 with linux kernel 
2.6.18 and I get an error.  The error and configuration is shown below.
Basically, the rdma_cm device is not created under /dev/infiniband.  I 
am wondering if this is a known problem and how to solve it.

Thanks,
Steve Smaldone

$ ./dapltest -T S -D IB1
...
DAT Registry: dat_ia_openv (IB1,1:2,0) called
DAT Registry: IA IB1, trying to load library /usr/local/lib/libdapl.so
DAT Registry: dat_registry_add_provider (IB1,1:2,0)
librdmacm: couldn't read ABI version.
librdmacm: assuming: 2
libibverbs: Warning: no userspace device-specific driver found for uverbs0
driver search path: /usr/local/lib/infiniband
CMA: unable to open /dev/infiniband/rdma_cm
DT_cs_Server: Could not open IB1 (DAT_INTERNAL_ERROR )
DT_cs_Server (IB1):  Exiting.
DAT Registry: Stopped (dat_fini)

My dat.conf:
IB1 u1.2 nonthreadsafe default /usr/local/lib/libdapl.so mv_dapl.1.2 
"hora-1-ib0 0" ""

ifconfig:
ib0   Link encap:UNSPEC  HWaddr 
00-00-00-14-FE-80-00-00-00-00-00-00-00-00-00-00
  inet addr:10.2.2.135  Bcast:10.255.255.255  Mask:255.0.0.0
  inet6 addr: fe80::202:c901:81e:90e1/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1
  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
  TX packets:1 errors:0 dropped:5 overruns:0 carrier:0
  collisions:0 txqueuelen:128
  RX bytes:0 (0.0 b)  TX bytes:68 (68.0 b)

lsmod:
Module  Size  Used by
ib_ucm 20612  0
ib_uverbs  40232  1 ib_ucm
rdma_cm33572  0
ib_cm  39824  2 ib_ucm,rdma_cm
ib_addr11524  1 rdma_cm
ib_local_sa15752  1 rdma_cm
findex  8576  1 ib_local_sa
ib_ipoib   51144  0
ib_multicast   15940  2 rdma_cm,ib_ipoib
ib_sa  20004  5 
rdma_cm,ib_cm,ib_local_sa,ib_ipoib,ib_multicast
ib_umad20016  0
ib_mthca  131236  0
ib_mad 42272  5 ib_cm,ib_local_sa,ib_sa,ib_umad,ib_mthca
ib_core52992  11 
ib_ucm,ib_uverbs,rdma_cm,ib_cm,ib_local_sa,ib_ipoib,ib_multicast,ib_sa,ib_umad,ib_mthca,ib_mad

udev rules:
KERNEL="umad*", NAME="infiniband/%k"
KERNEL="issm*", NAME="infiniband/%k"
KERNEL="uverbs*", NAME="infiniband/%k", MODE="0666"
KERNEL="ucm*", NAME="infiniband/%k", MODE="0666"
KERNEL="rdma_cm", NAME="infiniband/%k", MODE="0666"

/dev/infiniband:
issm0  issm1  ucm0  umad0  umad1  uverbs0


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] uDAPL problem

2005-09-27 Thread James Lentini


On Tue, 27 Sep 2005, Todd Bowman wrote:

> Never mind, I found the post from Roland discussing this issue:
> 
> Roland wrote:
> I didn't try to fix uDAPL, because some thought probably needs to go 
> into how to use completion channels most efficiently.

I appologize. I forgot that the current tree has an ABI change. Arlin 
is working on a fix for this.

The kernel and userspace code at revision 3547 is what you need. 
Please use

 svn co -r 3547 https://openib.org/svn/gen2/trunk/src/

to obtain these sources.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] uDAPL problem

2005-09-27 Thread Todd Bowman
On 9/27/05, Todd Bowman <[EMAIL PROTECTED]> wrote:
On 9/27/05, James Lentini <
[EMAIL PROTECTED]> wrote:
On Tue, 27 Sep 2005, Todd Bowman wrote:> On 9/27/05, James Lentini <[EMAIL PROTECTED]
> wrote:>> What is the output of cat /sys/class/net/*/ifindex?
> >> > cat /sys/class/net/*/ifindex> 1 #eth0> 10 #ib0> 11 #ib1> 2 #lo> 3 #tunl0This looks like the problem I reported last week:

http://openib.org/pipermail/openib-general/2005-September/011668.htmlIf so, this is fixed in the current subversion sources. Could youupdate your sources, specifically infiniband/core/at.c, and try again?

james
That worked.  Thanks.

It is printing the addresses backwards.  But I assume this is also a bug in printing.

 ips_by_gid: RET 0 at_rec 0x19e5f40 -> id 1
 dapli_at_event_cb()
 ip_comp_handler: rec 0x19e5f40 ->id 1 id 1 num 1 b0f0add
 open_hca: mthca0, port 1, AF_INET  221.10.15.11 INLINE_MAX=128
 query_hca: mthca0 AF_INET  221.10.15.11
 query_hca: (0.30002) ep 64512 ep_q 65535 evd 65408 evd_q 65535
 query_hca: msg 2147483648 rdma 2147483648 iov 28 lmr 131056 rmr 0
Bus error


The bus error is probably due to changes in ibv_context.  I didn't recompile libdapl.  

The changes in ibv_context that broke udapl are:
num_comp   is now    num_comp_vectors
I don't know what is used in place of cq_fd[1].  Any pointers?

Todd   
Never mind, I found the post from Roland discussing this issue:

Roland wrote:
 I didn't try to fix uDAPL, because some thought probably
needs to go into how to use completion channels most efficiently.
 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] uDAPL problem

2005-09-27 Thread Todd Bowman
On 9/27/05, James Lentini <[EMAIL PROTECTED]> wrote:
On Tue, 27 Sep 2005, Todd Bowman wrote:> On 9/27/05, James Lentini <[EMAIL PROTECTED]> wrote:>> What is the output of cat /sys/class/net/*/ifindex?
> >> > cat /sys/class/net/*/ifindex> 1 #eth0> 10 #ib0> 11 #ib1> 2 #lo> 3 #tunl0This looks like the problem I reported last week:
http://openib.org/pipermail/openib-general/2005-September/011668.htmlIf so, this is fixed in the current subversion sources. Could youupdate your sources, specifically infiniband/core/at.c, and try again?
james
That worked.  Thanks.

It is printing the addresses backwards.  But I assume this is also a bug in printing.

 ips_by_gid: RET 0 at_rec 0x19e5f40 -> id 1
 dapli_at_event_cb()
 ip_comp_handler: rec 0x19e5f40 ->id 1 id 1 num 1 b0f0add
 open_hca: mthca0, port 1, AF_INET  221.10.15.11 INLINE_MAX=128
 query_hca: mthca0 AF_INET  221.10.15.11
 query_hca: (0.30002) ep 64512 ep_q 65535 evd 65408 evd_q 65535
 query_hca: msg 2147483648 rdma 2147483648 iov 28 lmr 131056 rmr 0
Bus error


The bus error is probably due to changes in ibv_context.  I didn't recompile libdapl.  

The changes in ibv_context that broke udapl are:
num_comp   is now    num_comp_vectors
I don't know what is used in place of cq_fd[1].  Any pointers?

Todd   



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] uDAPL problem

2005-09-27 Thread James Lentini


On Tue, 27 Sep 2005, Todd Bowman wrote:

> > I am having a different problem in ips_by_gid:
> 
> open_hca: Found dev mthca0 f42202c90200
> open_hca: GID subnet 80fe id f52202c90200
> ips_by_gid: ERR ips_by_gid -1 No such device
> open_hca: ERR ib_at_ips_by_gid for mthca0
> dapls_ib_open_hca failed 4
> dapl_ia_open () returns 0x4
> DT_cs_Server: Could not open OpenIB-ib0 (DAT_INTERNAL_ERROR )

What is the output of ifconfig? Is the IPoIB interface configured?

What is the output of cat /sys/class/net/*/ifindex?

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] uDAPL problem

2005-09-27 Thread Todd Bowman
On 27 Sep 2005 09:55:02 -0400, Hal Rosenstock <[EMAIL PROTECTED]> wrote:
On Tue, 2005-09-27 at 09:51, James Lentini wrote:> On Mon, 26 Sep 2005, Hal Rosenstock wrote:>> > On Mon, 2005-09-26 at 18:05, Todd Bowman wrote:> > > I am having a problem with uDAPL accessing
> > > /dev/infiniband/{uat,ucm0}.  I am running 3549,  2.6.12 kernel with> > > backport.  Here is a snippet of the uDAPL debug messages running> > > dtest.  The dat.conf file seems to be correct, the correclty named
> > > providers are being loaded.> > >> > > 26248 Running as server> > > DAT Registry: dat_ia_openv (OpenIB-ib0,1:2,0) called> > > DAT Registry: IA OpenIB-ib0, trying to load library
> > > /usr/local/lib/libdapl.so> > > libuat: Error <-1:6> couldn't open IB at device > > > libibcm: error <-1:6> opening device 
>> This means that the /dev entried are not setup correctly.Correct. He set this up manually. Todd wrote:"I am not running udev but manually create uat and ucm."
The correct  major/minor #s fixed that problem.
> > > DAPL: NOT Setting Loopback> > >  dapl_ib_init:> > > DAT Registry: dat_registry_add_provider (OpenIB-ib0,1:2,0)
> > > dapl_ia_open (OpenIB-ib0, 8, 0x10019d40, 0x10019cc0)> > >  open_hca: mthca0 - 0x1001fdb0> > >  open_hca: Found dev mthca0 f42202c90200> > >  open_hca: GID subnet 80fe id f52202c90200
> >> > These look like they need to be endianized to me.>> This looks like a bug in the way we print these values out, but I> don't think it is the real problem.Right, it's just a cosmetic with the display.
-- Hal> What architecture are you using?
Apple G5. 
>> > >  ips_by_gid: ERR ips_by_gid -1 Bad file descriptor> > >  open_hca: ERR ib_at_ips_by_gid for mthca0
> > > dapls_ib_open_hca failed 4> > > dapl_ia_open () returns 0x4> > > 26248: Error Adaptor open: DAT_INTERNAL_ERROR> > > DAT Registry: Stopped (dat_fini)> > > DAPL: Stopped (dapl_fini)
> > >  dapl_ib_release:> > >> > >> > > I am not running udev but manually create uat and ucm.  Here is the
> > > list of /dev/infiniband:> > >> > > ls -l /dev/infiniband/> > > total 0> > > crw-rw-rw-  1 root root 231,  64 Sep 22 15:18 issm0> > > crw-rw-rw-  1 root root 231,  65 Sep 22 15:18 issm1
> > > crw-rw-rw-  1 root root 231, 254 Sep 22 22:47 uat> >> > uat is at 231/191.> >> > > crw-rw-rw-  1 root root 231, 255 Sep 20 22:31 ucm> >> > I don't think you need this.
> >> > > crw-rw-rw-  1 root root 231, 255 Sep 26 20:01 ucm0> >> > ucm devices start at 231/224.>> If these changes do not fix you problem, please let us know.>
> > -- Hal> >> > > crw-rw-rw-  1 root root 231,   0 Sep 22 15:18 umad0> > > crw-rw-rw-  1 root root 231,   1 Sep 22 15:18 umad1> > > crw-rw-rw-  1 root root 231, 192 Sep 20 22:30 uverbs0
> > > crw-rw-rw-  1 root root 231, 193 Sep 20 22:30 uverbs1> > >> > >> > > And the loaded modules:> > >> > > kdapl_ib   82000  0
>
> >
kdapl  14888  1
kdapl_ib> > > ib_uverbs  52064  0> > > ib_ipoib   65480  0>
> >
ib_ucm
32624  0> > >
ib_cm  51944  2
kdapl_ib,ib_ucm> > >
ib_uat
22168  0> > >
ib_at  34840  2
kdapl_ib,ib_uat> > >
ib_sa  25328  2
ib_ipoib,ib_at> > > ib_mthca  160376  0>
> >
ib_mad
61108  3 ib_cm,ib_sa,ib_mthca> > >
ib_core73888  8> > > kdapl_ib,ib_uverbs,ib_ipoib,ib_ucm,ib_cm,ib_sa,ib_mthca,ib_mad> > >> > >> > > I am sure that I am missing something simple.  Can someone point me in
> > > the right direction.> > >> > > Thanks,> > > ToddI am having a different problem in ips_by_gid:

open_hca: Found dev mthca0 f42202c90200

 open_hca: GID subnet 80fe id f52202c90200

 ips_by_gid: ERR ips_by_gid -1 No such device

 open_hca: ERR ib_at_ips_by_gid for mthca0

dapls_ib_open_hca failed 4

dapl_ia_open () returns 0x4

DT_cs_Server: Could not open OpenIB-ib0 (DAT_INTERNAL_ERROR )

Thanks,
Todd
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] uDAPL problem

2005-09-27 Thread Hal Rosenstock
On Tue, 2005-09-27 at 09:51, James Lentini wrote:
> On Mon, 26 Sep 2005, Hal Rosenstock wrote:
> 
> > On Mon, 2005-09-26 at 18:05, Todd Bowman wrote:
> > > I am having a problem with uDAPL accessing
> > > /dev/infiniband/{uat,ucm0}.  I am running 3549,  2.6.12 kernel with
> > > backport.  Here is a snippet of the uDAPL debug messages running
> > > dtest.  The dat.conf file seems to be correct, the correclty named
> > > providers are being loaded.
> > > 
> > > 26248 Running as server
> > > DAT Registry: dat_ia_openv (OpenIB-ib0,1:2,0) called
> > > DAT Registry: IA OpenIB-ib0, trying to load library
> > > /usr/local/lib/libdapl.so
> > > libuat: Error <-1:6> couldn't open IB at device 
> > > libibcm: error <-1:6> opening device 
> 
> This means that the /dev entried are not setup correctly.

Correct. He set this up manually. Todd wrote:
"I am not running udev but manually create uat and ucm." 

> > > DAPL: NOT Setting Loopback
> > >  dapl_ib_init:
> > > DAT Registry: dat_registry_add_provider (OpenIB-ib0,1:2,0)
> > > dapl_ia_open (OpenIB-ib0, 8, 0x10019d40, 0x10019cc0)
> > >  open_hca: mthca0 - 0x1001fdb0
> > >  open_hca: Found dev mthca0 f42202c90200
> > >  open_hca: GID subnet 80fe id f52202c90200
> > 
> > These look like they need to be endianized to me.
> 
> This looks like a bug in the way we print these values out, but I 
> don't think it is the real problem. 

Right, it's just a cosmetic with the display.

-- Hal

> What architecture are you using?
> 
> > >  ips_by_gid: ERR ips_by_gid -1 Bad file descriptor
> > >  open_hca: ERR ib_at_ips_by_gid for mthca0
> > > dapls_ib_open_hca failed 4
> > > dapl_ia_open () returns 0x4
> > > 26248: Error Adaptor open: DAT_INTERNAL_ERROR
> > > DAT Registry: Stopped (dat_fini)
> > > DAPL: Stopped (dapl_fini)
> > >  dapl_ib_release:
> > > 
> > > 
> > > I am not running udev but manually create uat and ucm.  Here is the
> > > list of /dev/infiniband:
> > > 
> > > ls -l /dev/infiniband/
> > > total 0
> > > crw-rw-rw-  1 root root 231,  64 Sep 22 15:18 issm0
> > > crw-rw-rw-  1 root root 231,  65 Sep 22 15:18 issm1
> > > crw-rw-rw-  1 root root 231, 254 Sep 22 22:47 uat
> > 
> > uat is at 231/191.
> > 
> > > crw-rw-rw-  1 root root 231, 255 Sep 20 22:31 ucm
> > 
> > I don't think you need this.
> > 
> > > crw-rw-rw-  1 root root 231, 255 Sep 26 20:01 ucm0
> > 
> > ucm devices start at 231/224.
> 
> If these changes do not fix you problem, please let us know.
> 
> > -- Hal
> > 
> > > crw-rw-rw-  1 root root 231,   0 Sep 22 15:18 umad0
> > > crw-rw-rw-  1 root root 231,   1 Sep 22 15:18 umad1
> > > crw-rw-rw-  1 root root 231, 192 Sep 20 22:30 uverbs0
> > > crw-rw-rw-  1 root root 231, 193 Sep 20 22:30 uverbs1
> > > 
> > > 
> > > And the loaded modules:
> > > 
> > > kdapl_ib   82000  0
> > > kdapl  14888  1 kdapl_ib
> > > ib_uverbs  52064  0
> > > ib_ipoib   65480  0
> > > ib_ucm 32624  0
> > > ib_cm  51944  2 kdapl_ib,ib_ucm
> > > ib_uat 22168  0
> > > ib_at  34840  2 kdapl_ib,ib_uat
> > > ib_sa  25328  2 ib_ipoib,ib_at
> > > ib_mthca  160376  0
> > > ib_mad 61108  3 ib_cm,ib_sa,ib_mthca
> > > ib_core73888  8
> > > kdapl_ib,ib_uverbs,ib_ipoib,ib_ucm,ib_cm,ib_sa,ib_mthca,ib_mad
> > > 
> > > 
> > > I am sure that I am missing something simple.  Can someone point me in
> > > the right direction.
> > > 
> > > Thanks,
> > > Todd

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] uDAPL problem

2005-09-27 Thread James Lentini


On Mon, 26 Sep 2005, Hal Rosenstock wrote:

> On Mon, 2005-09-26 at 18:05, Todd Bowman wrote:
> > I am having a problem with uDAPL accessing
> > /dev/infiniband/{uat,ucm0}.  I am running 3549,  2.6.12 kernel with
> > backport.  Here is a snippet of the uDAPL debug messages running
> > dtest.  The dat.conf file seems to be correct, the correclty named
> > providers are being loaded.
> > 
> > 26248 Running as server
> > DAT Registry: dat_ia_openv (OpenIB-ib0,1:2,0) called
> > DAT Registry: IA OpenIB-ib0, trying to load library
> > /usr/local/lib/libdapl.so
> > libuat: Error <-1:6> couldn't open IB at device 
> > libibcm: error <-1:6> opening device 

This means that the /dev entried are not setup correctly.

> > DAPL: NOT Setting Loopback
> >  dapl_ib_init:
> > DAT Registry: dat_registry_add_provider (OpenIB-ib0,1:2,0)
> > dapl_ia_open (OpenIB-ib0, 8, 0x10019d40, 0x10019cc0)
> >  open_hca: mthca0 - 0x1001fdb0
> >  open_hca: Found dev mthca0 f42202c90200
> >  open_hca: GID subnet 80fe id f52202c90200
> 
> These look like they need to be endianized to me.

This looks like a bug in the way we print these values out, but I 
don't think it is the real problem. What architecture are you using?

> >  ips_by_gid: ERR ips_by_gid -1 Bad file descriptor
> >  open_hca: ERR ib_at_ips_by_gid for mthca0
> > dapls_ib_open_hca failed 4
> > dapl_ia_open () returns 0x4
> > 26248: Error Adaptor open: DAT_INTERNAL_ERROR
> > DAT Registry: Stopped (dat_fini)
> > DAPL: Stopped (dapl_fini)
> >  dapl_ib_release:
> > 
> > 
> > I am not running udev but manually create uat and ucm.  Here is the
> > list of /dev/infiniband:
> > 
> > ls -l /dev/infiniband/
> > total 0
> > crw-rw-rw-  1 root root 231,  64 Sep 22 15:18 issm0
> > crw-rw-rw-  1 root root 231,  65 Sep 22 15:18 issm1
> > crw-rw-rw-  1 root root 231, 254 Sep 22 22:47 uat
> 
> uat is at 231/191.
> 
> > crw-rw-rw-  1 root root 231, 255 Sep 20 22:31 ucm
> 
> I don't think you need this.
> 
> > crw-rw-rw-  1 root root 231, 255 Sep 26 20:01 ucm0
> 
> ucm devices start at 231/224.

If these changes do not fix you problem, please let us know.

> -- Hal
> 
> > crw-rw-rw-  1 root root 231,   0 Sep 22 15:18 umad0
> > crw-rw-rw-  1 root root 231,   1 Sep 22 15:18 umad1
> > crw-rw-rw-  1 root root 231, 192 Sep 20 22:30 uverbs0
> > crw-rw-rw-  1 root root 231, 193 Sep 20 22:30 uverbs1
> > 
> > 
> > And the loaded modules:
> > 
> > kdapl_ib   82000  0
> > kdapl  14888  1 kdapl_ib
> > ib_uverbs  52064  0
> > ib_ipoib   65480  0
> > ib_ucm 32624  0
> > ib_cm  51944  2 kdapl_ib,ib_ucm
> > ib_uat 22168  0
> > ib_at  34840  2 kdapl_ib,ib_uat
> > ib_sa  25328  2 ib_ipoib,ib_at
> > ib_mthca  160376  0
> > ib_mad 61108  3 ib_cm,ib_sa,ib_mthca
> > ib_core73888  8
> > kdapl_ib,ib_uverbs,ib_ipoib,ib_ucm,ib_cm,ib_sa,ib_mthca,ib_mad
> > 
> > 
> > I am sure that I am missing something simple.  Can someone point me in
> > the right direction.
> > 
> > Thanks,
> > Todd
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] uDAPL problem

2005-09-26 Thread Hal Rosenstock
On Mon, 2005-09-26 at 18:05, Todd Bowman wrote:
> I am having a problem with uDAPL accessing
> /dev/infiniband/{uat,ucm0}.  I am running 3549,  2.6.12 kernel with
> backport.  Here is a snippet of the uDAPL debug messages running
> dtest.  The dat.conf file seems to be correct, the correclty named
> providers are being loaded.
> 
> 26248 Running as server
> DAT Registry: dat_ia_openv (OpenIB-ib0,1:2,0) called
> DAT Registry: IA OpenIB-ib0, trying to load library
> /usr/local/lib/libdapl.so
> libuat: Error <-1:6> couldn't open IB at device 
> libibcm: error <-1:6> opening device 
> DAPL: NOT Setting Loopback
>  dapl_ib_init:
> DAT Registry: dat_registry_add_provider (OpenIB-ib0,1:2,0)
> dapl_ia_open (OpenIB-ib0, 8, 0x10019d40, 0x10019cc0)
>  open_hca: mthca0 - 0x1001fdb0
>  open_hca: Found dev mthca0 f42202c90200
>  open_hca: GID subnet 80fe id f52202c90200

These look like they need to be endianized to me.

>  ips_by_gid: ERR ips_by_gid -1 Bad file descriptor
>  open_hca: ERR ib_at_ips_by_gid for mthca0
> dapls_ib_open_hca failed 4
> dapl_ia_open () returns 0x4
> 26248: Error Adaptor open: DAT_INTERNAL_ERROR
> DAT Registry: Stopped (dat_fini)
> DAPL: Stopped (dapl_fini)
>  dapl_ib_release:
> 
> 
> I am not running udev but manually create uat and ucm.  Here is the
> list of /dev/infiniband:
> 
> ls -l /dev/infiniband/
> total 0
> crw-rw-rw-  1 root root 231,  64 Sep 22 15:18 issm0
> crw-rw-rw-  1 root root 231,  65 Sep 22 15:18 issm1
> crw-rw-rw-  1 root root 231, 254 Sep 22 22:47 uat

uat is at 231/191.

> crw-rw-rw-  1 root root 231, 255 Sep 20 22:31 ucm

I don't think you need this.

> crw-rw-rw-  1 root root 231, 255 Sep 26 20:01 ucm0

ucm devices start at 231/224.

-- Hal

> crw-rw-rw-  1 root root 231,   0 Sep 22 15:18 umad0
> crw-rw-rw-  1 root root 231,   1 Sep 22 15:18 umad1
> crw-rw-rw-  1 root root 231, 192 Sep 20 22:30 uverbs0
> crw-rw-rw-  1 root root 231, 193 Sep 20 22:30 uverbs1
> 
> 
> And the loaded modules:
> 
> kdapl_ib   82000  0
> kdapl  14888  1 kdapl_ib
> ib_uverbs  52064  0
> ib_ipoib   65480  0
> ib_ucm 32624  0
> ib_cm  51944  2 kdapl_ib,ib_ucm
> ib_uat 22168  0
> ib_at  34840  2 kdapl_ib,ib_uat
> ib_sa  25328  2 ib_ipoib,ib_at
> ib_mthca  160376  0
> ib_mad 61108  3 ib_cm,ib_sa,ib_mthca
> ib_core73888  8
> kdapl_ib,ib_uverbs,ib_ipoib,ib_ucm,ib_cm,ib_sa,ib_mthca,ib_mad
> 
> 
> I am sure that I am missing something simple.  Can someone point me in
> the right direction.
> 
> Thanks,
> Todd
> 
> 
> __
> 
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] uDAPL problem

2005-09-26 Thread Todd Bowman
I am having a problem with uDAPL accessing
/dev/infiniband/{uat,ucm0}.  I am running 3549,  2.6.12
kernel with backport.  Here is a snippet of the uDAPL debug
messages running dtest.  The dat.conf file seems to be correct,
the correclty named providers are being loaded.

26248 Running as server
DAT Registry: dat_ia_openv (OpenIB-ib0,1:2,0) called
DAT Registry: IA OpenIB-ib0, trying to load library /usr/local/lib/libdapl.so
libuat: Error <-1:6> couldn't open IB at device 
libibcm: error <-1:6> opening device 
DAPL: NOT Setting Loopback
 dapl_ib_init:
DAT Registry: dat_registry_add_provider (OpenIB-ib0,1:2,0)
dapl_ia_open (OpenIB-ib0, 8, 0x10019d40, 0x10019cc0)
 open_hca: mthca0 - 0x1001fdb0
 open_hca: Found dev mthca0 f42202c90200
 open_hca: GID subnet 80fe id f52202c90200
 ips_by_gid: ERR ips_by_gid -1 Bad file descriptor
 open_hca: ERR ib_at_ips_by_gid for mthca0
dapls_ib_open_hca failed 4
dapl_ia_open () returns 0x4
26248: Error Adaptor open: DAT_INTERNAL_ERROR
DAT Registry: Stopped (dat_fini)
DAPL: Stopped (dapl_fini)
 dapl_ib_release:


I am not running udev but manually create uat and ucm.  Here is the list of /dev/infiniband:

ls -l /dev/infiniband/
total 0
crw-rw-rw-  1 root root 231,  64 Sep 22 15:18 issm0
crw-rw-rw-  1 root root 231,  65 Sep 22 15:18 issm1
crw-rw-rw-  1 root root 231, 254 Sep 22 22:47 uat
crw-rw-rw-  1 root root 231, 255 Sep 20 22:31 ucm
crw-rw-rw-  1 root root 231, 255 Sep 26 20:01 ucm0
crw-rw-rw-  1 root root 231,   0 Sep 22 15:18 umad0
crw-rw-rw-  1 root root 231,   1 Sep 22 15:18 umad1
crw-rw-rw-  1 root root 231, 192 Sep 20 22:30 uverbs0
crw-rw-rw-  1 root root 231, 193 Sep 20 22:30 uverbs1


And the loaded modules:

kdapl_ib   82000  0
kdapl 
14888  1 kdapl_ib
ib_uverbs  52064  0
ib_ipoib   65480  0
ib_ucm 32624  0
ib_cm 
51944  2 kdapl_ib,ib_ucm
ib_uat 22168  0
ib_at 
34840  2 kdapl_ib,ib_uat
ib_sa 
25328  2 ib_ipoib,ib_at
ib_mthca  160376  0
ib_mad
61108  3 ib_cm,ib_sa,ib_mthca
ib_core   
73888  8
kdapl_ib,ib_uverbs,ib_ipoib,ib_ucm,ib_cm,ib_sa,ib_mthca,ib_mad


I am sure that I am missing something simple.  Can someone point me in the right direction.

Thanks,
Todd
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general