date:20070207

Re: [openib-general] [libmthca] deadlock while trying to destroy QP

2007-02-07 Thread Guy German

Roland Dreier wrote:
> I guess my first reaction is "don't do that."  Trying to do something
> as complex as destroying a QP from a signal handler seems very fragile
> to me, and I wouldn't consider ibv_destroy_qp() safe to call from a
> signal handler.

Fair enough.

Thanks,
Guy

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] ofa_1_2_kernel 20070207-0200 daily build status

2007-02-07 Thread vlad

This email was generated automatically, please do not reply


Common build parameters:  --with-ipoib-mod --with-sdp-mod --with-srp-mod 
--with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-core-mod 
--with-addr_trans-mod --with-cxgb3-mod 

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.12
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.18
Passed on powerpc with linux-2.6.19
Passed on x86_64 with linux-2.6.19
Passed on powerpc with linux-2.6.18
Passed on powerpc with linux-2.6.17
Passed on x86_64 with linux-2.6.16
Passed on x86_64 with linux-2.6.18
Passed on ia64 with linux-2.6.19
Passed on ia64 with linux-2.6.18
Passed on x86_64 with linux-2.6.12
Passed on ppc64 with linux-2.6.12
Passed on x86_64 with linux-2.6.17
Passed on x86_64 with linux-2.6.15
Passed on ppc64 with linux-2.6.18
Passed on x86_64 with linux-2.6.13
Passed on ppc64 with linux-2.6.19
Passed on x86_64 with linux-2.6.14
Passed on ppc64 with linux-2.6.15
Passed on ppc64 with linux-2.6.16
Passed on powerpc with linux-2.6.15
Passed on powerpc with linux-2.6.12
Passed on ia64 with linux-2.6.13
Passed on powerpc with linux-2.6.14
Passed on powerpc with linux-2.6.13
Passed on ppc64 with linux-2.6.13
Passed on powerpc with linux-2.6.16
Passed on ppc64 with linux-2.6.14
Passed on ppc64 with linux-2.6.17
Passed on ia64 with linux-2.6.17
Passed on ia64 with linux-2.6.14
Passed on ia64 with linux-2.6.16
Passed on ia64 with linux-2.6.12
Passed on ia64 with linux-2.6.15

Failed:
Build failed on ia64 with linux-2.6.16.21-0.8-default
Log:
/home/vlad/tmp/ofa_1_2_kernel-20070207-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c:380:
 error: implicit declaration of function âregister_netevent_notifierâ
/home/vlad/tmp/ofa_1_2_kernel-20070207-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c:
 In function âaddr_cleanupâ:
/home/vlad/tmp/ofa_1_2_kernel-20070207-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.c:386:
 error: implicit declaration of function âunregister_netevent_notifierâ
make[4]: *** 
[/home/vlad/tmp/ofa_1_2_kernel-20070207-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core/addr.o]
 Error 1
make[3]: *** 
[/home/vlad/tmp/ofa_1_2_kernel-20070207-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband/core]
 Error 2
make[2]: *** 
[/home/vlad/tmp/ofa_1_2_kernel-20070207-0200_linux-2.6.16.21-0.8-default_ia64_check/drivers/infiniband]
 Error 2
make[1]: *** 
[_module_/home/vlad/tmp/ofa_1_2_kernel-20070207-0200_linux-2.6.16.21-0.8-default_ia64_check]
 Error 2
make[1]: Leaving directory 
`/home/vlad/kernel.org/ia64/linux-2.6.16.21-0.8-default'
make: *** [kernel] Error 2
--

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] Problem with SRP with 512 byte sector size with > 2 TB LUNs

2007-02-07 Thread Thomas Großmann

Hello,

We have a disk-array connected over a Mellanox MT25204 IB
card. We have configured LUNs with a size of over 2 TB with
512 byte sector size and are using OpenIB 1.1 and SUSE SLES 10 x86_64. 
I get the following output in /var/log/messages when adding a LUN:

Feb  2 09:59:57 data1 kernel:   Vendor: DDN   Model: S2A 9550  
Rev: 3.03
Feb  2 09:59:57 data1 kernel:   Type:   Direct-Access  
ANSI SCSI revision: 06
Feb  2 09:59:57 data1 kernel: sdc : very big device. try to use READ 
CAPACITY(16).
Feb  2 09:59:57 data1 kernel: sdc : READ CAPACITY(16) failed.
Feb  2 09:59:57 data1 kernel: sdc : status=0, message=00, host=5, driver=00
Feb  2 09:59:57 data1 kernel: sdc : use 0x as device size
Feb  2 09:59:57 data1 kernel: SCSI device sdc: 4294967296 512-byte hdwr 
sectors (2199023 MB)
Feb  2 09:59:57 data1 kernel: sdc: Write Protect is off
Feb  2 09:59:57 data1 kernel: sdc: Mode Sense: 97 00 10 08
Feb  2 09:59:57 data1 kernel: SCSI device sdc: drive cache: write back w/ FUA
Feb  2 09:59:57 data1 kernel: sdc : very big device. try to use READ 
CAPACITY(16).
Feb  2 09:59:57 data1 kernel: sdc : READ CAPACITY(16) failed.
Feb  2 09:59:57 data1 kernel: sdc : status=0, message=00, host=5, driver=00
Feb  2 09:59:57 data1 kernel: sdc : use 0x as device size
Feb  2 09:59:57 data1 kernel: SCSI device sdc: 4294967296 512-byte hdwr 
sectors (2199023 MB)
Feb  2 09:59:57 data1 kernel: sdc: Write Protect is off
Feb  2 09:59:57 data1 kernel: sdc: Mode Sense: 97 00 10 08
Feb  2 09:59:57 data1 kernel: SCSI device sdc: drive cache: write back w/ FUA
Feb  2 09:59:57 data1 kernel:  sdc: unknown partition table
Feb  2 09:59:57 data1 kernel: sd 8:0:0:0: Attached scsi disk sdc
Feb  2 09:59:57 data1 kernel: sd 8:0:0:0: Attached scsi generic sg2 type 0

I found in the Changelog of kernel 2.6.20 the following instruction:
target_host->max_cmd_len = sizeof ((struct srp_cmd *) (void *) 0L)->cdb;
(added to the function srp_create_target to achieve READ CAPACITY(16) )
and added it to the ib_srp module of OpenIB 1.1. 

The output was then:
Feb  5 17:53:07 data1 kernel:   Vendor: DDN   Model: S2A 9550  
Rev: 3.03
Feb  5 17:53:07 data1 kernel:   Type:   Direct-Access  
ANSI SCSI revision: 06
Feb  5 17:53:07 data1 kernel: sdc : very big device. try to use READ 
CAPACITY(16).
Feb  5 17:53:07 data1 kernel: sdc : sector size 0 reported, assuming 512.
Feb  5 17:53:07 data1 kernel: SCSI device sdc: 1 512-byte hdwr sectors (0 MB)
Feb  5 17:53:07 data1 kernel: sdc: Write Protect is off
Feb  5 17:53:07 data1 kernel: sdc: Mode Sense: 97 00 10 08
Feb  5 17:53:07 data1 kernel: SCSI device sdc: drive cache: write back w/ FUA
Feb  5 17:53:07 data1 kernel: sdc : very big device. try to use READ 
CAPACITY(16).
Feb  5 17:53:07 data1 kernel: sdc : sector size 0 reported, assuming 512.
Feb  5 17:53:07 data1 kernel: SCSI device sdc: 1 512-byte hdwr sectors (0 MB)
Feb  5 17:53:07 data1 kernel: sdc: Write Protect is off
Feb  5 17:53:07 data1 kernel: sdc: Mode Sense: 97 00 10 08
Feb  5 17:53:07 data1 kernel: SCSI device sdc: drive cache: write back w/ FUA
Feb  5 17:53:07 data1 kernel:  sdc: unknown partition table
Feb  5 17:53:07 data1 kernel: sd 9:0:0:0: Attached scsi disk sdc
Feb  5 17:53:07 data1 kernel: sd 9:0:0:0: Attached scsi generic sg2 type 0

The same output was shown when trying to add a LUN using kernel 2.6.20.

Is it possible to add LUNs with > 2 TB and 512 byte sectors ?
Why does the READ CAPACITY(16) comand fail ?

Kind regards,
Thomas

-- 
 Thomas Großmann                
 High Performance Computing Center Stuttgart (HLRS)                             
         
  
 Allmandring 30                                                
 70550 Stuttgart, Germany   

 E-Mail: [EMAIL PROTECTED]                                                      
        
 Phone: ++49-711-685-65529
 Fax  : ++49-711-685-65832

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] resolving sending mails from OFA new server

2007-02-07 Thread Michael S. Tsirkin

> Michael,
> 
> I put something together at [EMAIL PROTECTED]  I did not
> get a chance to try it out, so let me know if it's working out for you.
> Keywords used in the e-mail format come from the bugmail_help.html
> included w/ Bugzilla (it is posted at
> http://www.openfabrics.org/docs/bugmail_help.html).  
> 
> Michael

I just tried both and it worked flawlessly.
Thanks, very much!

Guiys, you should try the email gateway, it is amazing
especially for adding text to bugs: just put
[Bug XXX] in mail subject.

Michael, one small request: could the messages that bugzilla
generates have From field as [EMAIL PROTECTED]
and not [EMAIL PROTECTED] as today?
This way I can add text to a bug just by replying to it.

Thanks,
MST
-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] RDMA/iwcm: Bugs in cm_conn_req_handler()

2007-02-07 Thread Michael S. Tsirkin

-   set_bit(IWCM_F_CALLBACK_DESTROY, &cm_id_priv->flags);
-   destroy_cm_id(cm_id);
-   if (atomic_read(&cm_id_priv->refcount)==0)
-   kfree(cm_id);
+   BUG_ON(atomic_read(&cm_id_priv->refcount) != 1);
+   iw_cm_reject(cm_id, NULL, 0);
+   iw_destroy_cm_id(cm_id);

And BTW, lots of lines with atomic_read()==0 in them have broken whitespace
in iwcm.c. Does anyone care enough to fix them?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] patches to 2.6.19.1 kernel for switch Operation

2007-02-07 Thread Hal Rosenstock

Suri,

On Mon, 2007-02-05 at 12:31, Suresh Shelvapille wrote:
> Hal:
> 
> We are upgrading to 2.6.19.1 kernel

Glad to hear this.

>  and I finally ported the changes
> required for Switch operation from my current kernel (2.6.12) version. 
> 
> I have tested these changes for a switch with different SM(s). But I need
> the community's help to test the changes on different HCAs to make sure I
> have not broken anything.
> 
> Please see if the changes look OK.

Have you tested these changes on end nodes (HCAs) ? If so, what tests
have you performed ?

It would be easier to comment if your changes were included inline
rather than as attachments.

Also, you should attach your S-O-B line.

Thanks.

-- Hal

> Thanks,
> Suri



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] RDMA/iwcm: Bugs in cm_conn_req_handler()

2007-02-07 Thread Steve Wise

This looks good for 2.6.21 IMO.

Acked-by: Steve Wise <[EMAIL PROTECTED]>


On Wed, 2007-02-07 at 12:26 +0530, Krishna Kumar wrote:
> (I had submitted this once earlier but got no response)
> 
> cm_conn_req_handler() :
>   1. Calling destroy_cm_id leaks 3 work 'free' list entries.
>   2. cm_id is freed up wrongly and not cm_id_priv (though the
>  effect is the same since cm_id is the first element of
>  cm_id_priv, but still a bug if the top level cm_id changes).
>   3. Reject message has to be sent on failure. Tested this
>  without the fix and found the client hangs, waited for about
>  20 mins and then did Ctrl-C but the process is unkillable.
>   4. Setting IWCM_F_CALLBACK_DESTROY on cm_id (child handle)
>  doesn't achieve anything, since checking for
>  IWCM_F_CALLBACK_DESTROY in the parent's flag (in
>  cm_work_handler) means that this will never be true.
> 
> All 4 above cases were tested by injecting random error in
> iw_conn_req_handler() and running rdma_bw/krping, they were
> confirmed. I added the BUG_ON() to confirm the earlier check
> for id_priv->refcount==0 should always be true (and could be
> removed).
> 
> Patch against 2.6.20
> 
> Signed-off-by: Krishna Kumar <[EMAIL PROTECTED]>
> ---
> diff -ruNp org/drivers/infiniband/core/iwcm.c 
> new/drivers/infiniband/core/iwcm.c
> --- org/drivers/infiniband/core/iwcm.c2007-01-24 10:25:26.0 
> +0530
> +++ new/drivers/infiniband/core/iwcm.c2007-01-24 10:25:31.0 
> +0530
> @@ -647,10 +647,9 @@ static void cm_conn_req_handler(struct i
>   /* Call the client CM handler */
>   ret = cm_id->cm_handler(cm_id, iw_event);
>   if (ret) {
> - set_bit(IWCM_F_CALLBACK_DESTROY, &cm_id_priv->flags);
> - destroy_cm_id(cm_id);
> - if (atomic_read(&cm_id_priv->refcount)==0)
> - kfree(cm_id);
> + BUG_ON(atomic_read(&cm_id_priv->refcount) != 1);
> + iw_cm_reject(cm_id, NULL, 0);
> + iw_destroy_cm_id(cm_id);
>   }
>  
>  out:
> 
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Unknown SMP Recv

2007-02-07 Thread Michael Arndt

Hi,

> This sounds like the good response and appears to traverse your 3 nodes.

Yes, that's right

> This is the bogus extra response. Since your sender node is unmodified,
> it is unlikely an issue there. It seems like the intermediate node might
> be responding and forwarding the packet on although it should only do
> one of those two things. You did mention the SMI on the intermediate
> node was modified, right ? Also, note that the SMI is not validated and
> has some known issues for switches (e.g. intermediate hops).

The sender and the responder is unmodified (node1, node3). I have debugged 
the hole SMI, ib_mad_recv_done_handler and handle_outgoing_dr_smp functions 
and did not found the bogus extra response. As debugged is the responder 
sending one packet, which would be right and the intermediate node isn't 
receiving an bogus extra packet. So the extra packet didn't pass the SMI 
that's for sure. I use the libibumad to implement the forwarding mechanism 
and also use the select function to catch any receive I should handle. Maybe 
there is something wrong.

Thanks Michael Arndt 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] RDMA/iwcm: Bugs in cm_conn_req_handler()

2007-02-07 Thread Tom Tucker

On Wed, 2007-02-07 at 08:24 -0600, Steve Wise wrote:
> This looks good for 2.6.21 IMO.
> 
> Acked-by: Steve Wise <[EMAIL PROTECTED]>
> 
> 
> On Wed, 2007-02-07 at 12:26 +0530, Krishna Kumar wrote:
> > (I had submitted this once earlier but got no response)


> > 
> > cm_conn_req_handler() :
> > 1. Calling destroy_cm_id leaks 3 work 'free' list entries.

When dealloc_work_entries was added to the iw_destroy_cm_id function, it
needed ALSO to be added everywhere destroy_cm_id was called. So you need
to call dealloc_work_entries everywhere you call destroy_cm_id or this
leak remains all over the place, e.g. cm_work_handler

> > 2. cm_id is freed up wrongly and not cm_id_priv (though the
> >effect is the same since cm_id is the first element of
> >cm_id_priv, but still a bug if the top level cm_id changes).
> > 3. Reject message has to be sent on failure. Tested this
> >without the fix and found the client hangs, waited for about
> >20 mins and then did Ctrl-C but the process is unkillable.

This should be added to the switch statement in destroy_cm_id (not here)
so that it doesn't need to be added everywhere the cm_id is destroyed
when it's in a state that requires a reject.

> > 4. Setting IWCM_F_CALLBACK_DESTROY on cm_id (child handle)
> >doesn't achieve anything, since checking for
> >IWCM_F_CALLBACK_DESTROY in the parent's flag (in
> >cm_work_handler) means that this will never be true.

destroy_cm_id exists to allow cm_id to be destroyed without waiting. If
you're changing it to iw_destroy_cm_id, that may be fine, but all the
setbit/getbit stuff is a side show.  You must be certain that
iw_destroy_cm_id can't wait. If it does, you'll shut down the entire
IWCM.
 
> > 
> > All 4 above cases were tested by injecting random error in
> > iw_conn_req_handler() and running rdma_bw/krping, they were
> > confirmed. I added the BUG_ON() to confirm the earlier check
> > for id_priv->refcount==0 should always be true (and could be
> > removed).
> > 
> > Patch against 2.6.20
> > 
> > Signed-off-by: Krishna Kumar <[EMAIL PROTECTED]>
> > ---
> > diff -ruNp org/drivers/infiniband/core/iwcm.c 
> > new/drivers/infiniband/core/iwcm.c
> > --- org/drivers/infiniband/core/iwcm.c  2007-01-24 10:25:26.0 
> > +0530
> > +++ new/drivers/infiniband/core/iwcm.c  2007-01-24 10:25:31.0 
> > +0530
> > @@ -647,10 +647,9 @@ static void cm_conn_req_handler(struct i
> > /* Call the client CM handler */
> > ret = cm_id->cm_handler(cm_id, iw_event);
> > if (ret) {
> > -   set_bit(IWCM_F_CALLBACK_DESTROY, &cm_id_priv->flags);
> > -   destroy_cm_id(cm_id);
> > -   if (atomic_read(&cm_id_priv->refcount)==0)
> > -   kfree(cm_id);
> > +   BUG_ON(atomic_read(&cm_id_priv->refcount) != 1);
> > +   iw_cm_reject(cm_id, NULL, 0);
> > +   iw_destroy_cm_id(cm_id);
> > }
> >  
> >  out:
> > 
> > ___
> > openib-general mailing list
> > openib-general@openib.org
> > http://openib.org/mailman/listinfo/openib-general
> > 
> > To unsubscribe, please visit 
> > http://openib.org/mailman/listinfo/openib-general
> > 
> 
> 
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] issues with compilation of ofed 1.2

2007-02-07 Thread Yosef Etigin



**
1. When compiling without ibutils I get the following error:


RPM build errors:
user vladsk does not exist - using root
group vladsk does not exist - using root
user vladsk does not exist - using root
group vladsk does not exist - using root
File not found by glob: /var/tmp/OFED/usr/local/ofed/man/man1/ibv_*
File not found by glob: /var/tmp/OFED/usr/local/ofed/man/man8/opensm*
File not found by glob: /var/tmp/OFED/usr/local/ofed/man/man8/osmtest*
ERROR: Failed executing "rpmbuild --rebuild --define '_topdir /var/tmp/OFEDRPM' 
--define '_prefix /usr/local/ofed ' --define 'build_root /var/tmp/OFED ' 
--define 'configure_options --with-ipoibtools --with-libcxgb3 --with-libibcm 
--with-libibcommon --with-libibmad --with-libibumad --with-libibverbs 
--with-libmthca --with-opensm --with-librdmacm --with-libsdp --with-sdpnetstat 
--with-mstflint --with-perftest --mandir=/usr/local/ofed /man' --define 
'configure_options32 %{nil}' --define 'build_32bit 0' 
/tmp/regtest/OFED-1.2-20070205-1823/SRPMS/ofa_user-1.2-alpha1.src.rpm"

**
2. After adding ibutils, compilation passes on RH4 (U4 and U3)
However, when execution application that uses libibverbs, i get ths error:

libibverbs: Warning: couldn't open config directory 
'/usr/local/ofed/etc/libibverbs.d'.
libibverbs: Warning: no userspace device-specific driver found for 
/sys/class/infiniband_verbs/uverbs0
No IB devices found

Workaround: copy libibverbs.d from installation of ofed 1.2 from daily build 
packages to /usr/loca/ofed/etc/


**
3. Uninstall script does not always successfully remove libcxgb3 package



**
4. When compiling on SLES10 I get this error:

MTHOME directory /var/tmp/OFED/usr/local/ofed does not exist.
Exiting.
error: Bad exit status from /var/tmp/rpm-tmp.37387 (%build)


RPM build errors:
user rowland does not exist - using root
group mvapich does not exist - using root
user rowland does not exist - using root
group mvapich does not exist - using root
Bad exit status from /var/tmp/rpm-tmp.37387 (%build)
ERROR: Failed executing "rpmbuild --rebuild --define '_topdir /var/tmp/OFEDRPM' 
--define '_name mvapich2_gcc' --define '_prefix 
/usr/local/ofed/mpi/gcc/mvapich2-0.9.8-1' --define 'build_root /var/tmp/OFED' 
--define 'open_ib_home /usr/local/ofed' --define 'ofed_build_root 
/var/tmp/OFED' --define 'comp_env CC=gcc CXX=g++ F77=gfortran' --define 'iwarp 
0' --define 'romio 1' --define 'shared_libs 1' --define 'auto_req 1' 
/tmp/OFED-1.2-20070205-1823/SRPMS/mvapich2-0.9.8-1.src.rpm"

**
5. When compiling on SLES10 SP1 I get this error:

In file included from /usr/src/linux-2.6.16.37-0.9/include/linux/inetdevice.h:7,
 from 
/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/addr.c:32:
/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/kernel_addons/backport/2.6.16_sles10/include/linux/netdevice.h:7:
 error: redefinition of ânetif_tx_lockâ
/usr/src/linux-2.6.16.37-0.9/include/linux/netdevice.h:927: error: previous 
definition of ânetif_tx_lockâ was here
/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/kernel_addons/backport/2.6.16_sles10/include/linux/netdevice.h:
 In function ânetif_tx_lockâ:
/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/kernel_addons/backport/2.6.16_sles10/include/linux/netdevice.h:8:
 error: âstruct net_deviceâ has no member named âxmit_lockâ
/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/kernel_addons/backport/2.6.16_sles10/include/linux/netdevice.h:
 At top level:
/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/kernel_addons/backport/2.6.16_sles10/include/linux/netdevice.h:13:
 error: redefinition of ânetif_tx_unlockâ
/usr/src/linux-2.6.16.37-0.9/include/linux/netdevice.h:947: error: previous 
definition of ânetif_tx_unlockâ was here
/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/kernel_addons/backport/2.6.16_sles10/include/linux/netdevice.h:
 In function ânetif_tx_unlockâ:
/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/kernel_addons/backport/2.6.16_sles10/include/linux/netdevice.h:15:
 error: âstruct net_deviceâ has no member named âxmit_lockâ
/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/addr.c: At top 
level:
/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/addr.c:61: 
warning: initialization from incompatible pointer type
make[6]: *** 
[/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/addr.o] Error 1



**
6. On [PPC64/Sles10] I get this compilaton error:

make[2]: Entering directory 
`/var/tmp/OFEDRPM/BUILD/ofa_user-1.2/src/userspace/librdmacm'
if /bin/sh ./libtool --tag=CC --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I. 
-I./include  -I../libibverbs/in

[openib-general] dapltest?

2007-02-07 Thread Steve Wise

Hey Arlin,

Shouldn't dapl/test be shipped with OFED?  It appears not to be...

Steve.






___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IB/ipoib get net_device from ipoib_neigh instead of linux neighbour

2007-02-07 Thread Moni Shoua


> Another concern: assume that one device goes away (e.g. hotplug).
> It seems that neighbours whose dev field point to another device, will not be 
> destroyed.
> Correct?
I agree.
> 
> Therefore in your design, it seems that to_ipoib_neigh()->dev
> will get us a pointer to device that has been removed already.
> 
I agree that this is a problem. It think it would be best to prevent an IPoIB 
device
from disappearing or from ib_ipoib from being unloaded as long as IPoIB
device is a slave. Unfortunately, I don't see how this can be done just
by fixing something in bonding or IPoIB. 
However, any slave knows he has a master (dev->master). 
What do you think about a solution where IPoIB first tries to clean up the
neighbours that belong to it's master before deleting the IPoIB device?

>> Furthermore, bond_setup_by_slave is called only for non
>> Ethernet devices (we consider to change the logic to "called only for
>> IPoIB devices just for safety).
> 
> Why is this necessary, BTW?
> 
If we don't do that, we get a memory leak because the neigh destructor will
never be called for non IPoIB devices although they carry ipoib_neigh
with them.



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] Open MPI rpmbuild fails in OFED-1.2

2007-02-07 Thread Vladimir Sokolovsky

Hi Jeff,
Please remove %build macro from the RPM spec file.
On SuSE distros it removes RPM_BUILD_ROOT.

Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.23343
+ umask 022
+ cd /var/tmp/OFEDRPM/BUILD
+ /bin/rm -rf /var/tmp/OFED
++ dirname /var/tmp/OFED
+ /bin/mkdir -p /var/tmp
+ /bin/mkdir /var/tmp/OFED
+ cd openmpi-1.2b4ofedr13470
+ fortify_source=1
+ test '' '!=' ''
...

-- 
Vladimir Sokolovsky <[EMAIL PROTECTED]>
Mellanox Technologies Ltd.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Open MPI rpmbuild fails in OFED-1.2

2007-02-07 Thread Jeff Squyres

The "%build" directive is not just a macro, it's also a section  
qualifier indicating the beginning of the build section.  From

http://fedora.redhat.com/docs/drafts/rpm-guide-en/ch08s02.html#id2966770

"The build section starts with a %build statement."

Is there something else that I should replace it with that will also  
start the build section?



On Feb 7, 2007, at 11:42 AM, Vladimir Sokolovsky wrote:

> Hi Jeff,
> Please remove %build macro from the RPM spec file.
> On SuSE distros it removes RPM_BUILD_ROOT.
>
> Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.23343
> + umask 022
> + cd /var/tmp/OFEDRPM/BUILD
> + /bin/rm -rf /var/tmp/OFED
> ++ dirname /var/tmp/OFED
> + /bin/mkdir -p /var/tmp
> + /bin/mkdir /var/tmp/OFED
> + cd openmpi-1.2b4ofedr13470
> + fortify_source=1
> + test '' '!=' ''
> ...
>
> -- 
> Vladimir Sokolovsky <[EMAIL PROTECTED]>
> Mellanox Technologies Ltd.


-- 
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Open MPI rpmbuild fails in OFED-1.2

2007-02-07 Thread Vladimir Sokolovsky

I propose to replace %build by %install.
Otherwise %build removes /var/tmp/OFED (on SuSE) which includes all
installed libraries.

Regards,
Vladimir

On Wed, 2007-02-07 at 11:52 -0500, Jeff Squyres wrote:
> The "%build" directive is not just a macro, it's also a section  
> qualifier indicating the beginning of the build section.  From
> 
> http://fedora.redhat.com/docs/drafts/rpm-guide-en/ch08s02.html#id2966770
> 
> "The build section starts with a %build statement."
> 
> Is there something else that I should replace it with that will also  
> start the build section?
> 
> 
> 
> On Feb 7, 2007, at 11:42 AM, Vladimir Sokolovsky wrote:
> 
> > Hi Jeff,
> > Please remove %build macro from the RPM spec file.
> > On SuSE distros it removes RPM_BUILD_ROOT.
> >
> > Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.23343
> > + umask 022
> > + cd /var/tmp/OFEDRPM/BUILD
> > + /bin/rm -rf /var/tmp/OFED
> > ++ dirname /var/tmp/OFED
> > + /bin/mkdir -p /var/tmp
> > + /bin/mkdir /var/tmp/OFED
> > + cd openmpi-1.2b4ofedr13470
> > + fortify_source=1
> > + test '' '!=' ''
> > ...
> >
> > -- 
> > Vladimir Sokolovsky <[EMAIL PROTECTED]>
> > Mellanox Technologies Ltd.
> 
> 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Problem with SRP with 512 byte sector size with > 2 TB LUNs

2007-02-07 Thread Roland Dreier

 > Is it possible to add LUNs with > 2 TB and 512 byte sectors ?
 > Why does the READ CAPACITY(16) comand fail ?

It seems that the DDN target is not reporting good information -- I
don't see anything obviously wrong in what the kernel is doing (now
that SRP sends a READ CAPACITY command).  Do you know if the same type
of config works over fibre channel?

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Open MPI rpmbuild fails in OFED-1.2

2007-02-07 Thread Jeff Squyres

My $0.02: This is another in a growing list of issues reflecting the  
whole "build everything in DESTDIR" is a problematic approach.

I have distinct %build and %install sections in the Open MPI specfile  
-- they're really intended for two different things.  Specifically: I  
wouldn't call the SuSE %build behavior a bug -- it reflects how they  
want RPM designers to write RPMs.  It appears that we're trying to  
circumvent their intended approach.  Shouldn't that be a warning  
flag?  :-)

I've heard offhand comments that there were problems with trying to  
use chroot for building OFED.  The two that I'm aware of are:

1. need to be root to make a chroot.
My thought: who cares?
2. takes up lots of extra disk space.
My thought: does it matter?  Do we know of anyone who has small- 
disk servers who are building OFED? (and/or: can you hard-link files  
to make a chroot environment?  I'm don't know)

Are there other issues?  More specifically, which is going to be  
simpler: a) fixing the growing list of problems with the DESTDIR  
approach or b) switching to a chroot environment?

A simple search for "chroot" on freshmeat, for example, turns up a  
number of projects that can be used to help automate the creation of  
chroot environments.

Again -- this is all my $0.02.  Comments?

On Feb 7, 2007, at 12:00 PM, Vladimir Sokolovsky wrote:

> I propose to replace %build by %install.
> Otherwise %build removes /var/tmp/OFED (on SuSE) which includes all
> installed libraries.
>
> Regards,
> Vladimir
>
> On Wed, 2007-02-07 at 11:52 -0500, Jeff Squyres wrote:
>> The "%build" directive is not just a macro, it's also a section
>> qualifier indicating the beginning of the build section.  From
>>
>> http://fedora.redhat.com/docs/drafts/rpm-guide-en/ 
>> ch08s02.html#id2966770
>>
>> "The build section starts with a %build statement."
>>
>> Is there something else that I should replace it with that will also
>> start the build section?
>>
>>
>>
>> On Feb 7, 2007, at 11:42 AM, Vladimir Sokolovsky wrote:
>>
>>> Hi Jeff,
>>> Please remove %build macro from the RPM spec file.
>>> On SuSE distros it removes RPM_BUILD_ROOT.
>>>
>>> Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.23343
>>> + umask 022
>>> + cd /var/tmp/OFEDRPM/BUILD
>>> + /bin/rm -rf /var/tmp/OFED
>>> ++ dirname /var/tmp/OFED
>>> + /bin/mkdir -p /var/tmp
>>> + /bin/mkdir /var/tmp/OFED
>>> + cd openmpi-1.2b4ofedr13470
>>> + fortify_source=1
>>> + test '' '!=' ''
>>> ...
>>>
>>> -- 
>>> Vladimir Sokolovsky <[EMAIL PROTECTED]>
>>> Mellanox Technologies Ltd.
>>
>>

-- 
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IB/ipoib get net_device from ipoib_neigh instead of linux neighbour

2007-02-07 Thread Michael S. Tsirkin

> Quoting Moni Shoua <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH] IB/ipoib get net_device from ipoib_neigh instead of 
> linux neighbour
> 
> 
> > Another concern: assume that one device goes away (e.g. hotplug).
> > It seems that neighbours whose dev field point to another device, will not 
> > be destroyed.
> > Correct?
>
> I agree.
>
> > Therefore in your design, it seems that to_ipoib_neigh()->dev
> > will get us a pointer to device that has been removed already.
> > 
> I agree that this is a problem.

I think we can solve this if we track all ipoib neighbours, like we do for old 
kernels,
and then flush ipoib neighbours on any hotplug event.
Roland, does this sound too awful?

> It think it would be best to prevent an IPoIB device
> from disappearing or from ib_ipoib from being unloaded as long as IPoIB
> device is a slave. Unfortunately, I don't see how this can be done just
> by fixing something in bonding or IPoIB. 

So hotplug is blocked potentially forever?
This does not sound good.

> However, any slave knows he has a master (dev->master). 
> What do you think about a solution where IPoIB first tries to clean up the
> neighbours that belong to it's master before deleting the IPoIB device?

How?

> >> Furthermore, bond_setup_by_slave is called only for non
> >> Ethernet devices (we consider to change the logic to "called only for
> >> IPoIB devices just for safety).
> > 
> > Why is this necessary, BTW?
> > 
> If we don't do that, we get a memory leak because the neigh destructor will
> never be called for non IPoIB devices although they carry ipoib_neigh
> with them.

How can this happen? If it does, I think we are back to where we started:
to_ipoib_neigh is broken for non-IPoIB device.
I thought you said only devices of the same type can be paired?


-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCHv6 RFC] IPoIB CM Experimental support

2007-02-07 Thread Roland Dreier

 > Well, randomness is a resource after all, and since we don't have the 
 > additional
 > security provided by PSNs in IPoIB UD, it seemed we do not need it for
 > IPoIB CM either. So maybe the right thing is just to remove the FIXME 
 > comment.

random32() doesn't use up any entropy. Random PSNs help avoid problems
with stale connections, so I think we should do it.

I noticed some funny code in ipoib_cm_skb_reap():

__be32 mtu = cpu_to_be32(priv->mcast_mtu);

// htonl(__be32)??
icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, 
htonl(mtu));
// no htonl() here -- is this correct?
icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu, dev);

what is the right thing?

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-02-07 Thread Sean Hefty

> Oops, I'll fix these style things and send a new patch.

Jason, what's the status of this patch?  (I ask because I'm starting to look at 
router support in the stack.)

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCHv6 RFC] IPoIB CM Experimental support

2007-02-07 Thread Michael S. Tsirkin

> Quoting Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: [PATCHv6 RFC] IPoIB CM Experimental support
> 
>  > Well, randomness is a resource after all, and since we don't have the 
> additional
>  > security provided by PSNs in IPoIB UD, it seemed we do not need it for
>  > IPoIB CM either. So maybe the right thing is just to remove the FIXME 
> comment.
> 
> random32() doesn't use up any entropy. Random PSNs help avoid problems
> with stale connections, so I think we should do it.

Well, stale connections don't pose any real problems for IPoIB CM - worst case a
connnection is torn down and recreated.  But I don't have a strong opinion
anyway - that's why I put the FIXME there. So I'm OK with random32, too.

> I noticed some funny code in ipoib_cm_skb_reap():
> 
>   __be32 mtu = cpu_to_be32(priv->mcast_mtu);
> 
> // htonl(__be32)??
>   icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, 
> htonl(mtu));
> // no htonl() here -- is this correct?
>   icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu, dev);
> 
> what is the right thing?

Both are right I think.
These two functions seem to accept parameters in different format:

include/net/icmp.h:extern void  icmp_send(struct sk_buff *skb_in,  int type, int
  code, __be32 info);


include/linux/icmpv6.h:extern voidicmpv6_send(struct sk_buff 
*skb,
include/linux/icmpv6.h-   int type, int 
code,
include/linux/icmpv6.h-   __u32 info,
include/linux/icmpv6.h-   struct net_device 
*dev);

BTW, I just looked at ip_gre.c and it has the same code.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] dapltest?

2007-02-07 Thread Arlin Davis

Steve Wise wrote:

>Hey Arlin,
>
>Shouldn't dapl/test be shipped with OFED?  It appears not to be...
>  
>

Yes,  I will try to get to this by next week at the latest. Can you add 
a bugzilla report to track against?

-arlin

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCHv6 RFC] IPoIB CM Experimental support

2007-02-07 Thread Roland Dreier

 > > I noticed some funny code in ipoib_cm_skb_reap():
 > > 
 > >__be32 mtu = cpu_to_be32(priv->mcast_mtu);
 > > 
 > > // htonl(__be32)??
 > >icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, 
 > > htonl(mtu));
 > > // no htonl() here -- is this correct?
 > >icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu, dev);
 > > 
 > > what is the right thing?
 > 
 > Both are right I think.

You're right -- the mistake is making mtu __be32 and preswapping it.
I'll fix it up in my tree.

 > These two functions seem to accept parameters in different format:
 > 
 > include/net/icmp.h:extern void  icmp_send(struct sk_buff *skb_in,  int type, 
 > int
 >code, __be32 info);
 > 
 > 
 > include/linux/icmpv6.h:extern voidicmpv6_send(struct sk_buff 
 > *skb,
 > include/linux/icmpv6.h-   int type, int 
 > code,
 > include/linux/icmpv6.h-   __u32 info,
 > include/linux/icmpv6.h-   struct 
 > net_device *dev);
 > 
 > BTW, I just looked at ip_gre.c and it has the same code.

no, it leaves mtu as an int rather than swapping it.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] dapltest?

2007-02-07 Thread Scott Weitzenkamp (sweitzen)

I opened bug 350, I would like dapltest (and any other useful dapl test
programs) too.

Scott 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Arlin Davis
> Sent: Wednesday, February 07, 2007 11:03 AM
> To: Steve Wise
> Cc: openib-general; Arlin Davis
> Subject: Re: [openib-general] dapltest?
> 
> Steve Wise wrote:
> 
> >Hey Arlin,
> >
> >Shouldn't dapl/test be shipped with OFED?  It appears not to be...
> >  
> >
> 
> Yes,  I will try to get to this by next week at the latest. 
> Can you add 
> a bugzilla report to track against?
> 
> -arlin
> 
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
> 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-02-07 Thread Roland Dreier

 > I was going to resend it after Roland's earlier patch to clean up the 
 > ib_init_ah_from_path was accepted..

Sorry, I started having second thoughts about the part about changing
it to return void (it seems more sensible to check it the other places
it's called).  But I'll look at that again soon.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-02-07 Thread Jason Gunthorpe

On Wed, Feb 07, 2007 at 10:55:00AM -0800, Sean Hefty wrote:
> >Oops, I'll fix these style things and send a new patch.
> 
> Jason, what's the status of this patch?  (I ask because I'm starting to 
> look at router support in the stack.)

I was going to resend it after Roland's earlier patch to clean up the 
ib_init_ah_from_path was accepted..

I didn't get too far on getting CMA to work. Beyond the bad HopLimit
feild I was seeing Hal pointed out a number of problems in IBA that
would prevent it from working as is :<

Jason

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Immediate data question

2007-02-07 Thread Tang, Changqing


Roland:
This is a followup question. If one process uses
IBV_WR_SEND_WITH_IMM  and IBV_SEND_INLINE to send 8 bytes, but the
receiver process does not post the corresponding receive to the QP,
instead,  this receiver process and other processes are doing  heavy
RDMA_WRITE/READ traffic each other.

Does this pending SEND_WITH_IMM message affect the performance
of the receiver process ? Is this message buffered in the receiver's
HCA, or the sender retry and get RNR ack until receiver posts a receive
?

Thanks.

--CQ


> -Original Message-
> From: Roland Dreier [mailto:[EMAIL PROTECTED] 
> Sent: Monday, February 05, 2007 5:03 PM
> To: Tang, Changqing
> Cc: Michael S. Tsirkin; openib-general@openib.org
> Subject: Re: Immediate data question
> 
> Changqing> Thank you. Other than using immediate data to send
> Changqing> notification from one end to the other of a QP, is
> Changqing> there any other way to do this ? For example, can I
> Changqing> modify QP state from RTS to other state on one end, and
> Changqing> then the other end gets some notification when I query
> Changqing> the QP ?
> 
> Not that I know of.  You would need to do something that 
> triggers something to be sent on the wire, and I don't know 
> of any way to do that other than posting a work request.
> 
>  - R.
> 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Open MPI rpmbuild fails in OFED-1.2

2007-02-07 Thread Michael S. Tsirkin

> Quoting Jeff Squyres <[EMAIL PROTECTED]>:
> Subject: Re: Open MPI rpmbuild fails in OFED-1.2
> 
> My $0.02: This is another in a growing list of issues reflecting the  
> whole "build everything in DESTDIR" is a problematic approach.

I don't know much about RPM, and I am not exactly sure why are
our source RPMs so complicated.

However, with the plan configure/make we are able to
build all openfabrics components within build directory,
without any chroot tricks.

So let's not give up yet, IMO it is very nice to be able to build in
standard environment, without being root.

Note that what is biting us here is mostly the large number of modules:
simple single-module packages don't have this problem - and this
is really a design decision we took.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCHv6 RFC] IPoIB CM Experimental support

2007-02-07 Thread Michael S. Tsirkin

> Quoting Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: [PATCHv6 RFC] IPoIB CM Experimental support
> 
>  > > I noticed some funny code in ipoib_cm_skb_reap():
>  > > 
>  > >  __be32 mtu = cpu_to_be32(priv->mcast_mtu);
>  > > 
>  > > // htonl(__be32)??
>  > >  icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, 
> htonl(mtu));
>  > > // no htonl() here -- is this correct?
>  > >  icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu, dev);
>  > > 
>  > > what is the right thing?
>  > 
>  > Both are right I think.
> 
> You're right -- the mistake is making mtu __be32 and preswapping it.
> I'll fix it up in my tree.

Let me know when you push it out, I'll start testing it.

>  > These two functions seem to accept parameters in different format:
>  > 
>  > include/net/icmp.h:extern void  icmp_send(struct sk_buff *skb_in,  int 
> type, int
>  >  code, __be32 info);
>  > 
>  > 
>  > include/linux/icmpv6.h:extern voidicmpv6_send(struct 
> sk_buff *skb,
>  > include/linux/icmpv6.h-   int type, 
> int code,
>  > include/linux/icmpv6.h-   __u32 info,
>  > include/linux/icmpv6.h-   struct 
> net_device *dev);
>  > 
>  > BTW, I just looked at ip_gre.c and it has the same code.
> 
> no, it leaves mtu as an int rather than swapping it.

You are right of course. sparse would have found it.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] dapl broken for iWARP

2007-02-07 Thread Steve Wise

Arlin,

The OFED dapl code is assuming the responder_resources and
initiator_depth passed up on a connection request event are from the
remote peer.  This doesn't happen for iWARP.  In the current iWARP
specifications, its up to the application to exchange this information
somehow. So these are defaulting to 0 on the server side of any dapl
connection over iWARP.  

This is a fairly recent change, I think.  We need to come up with some
way to deal with this for OFED 1.2 IMO.



Steve.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-02-07 Thread Sean Hefty

> I didn't get too far on getting CMA to work. Beyond the bad HopLimit
> feild I was seeing Hal pointed out a number of problems in IBA that
> would prevent it from working as is :<

I've started thinking about what it would take to get the rdma cm to work 
across 
a router.  I think the rdma cm may need to treat IPv6 addresses as a GID for 
this to work across subnets, versus trying to map an ipoib IP address to a GID 
based on ARP.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] RFC ofed 1 2 kernel file structure

2007-02-07 Thread Michael S. Tsirkin

Repost. Could everyone please look at
git://git.openfabrics.org/~mst/newofed.git
and tell me whether this looks acceptable?

Thanks,
MST

Quoting r. Michael S. Tsirkin <[EMAIL PROTECTED]>:
Subject: Re: idea for ofed 1 2 kernel file structure

> Quoting  Michael S. Tsirkin <[EMAIL PROTECTED]>:
> It would easy to split OFED specific files In separate directory and have OFED
> 
> All out of tree modules we distribute would go there too.
> 
> What do others think about this?

OK, I didn't quite get whether the majority likes this or not,
so I created such a repository, extracted the ofed specific history
and imported it there.

Take a look here:
git://git.openfabrics.org/~mst/newofed.git

Build scripts will have to be adjusted to add
necessary kernel components that we use.

Another nice thing about this layout, is that users (if they so wish)
will be able to use just linux kernel source tarball instead of full linux
kernel git.

OFED maintainers, you are the primary users of the OFED git.
Please comment which layout is better for you.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] RFC ofed 1 2 kernel file structure

2007-02-07 Thread Sean Hefty

Michael S. Tsirkin wrote:
> Repost. Could everyone please look at
> git://git.openfabrics.org/~mst/newofed.git
> and tell me whether this looks acceptable?

I don't see anything listed for this off of the web site, and cloning it 
produces an empty tree.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Immediate data question

2007-02-07 Thread Tang, Changqing

> Changqing>Does this pending SEND_WITH_IMM message 
> affect the
> Changqing> performance of the receiver process ? Is this message
> Changqing> buffered in the receiver's HCA, or the sender retry and
> Changqing> get RNR ack until receiver posts a receive ?
> 
> If no receive is pending, then the responder sends an RNR NAK 
> and the sender will wait for the RNR timeout and retry, etc.

What I mean is that, is there any performance penalty for receiver's
overall performance if RNR happens continuously on one of the QP ?

--CQ


> 
>  - R.
> 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-02-07 Thread Sean Hefty

> Basically, if IB routers are used, and the IPoIB feature of *not*
> spanning a subnet is used (for scalabililty?) then you need an
> alternate way to specify addresses to rdma cm.

This was the case I was thinking of.  Without global IB name service 
resolution, 
how do you get the GID of the remote system?

> I agree that special casing some IPv6 addresses is a bad idea. It
> needs to be integrated correctly with NET and the routing table/etc

I haven't given this more than a few minutes of thought, but I was thinking 
more 
along the lines of a port having an assigned GID that's the same as an assigned 
IPv6 address.  (Is there some reason this wouldn't work?)  IP name service 
resolution would map the name to the IPv6 address.  The mapping from the IPv6 
address to a GID would then be straightforward, as opposed to using a mapping 
using ARP.

If name service resolution gives me an IPv6 address that's off of the local 
subnet, but the ARP response gives me an address that's on the local subnet, 
then I think we can assume that ARP was unsuccessful is resolving the address 
to 
the remote GID.  (I.e. the GID should be for a router.)  If this is true, then 
we need some other way to acquire the DGID.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] sharing qp between user and kernel

2007-02-07 Thread Pete Wyckoff

We're writing a kernel module that is an IB verbs consumer.  The
plan was to connect up the QP in userspace and do some preliminary
communication, then hand the QP to the kernel and let it use the QP
directly to do some more communication.  This works fine on ammasso,
but fails on mthca.

In particular, this code in mthca_alloc_wqe_buf():

/*
 * If this is a userspace QP, we don't actually have to 
 * allocate anything.  All we need is to calculate the WQE
 * sizes and the send_wqe_offset, so we're done now.
 */
if (pd->ibpd.uobject)
return 0;

prevents the allocation of space for WQEs required by
kernel-initiated posts.  Just commenting out this section led to
failures elsewhere (local prot error on a userspace cq poll for a
receive).

Before I dig into this anymore, do you expect this to work?  Are
there fundamental problems with QP sharing between user and kernel?
It would sure be nice not to have to stick the connection management
aspects into the kernel.

-- Pete

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] dapl broken for iWARP

2007-02-07 Thread Steve Wise


On Wed, 2007-02-07 at 14:02 -0600, Steve Wise wrote:
> Arlin,
> 
> The OFED dapl code is assuming the responder_resources and
> initiator_depth passed up on a connection request event are from the
> remote peer.  This doesn't happen for iWARP.  In the current iWARP
> specifications, its up to the application to exchange this information
> somehow. So these are defaulting to 0 on the server side of any dapl
> connection over iWARP.  
> 
> This is a fairly recent change, I think.  We need to come up with some
> way to deal with this for OFED 1.2 IMO.
> 

The IWCM could set these to the device max values for instance.

Steve.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] RFC ofed 1 2 kernel file structure

2007-02-07 Thread Bryan O'Sullivan

Michael S. Tsirkin wrote:

> All, pls try now.

This is similar in layout to the sort of tree we've used internally all 
along, so it's fine by me.  One small problem: I don't like the 
combination of lower and upper case names of makefile and Makefile in 
the top-level directory.

Also, it's no longer obvious to me to tell what kernel version the 
sources are pulled from.  I used to be able to check the top-level 
Makefile or git history, but I no longer know what to look at.

http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] RFC ofed 1 2 kernel file structure

2007-02-07 Thread Hoang-Nam Nguyen

> I could clone it:
Should be "I could not clone it"


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] RFC ofed 1 2 kernel file structure

2007-02-07 Thread Michael S. Tsirkin

> Quoting Sean Hefty <[EMAIL PROTECTED]>:
> Subject: Re: [openib-general] RFC ofed 1 2 kernel file structure
> 
> Michael S. Tsirkin wrote:
> > Repost. Could everyone please look at
> > git://git.openfabrics.org/~mst/newofed.git
> > and tell me whether this looks acceptable?
> 
> I don't see anything listed for this off of the web site, and cloning it 
> produces an empty tree.

Pls try again now.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-02-07 Thread Jason Gunthorpe

On Wed, Feb 07, 2007 at 01:35:25PM -0800, Roland Dreier wrote:
> Hmm, why is that?  Shouldn't IPoIB work through a router, and
> correctly get the GID of the final destination via ARP just fine?

Basically, if IB routers are used, and the IPoIB feature of *not*
spanning a subnet is used (for scalabililty?) then you need an
alternate way to specify addresses to rdma cm.

I agree that special casing some IPv6 addresses is a bad idea. It
needs to be integrated correctly with NET and the routing table/etc

Jason

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] RFC ofed 1 2 kernel file structure

2007-02-07 Thread Hoang-Nam Nguyen

Hi Michael,
> Repost. Could everyone please look at
> git://git.openfabrics.org/~mst/newofed.git
> and tell me whether this looks acceptable?
I could clone it:
$git clone git://git.openfabrics.org/~mst/newofed.git
fatal: Unable to look up git.openfabrics.org (Temporary failure in name
resolution)
fetch-pack from 'git://git.openfabrics.org/~mst/newofed.git' failed.
$git clone git://git.openfabrics.org/~mst/newofed.git
fatal: Unable to look up git.openfabrics.org (Temporary failure in name
resolution)
fetch-pack from 'git://git.openfabrics.org/~mst/newofed.git' failed.

I tried to use web git pointing to
http://www.openfabrics.org/git/?p=~mst/newofed.git;a=tree
and got this:
403 Forbidden - Reading tree failed

Is there something else I need to pay attention of?

Thanks
Nam


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Immediate data question

2007-02-07 Thread Roland Dreier

Changqing>  Does this pending SEND_WITH_IMM message affect the
Changqing> performance of the receiver process ? Is this message
Changqing> buffered in the receiver's HCA, or the sender retry and
Changqing> get RNR ack until receiver posts a receive ?

If no receive is pending, then the responder sends an RNR NAK and the
sender will wait for the RNR timeout and retry, etc.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-02-07 Thread Roland Dreier

 > I've started thinking about what it would take to get the rdma cm to
 > work across a router.  I think the rdma cm may need to treat IPv6
 > addresses as a GID for this to work across subnets, versus trying to
 > map an ipoib IP address to a GID based on ARP.

Hmm, why is that?  Shouldn't IPoIB work through a router, and
correctly get the GID of the final destination via ARP just fine?

If the RDMA CM treats IPv6 addresses as GIDs, then this breaks things
on a normal subnet with IPoIB interfaces configured with IPv6 addresses.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] RFC ofed 1 2 kernel file structure

2007-02-07 Thread Michael S. Tsirkin

> Quoting r. Hoang-Nam Nguyen <[EMAIL PROTECTED]>:
> Subject: Re: [openib-general] RFC ofed 1 2 kernel file structure
> 
> Hi Michael,
> > Repost. Could everyone please look at
> > git://git.openfabrics.org/~mst/newofed.git
> > and tell me whether this looks acceptable?
> I could clone it:
> $git clone git://git.openfabrics.org/~mst/newofed.git
> fatal: Unable to look up git.openfabrics.org (Temporary failure in name
> resolution)
> fetch-pack from 'git://git.openfabrics.org/~mst/newofed.git' failed.
> $git clone git://git.openfabrics.org/~mst/newofed.git
> fatal: Unable to look up git.openfabrics.org (Temporary failure in name
> resolution)
> fetch-pack from 'git://git.openfabrics.org/~mst/newofed.git' failed.
> 
> I tried to use web git pointing to
> http://www.openfabrics.org/git/?p=~mst/newofed.git;a=tree
> and got this:
> 403 Forbidden - Reading tree failed
> 
> Is there something else I need to pay attention of?

Pls try again.


-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-02-07 Thread Jason Gunthorpe

On Wed, Feb 07, 2007 at 12:24:08PM -0800, Sean Hefty wrote:
> >I didn't get too far on getting CMA to work. Beyond the bad HopLimit
> >feild I was seeing Hal pointed out a number of problems in IBA that
> >would prevent it from working as is :<
> 
> I've started thinking about what it would take to get the rdma cm to work 
> across a router.  I think the rdma cm may need to treat IPv6 addresses as a 
> GID for this to work across subnets, versus trying to map an ipoib IP 
> address to a GID based on ARP.

I don't think that is the main problem - though clearly the way things
are now (for better or worse) rdma cm requires the IPoIB subnet to
span all of the IB subnets.. The main problem with the protocol is in
the LID selection for routed paths on the passive side. It can't rely
on the active side to identify the lids if a router is involved.

One feature I've thought has been underused in IBA is the raw IPv6
packet feature. It would be nice to have a linux netdev interface to
be able to do IPv6 traffic using GID addressing. That would seem to me
to be the natural way to bolt native GID addressing into rdma
cm..

Jason

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] sharing qp between user and kernel

2007-02-07 Thread Steve Wise

On Wed, 2007-02-07 at 17:31 -0500, Pete Wyckoff wrote:
> is an IB verbs consumer.  The
> plan was to connect up the QP in userspace and do some preliminary
> communication, then hand the QP to the kernel and let it use the QP
> directly to do some more communication.  This works fine on ammasso,
> but fails on mthca. 

I think the only reason it works on ammasso is because ammasso doesn't
do any kernel bypass.  

For devices that _do_ kernel bypass, I'm not sure it will work.  

It will _not_ work for the Chelsio iWARP device as its implemented
today.  Once the decision is made to do kernel bypass, the kernel looses
track of the state of the resources shared by HW and library.

Steve.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-02-07 Thread Sean Hefty

> I don't think that is the main problem - though clearly the way things
> are now (for better or worse) rdma cm requires the IPoIB subnet to
> span all of the IB subnets.. The main problem with the protocol is in
> the LID selection for routed paths on the passive side. It can't rely
> on the active side to identify the lids if a router is involved.

Are you referring to the SLID in the CM REQ?  If so, I've been looking at this 
issue as well.  I simply cannot think of any way to come up with this LID, and 
my current solution is to punt this problem over to the passive side, which 
could use the SLID of the router that the CM REQ is received from.  If not, 
well, then I just rambled more than usual.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-02-07 Thread Jason Gunthorpe

On Wed, Feb 07, 2007 at 02:40:51PM -0800, Sean Hefty wrote:
> Are you referring to the SLID in the CM REQ?  If so, I've been looking at 
> this issue as well.  I simply cannot think of any way to come up with this 
> LID, and my current solution is to punt this problem over to the passive 
> side, which could use the SLID of the router that the CM REQ is received 
> from.  If not, well, then I just rambled more than usual.

Yes, this is the problem.

The active side clearly cannot learn what the SLID of the passive
side's router should be.

We don't want to have the routers snoop and alter CM GMPs.

The passive side cannot use information from the LRH to get the router
LID since the LRH may not be reversible.

The only option seems to be to have the passive side do a path record
query on a SGID in the CM REQ...

This is a spec problem unfortunately.

Jason

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] dapl broken for iWARP

2007-02-07 Thread Arlin Davis

Steve Wise wrote:

>On Wed, 2007-02-07 at 14:02 -0600, Steve Wise wrote:
>  
>
>>Arlin,
>>
>>The OFED dapl code is assuming the responder_resources and
>>initiator_depth passed up on a connection request event are from the
>>remote peer.  This doesn't happen for iWARP.  In the current iWARP
>>specifications, its up to the application to exchange this information
>>somehow. So these are defaulting to 0 on the server side of any dapl
>>connection over iWARP.  
>>
>>This is a fairly recent change, I think.  We need to come up with some
>>way to deal with this for OFED 1.2 IMO.
>>
>>
Yes, this was changed recently to sync up with the rdma_cm changes that 
exposed the values.

>>
>>
>
>The IWCM could set these to the device max values for instance.
>  
>
That would work fine as long as you know the remote settings will be 
equal or better. The provider just sets the min of local device max 
values and the remote values provided with the request.

-arlin

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] dapl broken for iWARP

2007-02-07 Thread Steve Wise

On Wed, 2007-02-07 at 15:05 -0800, Arlin Davis wrote:
> Steve Wise wrote:
> 
> >On Wed, 2007-02-07 at 14:02 -0600, Steve Wise wrote:
> >  
> >
> >>Arlin,
> >>
> >>The OFED dapl code is assuming the responder_resources and
> >>initiator_depth passed up on a connection request event are from the
> >>remote peer.  This doesn't happen for iWARP.  In the current iWARP
> >>specifications, its up to the application to exchange this information
> >>somehow. So these are defaulting to 0 on the server side of any dapl
> >>connection over iWARP.  
> >>
> >>This is a fairly recent change, I think.  We need to come up with some
> >>way to deal with this for OFED 1.2 IMO.
> >>
> >>
> Yes, this was changed recently to sync up with the rdma_cm changes that 
> exposed the values.
> 
> >>
> >>
> >
> >The IWCM could set these to the device max values for instance.
> >  
> >
> That would work fine as long as you know the remote settings will be 
> equal or better. The provider just sets the min of local device max 
> values and the remote values provided with the request.
> 

I know Krishna Kumar is working on a solution for exchanging this info
in private data so the IWCM can "do the right thing".  Stay tuned for a
patch series to review for this.  But this functionality is definitely
post OFED-1.2.  


So for the OFED-1.2, I will set these to the device max in the IWCM.
Assuming the other side is OFED 1.2 DAPL, then it will work fine.

Steve.



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-02-07 Thread Jason Gunthorpe

On Wed, Feb 07, 2007 at 02:31:10PM -0800, Sean Hefty wrote:

> >I agree that special casing some IPv6 addresses is a bad idea. It
> >needs to be integrated correctly with NET and the routing table/etc

> I haven't given this more than a few minutes of thought, but I was thinking 
> more along the lines of a port having an assigned GID that's the same as an
> assigned IPv6 address.  (Is there some reason this wouldn't work?)  IP name 
> service resolution would map the name to the IPv6 address.  The mapping 
> from the IPv6 address to a GID would then be straightforward, as opposed to 
> using a mapping using ARP.

Right, I also like the idea of using DNS as a global GID name
service.

> If name service resolution gives me an IPv6 address that's off of the local
> subnet, but the ARP response gives me an address that's on the local 
> subnet, then I think we can assume that ARP was unsuccessful is resolving 
> the address to the remote GID.  (I.e. the GID should be for a router.)  If 
> this is true, then we need some other way to acquire the DGID.

This is where I think you have problems... Why would you ARP for an
off-subnet address? Why would the router answer?  You push the address
through the route table and ARP the router address that results.

All of that is why I think another netdevice is a tidy
solution. ping6/tcp/etc using this device would generate packets that
follow the same path as RMDA connections would. No special rules about
broadcast groups are required. The route table is used to instruct the
kernel what IPv6 prefixes are IB GIDs and which are not by associating
the output of the route with the ib0 device. The admins can use any
means to set that up. Something that looks like:

$ ip addr
1: ib0:  mtu 2048 qdisc pfifo_fast qlen 1000
link/ib [my GID..]
inet6 fe80::c2/64 scope link dynamic <<-- My LL GID
inet6 2000::c2/64 scope global dynmaic  <<-- My GID

Both are maintained by the kernel.

$ ip -6 route
fe80::/64 dev ib0
2000::/64 dev ib0 src 2000::c2
2001::/64 dev ib0 src 2000::c2  <<-- Tells the kernel that 2001::/64
 is a GID and to use path records
 to do lookups at the SM
2002::/64 via fe80::a0 ib0 src 2000::c2 <<--- 2002::/64 is a GID
  but don't query the SM and
  direct things to IB
  router fe80::a0
$ ping6 -I ib0 2001::b1
 ^--- Generate packet structured as: LRH,GRH,ICMP6,PING_DATA
  Set the GRH.SGID to 2000::c2, DGID to 2001::b1 as per the route
  table
  Do a SM Path Record query for 2001::b1 and use that to set the LRH
$ ping6 -I ib0 2002::b1
 ^--- Generate packet structured as: LRH,GRH,ICMP6,PING_DATA
  Set the GRH.SGID to 2000::c2, DGID to 2002::b1 as per the route
  table
  Do a SM Path Record query for fe80::a0 and use that to set the
  LRH
$ traceroute6 -I ib0 2001::b1
 ^--- Same as the ping, except the IB router can capture the packet when
  the hop limit runs out an produce an ICMP error.

Note: In all three cases the LRH.LNH would be set to 1 (non-IBA raw
IPv6). RDMA CM would use the usual value of 3.

This also provides at least a mechanism, if not a full solution, to
the MTU problem. Linux already allows route entries to specify a MTU
and with closer integration of the raw IPV6 stuff it becomes possible
for routers to send ICMP6 errors as raw IPv6 packet and for Linux to
capture them and update the route. The ICMP6 errors are crucial to
having path MTU type functions converge quickly.

RDMA CM would use the same rules for addressing CM packets.

A further refinement would be to layer the entire path record query
mechanism in the kernel over this so that the admin has local control
over the IB routing table (if desired). A 2nd refinement would be to
use the ND cache of such an ib0 device as a local path record query
cache (again lets the admin see what is going on and override/discard
SA queries using the usual 'ip neigh' command). There might even be
good potential for sa replication using the already existing userspace
arpd stuff.

Overall I would just view something like this as further integrating
the IB stack with the existing rich services provided by NET rather
than trying to duplicate a small portion of them with seperate
interfaces. [For instance with something like this netlink could be
used instead of the sysfs probing for many cases]

But yes, it is a bit outside what the current framework envisions..

Jason

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Immediate data question

2007-02-07 Thread Roland Dreier

Changqing> What I mean is that, is there any performance penalty
Changqing> for receiver's overall performance if RNR happens
Changqing> continuously on one of the QP ?

Not for the receiver, but the sender will be severely slowed down by
having to wait for the RNR timeouts.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-02-07 Thread Roland Dreier

Jason> Basically, if IB routers are used, and the IPoIB feature of
Jason> *not* spanning a subnet is used (for scalabililty?) then
Jason> you need an alternate way to specify addresses to rdma cm.

You mean if the IB router is also an IP router for IPoIB?

Then I think there are some serious semantic problems to solve for the
RDMA CM -- because you are using an IP address to define a
destination, but since that address is on the other side of an IP
router, there's no way to know it even belongs to an IB port.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-02-07 Thread Sean Hefty

> We don't want to have the routers snoop and alter CM GMPs.

agreed

> The passive side cannot use information from the LRH to get the router
> LID since the LRH may not be reversible.

argh... I was interpreting symmetric paths at the network layer (SGID to DGID) 
and applying it at the link layer as well.  (See the last couple of sentences 
on 
page 222 of the spec.)

> The only option seems to be to have the passive side do a path record
> query on a SGID in the CM REQ...

I've thought of that as well, and this is what Yaron mentioned in his OFA 
DevCon 
slides as well.  I'd just like to avoid adding even more complexity to the 
ib_cm 
state management if at all possible.

> This is a spec problem unfortunately.

aye...

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] sharing qp between user and kernel

2007-02-07 Thread Roland Dreier

Pete> Before I dig into this anymore, do you expect this to work?
Pete> Are there fundamental problems with QP sharing between user
Pete> and kernel?  It would sure be nice not to have to stick the
Pete> connection management aspects into the kernel.

No, I wouldn't expect this to work.  At first glance at least, yes,
there are fundamental problems.  Sharing a QP between user and
kernelspace, where userspace is doing full kernel bypass (as eg mthca
does -- there are NO system calls when doing post work request, poll
CQ and request CQ notification operations), seems like a huge
problem.  I don't see any way that the kernel can keep a consistent
view of the QP state unless userspace has to call into the kernel for
every operation, which would kill performance.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-02-07 Thread Hal Rosenstock

On Wed, 2007-02-07 at 15:24, Sean Hefty wrote:
> > I didn't get too far on getting CMA to work. Beyond the bad HopLimit
> > feild I was seeing Hal pointed out a number of problems in IBA that
> > would prevent it from working as is :<
> 
> I've started thinking about what it would take to get the rdma cm to work 
> across 
> a router.  I think the rdma cm may need to treat IPv6 addresses as a GID for 
> this to work across subnets, versus trying to map an ipoib IP address to a 
> GID 
> based on ARP.

An IB GID is IPv6 like but not an IPv6 address so I don't think this is
a good idea and don't see how you get around mapping IP addresses to
GIDs in an IB routed network given the way things are spec'd. I think
that the RDMA CM assumes a single IPoIB subnet. Does it work when the
destination is on another subnet ? I think there are some unaddressed
gateway issues here to make that work and these may have been punted
(during spec time). Arkady might be a good person to comment on this.

-- Hal

> - Sean
> 
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-02-07 Thread Hal Rosenstock

On Wed, 2007-02-07 at 16:31, Jason Gunthorpe wrote:
> On Wed, Feb 07, 2007 at 12:24:08PM -0800, Sean Hefty wrote:
> > >I didn't get too far on getting CMA to work. Beyond the bad HopLimit
> > >feild I was seeing Hal pointed out a number of problems in IBA that
> > >would prevent it from working as is :<
> > 
> > I've started thinking about what it would take to get the rdma cm to work 
> > across a router.  I think the rdma cm may need to treat IPv6 addresses as a 
> > GID for this to work across subnets, versus trying to map an ipoib IP 
> > address to a GID based on ARP.
> 
> I don't think that is the main problem - though clearly the way things
> are now (for better or worse) rdma cm requires the IPoIB subnet to
> span all of the IB subnets.. The main problem with the protocol is in
> the LID selection for routed paths on the passive side. It can't rely
> on the active side to identify the lids if a router is involved.
> 
> One feature I've thought has been underused in IBA is the raw IPv6
> packet feature.

I thought raw support (including IPv6 header) although still in the spec
was largely deprecated as the CRC protection was deemed too weak.

-- Hal

>  It would be nice to have a linux netdev interface to
> be able to do IPv6 traffic using GID addressing. That would seem to me
> to be the natural way to bolt native GID addressing into rdma
> cm..
> 
> Jason
> 
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-02-07 Thread Hal Rosenstock

On Wed, 2007-02-07 at 17:49, Jason Gunthorpe wrote:
> On Wed, Feb 07, 2007 at 02:40:51PM -0800, Sean Hefty wrote:
> > Are you referring to the SLID in the CM REQ?  If so, I've been looking at 
> > this issue as well.  I simply cannot think of any way to come up with this 
> > LID, and my current solution is to punt this problem over to the passive 
> > side, which could use the SLID of the router that the CM REQ is received 
> > from.  If not, well, then I just rambled more than usual.
> 
> Yes, this is the problem.
> 
> The active side clearly cannot learn what the SLID of the passive
> side's router should be.
> 
> We don't want to have the routers snoop and alter CM GMPs.
> 
> The passive side cannot use information from the LRH to get the router
> LID since the LRH may not be reversible.
> 
> The only option seems to be to have the passive side do a path record
> query on a SGID in the CM REQ...
> 
> This is a spec problem unfortunately.

Yes and I would expect that this would be changed.

-- Hal

> 
> Jason
> 
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-02-07 Thread Jason Gunthorpe

On Wed, Feb 07, 2007 at 07:23:47PM -0500, Hal Rosenstock wrote:
> > One feature I've thought has been underused in IBA is the raw IPv6
> > packet feature.
> 
> I thought raw support (including IPv6 header) although still in the spec
> was largely deprecated as the CRC protection was deemed too weak.

I would envision using the raw support primarily for ICMP6. Ie
diganostics (ping/traceroute) and router messages (Packet to big, ICMP
Redirect, etc). Not to offset IPoIB as a high performance solution. In
this role the reduced MTU that you get because of CRC-16's limited
protection shouldn't be a big problem.

Jason

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate for unicast packets

2007-02-07 Thread Sean Hefty

>> If name service resolution gives me an IPv6 address that's off of the local
>> subnet, but the ARP response gives me an address that's on the local
>> subnet, then I think we can assume that ARP was unsuccessful is resolving
>> the address to the remote GID.  (I.e. the GID should be for a router.)  If
>> this is true, then we need some other way to acquire the DGID.
>
>This is where I think you have problems... Why would you ARP for an
>off-subnet address? Why would the router answer?  You push the address
>through the route table and ARP the router address that results.

I'm confusing myself.  I was considering different IB subnets, and trying to
determine whether they shared the same IP subnet.  The GIDs may have different
subnet prefixes, but the IP addresses may not, and I'm not sure how to relate
this back to using DNS.

>All of that is why I think another netdevice is a tidy
>solution. ping6/tcp/etc using this device would generate packets that
>follow the same path as RMDA connections would. No special rules about
>broadcast groups are required. The route table is used to instruct the
>kernel what IPv6 prefixes are IB GIDs and which are not by associating
>the output of the route with the ib0 device. The admins can use any
>means to set that up. Something that looks like:

At first glance, this seems like a decent approach to explore.

>But yes, it is a bit outside what the current framework envisions..

I'm fine with that.  My short-term objective is to enable basic router support
within the host stack, and I think I have an idea of what that takes.  I'd just
also like to have an idea of how an application could transfer data between
routed IB subnets, including providing a way for the application to locate a
given remote node.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] RFC ofed 1 2 kernel file structure

2007-02-07 Thread Michael S. Tsirkin

> Quoting Bryan O'Sullivan <[EMAIL PROTECTED]>:
> Subject: Re: RFC ofed 1 2 kernel file structure
> 
> Michael S. Tsirkin wrote:
> 
> > All, pls try now.
> 
> This is similar in layout to the sort of tree we've used internally all 
> along, so it's fine by me.  One small problem: I don't like the 
> combination of lower and upper case names of makefile and Makefile in 
> the top-level directory.

ofed_1_2 has the same.

> Also, it's no longer obvious to me to tell what kernel version the 
> sources are pulled from.  I used to be able to check the top-level 
> Makefile or git history, but I no longer know what to look at.

This will be part of BUILD_ID.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] more comments on cxgb3

2007-02-07 Thread Michael S. Tsirkin

OK, so I looked at cxgb3 some more.
To summarise my previous comments, I think the cxio hal layer needs to go to
make the code readable - if I understand correctly it is there for historical
reasons only.

I started looking at userspace/kernel interaction, and then
went over to other code under cxgb3 (but not core/).

- Consider a user that does e.g. create QP, but never calls mmap.
  Is there some code that will clean out the unclamed mmap object?
  I couldn't find it, and iwch_dealloc_ucontext does not seem to
  do anything with it.

- Passing physical address to userspace and back looks suspicios.
  Especially this:
uresp.physaddr = virt_to_phys(chp->cq.queue);
  Could you elaborate on the design here? What are these phy addresses
  and how come userspace needs to know the phy address?
  You are not doing DMA by this address, by any chance?

- It seems that by passing in huge resource sizes, userspace will be able to
  drink up unlimited amounts of kernel memory.
  mthca handles this by using the mlock rlimit, should something be done here
  as well?
 
A couple of comments on PDBG macro.
- I'd like to suggest following the practice of prefixing macro names with 
module name
  (same goes for functions like get_mhp really) - unless they are local to file.

- You are using __FUNCTION__ a lot - it might be to just to kill it,
  messages are unique so you'll be able to locate the msg source anyway,
  save some kernel text and logs will be shorter. In any case I think
  __func__ is the recommended gcc way to get the name currently.

- comment near pr_debug definition in include/linux/kernel.h says:
/* If you are writing a driver, please use dev_dbg instead */
  so it might be a good idea for PDBG to follow this rule.

- log messages do not look very informative to me.
  I also think they are a bit too many of them currently.
  For example, I do not think it is a good idea to print
  the kernel pointers out.

  For example, in code like the following:
mhp = get_mhp(rhp, (sg_list[i].lkey) >> 8);
if (!mhp) {
PDBG("%s %d\n", __FUNCTION__, __LINE__);
return -EIO;
}

  might be better to say 
  "MR key XXX does not exist. Exiting.".
  -EIO also looks like a strange error code to return here, does it not?
  Maybe something like EINVAL would be more appropriate?

- I wonder about the names like get_mhp - what does "mhp" mean?
static inline struct iwch_mr *get_mhp(struct iwch_dev *rhp, u32 mmid)
{
return idr_find(&rhp->mmidr, mmid);
}

Looks like it looks up an mr. Is that right? Maybe the name shouldbe changed
to convey this meaning.

- In the following code, what does "missing pdid check" mean?
/*
 * TBD: this is going to be moved to firmware. Missing pdid/qpid check for now.
 */
This sounds interesting.
Does this mean the code does not validate the PD currently?

I have the same question for:
/* TBD: check perms */
in iwch_bind_mw.

BTW, does TBD stand for "To Be Done" here?
google says:
>Definitions of TBD on the Web:

* To Be Determined, Defined, Decided.
  www.csr.com/ptot.htm

* to be determined
  
www.liberalsagainstterrorism.com/wiki/index.php/Counterinsurgency_Operations/Glossary

* Treasury Board (Secretariat)
  www.psc-cfp.gc.ca/centres/definitions_and_notes_e.htm

* The three letter abbreviation TBD may be/mean, depending on context: * an 
 acronym for "To Be Determined" ("...at a later point in time.", typically)* 
the Douglas Devastator, a US Navy torpedo bomber of World War II
  en.wikipedia.org/wiki/TBD

What is to be determined here?
Do you mean TODO really?

- iwch_sgl2pbl_map is used in several places, and seems a bit too big to be 
inline

Well, it's time to go do my day job now :)

Hope this helps,

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] issues with compilation of ofed 1.2

2007-02-07 Thread Moni Levy

Doug,
On 2/7/07, Yosef Etigin <[EMAIL PROTECTED]> wrote:
> 7. On RHAS5 beta 2, the setup requires sysfstuils-devel RPM which is not 
> included in this distro.

Can you please help us with that ?

-- Moni

>
> --
> Yosef Etigin
> Alex Tabachnik
>

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

64 matches

Mail list logo