[openib-general] Wrong minor number for /dev/uat in README file

2005-10-11 Thread Heiko J Schick
Hello,

I think the minor number for /dev/uat in /src/userspace/libibat/README is 
wrong.

mknod /dev/infiniband/uat c 231 254 
should be replaced by
mknod /dev/infiniband/uat c 231 191

At least, the file /src/linux-kernel/infiniband/core/uat.c has the 
following content:

enum {
IB_UAT_MAJOR = 231,
IB_UAT_MINOR = 191
};

Many thanks in advance!

Mit freundlichen Gruessen / Kind Regards
Heiko Joerg Schick

IBM Deutschland Entwicklung GmbH
I/Ox Microcode Development
Linux Infiniband Device Drivers

Schoenaicher Str. 220
71032 Boeblingen
E-Mail: [EMAIL PROTECTED]
External: 49-7031-16-0 x4219,   t/l: 120-4219

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [PATCH] Opensm - handling immediate error in vendor_send new

2005-10-11 Thread Yael Kalka
Hi Hal,

Attached is a new patch with several fixes for this issue.
I decided to remove the checking for zero in the atomic_dec after all,
since as I mentioned before - clearing it is not a fix, and we will
see the value in other infos in the log file.

Thanks,
Yael

Signed-off-by:  Yael Kalka [EMAIL PROTECTED]

Index: include/opensm/osm_vl15intf.h
===
--- include/opensm/osm_vl15intf.h   (revision 3704)
+++ include/opensm/osm_vl15intf.h   (working copy)
@@ -55,11 +55,13 @@
 #include complib/cl_event.h
 #include complib/cl_thread.h
 #include complib/cl_qlist.h
+#include complib/cl_passivelock.h
 #include opensm/osm_stats.h
 #include opensm/osm_log.h
 #include opensm/osm_madw.h
 #include opensm/osm_mad_pool.h
 #include vendor/osm_vendor.h
+#include opensm/osm_subnet.h
 
 #ifdef __cplusplus
 #  define BEGIN_C_DECLS extern C {
@@ -137,6 +139,9 @@ typedef struct _osm_vl15
osm_vendor_t*p_vend;
osm_log_t   *p_log;
osm_stats_t *p_stats;
+   osm_subn_t  *p_subn;
+   cl_disp_reg_handle_th_disp;
+   cl_plock_t  *p_lock;
 
 } osm_vl15_t;
 /*
@@ -176,6 +181,15 @@ typedef struct _osm_vl15
 *  p_stats
 *  Pointer to the OpenSM statistics block.
 *
+*  p_subn
+* Pointer to the Subnet object for this subnet.
+*
+*  h_disp
+*Handle returned from dispatcher registration.
+*
+*  p_lock
+*  Pointer to the serializing lock.
+*
 * SEE ALSO
 *  VL15 object
 */
@@ -265,7 +279,10 @@ osm_vl15_init(
IN osm_vendor_t* const p_vend,
IN osm_log_t* const p_log,
IN osm_stats_t* const p_stats,
-   IN const int32_t max_wire_smps );
+   IN const int32_t max_wire_smps,
+   IN osm_subn_t* const p_subn,
+   IN cl_dispatcher_t* const p_disp,
+   IN cl_plock_t* const p_lock );
 /*
 * PARAMETERS
 *  p_vl15
@@ -283,6 +300,15 @@ osm_vl15_init(
 *  max_wire_smps
 *  [in] Maximum number of MADs allowed on the wire at one time.
 *
+*  p_subn
+* [in] Pointer to the subnet object.
+*
+*  p_disp
+* [in] Pointer to the dispatcher object.
+*
+*  p_lock
+*  [in] Pointer to the OpenSM serializing lock.
+*
 * RETURN VALUES
 *  IB_SUCCESS if the VL15 object was initialized successfully.
 *
Index: opensm/osm_opensm.c
===
--- opensm/osm_opensm.c (revision 3704)
+++ opensm/osm_opensm.c (working copy)
@@ -257,7 +257,8 @@ osm_opensm_init(
 
status = osm_vl15_init( p_osm-vl15,
p_osm-p_vendor,
-   p_osm-log, p_osm-stats, p_opt-max_wire_smps );
+   p_osm-log, p_osm-stats, p_opt-max_wire_smps, 
+   p_osm-subn, p_osm-disp, p_osm-lock );
if( status != IB_SUCCESS )
   goto Exit;
 
Index: opensm/osm_vl15intf.c
===
--- opensm/osm_vl15intf.c   (revision 3704)
+++ opensm/osm_vl15intf.c   (working copy)
@@ -157,6 +157,8 @@ __osm_vl15_poller(
 
   if( status != IB_SUCCESS )
   {
+uint32_t outstanding;
+cl_status_t cl_status;
 osm_log( p_vl-p_log, OSM_LOG_ERROR,
  __osm_vl15_poller: ERR 3E03: 
  MAD send failed (%s).\n,
@@ -166,7 +168,69 @@ __osm_vl15_poller(
   The MAD was never successfully sent, so
   fix up the pre-incremented count values.
 */
+/* Decrement qp0_mads_sent and qp0_mads_outstanding_on_wire
+   that was incremented in the code above. */
 mads_sent = cl_atomic_dec( p_vl-p_stats-qp0_mads_sent );
+if( p_madw-resp_expected == TRUE )
+  cl_atomic_dec( p_vl-p_stats-qp0_mads_outstanding_on_wire );
+
+/*
+   The following code is similar to the one in 
+   __osm_sm_mad_ctrl_retire_trans_mad. We need to decrement the 
+   qp0_mads_outstanding counter, and if we reached 0 - need to call
+   the cl_disp_post with OSM_SIGNAL_NO_PENDING_TRANSACTION (in order
+   to wake up the state mgr).
+*/
+cl_atomic_dec( p_vl-p_stats-qp0_mads_outstanding );
+
+osm_log( p_vl-p_log, OSM_LOG_DEBUG,
+ __osm_vl15_poller: 
+ %u QP0 MADs outstanding.\n,
+ p_vl-p_stats-qp0_mads_outstanding );
+
+/*
+  Acquire the lock non-exclusively.
+  Other modules that send MADs grab this lock exclusively.
+  These modules that are in the process of sending MADs
+  will hold the lock until they finish posting all the MADs
+  they plan to send.  While the other module is sending MADs
+ 

[openib-general] SRP Infiniband

2005-10-11 Thread Mohit Katiyar, Noida








Hi all,

I am just an investigating level newbee to Infiniband and I have
a query in it.

I am not clear about the functionalities of the user level
HCA driver? Are there any specifications for it or it is totally vendor based?

It is also said it is used in speed path operations? Does
anyone has any ideas how does it do accomplishes it?

If I have SCSI storage devices in a SAN then can I use SRP
module to send some request and User mode HCA library for some speed path
operation? Basically I wanted to know that for SCSI devices can User mode HCA
library be used for speed path operations . If yes the how they can be used(Only
theoretical details rest I wil try)

Thanks in advance for all the help I am going to get





Mohit Katiyar






___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Wrong minor number for /dev/uat in README file

2005-10-11 Thread Hal Rosenstock
On Tue, 2005-10-11 at 02:07, Heiko J Schick wrote:
 Hello,
 
 I think the minor number for /dev/uat in /src/userspace/libibat/README is 
 wrong.
 
 mknod /dev/infiniband/uat c 231 254 
 should be replaced by
 mknod /dev/infiniband/uat c 231 191
 
 At least, the file /src/linux-kernel/infiniband/core/uat.c has the 
 following content:
 
 enum {
 IB_UAT_MAJOR = 231,
 IB_UAT_MINOR = 191
 };
 
 Many thanks in advance!

Thanks. The README wasn't updated when this occured (on 9/15).

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [PATCH] Opensm - enabling erase of log file flag

2005-10-11 Thread Yael Kalka
Hi Hal,

Currently the osm log file is accumulative. I've added an option to
erase the log file before starting to write it.
By default, still, the log is still accumulative.
Attached is a patch for that.

Thanks,
Yael

Signed-off-by:  Yael Kalka [EMAIL PROTECTED]

Index: include/opensm/osm_subnet.h
===
--- include/opensm/osm_subnet.h (revision 3704)
+++ include/opensm/osm_subnet.h (working copy)
@@ -220,6 +220,7 @@ typedef struct _osm_subn_opt
   uint8_t  log_flags;
   char *   dump_files_dir;
   char *   log_file;
+  boolean_taccum_log_file;
   cl_map_t port_pro_ignore_guids;
   boolean_tport_profile_switch_nodes;
   uint32_t max_port_profile;
@@ -319,6 +320,10 @@ typedef struct _osm_subn_opt
 *  log_file
 * Name of the log file (or NULL) for stdout.
 *
+*  accum_log_file
+* If TRUE (default) - the log file will be accumulated.
+* If FALSE - the log file will be erased before starting current opensm 
run.
+*
 *  port_pro_ignore_guids
 * A map of guids to be ignored by port profiling.
 *
Index: include/opensm/osm_log.h
===
--- include/opensm/osm_log.h(revision 3704)
+++ include/opensm/osm_log.h(working copy)
@@ -218,7 +218,8 @@ osm_log_init(
   IN osm_log_t* const p_log,
   IN const boolean_t flush,
   IN const uint8_t log_flags,
-  IN const char *log_file)
+  IN const char *log_file,
+  IN const boolean_t accum_log_file )
 {
   p_log-level = log_flags;
   p_log-flush = flush;
@@ -229,10 +230,18 @@ osm_log_init(
   }
   else
   {
+if (accum_log_file)
 p_log-out_port = fopen(log_file,a+);
+else
+  p_log-out_port = fopen(log_file,w+);
+
 if (!p_log-out_port)
 {
+  if (accum_log_file)
   printf(Cannot open %s for appending. Permission denied\n, log_file);
+  else
+printf(Cannot open %s for writing. Permission denied\n, log_file);
+
   return(IB_UNKNOWN_ERROR);
 }
   }
Index: complib/cl_event_wheel.c
===
--- complib/cl_event_wheel.c(revision 3704)
+++ complib/cl_event_wheel.c(working copy)
@@ -597,7 +597,7 @@ main ()
   cl_event_wheel_construct( event_wheel );
 
   /* init */
-  osm_log_init( log, TRUE, 0xff, NULL);
+  osm_log_init( log, TRUE, 0xff, NULL, FALSE);
   cl_event_wheel_init( event_wheel, log );
 
   /* Start Playing */
Index: osmtest/osmtest.c
===
--- osmtest/osmtest.c   (revision 3704)
+++ osmtest/osmtest.c   (working copy)
@@ -507,7 +507,7 @@ osmtest_init( IN osmtest_t * const p_osm
   osmtest_construct( p_osmt );
 
   status = osm_log_init( p_osmt-log, p_opt-force_log_flush,
- 0x0001, p_opt-log_file );
+ 0x0001, p_opt-log_file, TRUE );
   if( status != IB_SUCCESS )
 return ( status );
   /* but we do not want any extra staff here */
Index: opensm/osm_subnet.c
===
--- opensm/osm_subnet.c (revision 3704)
+++ opensm/osm_subnet.c (working copy)
@@ -427,6 +427,7 @@ osm_subn_set_default_opt(
 p_opt-dump_files_dir = OSM_DEFAULT_TMP_DIR;
 
   p_opt-log_file = OSM_DEFAULT_LOG_FILE;
+  p_opt-accum_log_file = TRUE;
   p_opt-port_profile_switch_nodes = FALSE;
   p_opt-max_port_profile = 0x;
   p_opt-pfn_ui_pre_lid_assign = NULL;
@@ -754,6 +755,10 @@ osm_subn_parse_conf_file(
   __osm_subn_opts_unpack_charp(
 log_file , p_key, p_val, p_opts-log_file);
 
+  __osm_subn_opts_unpack_boolean(
+accum_log_file,
+p_key, p_val, p_opts-accum_log_file);
+
   __osm_subn_opts_unpack_charp(
 dump_files_dir ,
 p_key, p_val, p_opts-dump_files_dir);
@@ -920,6 +925,7 @@ osm_subn_write_conf_file(
 force_log_flush %s\n\n
 # Log file to be used\n
 log_file %s\n\n 
+accum_log_file %s\n\n
 # The directory to hold the file OpenSM dumps\n
 dump_files_dir %s\n\n
 # If TRUE if OpenSM should disable multicast support\n
@@ -929,6 +935,7 @@ osm_subn_write_conf_file(
 p_opts-log_flags,
 p_opts-force_log_flush ? TRUE : FALSE,
 p_opts-log_file,
+p_opts-accum_log_file,
 p_opts-dump_files_dir,
 p_opts-no_multicast_option ? TRUE : FALSE,
 p_opts-disable_multicast ? TRUE : FALSE
Index: opensm/osm_db_files.c
===
--- opensm/osm_db_files.c   (revision 3704)
+++ opensm/osm_db_files.c   (working copy)
@@ -673,7 +673,7 @@ main(int argc, char **argv)
   cl_list_construct( keys );
   cl_list_init( keys, 10 );
 
-  osm_log_init( log, TRUE, 0xff, /tmp/test_osm_db.log);
+  osm_log_init( log, TRUE, 0xff, /tmp/test_osm_db.log, FALSE);
 
   osm_db_construct(db);
   if (osm_db_init(db, log))
Index: 

Re: Re: Re: [openib-general] IBM eHCA testing..

2005-10-11 Thread Heiko J Schick
Hello Troy,

this morning I've looked in detail into the problem you've reported on Oct 
10 via the OpenIB mailing-list [1]. It seems that the kernel panic is an 
IPoIB issues.

[1]:  http://openib.org/pipermail/openib-general/2005-October/012353.html

The following things appens:

1.  modprobe hcad_mod ehca_nr_ports=1
The eHCA InfiniBand Device Driver is loaded.

2.  modprobe ib_mad
The ib_mad stack creates an AQP1. This will start the port 
activation process. 
By my count it will take more than 110 / 120 seconds to activate a 
port. 
Our device driver gets a timeout, which means that the port is NOT 
active. and
ib_modify_qp will not work (for any QP, doesn't matter if it was 
created in the ib_mad 
stack or in the ib_ipoib stack).

3.  modprobe ib_ipoib
All ressources for IPoIB are allocated (CQ, QPs, MR, etc.)

4.  A user runs ifconfig ib0 xxx.xxx.xxx.xxx which executes the 
following functions:
ipoib_open - ipoib_ib_dev_open - ipoib_qp_create. The user 
should see the following 
error message:
 
l2:/home/schickhj/ibt/linstack/ehca2/ehca2 # ifconfig ib0 
192.168.8.8
SIOCSIFFLAGS: Invalid argument

5.  The function ipoib_qp_create modifies the QP from Reset 2 Init 2 
RTR 2 RTS.
If one of these three ib_modify_qp doesn't work, the IPoIB QP 
(priv-qp) will be destroyed
(by the ipoib_qp_create error routine / out_fail) and priv-qp 
will be NULL.
 
-- see /src/linux-kernel/infiniband/ulp/ipoib/ipoib_verbs.c 
function ipoib_qp_create

6.  A user runs (again) ifconfig ib0 xxx.xxx.xxx which executes 
(again) the following functions:
ipoib_open - ipoib_ib_dev_open - ipoib_qp_create

7.  ipoib_qp_create wants to modify the IPoIB QP (priv-qp) which is 
NULL, because the
QP was destroy earlier in time by the error handling routine in 
ipoib_qp_create (see 5.)

I think this error could also show up on Mellanox based IB cards when 
ib_modify_qp failes in ipoib_qp_create.

In dmesg you should see:

(see 1.)
eHCA Infiniband Device Driver (Rel.: )
xics_enable_irq: irq=9029: ibm_int_on returned fffd
eHCA Infiniband Device Driver (Rel.: )

(see 2.)
PU 000b0078:ehca_define_sqp HCAD_ERROR  Port 1 is not active.
PU 000b0387:ehca_create_qp HCAD_ERROR  ehca_define_sqp() failed 
rc=
PU 000b03ae:ehca_create_qp  failed ret=ffea
ib_mad: Couldn't create ib_mad QP1
ib_mad: Couldn't open ehca0 port 1
PU0001 00060103:ehca_parse_ec  EHCA port 1 is available.
PU 000b00bd:plpar_hcall_7arg_7ret HCAD_ERROR  HCALL77_IN r3=168 
r4=100100050304 r5=2001002c r6=8a40 3ed48000 r8=0 
r9=0 r10=0
PU 000b00c4:plpar_hcall_7arg_7ret HCAD_ERROR  HCALL77_OUT 
r3=ffd3 r4=0 r5=0 r6=0 r7=4 r8=0 r9=8005aa18 r10=0 

(see 4.)
PU 000b0564:internal_modify_qp HCAD_ERROR  hipz_h_modify_qp() failed 
rc=ffd3 ehca_qp=c3ba4e00 qp_num=2c
ib0: failed to modify QP to init, ret = -22
ib0: ipoib_qp_create returned -22

Mit freundlichen Gruessen / Kind Regards
Heiko Joerg Schick

IBM Deutschland Entwicklung GmbH
I/Ox Microcode Development
Linux Infiniband Device Drivers

Schoenaicher Str. 220
71032 Boeblingen
E-Mail: [EMAIL PROTECTED]
External: 49-7031-16-0 x4219,   t/l: 120-4219

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH] Opensm - handling immediate error in vendor_send new

2005-10-11 Thread Hal Rosenstock
Hi Yael,

On Tue, 2005-10-11 at 04:28, Yael Kalka wrote:
 Attached is a new patch with several fixes for this issue.

Thanks. Applied.

There were still extra whitespace issues which I fixed by hand. Please
try to eliminate these so I don't have to do hand touch ups.

 I decided to remove the checking for zero in the atomic_dec after all,
 since as I mentioned before - clearing it is not a fix, and we will
 see the value in other infos in the log file.

But there is danger is these counters wrap, right ?

Also, in looking further at the code, the same issue does not appear to
occur for QP1 handling, right ?

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: Re: Re: [openib-general] IBM eHCA testing..

2005-10-11 Thread Hal Rosenstock
Hi Heiko,

On Tue, 2005-10-11 at 08:43, Heiko J Schick wrote:
 this morning I've looked in detail into the problem you've reported on Oct 
 10 via the OpenIB mailing-list [1]. It seems that the kernel panic is an 
 IPoIB issues.
 
 [1]:  http://openib.org/pipermail/openib-general/2005-October/012353.html
 
 The following things appens:
 
 1.  modprobe hcad_mod ehca_nr_ports=1
 The eHCA InfiniBand Device Driver is loaded.
 
 2.  modprobe ib_mad
 The ib_mad stack creates an AQP1. This will start the port 
 activation process. 
 By my count it will take more than 110 / 120 seconds to activate a 
 port. 
 Our device driver gets a timeout, which means that the port is NOT 
 active. and
 ib_modify_qp will not work (for any QP, doesn't matter if it was 
 created in the ib_mad 
 stack or in the ib_ipoib stack).

Where does this time to activate a port come from ? Is there some
maximum time in which the eHCA firmware requires this to be completed ?

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: Re: Re: [openib-general] IBM eHCA testing..

2005-10-11 Thread Heiko J Schick

Hello Hal,

normally the timeout is set to 30 seconds.
If you need more information about the activation please
see [1].

[1]: http://openib.org/pipermail/openib-general/2005-October/012355.html


Mit freundlichen Gruessen / Kind Regards
Heiko Joerg Schick

IBM Deutschland Entwicklung GmbH
I/Ox Microcode Development
Linux Infiniband Device Drivers

Schoenaicher Str. 220
71032 Boeblingen
E-Mail: [EMAIL PROTECTED]
External: 49-7031-16-0 x4219,  t/l: 120-4219






Hal Rosenstock [EMAIL PROTECTED]

11.10.2005 14:48




To
Heiko J Schick/Germany/[EMAIL PROTECTED]


cc
openib-general@openib.org,
Christoph Raisch/Germany/[EMAIL PROTECTED]


Subject
Re: Re: Re: [openib-general]
IBM eHCA testing..








Hi Heiko,

On Tue, 2005-10-11 at 08:43, Heiko J Schick wrote:
 this morning I've looked in detail into the problem you've reported
on Oct 
 10 via the OpenIB mailing-list [1]. It seems that the kernel panic
is an 
 IPoIB issues.
 
 [1]: http://openib.org/pipermail/openib-general/2005-October/012353.html
 
 The following things appens:
 
 1.   modprobe hcad_mod ehca_nr_ports=1
 The eHCA InfiniBand Device Driver is loaded.
 
 2.   modprobe ib_mad
 The ib_mad stack creates an AQP1. This
will start the port 
 activation process. 
 By my count it will take more than 110
/ 120 seconds to activate a 
 port. 
 Our device driver gets a timeout, which
means that the port is NOT 
 active. and
 ib_modify_qp will not work (for any QP,
doesn't matter if it was 
 created in the ib_mad 
 stack or in the ib_ipoib stack).

Where does this time to activate a port come from ? Is there some
maximum time in which the eHCA firmware requires this to be completed ?

-- Hal


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] Re: [PATCH] Opensm - enabling erase of log file flag

2005-10-11 Thread Hal Rosenstock
Hi Yael,

On Tue, 2005-10-11 at 08:24, Yael Kalka wrote:
 Currently the osm log file is accumulative. I've added an option to
 erase the log file before starting to write it.
 By default, still, the log is still accumulative.
 Attached is a patch for that.

One minor comment on this...

 Thanks,
 Yael
 
 Signed-off-by:  Yael Kalka [EMAIL PROTECTED]

 Index: opensm/osm_subnet.c
 ===
 --- opensm/osm_subnet.c   (revision 3704)
 +++ opensm/osm_subnet.c   (working copy)

 @@ -920,6 +925,7 @@ osm_subn_write_conf_file(
  force_log_flush %s\n\n
  # Log file to be used\n
  log_file %s\n\n 
 +accum_log_file %s\n\n
  # The directory to hold the file OpenSM dumps\n
  dump_files_dir %s\n\n
  # If TRUE if OpenSM should disable multicast support\n
 @@ -929,6 +935,7 @@ osm_subn_write_conf_file(
  p_opts-log_flags,
  p_opts-force_log_flush ? TRUE : FALSE,
  p_opts-log_file,
 +p_opts-accum_log_file,

Shouldn't this line be:
p_opts-accum_log_file ? TRUE : FALSE,

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Lustre Network Driver - KDAPL or verbs?

2005-10-11 Thread James Lentini


On Sun, 9 Oct 2005, Peter J. Braam wrote:

 Cluster File Systems, Inc and its customers have been wondering if the
 Lustre Network Driver (LND) for OpenIb gen2, which we will begin to
 develop during the coming months, should be based on kdapl or verbs.
  
 The driver we plan to develop should strive to address several goals: 
  - high reliability and performance
  - allow interoperability between user and kernel level
  - allow interoperability, or better, portability among different
 operating systems (Linux, OS X, Windows, Solaris)
  - be suitable for inclusion in the Linux kernel
  
 We are keen to hear some opinions!
  
 Thanks
  
 Peter Braam

Hi Peter,

I am the maintainer of the kDAPL reference implementation.

If you are interested in portability, I would recommend kDAPL. 

Earlier this year, there was an effort to modify the kDAPL API to make 
it acceptable for inclusion in the Linux kernel. After making these 
modifications, the OpenIB community still felt that the kDAPL API was 
not ready for merging into the upstream kernel. As a result, a new 
project was begun to develop an API capable of supporting both IB and 
iWARP and suitable for kernel inclusion. 

At the present time, neither the kDAPL API or the new RDMA API (verbs 
+ CMA) has been sent upstream. The current thinking is that the RDMA 
API has a better chance than kDAPL. 

james
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: Re: Re: [openib-general] IBM eHCA testing..

2005-10-11 Thread Hal Rosenstock
Hi again Heiko,

On Tue, 2005-10-11 at 09:21, Heiko J Schick wrote:
 Hello Hal,
 
 normally the timeout is set to 30 seconds.  

Why does there need to be a timeout for this ? There is no time defined
in the IB spec for activating a port. The SM may or may not be up and it
is implementation specific when it activates any particular port.

 If you need more information about the activation please see [1].
 
 [1]:
 http://openib.org/pipermail/openib-general/2005-October/012355.html

Yes, I saw that post yesterday.

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Compiling an application that calls ib_cm_* functions

2005-10-11 Thread Steven Wooding
Hi,

I wonder if someone could help me with compiling my IB application?The problem is when I go to link my program I get all of the ib_cm* function callscome up as "undefined reference". Also dlist_start and _dlist_mark_move (dlist_next in the code).

Here is my linking command:
icpc -o ib_comms_test1 ib_comms_test1.o ib_queue_pair.o ib_comms_manager.o -L/usr/local/lib -libcm -libat -libverbs -libumad -lsysfs -ldl

Get the same result when using g++
The cmpost.c example compiles fine. I've tried to see what it is doing. It seems to link-in the libibcm.la file,but when I try this with icpc or g++, they say they cannot recogised the file type.

Maybe someone can spot the simple mistake I'm making.

Cheers,

Steve. 
		To help you stay safe and secure online, we've developed the all new Yahoo! Security Centre.___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: Re: Re: [openib-general] IBM eHCA testing..

2005-10-11 Thread Hal Rosenstock
Hi again Heiko,

On Tue, 2005-10-11 at 09:21, Heiko J Schick wrote:
 Hello Hal,
 
 normally the timeout is set to 30 seconds. 

One more thing:

How can the timeout be adjusted ? Is it an module parameter ?

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Lustre Network Driver - KDAPL or verbs?

2005-10-11 Thread Dan Bar Dov
Hi Peter,

I can testify from first hand experience - we first developed ISER over KDAPL.
It simplified our work since kDAPL was pretty stable at the time.

We are now porting ISER to run over openIB-verbs + CMA.
Although CMA is not there yet, the port does simplify the code compared
to the kDAPL implementation.

Dan


On 10/9/05, Peter J. Braam [EMAIL PROTECTED] wrote:

 Cluster File Systems, Inc and its customers have been wondering if the
 Lustre Network Driver (LND) for OpenIb gen2, which we will begin to develop
 during the coming months, should be based on kdapl or verbs.

 The driver we plan to develop should strive to address several goals:
  - high reliability and performance
  - allow interoperability between user and kernel level
  - allow interoperability, or better, portability among different operating
 systems (Linux, OS X, Windows, Solaris)
  - be suitable for inclusion in the Linux kernel

 We are keen to hear some opinions!

 Thanks

 Peter Braam


 ___
 openib-general mailing list
 openib-general@openib.org
 http://openib.org/mailman/listinfo/openib-general

 To unsubscribe, please visit
 http://openib.org/mailman/listinfo/openib-general


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH] SDP: In sdp_link.c::do_link_path_lookup, handle interface table numbering holes

2005-10-11 Thread Michael S. Tsirkin
Quoting r. Hal Rosenstock [EMAIL PROTECTED]:
 Subject: [PATCH] SDP: In sdp_link.c::do_link_path_lookup, handle interface 
 table numbering holes
 
 SDP: In sdp_link.c::do_link_path_lookup, handle interface table
 numbering holes
 (similar to James Lentini's patch to at.c)
 
 (this is untested)
 
 Signed-off-by: Hal Rosenstock [EMAIL PROTECTED]
 
 Index: sdp_link.c
 ===
 --- sdp_link.c  (revision 3623)
 +++ sdp_link.c  (working copy)
 @@ -354,7 +354,6 @@ static void do_link_path_lookup(struct s
 struct ipoib_dev_priv *priv;
 struct net_device *dev = NULL;
 struct rtable *rt;
 -   int counter = 0;
 int result = 0;
 struct flowi fl = {
 .oif = info-dif, /* oif */
 @@ -435,7 +434,7 @@ static void do_link_path_lookup(struct s
  
 if (dev-flags  IFF_LOOPBACK) {
 dev_put(dev);
 -   while ((dev = dev_get_by_index(++counter))) {
 +   for (dev = dev_base; dev; dev = dev-next) {
 if (dev-type == ARPHRD_INFINIBAND 
 (dev-flags  IFF_UP))
 break;
 

I think this list scan needs some kind of protection.
The following is what I checked in. Does this needs to be updated
in other places as well?

Handle net interface table numbering holes
(similar to James Lentini's patch to at.c)

Signed-off-by: Michael S. Tsirkin [EMAIL PROTECTED]
Signed-off-by: Hal Rosenstock [EMAIL PROTECTED]

Index: linux-kernel/drivers/infiniband/ulp/sdp/sdp_link.c
===
--- linux-kernel.orig/drivers/infiniband/ulp/sdp/sdp_link.c 2005-10-11 
13:48:30.0 +0200
+++ linux-kernel/drivers/infiniband/ulp/sdp/sdp_link.c  2005-10-11 
13:55:15.0 +0200
@@ -433,13 +433,15 @@ static void do_link_path_lookup(struct s
 
if (dev-flags  IFF_LOOPBACK) {
dev_put(dev);
-   while ((dev = dev_get_by_index(++counter))) {
+   read_lock(dev_base_lock);
+   for (dev = dev_base; dev; dev = dev-next) {
if (dev-type == ARPHRD_INFINIBAND 
-   (dev-flags  IFF_UP))
+   (dev-flags  IFF_UP)) {
+   dev_hold(dev);
break;
-   else
-   dev_put(dev);
+   }
}
+   read_unlock(dev_base_lock);
}
 
if (!dev) {


-- 
MST
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [PATCH] reduce the number of included files in cma.c

2005-10-11 Thread Michael S. Tsirkin
Remove unnecessary includes from cma.c

Signed-off-by: Michael S. Tsirkin [EMAIL PROTECTED]

Index: linux-2.6.13/drivers/infiniband/core/cma.c
===
--- linux-2.6.13/drivers/infiniband/core/cma.c  (revision 3720)
+++ linux-2.6.13/drivers/infiniband/core/cma.c  (working copy)
@@ -30,10 +30,6 @@
  */
 #include linux/in.h
 #include linux/in6.h
-#include linux/inetdevice.h
-#include net/arp.h
-#include net/neighbour.h
-#include net/route.h
 #include rdma/rdma_cm.h
 #include rdma/ib_cache.h
 #include rdma/ib_cm.h

-- 
MST
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general] segmentation fault in ibv_modify_srq

2005-10-11 Thread Tziporet Koren
Title: RE: [openib-general] segmentation fault in ibv_modify_srq





SRQ limit event will be supported also on cards with memory (both Infinihost and Infinihost III)
If someone need it nowadays we can give a drop of FW that supports it.
It will be officially released in Q4.


Tziporet


-Original Message-
From: Roland Dreier [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, October 05, 2005 9:42 PM
To: Sayantan Sur
Cc: openib-general@openib.org
Subject: Re: [openib-general] segmentation fault in ibv_modify_srq



 Sayantan Hello, This is in regard to the use of `ibv_modify_srq'
 Sayantan call. When I use this call, I get a segmentation
 Sayantan fault.


This is because the modify SRQ operation is not implemented at all in
libmthca. Do you just want to set the SRQ limit? That's not so hard
for me to implement. However, you should be aware that as far as I
know, only mem-free HCAs generate the SRQ limited reached event.


- R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general


To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: Re: Re: [openib-general] IBM eHCA testing..

2005-10-11 Thread Shirley Ma

The IB stack doesn't handle errors during
client initialization. This problem is easy to reproduce by inducing errors
(resouce allocation failure or query failure) in mad_client or sa_client
registration. I am working on a patch, but I am in class the whole week,
don't have time to verify the patch. I hope the patch will be available
early next week to fix the panic. 

Thanks
Shirley Ma
IBM Linux Technology Center
15300 SW Koll Parkway
Beaverton, OR 97006-6063
Phone(Fax): (503) 578-7638___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] some bugs that can be found using the gen2_b asic in the contrib/m ellanox folder

2005-10-11 Thread Amit Krig
Title: RE: [openib-general] some bugs that can be found using the gen2_basic in the contrib/m ellanox folder





Hi Roland,


Dotan is on vacation until the end of the month, (Ami will send an update)
Regarding the max qp number the main reason for the test is to see that we are in the ballpark,


Your point was taken and we will focus on some heavy data movement from there we will continue to some error flows


-Original Message-
From: Roland Dreier [mailto:[EMAIL PROTECTED]] 
Sent: Monday, October 03, 2005 7:30 PM
To: Dotan Barak
Cc: openib-general@openib.org
Subject: Re: [openib-general] some bugs that can be found using the gen2_basic in the contrib/m mellanox folder


I finally got a chance to try your tests. A few comments:


- Several of the tests are buggy. See the patch below at least.


- It would be much more useful if the COMPARE() macro printed the
 expected and actual value on failure.


- Similarly, other macros should probably also print more context.
 For example, in something like:


 CHECK_PTR(ibv_create_qp, qp[i], goto cleanup);


 I would probably want to know the value of i on failure.


- I don't believe some of the tests are really valid. For example,
 the max number of QPs doesn't have to be precisely correct -- no
 valid app is going to depend on being able to create exactly that
 number of QPs and no more.


- In any case, I'm not convinced that this sort of negative testing
 is the most valuable thing to focus on right now. I think it would
 be better to have regression tests of basic functionality (sends,
 receives, RDMA, CQ polling, etc) and stress tests before testing
 whether a buggy app will get the right error value when passing
 invalid parameters.


- R.


Index: test_cq.c
===
--- test_cq.c (revision 3639)
+++ test_cq.c (working copy)
@@ -106,6 +106,7 @@ int cq_2(
{
 struct ibv_context *ib_cont = NULL;
 struct ibv_pd  *pd = NULL;
+ struct ibv_comp_channel *channel = NULL;
 struct ibv_cq *cq = NULL;
 struct ibv_cq *event_cq = NULL;
 struct ibv_qp  *qp = NULL;
@@ -132,8 +133,11 @@ int cq_2(
 pd = ibv_alloc_pd(ib_cont);
 CHECK_PTR(ibv_alloc_pd, pd, goto cleanup);

+ channel = ibv_create_comp_channel(ib_cont);
+ CHECK_PTR(ibv_create_comp_channel, channel, goto cleanup);
+
 cq_size = VL_range(rand_gen, 1, device_attr.max_cqe);
- cq = ibv_create_cq(ib_cont, cq_size, (void *)count, NULL, 0);
+ cq = ibv_create_cq(ib_cont, cq_size, (void *)count, channel, 0);
 CHECK_PTR(ibv_create_cq, cq, goto cleanup);

 mr_size = VL_range(rand_gen, 1, 1024); @@ -211,6 +215,7 @@ int cq_2(
 CHECK_MALLOC(event_count, goto cleanup);

 *event_count = 0;
+ rc = ibv_get_cq_event(channel, (void *)event_cq, (void 
+*)event_count);
 rc = ibv_get_cq_event(NULL, (void *)event_cq, (void *)event_count);
 CHECK_VALUE(ibv_get_cq_event, rc, 0, goto cleanup);

Index: test_hca.c
===
--- test_hca.c (revision 3639)
+++ test_hca.c (working copy)
@@ -230,7 +230,7 @@ int hca_5(
  j = port_attr.gid_tbl_len + VL_random(rand_gen, 0x - port_attr.gid_tbl_len);
 
  rc = ibv_query_gid(ib_cont, i, j, gid);
-  CHECK_VALUE(ibv_query_gid, rc, 0, goto cleanup);
+  CHECK_VALUE(ibv_query_gid, rc, -1, goto cleanup);
 }
 PASSED;

@@ -239,7 +239,7 @@ int hca_5(
 i = VL_range(rand_gen, device_attr.phys_port_cnt + 1, 0xFF);
 
 rc = ibv_query_gid(ib_cont, i, j, gid);
- CHECK_VALUE(ibv_query_gid, rc, 0, goto cleanup);
+ CHECK_VALUE(ibv_query_gid, rc, -1, goto cleanup);
 PASSED;

 test_result = 0;
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general


To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] [CMA] blocking in rdma_listen()

2005-10-11 Thread Sean Hefty

Does anyone have any objection to rdma_listen() blocking?

I'm working on adding support for listening across any device, but need to 
synchronize with device addition/removal.


- Sean
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] SRP Infiniband

2005-10-11 Thread Roland Dreier
Mohit I am not clear about the functionalities of the user level
Mohit HCA driver?  Are there any specifications for it or it is
Mohit totally vendor based?

The userspace interface is based on the verbs described in chapter
11 of the IB spec, but there is no formal API spec.

Mohit It is also said it is used in speed path operations? Does
Mohit anyone has any ideas how does it do accomplishes it?

The kernel sets up a mapping of HCA registers into userspace, and then
userspace can talk directly to the IB hardware without going through
the kernel.

Mohit If I have SCSI storage devices in a SAN then can I use SRP
Mohit module to send some request and User mode HCA library for
Mohit some speed path operation? Basically I wanted to know that
Mohit for SCSI devices can User mode HCA library be used for
Mohit speed path operations . If yes the how they can be
Mohit used(Only theoretical details rest I wil try)

It would be theoretically possible to implement a userspace process
that connects to an SRP target and implement SRP in userspace.
However, I don't think this would be any better than using a kernel
SRP driver along with direct IO from userspace.

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] IBM eHCA testing..

2005-10-11 Thread Roland Dreier
Heiko 7.  ipoib_qp_create wants to modify the IPoIB QP (priv-qp)
Heiko which is NULL, because the QP was destroy earlier in time
Heiko by the error handling routine in ipoib_qp_create (see 5.)

Heiko I think this error could also show up on Mellanox based IB
Heiko cards when ib_modify_qp failes in ipoib_qp_create.

Yes, this is a bug.  I think something like the patch below is needed
-- ipoib_qp_create() should not destroy the QP on failure, since it no
longer creates the QP.  In fact we should fix the name as well, since
creation of the QP has moved elsewhere.

I'll check this in and queue it for 2.6.15.

Thanks,
  Roland

--- infiniband/ulp/ipoib/ipoib_verbs.c  (revision 3707)
+++ infiniband/ulp/ipoib/ipoib_verbs.c  (working copy)
@@ -92,7 +92,7 @@ int ipoib_mcast_detach(struct net_device
return ret;
 }
 
-int ipoib_qp_create(struct net_device *dev)
+int ipoib_init_qp(struct net_device *dev)
 {
struct ipoib_dev_priv *priv = netdev_priv(dev);
int ret;
@@ -149,10 +149,11 @@ int ipoib_qp_create(struct net_device *d
return 0;
 
 out_fail:
-   ib_destroy_qp(priv-qp);
-   priv-qp = NULL;
+   qp_attr.qp_state = IB_QPS_RESET;
+   if (ib_modify_qp(priv-qp, qp_attr, IB_QP_STATE))
+   ipoib_warn(priv, Failed to modify QP to RESET state\n);
 
-   return -EINVAL;
+   return ret;
 }
 
 int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca)
--- infiniband/ulp/ipoib/ipoib.h(revision 3707)
+++ infiniband/ulp/ipoib/ipoib.h(working copy)
@@ -277,7 +277,7 @@ int ipoib_mcast_attach(struct net_device
 int ipoib_mcast_detach(struct net_device *dev, u16 mlid,
   union ib_gid *mgid);
 
-int ipoib_qp_create(struct net_device *dev);
+int ipoib_init_qp(struct net_device *dev);
 int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca);
 void ipoib_transport_dev_cleanup(struct net_device *dev);
 
--- infiniband/ulp/ipoib/ipoib_ib.c (revision 3707)
+++ infiniband/ulp/ipoib/ipoib_ib.c (working copy)
@@ -387,9 +387,9 @@ int ipoib_ib_dev_open(struct net_device 
struct ipoib_dev_priv *priv = netdev_priv(dev);
int ret;
 
-   ret = ipoib_qp_create(dev);
+   ret = ipoib_init_qp(dev);
if (ret) {
-   ipoib_warn(priv, ipoib_qp_create returned %d\n, ret);
+   ipoib_warn(priv, ipoib_init_qp returned %d\n, ret);
return -1;
}
 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH] SRP: don't use TX IU after freeing it

2005-10-11 Thread Vu Pham

Roland,
   Thanks or reviewing it.
   Responding to your feedback, I prepare new patch (attached)




Why put a pointer to struct list_head here instead of just a struct
list_head?  If you just used the struct, then you wouldn't need this:



Done. Using struct list_head instead of pointer




 +  u16 in_use;
  };

I can't find anywhere that the in_use flag is used.



Removed



 +static int srp_map_fmr(struct srp_target_port *target, struct 
scatterlist *scat,
 + int sg_cnt, struct srp_request *req)

[...]

 +  return -ENOMEM;

 +  } else if (fmr_cnt = 0) {

fmr_cnt is unsigned so I think this is going to get you in trouble.
Might as well make fmr_cnt a plain int to make things simpler.



In previous patch, fmr_cnt was already declared as int


Also, it might be good to try and add some more comments explaining
srp_map_fmr() -- it would definitely help me review.



I added some comments - Hope they help your review (instead 
of confusing you more :))


Signed-off-by: Vu Pham [EMAIL PROTECTED]




Index: ulp/srp/ib_srp.c
===
--- ulp/srp/ib_srp.c(revision 3615)
+++ ulp/srp/ib_srp.c(working copy)
@@ -522,10 +522,120 @@ err:
return ret;
 }
 
+static int srp_map_fmr(struct srp_target_port *target, struct scatterlist 
*scat,
+  int sg_cnt, struct srp_request *req)
+  
+{
+   dma_addr_t  dma_addr;
+   u32 dma_len;
+   u32 cur_len;
+   u32 tmp_len;
+   int i;
+   u64*dma_pages;
+   u32 page_cnt;
+   struct srp_fmr *srp_fmr;
+   u32 unaligned;
+
+   dma_pages = kmalloc(sizeof(u64) * sg_cnt * (SRP_MAX_INDIRECT + 2), 
GFP_ATOMIC);
+   if (!dma_pages)
+   goto err_dpages;
+
+   req-fmr_arr = kmalloc(sizeof(struct srp_fmr) * sg_cnt, GFP_ATOMIC);
+   if (!req-fmr_arr)
+   goto err_farr;
+
+   srp_fmr = req-fmr_arr;
+   req-fmr_cnt = 0;
+   page_cnt = 0;
+   cur_len = 0;
+   unaligned = 0;
+
+   for (i = 0; i  sg_cnt; ++i) {
+   dma_len = sg_dma_len(scat);
+   dma_addr = sg_dma_address(scat);
+
+   /*
+* Checking for unaligned sg_element and assign the io_addr
+* for this fmr segment
+* If there is no such unaligned sg_element in the sg list
+* io_address will be the address of last sg_element
+*/
+   if (scat-offset ||
+   ((i == (sg_cnt - 1))  !unaligned)) {
+   srp_fmr-io_addr = dma_addr  PAGE_MASK;
+   ++unaligned;
+   }
+
+   /* 
+* Every FMR segment has atmost 1 unaligned element
+* prepare the page boundary bus addresses for this fmr segment
+*/
+   if (unaligned = 1) {
+   cur_len += dma_len;
+   for (tmp_len = 0; tmp_len  dma_len;
+tmp_len += PAGE_SIZE, dma_addr += PAGE_SIZE)
+   dma_pages[page_cnt++] = dma_addr  PAGE_MASK;
+   }
+
+   /*
+* Register this FMR segment if this is the last sg_element or
+* unaligned  1
+* Restart the new fmr segment with the new unaligned sg_element
+*/
+   if ((unaligned  1) || (i == (sg_cnt - 1))) {
+   srp_fmr-mem =
+   ib_fmr_pool_map_phys(target-srp_host-fmr_pool,
+dma_pages, page_cnt,
+srp_fmr-io_addr);
+   if (IS_ERR(srp_fmr-mem)) {
+   srp_fmr-mem = NULL;
+   goto err_map_phys;
+   }
+
+   srp_fmr-len = cur_len;
+   ++req-fmr_cnt;
+
+   /* restart the new fmr segment */
+   if (unaligned  1) {
+   srp_fmr++;
+   page_cnt = 0;
+   --unaligned;
+   cur_len = dma_len;
+   srp_fmr-io_addr = dma_addr  PAGE_MASK;
+   for (tmp_len = 0; tmp_len  dma_len;
+tmp_len += PAGE_SIZE, dma_addr += 
PAGE_SIZE)
+   dma_pages[page_cnt++] = dma_addr  
PAGE_MASK;
+   }
+   }
+   scat++;
+   }
+
+   kfree(dma_pages);
+
+   return req-fmr_cnt;
+

Re: [openib-general] IRQ sharing on PCIe bus

2005-10-11 Thread Michael Krause


At 02:05 PM 10/10/2005, Roland Dreier wrote:
 Roland
BTW, for INTx emulation on PCI Express, there are no
 Roland physical interrupt lines -- interrupts are
asserted and
 Roland deasserted with messages. So PCI
Express interrupts are
 Roland unshared.
 Michael They are messages upstream that any
device.

^ sent
Sorry. Insert sent above.
That doesn't parse
for me. Was what I said wrong?
No. Just clarifying that they are not unique per device. INTx
being a message does not change the fundamental semantics of a
wire being asserted. Hence, if the wire was shared
before, then there is no reason why this would not be the same with PCIe
sans. It really is an OS issue as to how INTx interrupts are
assigned to different processors and to what extent then end up being
shared. The host bridge can play some tricks as well as you
noted. Again, the goal within the PCI-SIG is to move people to
MSI-X and to eliminate INTx long-term. In fact, one area under
development is asking the SIG's members whether INTx can be eliminated
entirely which would go a long ways to simplifying designs both in
hardware and software.
Mike

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] InfiniPath driver announcement

2005-10-11 Thread Roland Dreier
I started working on some final cleanups of the uverbs interface
before merging stuff onto the trunk.  The patch below mixes some
simple cleanups with a slight change to the work request posting
interface.  I changed the ABI so that the work requests are passed as
part of the same write as the command, and modified the implementation
to copy the work requests one by one instead of in one giant chunk.

I also added a WQE size field to allow for future device-specific
extensions to work request posting.

The following is compile-tested only, and I haven't modified the
userspace library to match, but I wanted to give you some idea of what
I was doing in case you had some comments or started working on it too.

What do you think?

- R.

--- infiniband/include/rdma/ib_user_verbs.h (revision 3725)
+++ infiniband/include/rdma/ib_user_verbs.h (working copy)
@@ -89,8 +89,11 @@ enum {
  * Make sure that all structs defined in this file remain laid out so
  * that they pack the same way on 32-bit and 64-bit architectures (to
  * avoid incompatibility between 32-bit userspace and 64-bit kernels).
- * In particular do not use pointer types -- pass pointers in __u64
- * instead.
+ * Specifically:
+ *  - Do not use pointer types -- pass pointers in __u64 instead.
+ *  - Make sure that any structure larger than 4 bytes is padded to a
+ *multiple of 8 bytes.  Otherwise the structure size will be
+ *different between 32-bit and 64-bit architectures.
  */
 
 struct ib_uverbs_async_event_desc {
@@ -284,12 +287,12 @@ struct ib_uverbs_wc {
__u8 sl;
__u8 dlid_path_bits;
__u8 port_num;
-   __u8 reserved; /* Align struct to 8 bytes */
+   __u8 reserved;
 };
 
 struct ib_uverbs_poll_cq_resp {
__u32 count;
-   __u32 reserved; /* Align struct to 8 bytes */
+   __u32 reserved;
struct ib_uverbs_wc wc[];
 };
 
@@ -417,20 +420,20 @@ struct ib_uverbs_send_wr {
struct {
__u64 remote_addr;
__u32 rkey;
-   __u32 reserved; /* Align struct to 8 bytes */
+   __u32 reserved;
} rdma;
struct {
__u64 remote_addr;
__u64 compare_add;
__u64 swap;
__u32 rkey;
-   __u32 reserved; /* Align struct to 8 bytes */
+   __u32 reserved;
} atomic;
struct {
__u32 ah;
__u32 remote_qpn;
__u32 remote_qkey;
-   __u32 reserved; /* Align struct to 8 bytes */
+   __u32 reserved;
} ud;
} wr;
 };
@@ -440,8 +443,7 @@ struct ib_uverbs_post_send {
__u32 qp_handle;
__u32 wr_count;
__u32 sge_count;
-   __u32 reserved; /* Align struct to 8 bytes */
-   __u64 wr;
+   __u32 wqe_size;
 };
 
 struct ib_uverbs_post_send_resp {
@@ -451,7 +453,7 @@ struct ib_uverbs_post_send_resp {
 struct ib_uverbs_recv_wr {
__u64 wr_id;
__u32 num_sge;
-   __u32 reserved; /* Align struct to 8 bytes */
+   __u32 reserved;
 };
 
 struct ib_uverbs_post_recv {
@@ -459,8 +461,7 @@ struct ib_uverbs_post_recv {
__u32 qp_handle;
__u32 wr_count;
__u32 sge_count;
-   __u32 reserved; /* Align struct to 8 bytes */
-   __u64 wr;
+   __u32 wqe_size;
 };
 
 struct ib_uverbs_post_recv_resp {
@@ -472,47 +473,38 @@ struct ib_uverbs_post_srq_recv {
__u32 srq_handle;
__u32 wr_count;
__u32 sge_count;
-   __u32 reserved; /* Align struct to 8 bytes */
-   __u64 wr;
+   __u32 wqe_size;
 };
 
 struct ib_uverbs_post_srq_recv_resp {
__u32 bad_wr;
 };
 
-union ib_uverbs_gid {
-   __u8 raw[16]; 
-   struct {
-   __u64 subnet_prefix;
-   __u64 interface_id;
-   } global;
-};
-
-struct ibv_m_global_route {
-   union ib_uverbs_gid dgid;
+struct ib_uverbs_global_route {
+   __u8  dgid[16];
__u32 flow_label;
__u8  sgid_index;
__u8  hop_limit;
__u8  traffic_class;
-   __u8  reserved; /* Align struct to 8 bytes */
+   __u8  reserved;
 };
 
 struct ib_uverbs_ah_attr {
-   struct ibv_m_global_route grh;
+   struct ib_uverbs_global_route grh;
__u16 dlid;
__u8  sl;
__u8  src_path_bits;
__u8  static_rate;
__u8  is_global;
__u8  port_num;
-   __u8  reserved; /* Align struct to 8 bytes */
+   __u8  reserved;
 };
 
 struct ib_uverbs_create_ah {
__u64 response;
__u64 user_handle;
__u32 pd_handle;
-   __u32 reserved; /* Align struct to 8 bytes */
+   __u32 reserved;
struct ib_uverbs_ah_attr attr;
 };
 
--- infiniband/core/uverbs_cmd.c(revision 3725)
+++ infiniband/core/uverbs_cmd.c(working copy)
@@ 

Re: [openib-general] InfiniPath driver announcement

2005-10-11 Thread Robert Walsh
On Tue, 2005-10-11 at 12:46 -0700, Roland Dreier wrote:
 I started working on some final cleanups of the uverbs interface
 before merging stuff onto the trunk.  The patch below mixes some
 simple cleanups with a slight change to the work request posting
 interface.  I changed the ABI so that the work requests are passed as
 part of the same write as the command, and modified the implementation
 to copy the work requests one by one instead of in one giant chunk.
 
 I also added a WQE size field to allow for future device-specific
 extensions to work request posting.
 
 The following is compile-tested only, and I haven't modified the
 userspace library to match, but I wanted to give you some idea of what
 I was doing in case you had some comments or started working on it too.
 
 What do you think?

I'll spend some time today or tomorrow looking at this, getting it
integrated and finishing the userland stuff.  Thanks for doing this!

Regards,
 Robert.

-- 
Robert Walsh Email: [EMAIL PROTECTED]
PathScale, Inc.  Phone: +1 650 934 8117
2071 Stierlin Court, Suite 200 Fax: +1 650 428 1969
Mountain View, CA 94043


signature.asc
Description: This is a digitally signed message part
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Re: [PATCH] [CMA] RDMA CM abstraction module

2005-10-11 Thread Michael Krause


At 01:09 PM 10/10/2005, Christoph Hellwig wrote:
On Mon, Oct 10, 2005 at
12:53:29PM -0700, Michael Krause wrote:
 standards. There are also the new standard Sockets extension
API available 
 today that might be extended sometime in the future to include
explicit 
which is never going to get into linux. one more of these
braindead
standards people masturbating in a dark room and coming up with a
frankenstein bastard cases. 
Everyone is free to have an opinion. Sockets extensions are not
braindead nor created using whatever methods you envision. The
extensions were created by Sockets engineers with 20+ years
experience. But, hey, why put any faith into people who develop and
implement Sockets for a living? One day perhaps you'll learn a bit
of professionalism and perhaps open your mind that there are people out
in the world besides yourself you don't take a NIH approach to the world
and are actually qualified engineers who have a clue. All you get
with these constant unprofessional diatribes is a continual loss in
credibility. But, hey, that is just an opinion.
BTW, do you feel the same way about the people who created IB? How
about iWARP? How about PCIe? Are all of the engineers who
work on trying to accelerate technology, its performance, etc. who take
into account and try to find a balanced approach to problem solving
simply all in dark little rooms? All of these specs are created by
companies. Those same companies who fund open source efforts and many of
the people working here. 
One last thing, I'm not the only person who feels this way about your
unprofessional behavior. There are many others who have simply
don't want to bother writing or have simply written you off as
whatever. Sad state to be in and I suspect you don't care since you
view them all as in dark little rooms anyway. Just something you
might want to keep in mind. There is a much larger world out there
where people value other people's professional opinions and ideas.
They don't simply discount what they produce because it was not done in
whatever form you prefer. It is called reality. Get used to
it.
Mike 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [CMA] blocking in rdma_listen()

2005-10-11 Thread James Lentini


On Tue, 11 Oct 2005, Sean Hefty wrote:

 Does anyone have any objection to rdma_listen() blocking?
 
 I'm working on adding support for listening across any device, but need to
 synchronize with device addition/removal.

I have a strong objection to making it block. 

Our goal is to provide an interface with semantics similar to the 
sockets interface. A socket's listen function does not block (e.g. 
inet_listen). 

Since not blocking is what ULPs expect, kDAPL's listen function does 
not block. The same should be true of the CMA function.

james
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: 2.6.14 heads up: ip_dev_find() not exported

2005-10-11 Thread Michael S. Tsirkin
Quoting r. Roland Dreier [EMAIL PROTECTED]:
 Subject: 2.6.14 heads up: ip_dev_find() not exported
 
 I noticed while compiling against an up-to-date kernel tree that SDP
 and IBAT both use the function ip_dev_find().  The EXPORT_SYMBOL for
 this function was removed during the 2.6.14 devel cycle.
 
 I haven't looked yet at what this function does, how SDP and IBAT use
 it or what it could be replaced by.  But now would be a good time to
 figure out whether we need to ask for it to be re-exported, or if
 there's a better alternative to do whatever it does for us.
 
  - R.

Guys, did anyone figure out yet how we can find a device by its address
without ip_dev_find?

To remind you all, we use it to handle cases where the address is
local and so ip_route_output_key gets us a loopback device.

If not, is it too late to ask for it to be re-exported to modules?


-- 
MST
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] [CMA] blocking in rdma_listen()

2005-10-11 Thread Sean Hefty

James Lentini wrote:
Our goal is to provide an interface with semantics similar to the 
sockets interface. A socket's listen function does not block (e.g. 
inet_listen). 

Since not blocking is what ULPs expect, kDAPL's listen function does 
not block. The same should be true of the CMA function.


From what I can see, kDAPL connect and listen calls can block, as does 
inet_listen.  I'm referring to the thread blocking within the call, specifically 
on a semaphore and memory allocation using GFP_KERNEL.  I am not referring to 
listen blocking until a connection request is received.


- Sean
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Re: 2.6.14 heads up: ip_dev_find() not exported

2005-10-11 Thread Sean Hefty

Michael S. Tsirkin wrote:

Guys, did anyone figure out yet how we can find a device by its address
without ip_dev_find?


I wrote ib_addr to call ip_dev_find().  I didn't see a cleaner way to do this.


If not, is it too late to ask for it to be re-exported to modules?


Hal already tried to re-export it.  The response was that exporting it will only 
be accepted once code is submitted for inclusion that calls it.


- Sean
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [CMA] blocking in rdma_listen()

2005-10-11 Thread Michael S. Tsirkin
Quoting Sean Hefty [EMAIL PROTECTED]:
 Subject: [CMA] blocking in rdma_listen()
 
 Does anyone have any objection to rdma_listen() blocking?
 
 I'm working on adding support for listening across any device, but need to 
 synchronize with device addition/removal.
 
 - Sean

Sean, when you say blocking, do you mean might sleep?
If so, I dont have any objections.


-- 
MST
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [CMA] blocking in rdma_listen()

2005-10-11 Thread Sean Hefty

Michael S. Tsirkin wrote:

Sean, when you say blocking, do you mean might sleep?
If so, I dont have any objections.


Yes - I mean might sleep.  I wasn't very clear on that.

- Sean
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] [CMA] blocking in rdma_listen()

2005-10-11 Thread James Lentini


On Tue, 11 Oct 2005, Sean Hefty wrote:

 James Lentini wrote:
  Our goal is to provide an interface with semantics similar to the sockets
  interface. A socket's listen function does not block (e.g. inet_listen). 
  Since not blocking is what ULPs expect, kDAPL's listen function does not
  block. The same should be true of the CMA function.
 
 From what I can see, kDAPL connect and listen calls can block, as does
 inet_listen.  I'm referring to the thread blocking within the call,
 specifically on a semaphore and memory allocation using GFP_KERNEL.  I am not
 referring to listen blocking until a connection request is received.

I thought you meant blocking for a connection request to arrive. 

Your right the kDAPL and inet_listen functions can block for the 
reasons you list. 

I'm ok with rdma_listen() also blocking for these reasons.

james
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Latest build test results

2005-10-11 Thread Nishanth Aravamudan
On 07.10.2005 [09:48:56 -0400], Hal Rosenstock wrote:
 On Thu, 2005-10-06 at 15:26, Hal Rosenstock wrote:
  On Thu, 2005-10-06 at 15:20, Nishanth Aravamudan wrote:
   On 06.10.2005 [13:25:35 -0400], Hal Rosenstock wrote:
On Thu, 2005-10-06 at 13:11, Nishanth Aravamudan wrote:
 On 06.10.2005 [19:40:40 +0300], Dan Bar Dov wrote:
  I've fixed the 2.6.14-rc3 compilation warnings with iSER on x86 in 
  version 3682.
 
 Great! Thanks.
 
 I'm re-running the tests (due to a subtle flaw in my PATH, my cronjobs
 weren't running) now and will post the latest results.

You might also want to apply 
https://openib.org/svn/gen2/trunk/src/linux-kernel/patches/linux-2.6.14-rc3-fib-frontend.diff
to get rid of the AT and SDP warnings.
   
   This patch does remove the warning regarding undefined symbols during
   modpost, but does not remove the warnings
   
   drivers/infiniband/core/at.c:1547: warning: initialization from 
   incompatible pointer type
   
   drivers/infiniband/ulp/sdp/sdp_link.c:752: warning: initialization from 
   incompatible pointer type
  
  Right. Roland reported a change to struct packet_type in 2.6.14. I'll
  work on a patch for this too. Thanks.
 
 Can you try this patch for the above 2 warnings ? If it works, I check
 it into the patches directory. Thanks.
 
 -- Hal
 
 Update arp_recv functions to latest 2.6.14 netdevice.h API for struct
 packet_type

Sorry for the delay, I haven't yet had time to test the patches :/ I'll
try to get to it tonight or tomorrow.

Is there anyway you can send me patches against the kernel tree as
opposed to the svn repo? It makes my side of things *a lot* easier, as
right now I have to take your patch against svn and either hand-edit or
patch my checkout and then diff against the current kernel tree.

Thanks,
Nish
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] IBM eHCA testing..

2005-10-11 Thread Troy Benjegerdes
On Tue, Oct 11, 2005 at 09:13:20AM -0700, Shirley Ma wrote:
 The IB stack doesn't handle errors during client initialization. This 
 problem is easy to reproduce by inducing errors (resouce allocation 
 failure or query failure) in mad_client or sa_client registration. I am 
 working on a patch, but I am in class the whole week, don't have time to 
 verify the patch. I hope the patch will be available early next week to 
 fix the panic. 

I'd be happy to verify the patch, but I need to get the latest version
of the ehca driver, ideally already integrated into the subversion tree.
Otherwise a tar.gz I can extract and drop in drivers/infiniband/hw/ehca
would work just fine.

I'm still not sure I got an answer why the ehca is so senstive to which 
port is plugged in.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: Re: 2.6.14 heads up: ip_dev_find() not exported

2005-10-11 Thread Michael S. Tsirkin
Quoting r. Sean Hefty [EMAIL PROTECTED]:
  Guys, did anyone figure out yet how we can find a device by its address
  without ip_dev_find?
 
 I wrote ib_addr to call ip_dev_find().  I didn't see a cleaner way to do this.
 
  If not, is it too late to ask for it to be re-exported to modules?
 
 Hal already tried to re-export it.  The response was that exporting it will 
 only 
 be accepted once code is submitted for inclusion that calls it.
 
 - Sean
 

Hmm, maybe posting addr.c on lkml will help?

-- 
MST
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: 2.6.14 heads up: ip_dev_find() not exported

2005-10-11 Thread Sean Hefty

Michael S. Tsirkin wrote:

Hmm, maybe posting addr.c on lkml will help?


It probably needs to be reviewed and tested a little more first.  Plus, the only 
user of it at the moment is the CMA.  We may find that to add addr.c, we need a 
user, which requires the cma, which requires yet another user...


- Sean
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: Latest build test results

2005-10-11 Thread Michael S. Tsirkin
Quoting Nishanth Aravamudan [EMAIL PROTECTED]:
 Is there anyway you can send me patches against the kernel tree as
 opposed to the svn repo? It makes my side of things *a lot* easier, as
 right now I have to take your patch against svn and either hand-edit or
 patch my checkout and then diff against the current kernel tree.

In case this is useful to others, I am using the following trick
with softlinks to create -p1 patches suitable to applying to kernel
(requires svn client revision 1.2.3 and up):

cd trunk/src/linux-kernel/
ln -s . drivers
cd ../
svn diff --diff-cmd /usr/bin/diff -x -up linux-kernel/drivers/infiniband

I've put this information in the FAQ.

-- 
MST
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: 2.6.14 heads up: ip_dev_find() not exported

2005-10-11 Thread Michael S. Tsirkin
Quoting r. Sean Hefty [EMAIL PROTECTED]:
 Subject: Re: 2.6.14 heads up: ip_dev_find() not exported
 
 Michael S. Tsirkin wrote:
  Hmm, maybe posting addr.c on lkml will help?
 
 It probably needs to be reviewed and tested a little more first.

Certainly, but maybe thats a good way to get more review.

 Plus, the only 
 user of it at the moment is the CMA.  We may find that to add addr.c, we need 
 a 
 user, which requires the cma, which requires yet another user...
 
 - Sean

Hmm.
BTW, we need to add something for userspace?
Userspace can already get at GIDs, I think, but how does it get the
IPoIB pkey?

-- 
MST
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] InfiniPath driver announcement

2005-10-11 Thread Robert Walsh
Some comments on the patch:

 + /* Don't let userspace make us allocate a huge buffer */
 + if (cmd.ne  256)
 + return -ENOMEM;
 +

Is this necessary?  Won't the following fail with ENOMEM anyway if
cmd.ne is too big:

   wc = kmalloc(cmd.ne * sizeof *wc, GFP_KERNEL);
   if (!wc)
   return -ENOMEM;

Same here:

 + /* Don't let userspace make us allocate a huge buffer */
 + if (cmd.wqe_size  4096)
 + return -ENOMEM;
 +
 + user_wr = kmalloc(cmd.wqe_size, GFP_KERNEL);
 + if (!user_wr)
 + return -ENOMEM;
 +

What meaning do those numbers have, exactly?  i.e. the 4096 number
above?

 + if (ret) {
 + for (next = wr; next; next = next-next) {
 + if (next == bad_wr)
 + break;
 + ++resp.bad_wr;
 + }
 + }

Will this work?  If bad_wr is the first wr, then resp.bad_wr will be
zero.  The current user code (which has to change anyway) assumes 0 ==
no bad wr and 1 == the first wr is the bad wr, etc.
 
 + while (wr) {
 + next = wr-next;
 + kfree(wr);
 + wr = next;
 + }
 +
 + kfree(user_wr);

One reason why I originally allocated one big wr area instead of a bunch
of smaller ones was to keep the cost of this down.  Is it a good idea to
be doing this with a bunch of kmallocs?


This is all I've had a chance to look at for the moment.  More later.

Regards,
 Robert.

-- 
Robert Walsh Email: [EMAIL PROTECTED]
PathScale, Inc.  Phone: +1 650 934 8117
2071 Stierlin Court, Suite 200 Fax: +1 650 428 1969
Mountain View, CA 94043


signature.asc
Description: This is a digitally signed message part
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] Re: 2.6.14 heads up: ip_dev_find() not exported

2005-10-11 Thread Sean Hefty

Michael S. Tsirkin wrote:

Hmm.
BTW, we need to add something for userspace?
Userspace can already get at GIDs, I think, but how does it get the
IPoIB pkey?


Something needs to be done for userspace, but I'm not entirely sure what yet. 
I've given it some thought, but was deferring doing too much until I had a 
couple of missing areas completed in the kernel CMA first.


I think that the pkey is exported by ipoib through /sys/class/net/ib0.

- Sean
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Latest build test results

2005-10-11 Thread Hal Rosenstock
Hi Nish,

On Tue, 2005-10-11 at 17:45, Nishanth Aravamudan wrote:
 On 07.10.2005 [09:48:56 -0400], Hal Rosenstock wrote:
  On Thu, 2005-10-06 at 15:26, Hal Rosenstock wrote:
   On Thu, 2005-10-06 at 15:20, Nishanth Aravamudan wrote:
On 06.10.2005 [13:25:35 -0400], Hal Rosenstock wrote:
 On Thu, 2005-10-06 at 13:11, Nishanth Aravamudan wrote:
  On 06.10.2005 [19:40:40 +0300], Dan Bar Dov wrote:
   I've fixed the 2.6.14-rc3 compilation warnings with iSER on x86 
   in version 3682.
  
  Great! Thanks.
  
  I'm re-running the tests (due to a subtle flaw in my PATH, my 
  cronjobs
  weren't running) now and will post the latest results.
 
 You might also want to apply 
 https://openib.org/svn/gen2/trunk/src/linux-kernel/patches/linux-2.6.14-rc3-fib-frontend.diff
 to get rid of the AT and SDP warnings.

This patch does remove the warning regarding undefined symbols during
modpost, but does not remove the warnings

drivers/infiniband/core/at.c:1547: warning: initialization from 
incompatible pointer type

drivers/infiniband/ulp/sdp/sdp_link.c:752: warning: initialization from 
incompatible pointer type
   
   Right. Roland reported a change to struct packet_type in 2.6.14. I'll
   work on a patch for this too. Thanks.
  
  Can you try this patch for the above 2 warnings ? If it works, I check
  it into the patches directory. Thanks.
  
  -- Hal
  
  Update arp_recv functions to latest 2.6.14 netdevice.h API for struct
  packet_type
 
 Sorry for the delay, I haven't yet had time to test the patches :/ I'll
 try to get to it tonight or tomorrow.
 
 Is there anyway you can send me patches against the kernel tree as
 opposed to the svn repo? It makes my side of things *a lot* easier, as
 right now I have to take your patch against svn and either hand-edit or
 patch my checkout and then diff against the current kernel tree.

Since you were reporting iSER, AT, and SDP compile warnings/errors,
aren't you using the latest OpenIB svn tree with 2.6.14-rc3 ?

Which patches are you referring to ? Was it the fib_frontend.c one ? Not
sure why they would need any manual fixup. At least that one was pretty
straightforward.

2.6.14-rc4 is out now. 

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Latest build test results

2005-10-11 Thread Nishanth Aravamudan
On 11.10.2005 [21:27:15 -0400], Hal Rosenstock wrote:
 Hi Nish,
 
 On Tue, 2005-10-11 at 17:45, Nishanth Aravamudan wrote:
  On 07.10.2005 [09:48:56 -0400], Hal Rosenstock wrote:
   On Thu, 2005-10-06 at 15:26, Hal Rosenstock wrote:
On Thu, 2005-10-06 at 15:20, Nishanth Aravamudan wrote:
 On 06.10.2005 [13:25:35 -0400], Hal Rosenstock wrote:
  On Thu, 2005-10-06 at 13:11, Nishanth Aravamudan wrote:
   On 06.10.2005 [19:40:40 +0300], Dan Bar Dov wrote:
I've fixed the 2.6.14-rc3 compilation warnings with iSER on x86 
in version 3682.
   
   Great! Thanks.
   
   I'm re-running the tests (due to a subtle flaw in my PATH, my 
   cronjobs
   weren't running) now and will post the latest results.
  
  You might also want to apply 
  https://openib.org/svn/gen2/trunk/src/linux-kernel/patches/linux-2.6.14-rc3-fib-frontend.diff
  to get rid of the AT and SDP warnings.
 
 This patch does remove the warning regarding undefined symbols during
 modpost, but does not remove the warnings
 
 drivers/infiniband/core/at.c:1547: warning: initialization from 
 incompatible pointer type
 
 drivers/infiniband/ulp/sdp/sdp_link.c:752: warning: initialization 
 from incompatible pointer type

Right. Roland reported a change to struct packet_type in 2.6.14. I'll
work on a patch for this too. Thanks.
   
   Can you try this patch for the above 2 warnings ? If it works, I check
   it into the patches directory. Thanks.
   
   -- Hal
   
   Update arp_recv functions to latest 2.6.14 netdevice.h API for struct
   packet_type
  
  Sorry for the delay, I haven't yet had time to test the patches :/ I'll
  try to get to it tonight or tomorrow.
  
  Is there anyway you can send me patches against the kernel tree as
  opposed to the svn repo? It makes my side of things *a lot* easier, as
  right now I have to take your patch against svn and either hand-edit or
  patch my checkout and then diff against the current kernel tree.
 
 Since you were reporting iSER, AT, and SDP compile warnings/errors,
 aren't you using the latest OpenIB svn tree with 2.6.14-rc3 ?

Yes; but you have to understand that the automated build system I have
access to 1) does not have external internet access (i.e., to the svn
tree) and 2) only builds kernels unless I manually send commands to the
terminal.

So, the way I'm doing things is:

Send in 4 jobs for mainline (x86 and ppc64 with =y and =m) and then
generate a patch of the latest svn tree against the current -git release
(a patch to the kernel) and send it in as a parameter to my builds to
test the latest svn tree. This leads to another 4 jobs (x86 and ppc64
with =y and =m).

I'm *only* doing kernel build testing right now.

 Which patches are you referring to ? Was it the fib_frontend.c one ?
 Not sure why they would need any manual fixup. At least that one was
 pretty straightforward.

In the sense that I have to edit them to kernel relative paths, not in
the content of the patch. To test any patch in the system I have access
to, it needs to be a normal kernel patch (-p1 applicable to the base
tree).

Going through and manually applying patches to the svn tree and then
regenerating the diff completely defeats the purpose of automated
compilation testing.

 2.6.14-rc4 is out now.

Yes, I know.

Thanks,
Nish
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Latest build test results

2005-10-11 Thread Hal Rosenstock
Hi again Nish,

On Tue, 2005-10-11 at 21:39, Nishanth Aravamudan wrote:
Update arp_recv functions to latest 2.6.14 netdevice.h API for struct
packet_type
   
   Sorry for the delay, I haven't yet had time to test the patches :/ I'll
   try to get to it tonight or tomorrow.
   
   Is there anyway you can send me patches against the kernel tree as
   opposed to the svn repo? It makes my side of things *a lot* easier, as
   right now I have to take your patch against svn and either hand-edit or
   patch my checkout and then diff against the current kernel tree.
  
  Since you were reporting iSER, AT, and SDP compile warnings/errors,
  aren't you using the latest OpenIB svn tree with 2.6.14-rc3 ?
 
 Yes; but you have to understand that the automated build system I have
 access to 1) does not have external internet access (i.e., to the svn
 tree) and 2) only builds kernels unless I manually send commands to the
 terminal.
 
 So, the way I'm doing things is:
 
 Send in 4 jobs for mainline (x86 and ppc64 with =y and =m) and then
 generate a patch of the latest svn tree against the current -git release
 (a patch to the kernel) and send it in as a parameter to my builds to
 test the latest svn tree. This leads to another 4 jobs (x86 and ppc64
 with =y and =m).
 
 I'm *only* doing kernel build testing right now.
 
  Which patches are you referring to ? Was it the fib_frontend.c one ?
  Not sure why they would need any manual fixup. At least that one was
  pretty straightforward.
 
 In the sense that I have to edit them to kernel relative paths, not in
 the content of the patch. To test any patch in the system I have access
 to, it needs to be a normal kernel patch (-p1 applicable to the base
 tree).
 
 Going through and manually applying patches to the svn tree and then
 regenerating the diff completely defeats the purpose of automated
 compilation testing.

OK. Do you need any patches regenerated or is this more for the future ?

-- Hal


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] DMA mapping abuses in MAD layer

2005-10-11 Thread Roland Dreier
I recently got a chance to play with an eval board for the PowerPC
440SPe -- an embedded system with PCI Express support where the PCI
bus is not cache coherent with the CPU.  Of course I plugged an HCA in
and tried out our current drivers.

It turns out that everything works pretty well, except the HCA's ports
never make it past INIT.  I did some debugging, and the reason for
this is that the MAD layer doesn't quite use the DMA mapping API
properly.  Once we call dma_map_single() on a buffer, the CPU may not
touch that buffer until after the corresponding dma_unmap_single().

On mainstream architectures, it turns out that we can get away with
violating this rule.  However, on non-cache-coherent architectures
like PowerPC 4xx, dma_map_single(..., DMA_TO_DEVICE) does a cache
flush, which makes sure that the contents of the CPU's cache are
really written to memory.  If a driver then changes the contents of
the buffer after the call to dma_map_single(), then it's quite likely
that the change will be made only in the CPU's cache and the device
will end up DMA-ing the old data.

The problem I hit is in ib_post_send_mad(), specifically:

smp = (struct ib_smp *)send_wr-wr.ud.mad_hdr;
if (smp-mgmt_class == IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE) {
ret = handle_outgoing_dr_smp(mad_agent_priv, smp,
 send_wr);

basically, when the MAD layer goes to send a directed route reply, it
changes the MAD buffer after the DMA mapping is done.  The HCA
doesn't see the change, the wrong packet gets sent and the SM never
sees replies to its queries.

Adding a PPC-specific cache flush call after the call to
handle_outgoing_dr_smp() fixes things to the point that the port can
be brought to ACTIVE, and in fact IPoIB works as well.  However, this
is just a cludge -- the real fix will need to be more invasive.  It
seems that the whole interface to the MAD layer may need to be
reorganized to avoid doing this.

It looks like there is a similar problem with ib_create_send_mad(): it
does DMA mapping on a buffer that is then returned for the caller to modify.

Finally, some of the MAD structures like struct ib_mad_private look
risky to me, since kernel data might potentially share a cache line
with DMA buffers.  See http://lwn.net/Articles/2265/ for a nice
writeup of the class of bug that might be lurking.

Sorry for missing all of this when the MAD layer was first being
developed and reviewed.

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] InfiniPath driver announcement

2005-10-11 Thread Roland Dreier
Here is an updated kernel patch and a matching libibverbs patch (both
against your branch).  Still compile tested only.  I'll get my
Pathscale system set up to do some testing of this code tomorrow
morning and see if this stuff actually works.

 - R.

Index: infiniband/include/rdma/ib_user_verbs.h
===
--- infiniband/include/rdma/ib_user_verbs.h	(revision 3742)
+++ infiniband/include/rdma/ib_user_verbs.h	(working copy)
@@ -89,8 +89,11 @@ enum {
  * Make sure that all structs defined in this file remain laid out so
  * that they pack the same way on 32-bit and 64-bit architectures (to
  * avoid incompatibility between 32-bit userspace and 64-bit kernels).
- * In particular do not use pointer types -- pass pointers in __u64
- * instead.
+ * Specifically:
+ *  - Do not use pointer types -- pass pointers in __u64 instead.
+ *  - Make sure that any structure larger than 4 bytes is padded to a
+ *multiple of 8 bytes.  Otherwise the structure size will be
+ *different between 32-bit and 64-bit architectures.
  */
 
 struct ib_uverbs_async_event_desc {
@@ -284,12 +287,12 @@ struct ib_uverbs_wc {
 	__u8 sl;
 	__u8 dlid_path_bits;
 	__u8 port_num;
-	__u8 reserved; /* Align struct to 8 bytes */
+	__u8 reserved;
 };
 
 struct ib_uverbs_poll_cq_resp {
 	__u32 count;
-	__u32 reserved; /* Align struct to 8 bytes */
+	__u32 reserved;
 	struct ib_uverbs_wc wc[];
 };
 
@@ -417,20 +420,20 @@ struct ib_uverbs_send_wr {
 		struct {
 			__u64 remote_addr;
 			__u32 rkey;
-			__u32 reserved; /* Align struct to 8 bytes */
+			__u32 reserved;
 		} rdma;
 		struct {
 			__u64 remote_addr;
 			__u64 compare_add;
 			__u64 swap;
 			__u32 rkey;
-			__u32 reserved; /* Align struct to 8 bytes */
+			__u32 reserved;
 		} atomic;
 		struct {
 			__u32 ah;
 			__u32 remote_qpn;
 			__u32 remote_qkey;
-			__u32 reserved; /* Align struct to 8 bytes */
+			__u32 reserved;
 		} ud;
 	} wr;
 };
@@ -440,8 +443,7 @@ struct ib_uverbs_post_send {
 	__u32 qp_handle;
 	__u32 wr_count;
 	__u32 sge_count;
-	__u32 reserved; /* Align struct to 8 bytes */
-	__u64 wr;
+	__u32 wqe_size;
 };
 
 struct ib_uverbs_post_send_resp {
@@ -451,7 +453,7 @@ struct ib_uverbs_post_send_resp {
 struct ib_uverbs_recv_wr {
 	__u64 wr_id;
 	__u32 num_sge;
-	__u32 reserved; /* Align struct to 8 bytes */
+	__u32 reserved;
 };
 
 struct ib_uverbs_post_recv {
@@ -459,8 +461,7 @@ struct ib_uverbs_post_recv {
 	__u32 qp_handle;
 	__u32 wr_count;
 	__u32 sge_count;
-	__u32 reserved; /* Align struct to 8 bytes */
-	__u64 wr;
+	__u32 wqe_size;
 };
 
 struct ib_uverbs_post_recv_resp {
@@ -472,47 +473,38 @@ struct ib_uverbs_post_srq_recv {
 	__u32 srq_handle;
 	__u32 wr_count;
 	__u32 sge_count;
-	__u32 reserved; /* Align struct to 8 bytes */
-	__u64 wr;
+	__u32 wqe_size;
 };
 
 struct ib_uverbs_post_srq_recv_resp {
 	__u32 bad_wr;
 };
 
-union ib_uverbs_gid {
-	__u8 raw[16]; 
-	struct {
-		__u64 subnet_prefix;
-		__u64 interface_id;
-	} global;
-};
-
-struct ibv_m_global_route {
-	union ib_uverbs_gid dgid;
+struct ib_uverbs_global_route {
+	__u8  dgid[16];
 	__u32 flow_label;
 	__u8  sgid_index;
 	__u8  hop_limit;
 	__u8  traffic_class;
-	__u8  reserved; /* Align struct to 8 bytes */
+	__u8  reserved;
 };
 
 struct ib_uverbs_ah_attr {
-	struct ibv_m_global_route grh;
+	struct ib_uverbs_global_route grh;
 	__u16 dlid;
 	__u8  sl;
 	__u8  src_path_bits;
 	__u8  static_rate;
 	__u8  is_global;
 	__u8  port_num;
-	__u8  reserved; /* Align struct to 8 bytes */
+	__u8  reserved;
 };
 
 struct ib_uverbs_create_ah {
 	__u64 response;
 	__u64 user_handle;
 	__u32 pd_handle;
-	__u32 reserved; /* Align struct to 8 bytes */
+	__u32 reserved;
 	struct ib_uverbs_ah_attr attr;
 };
 
Index: infiniband/core/uverbs_cmd.c
===
--- infiniband/core/uverbs_cmd.c	(revision 3742)
+++ infiniband/core/uverbs_cmd.c	(working copy)
@@ -699,24 +699,25 @@ ssize_t ib_uverbs_poll_cq(struct ib_uver
 	}
 
 	resp-count = ib_poll_cq(cq, cmd.ne, wc);
-	for(i = 0; i  cmd.ne; i++) {
-		resp-wc[i].wr_id = wc[i].wr_id;
-		resp-wc[i].status = wc[i].status;
-		resp-wc[i].opcode = wc[i].opcode;
-		resp-wc[i].vendor_err = wc[i].vendor_err;
-		resp-wc[i].byte_len = wc[i].byte_len;
-		resp-wc[i].imm_data = wc[i].imm_data;
-		resp-wc[i].qp_num = wc[i].qp_num;
-		resp-wc[i].src_qp = wc[i].src_qp;
-		resp-wc[i].wc_flags = wc[i].wc_flags;
-		resp-wc[i].pkey_index = wc[i].pkey_index;
-		resp-wc[i].slid = wc[i].slid;
-		resp-wc[i].sl = wc[i].sl;
+	for (i = 0; i  resp-count; i++) {
+		resp-wc[i].wr_id 	   = wc[i].wr_id;
+		resp-wc[i].status 	   = wc[i].status;
+		resp-wc[i].opcode 	   = wc[i].opcode;
+		resp-wc[i].vendor_err 	   = wc[i].vendor_err;
+		resp-wc[i].byte_len 	   = wc[i].byte_len;
+		resp-wc[i].imm_data 	   = wc[i].imm_data;
+		resp-wc[i].qp_num 	   = wc[i].qp_num;
+		resp-wc[i].src_qp 	   = wc[i].src_qp;
+		resp-wc[i].wc_flags 	   = wc[i].wc_flags;
+		resp-wc[i].pkey_index 	   = 

Re: [openib-general] Latest build test results

2005-10-11 Thread Nishanth Aravamudan
On 11.10.2005 [23:15:27 -0400], Hal Rosenstock wrote:
 Hi again Nish,
 
 On Tue, 2005-10-11 at 21:39, Nishanth Aravamudan wrote:
 Update arp_recv functions to latest 2.6.14 netdevice.h API for struct
 packet_type

Sorry for the delay, I haven't yet had time to test the patches :/ I'll
try to get to it tonight or tomorrow.

Is there anyway you can send me patches against the kernel tree as
opposed to the svn repo? It makes my side of things *a lot* easier, as
right now I have to take your patch against svn and either hand-edit or
patch my checkout and then diff against the current kernel tree.
   
   Since you were reporting iSER, AT, and SDP compile warnings/errors,
   aren't you using the latest OpenIB svn tree with 2.6.14-rc3 ?
  
  Yes; but you have to understand that the automated build system I have
  access to 1) does not have external internet access (i.e., to the svn
  tree) and 2) only builds kernels unless I manually send commands to the
  terminal.
  
  So, the way I'm doing things is:
  
  Send in 4 jobs for mainline (x86 and ppc64 with =y and =m) and then
  generate a patch of the latest svn tree against the current -git release
  (a patch to the kernel) and send it in as a parameter to my builds to
  test the latest svn tree. This leads to another 4 jobs (x86 and ppc64
  with =y and =m).
  
  I'm *only* doing kernel build testing right now.
  
   Which patches are you referring to ? Was it the fib_frontend.c one ?
   Not sure why they would need any manual fixup. At least that one was
   pretty straightforward.
  
  In the sense that I have to edit them to kernel relative paths, not in
  the content of the patch. To test any patch in the system I have access
  to, it needs to be a normal kernel patch (-p1 applicable to the base
  tree).
  
  Going through and manually applying patches to the svn tree and then
  regenerating the diff completely defeats the purpose of automated
  compilation testing.
 
 OK. Do you need any patches regenerated or is this more for the future ?

If you could regen the patches, that would definitely speed things up
for me, but I can handle these few, it's not a big deal. Definitely, in
the future, it makes it an almost instantaneous build test if I have the
kernel-relative patch.

Thanks,
Nish
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general] DMA mapping abuses in MAD layer

2005-10-11 Thread Sean Hefty
properly.  Once we call dma_map_single() on a buffer, the CPU may not
touch that buffer until after the corresponding dma_unmap_single().

It sounds like we need to change how the mapping is done.  Can we let the MAD
layer always control the mapping?  Considering how RMPP works, I'm not sure what
else we could do.

is just a cludge -- the real fix will need to be more invasive.  It
seems that the whole interface to the MAD layer may need to be
reorganized to avoid doing this.

We really just need to change the post_send_mad routine, don't we?

The original intent around that API was to permit posting the WR directly onto
the QP.  Since this isn't the case, what about changing post send to take as
input an ib_mad_send_buf, with the work request and SGE fields removed?  This
could permit some additional optimization, such as avoiding additional
allocations within the post send call.  (Taking it a step further, we could
create a new structure to permit using a received MAD as input to a send.)

It looks like there is a similar problem with ib_create_send_mad(): it
does DMA mapping on a buffer that is then returned for the caller to modify.

If we pass the send_buf into post_send_mad, then the mapping could be deferred.

Finally, some of the MAD structures like struct ib_mad_private look
risky to me, since kernel data might potentially share a cache line
with DMA buffers.  See http://lwn.net/Articles/2265/ for a nice
writeup of the class of bug that might be lurking.

This sounds like a separate issue, is that the case?

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] DMA mapping abuses in MAD layer

2005-10-11 Thread Roland Dreier
Sean It sounds like we need to change how the mapping is done.
Sean Can we let the MAD layer always control the mapping?

I guess so.  Another alternative would be for the consumer to provide
some sort of callback interface to handle the mapping, but that
doesn't seem feasible.

Sean We really just need to change the post_send_mad routine,
Sean don't we?

I guess so -- and remove the DMA mapping call from ib_create_send_mad().

Sean The original intent around that API was to permit posting
Sean the WR directly onto the QP.  Since this isn't the case,
Sean what about changing post send to take as input an
Sean ib_mad_send_buf, with the work request and SGE fields
Sean removed?

We probably still want to handle gather lists for posting sends I
think.  Another (rather unrelated) issue that I just noticed the other
day is that something like sending a response to a GetTable request
for PortInfo for every port in a large fabric is going to end up
sending a very large RMPP message, probably too large to fit in a
single kmalloc()ed buffer.  So I don't think we should require that
all send requests have a single gather entry.

Roland Finally, some of the MAD structures like struct
Roland ib_mad_private look risky to me, since kernel data might
Roland potentially share a cache line with DMA buffers.  See
Roland http://lwn.net/Articles/2265/ for a nice writeup of the
Roland class of bug that might be lurking.

Sean This sounds like a separate issue, is that the case?

Yes.  In fact I'm not sure there's really a bug there.  It's just
something questionable that I saw while trying to find the real
problem on 440SPe.

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general] DMA mapping abuses in MAD layer

2005-10-11 Thread Fab Tillier
 From: Roland Dreier [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, October 11, 2005 8:39 PM
 
 On mainstream architectures, it turns out that we can get away with
 violating this rule.  However, on non-cache-coherent architectures
 like PowerPC 4xx, dma_map_single(..., DMA_TO_DEVICE) does a cache
 flush, which makes sure that the contents of the CPU's cache are
 really written to memory.  If a driver then changes the contents of
 the buffer after the call to dma_map_single(), then it's quite likely
 that the change will be made only in the CPU's cache and the device
 will end up DMA-ing the old data.
 
 The problem I hit is in ib_post_send_mad(), specifically:
 
   smp = (struct ib_smp *)send_wr-wr.ud.mad_hdr;
   if (smp-mgmt_class == IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE) {
   ret = handle_outgoing_dr_smp(mad_agent_priv, smp,
send_wr);
 
 basically, when the MAD layer goes to send a directed route reply, it
 changes the MAD buffer after the DMA mapping is done.  The HCA
 doesn't see the change, the wrong packet gets sent and the SM never
 sees replies to its queries.
 
 Adding a PPC-specific cache flush call after the call to
 handle_outgoing_dr_smp() fixes things to the point that the port can
 be brought to ACTIVE, and in fact IPoIB works as well.  However, this
 is just a cludge -- the real fix will need to be more invasive.  It
 seems that the whole interface to the MAD layer may need to be
 reorganized to avoid doing this.

Why not just use inline sends for the special QPs and remove the need to perform
any DMA mappings on the send side altogether?

- Fab

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general