[ewg] Re: [PATCH] call skb_orphan() after sending an SKB

2008-02-06 Thread Or Gerlitz
 commit f17ebf3e2099257da244587f1ee33f51745f7cdb
 Author: Eli Cohen [EMAIL PROTECTED]
 Date:   Tue Feb 5 11:15:46 2008 +0200

 Call skb_orphan() after sending an SKB

 This will call the destructor of the SKB (but not free the
 memory). It appears that some applications (ttcpv for example)
 are sensitive to delaying the time the SKB is freed. This commit
 fixes this problem.

Can you explain what is the difference from the socket send buffer accounting
point of view, between freeing the SKB to freeing the memory? what was the
problem with ttcpv, did it hanged? have you tested the unsig_udqp.patch with
different socket buffer sizes to make sure there's no live-lock etc? what
was the app you were using?

Also, I see that you have added a call to netif_stop_queue(), is this to
solve another problem?

Or.


 Signed-off-by: Eli Cohen [EMAIL PROTECTED]

 diff --git a/kernel_patches/fixes/ipoib_0190_unsig_udqp.patch 
 b/kernel_patches/fixes/ipoib_0190_unsig_udqp.patch
 index b76cdab..3fbeda3 100644
 --- a/kernel_patches/fixes/ipoib_0190_unsig_udqp.patch
 +++ b/kernel_patches/fixes/ipoib_0190_unsig_udqp.patch
 @@ -10,10 +10,10 @@ UDP messages, went up from 380 mbps to 508 mbps.

  Signed-off-by: Eli Cohen [EMAIL PROTECTED]
  ---
 -Index: ofed_kernel/drivers/infiniband/ulp/ipoib/ipoib.h
 +Index: ofa_1_3_dev_kernel/drivers/infiniband/ulp/ipoib/ipoib.h
  ===
  ofed_kernel.orig/drivers/infiniband/ulp/ipoib/ipoib.h
 -+++ ofed_kernel/drivers/infiniband/ulp/ipoib/ipoib.h
 +--- ofa_1_3_dev_kernel.orig/drivers/infiniband/ulp/ipoib/ipoib.h 
 2008-02-05 11:04:35.0 +0200
  ofa_1_3_dev_kernel/drivers/infiniband/ulp/ipoib/ipoib.h  2008-02-05 
 11:05:07.0 +0200
  @@ -373,6 +373,7 @@ struct ipoib_dev_priv {

   struct ib_wc ibwc[IPOIB_NUM_WC];
 @@ -39,10 +39,10 @@ Index: ofed_kernel/drivers/infiniband/ulp/ipoib/ipoib.h

   struct ipoib_ah *ipoib_create_ah(struct net_device *dev,
struct ib_pd *pd, struct ib_ah_attr *attr);
 -Index: ofed_kernel/drivers/infiniband/ulp/ipoib/ipoib_ib.c
 +Index: ofa_1_3_dev_kernel/drivers/infiniband/ulp/ipoib/ipoib_ib.c
  ===
  ofed_kernel.orig/drivers/infiniband/ulp/ipoib/ipoib_ib.c
 -+++ ofed_kernel/drivers/infiniband/ulp/ipoib/ipoib_ib.c
 +--- ofa_1_3_dev_kernel.orig/drivers/infiniband/ulp/ipoib/ipoib_ib.c  
 2008-02-05 11:04:35.0 +0200
  ofa_1_3_dev_kernel/drivers/infiniband/ulp/ipoib/ipoib_ib.c   
 2008-02-05 11:05:44.0 +0200
  @@ -254,12 +254,10 @@ repost:
  for buf %d\n, wr_id);
   }
 @@ -128,7 +128,7 @@ Index: ofed_kernel/drivers/infiniband/ulp/ipoib/ipoib_ib.c
   }

   int ipoib_poll(struct napi_struct *napi, int budget)
 -@@ -361,11 +372,65 @@ void ipoib_ib_rx_completion(struct ib_cq
 +@@ -361,11 +372,68 @@ void ipoib_ib_rx_completion(struct ib_cq
   netif_rx_schedule(dev, priv-napi);
   }

 @@ -168,8 +168,11 @@ Index: 
 ofed_kernel/drivers/infiniband/ulp/ipoib/ipoib_ib.c
  +ipoib_warn(priv, failed to post zlen send\n);
  +else {
  +++priv-tx_head;
 -+++priv-tx_outstanding;
  +ipoib_dbg(priv, %s-%d: head = %d\n, __func__, 
 __LINE__, priv-tx_head);
 ++if (++priv-tx_outstanding == ipoib_sendq_size) {
 ++ipoib_dbg(priv, TX ring full, stopping kernel 
 net queue\n);
 ++netif_stop_queue(dev);
 ++}
  +}
  +}
  +poll_tx(priv);
 @@ -197,7 +200,7 @@ Index: ofed_kernel/drivers/infiniband/ulp/ipoib/ipoib_ib.c
   }

   static inline int post_send(struct ipoib_dev_priv *priv,
 -@@ -405,6 +470,11 @@ static inline int post_send(struct ipoib
 +@@ -405,6 +473,11 @@ static inline int post_send(struct ipoib
   } else
   priv-tx_wr.opcode  = IB_WR_SEND;

 @@ -209,16 +212,18 @@ Index: 
 ofed_kernel/drivers/infiniband/ulp/ipoib/ipoib_ib.c
   return ib_post_send(priv-qp, priv-tx_wr, bad_wr);
   }

 -@@ -489,7 +559,7 @@ void ipoib_send(struct net_device *dev,
 +@@ -489,7 +562,9 @@ void ipoib_send(struct net_device *dev,
   }

   if (unlikely(priv-tx_outstanding  MAX_SEND_CQE + 1))
  -poll_tx(priv, 0);
  +poll_tx(priv);
 ++
 ++skb_orphan(skb);

   return;

 -@@ -530,6 +600,32 @@ void ipoib_reap_ah(struct work_struct *w
 +@@ -530,6 +605,32 @@ void ipoib_reap_ah(struct work_struct *w
  round_jiffies_relative(HZ));
   }

 @@ -251,7 +256,7 @@ Index: ofed_kernel/drivers/infiniband/ulp/ipoib/ipoib_ib.c
   int ipoib_ib_dev_open(struct net_device *dev)
   {
   struct ipoib_dev_priv *priv = netdev_priv(dev);
 -@@ -542,9 +638,17 @@ int ipoib_ib_dev_open(struct net_device
 +@@ -542,9 +643,17 @@ int ipoib_ib_dev_open(struct net_device
   }
  

[ewg] some comments/cleanups for the openibd service script

2008-02-06 Thread Or Gerlitz
Vlad,

I just realized that there is some old and misleading sections here, for
example bringing up/down of GEN1 drivers, mlx4_enet driver which is not
part of this release AKAIK ..., kdapl which was removed, starting/stopping
the ipoib ha tools which were removed, etc.

I can send a patch to clean them up, but I thought you might prefer to do
it yourself, please let me know, this has to get in for 1.3, I don't want
to start handling support cases with questions on non existent features.

This service script goes into commencial distributions, correct?

Please see below and let me know your thinking,

thanks,

Or.

On Wed, 6 Feb 2008, Or Gerlitz wrote:

 --- /dev/null 2008-02-05 10:18:44.755516936 +0200
 +++ ofed_scripts/openibd  2008-02-06 13:46:50.0 +0200
 @@ -0,0 +1,1375 @@
 +#!/bin/bash
 +
 +#
 +# Copyright (c) 2006 Mellanox Technologies. All rights reserved.
 +#
 +# This Software is licensed under one of the following licenses:
 +#
 +# 1) under the terms of the Common Public License 1.0 a copy of which is
 +#available from the Open Source Initiative, see
 +#http://www.opensource.org/licenses/cpl.php.
 +#
 +# 2) under the terms of the The BSD License a copy of which is
 +#available from the Open Source Initiative, see
 +#http://www.opensource.org/licenses/bsd-license.php.
 +#
 +# 3) under the terms of the GNU General Public License (GPL) Version 2 a
 +#copy of which is available from the Open Source Initiative, see
 +#http://www.opensource.org/licenses/gpl-license.php.
 +#
 +# Licensee has the right to choose one of the above licenses.
 +#
 +# Redistributions of source code must retain the above copyright
 +# notice and one of the license notices.
 +#
 +# Redistributions in binary form must reproduce both the above copyright
 +# notice, one of the license notices in the documentation
 +# and/or other materials provided with the distribution.
 +#
 +#
 +#  $Id: openibd 9139 2006-08-29 14:03:38Z vlad $
 +#
 +
 +# config: /etc/infiniband/openib.conf
 +CONFIG=/etc/infiniband/openib.conf
 +
 +if [ ! -f $CONFIG ]; then
 +echo No InfiniBand configuration found
 +exit 0
 +fi
 +
 +. $CONFIG
 +
 +CWD=`pwd`
 +cd /etc/infiniband
 +WD=`pwd`
 +
 +PATH=$PATH:/sbin:/usr/bin
 +if [ -e /etc/profile.d/ofed.sh ]; then
 +. /etc/profile.d/ofed.sh
 +fi
 +
 +# Only use ONBOOT option if called by a runlevel directory.
 +# Therefore determine the base, follow a runlevel link name ...
 +base=${0##*/}
 +link=${base#*[SK][0-9][0-9]}
 +# ... and compare them
 +if [ $link == $base ] ; then
 +RUNMODE=manual
 +ONBOOT=yes
 +else
 +RUNMODE=auto
 +fi
 +
 +ACTION=$1
 +shift
 +RESTART=0
 +max_ports_num_in_hca=0
 +
 +# Check if OpenIB configured to start automatically
 +if [ X${ONBOOT} != Xyes ]; then
 +exit 0
 +fi
 +
 +if ( grep -i 'SuSE Linux' /etc/issue /dev/null 21 ); then
 +if [ -n $INIT_VERSION ] ; then
 +# MODE=onboot
 +if LANG=C egrep -L ^ONBOOT=['\]?[Nn][Oo]['\]? ${CONFIG}  
 /dev/null ; then
 +exit 0
 +fi
 +fi
 +fi
 +
 +#
 +# Get a sane screen width
 +[ -z ${COLUMNS:-} ]  COLUMNS=80
 +
 +[ -z ${CONSOLETYPE:-} ]  [ -x /sbin/consoletype ]  
 CONSOLETYPE=`/sbin/consoletype`
 +
 +if [ -f /etc/sysconfig/i18n -a -z ${NOLOCALE:-} ] ; then
 +  . /etc/sysconfig/i18n
 +  if [ $CONSOLETYPE != pty ]; then
 +case ${LANG:-} in
 +ja_JP*|ko_KR*|zh_CN*|zh_TW*)
 +export LC_MESSAGES=en_US
 +;;
 +*)
 +export LANG
 +;;
 +esac
 +  else
 +export LANG
 +  fi
 +fi
 +
 +# Read in our configuration
 +if [ -z ${BOOTUP:-} ]; then
 +  if [ -f /etc/sysconfig/init ]; then
 +  . /etc/sysconfig/init
 +  else
 +# This all seem confusing? Look in /etc/sysconfig/init,
 +# or in /usr/doc/initscripts-*/sysconfig.txt
 +BOOTUP=color
 +RES_COL=60
 +MOVE_TO_COL=echo -en \\033[${RES_COL}G
 +SETCOLOR_SUCCESS=echo -en \\033[1;32m
 +SETCOLOR_FAILURE=echo -en \\033[1;31m
 +SETCOLOR_WARNING=echo -en \\033[1;33m
 +SETCOLOR_NORMAL=echo -en \\033[0;39m
 +LOGLEVEL=1
 +  fi
 +  if [ $CONSOLETYPE = serial ]; then
 +  BOOTUP=serial
 +  MOVE_TO_COL=
 +  SETCOLOR_SUCCESS=
 +  SETCOLOR_FAILURE=
 +  SETCOLOR_WARNING=
 +  SETCOLOR_NORMAL=
 +  fi
 +fi
 +
 +if [ ${BOOTUP:-} != verbose ]; then
 +   INITLOG_ARGS=-q
 +else
 +   INITLOG_ARGS=
 +fi
 +
 +echo_success() {
 +  echo -n $@
 +  [ $BOOTUP = color ]  $MOVE_TO_COL
 +  echo -n [  
 +  [ $BOOTUP = color ]  $SETCOLOR_SUCCESS
 +  echo -n $OK
 +  [ $BOOTUP = color ]  $SETCOLOR_NORMAL
 +  echo -n   ]
 +  echo -e \r
 +  return 0
 +}
 +
 +echo_done() {
 +  echo -n $@
 +  [ $BOOTUP = color ]  $MOVE_TO_COL
 +  echo -n [  
 +  [ $BOOTUP = color ]  $SETCOLOR_NORMAL
 +  echo -n $done
 +  [ $BOOTUP = color ]  $SETCOLOR_NORMAL
 +  echo -n   ]

[ewg] Re: [PATCH] call skb_orphan() after sending an SKB

2008-02-06 Thread Eli Cohen

On Wed, 2008-02-06 at 10:17 +0200, Or Gerlitz wrote:
  commit f17ebf3e2099257da244587f1ee33f51745f7cdb
  Author: Eli Cohen [EMAIL PROTECTED]
  Date:   Tue Feb 5 11:15:46 2008 +0200
 
  Call skb_orphan() after sending an SKB
 
  This will call the destructor of the SKB (but not free the
  memory). It appears that some applications (ttcpv for example)
  are sensitive to delaying the time the SKB is freed. This commit
  fixes this problem.
 
 Can you explain what is the difference from the socket send buffer accounting
 point of view, between freeing the SKB to freeing the memory?
When you call skb_orphan(), the destructor of the SKB is called, in the
case this a function put by the socket. So from the socket point of view
the packet has been sent. The memory is still no freed since it is
needed by HW. Once we get a completion for the send operation, the SKB
is freed.


  what was the
 problem with ttcpv, did it hanged?
The problem with ttcpv was that it stopped sending packets since it was
waiting for freeing the memory. The system did not hang, just the
application (ttcpv) stopped sending. Other applications could continue
working over the ipoib interface.

  have you tested the unsig_udqp.patch with
 different socket buffer sizes to make sure there's no live-lock etc?
Yes, our regression system does that with different applications and
benchmarks.

  what was the app you were using?
ttcpv


 
 Also, I see that you have added a call to netif_stop_queue(), is this to
 solve another problem?

This was just a whole that I found in code review - when I post a zero
length packet, I still want this to affect the net queue control.
 
 Or.
 



___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: with the ipoib patches, debug prints spam the system log

2008-02-06 Thread Eli Cohen
They are only visible when activating ipoib debug. I know it fills the
dmesg ring with messages. Do you think I should remove them?

On Wed, 2008-02-06 at 10:38 +0200, Or Gerlitz wrote:
 Eli,
 
 You have left somehow too many... debug prints in the last patches,
 please clean this up. See for example how the system log after less
 then a minute when ipoib debug prints are opened, it has one original
 print (ib0: Send unicast ARP to 0023) and all the rest are yours.
 
 Or
 
 Feb  6 14:39:23  kernel: ib0: posting zlen send, wrid = 4: head = 2756, tail 
 = 2752
 Feb  6 14:39:23  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 2757
 Feb  6 14:39:25  kernel: ib0: posting zlen send, wrid = 39: head = 2919, tail 
 = 2912
 Feb  6 14:39:25  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 2920
 Feb  6 14:39:25  kernel: ib0: posting zlen send, wrid = 15: head = 2959, tail 
 = 2944
 Feb  6 14:39:25  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 2960
 Feb  6 14:39:27  kernel: ib0: posting zlen send, wrid = 8: head = 3080, tail 
 = 3072
 Feb  6 14:39:27  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3081
 Feb  6 14:39:34  kernel: ib0: posting zlen send, wrid = 51: head = 3699, tail 
 = 3696
 Feb  6 14:39:34  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3700
 Feb  6 14:39:35  kernel: ib0: posting zlen send, wrid = 25: head = 3737, tail 
 = 3728
 Feb  6 14:39:35  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3738
 Feb  6 14:39:35  kernel: ib0: posting zlen send, wrid = 3: head = 3779, tail 
 = 3776
 Feb  6 14:39:35  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3780
 Feb  6 14:39:36  kernel: ib0: posting zlen send, wrid = 48: head = 3824, tail 
 = 3808
 Feb  6 14:39:36  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3825
 Feb  6 14:39:38  kernel: ib0: posting zlen send, wrid = 24: head = 3992, tail 
 = 3984
 Feb  6 14:39:38  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3993
 Feb  6 14:39:38  kernel: ib0: posting zlen send, wrid = 4: head = 4036, tail 
 = 4032
 Feb  6 14:39:38  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 4037
 Feb  6 14:39:46  kernel: ib0: Send unicast ARP to 0023
 Feb  6 14:39:46  kernel: ib0: posting zlen send, wrid = 11: head = 4683, tail 
 = 4672
 Feb  6 14:39:46  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 4684
 Feb  6 14:39:58  kernel: ib0: posting zlen send, wrid = 58: head = 5626, tail 
 = 5616
 Feb  6 14:39:58  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 5627
 Feb  6 14:39:59  kernel: ib0: posting zlen send, wrid = 56: head = 5752, tail 
 = 5744
 Feb  6 14:39:59  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 5753
 Feb  6 14:40:01  kernel: ib0: posting zlen send, wrid = 54: head = 5878, tail 
 = 5872
 Feb  6 14:40:01  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 5879
 Feb  6 14:40:01  kernel: ib0: posting zlen send, wrid = 30: head = 5918, tail 
 = 5904
 Feb  6 14:40:01  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 5919
 Feb  6 14:40:10  kernel: ib0: posting zlen send, wrid = 33: head = 6689, tail 
 = 6672
 Feb  6 14:40:10  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 6690
 Feb  6 14:40:13  kernel: ib0: posting zlen send, wrid = 48: head = 6896, tail 
 = 6880
 Feb  6 14:40:13  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 6897
 Feb  6 14:40:13  kernel: ib0: posting zlen send, wrid = 26: head = 6938, tail 
 = 6928
 Feb  6 14:40:13  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 6939
 Feb  6 14:40:15  kernel: ib0: posting zlen send, wrid = 61: head = 7101, tail 
 = 7088
 Feb  6 14:40:15  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 7102

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg][PATCH][0/2] SRP multipath failover within 60 seconds,

2008-02-06 Thread Vu Pham
The following patches assist SRP/dm-multipath to failover within 60 
seconds (bugzilla #577) without data corruption, read/write error


1. srp_disconnect_without_wait.patch - srp send disconnect request  
without waiting for CM timewait exit event since srp current does not 
re-use the cm_id and qp/cq of a connection (patch 
srp_1_recreate_at_reconnect.patch already in kernel_patches/fixes 
recreate the cmid, qp/cq for a connection at reconnect)
2. srp_qp_in_err_timer_reconnect_target.patch - when detecting a 
post_send/post_receive error, srp set qp_in_error, set a timer to 
reconnect to target, return SCSI_MLQUEUE_HOST_BUSY to lock the queue, 
and return DID_NO_CONNECT when target state is DEAD or REMOVED


Here is my multipath.conf
defaults {
   udev_dir/dev
   polling_interval5
   selectorround-robin 0
   path_grouping_policymultibus
   getuid_callout  /sbin/scsi_id -g -u -s /block/%n
   prio_callout/bin/true
   path_checkerreadsector0
   rr_min_io   100
   rr_weight   priorities
   failbackimmediate
   no_path_retry   5
   user_friendly_names no
}
I also set srp_daemon.sh to rescan fabric every 60 seconds (instead of 
300 secs as default setting)


I ran data integrity test to /dev/mapper/devices and {disable path 1, 
sleep 90, enable path 1, sleep 60, disable path 2, sleep 90, enable path 
2, sleep 60} in the loop


RHEL5, 5.1 work very well (no data corruption, read/write failure report)
For SLES 10 sp1, it work well as long as I run *multipath* every 60 
secs. I think that I mis-configured the multipathd somehow (Here is how 
I set it up: using the same multipath.conf above, chkconfig 
boot.multipath on and chkconf multipathd on)


  -vu





___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg][PATCH][1/2] SRP multipath failover within 60 seconds,

2008-02-06 Thread Vu Pham
srp_disconnect_without_wait.patch - srp send disconnect 
request without waiting for CM timewait exit event since srp 
current does not re-use the cm_id and qp/cq of a connection 
(patch srp_1_recreate_at_reconnect.patch already in 
kernel_patches/fixes recreate the cmid, qp/cq for a 
connection at reconnect)


Signed-off-by: Vu Pham [EMAIL PROTECTED]

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 950228f..45a2533 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -400,7 +400,6 @@
 		printk(KERN_DEBUG PFX Sending CM DREQ failed\n);
 		return;
 	}
-	wait_for_completion(target-done);
 }
 
 static void srp_remove_work(struct work_struct *work)
@@ -1266,7 +1294,6 @@
 	case IB_CM_TIMEWAIT_EXIT:
 		printk(KERN_ERR PFX connection closed\n);
 
-		comp = 1;
 		target-status = 0;
 		break;
 
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg][PATCH][2/2] SRP multipath failover within 60 seconds,

2008-02-06 Thread Vu Pham
srp_qp_in_err_timer_reconnect_target.patch - when detecting 
a post_send/post_receive error, srp set qp_in_error, set a 
timer to reconnect to target, return SCSI_MLQUEUE_HOST_BUSY 
to lock the queue, and return DID_NO_CONNECT when target 
state is DEAD or REMOVED


Signed-off-by: Vu Pham [EMAIL PROTECTED]

--- ofa_kernel-1.3.configured/drivers/infiniband/ulp/srp/ib_srp.c	2008-02-05 11:18:16.0 -0800
+++ ofa_kernel-1.3/drivers/infiniband/ulp/srp/ib_srp.c	2008-02-05 15:18:33.0 -0800
@@ -885,6 +884,26 @@
   DMA_FROM_DEVICE);
 }
 
+static void srp_reconnect_work(struct work_struct *work)
+{
+	struct srp_target_port *target =
+		container_of(work, struct srp_target_port, work);
+
+	srp_reconnect_target(target);
+}
+
+static void srp_qp_in_err_timer(unsigned long data)
+{
+	struct srp_target_port *target = (struct srp_target_port *)data;
+
+	spin_lock_irq(target-scsi_host-host_lock);
+	INIT_WORK(target-work, srp_reconnect_work);
+	schedule_work(target-work);
+	spin_unlock_irq(target-scsi_host-host_lock);
+
+	del_timer(target-qp_err_timer);
+}
+
 static void srp_completion(struct ib_cq *cq, void *target_ptr)
 {
 	struct srp_target_port *target = target_ptr;
@@ -896,7 +915,16 @@
 			printk(KERN_ERR PFX failed %s status %d\n,
 			   wc.wr_id  SRP_OP_RECV ? receive : send,
 			   wc.status);
-			target-qp_in_error = 1;
+			if (!target-qp_in_error) {
+target-qp_in_error = 1;
+if (!timer_pending(target-qp_err_timer)) {
+	setup_timer(target-qp_err_timer,
+		srp_qp_in_err_timer,
+		(unsigned long)target);
+	target-qp_err_timer.expires = 10 * HZ + jiffies;
+	add_timer(target-qp_err_timer);
+}
+			}
 			break;
 		}
 
@@ -1004,12 +1032,13 @@
 	struct ib_device *dev;
 	int len;
 
-	if (target-state == SRP_TARGET_CONNECTING)
+	if (target-state == SRP_TARGET_CONNECTING ||
+	target-qp_in_error)
 		goto err;
 
 	if (target-state == SRP_TARGET_DEAD ||
 	target-state == SRP_TARGET_REMOVED) {
-		scmnd-result = DID_BAD_TARGET  16;
+		scmnd-result = DID_NO_CONNECT  16;
 		done(scmnd);
 		return 0;
 	}
--- ofa_kernel-1.3.configured/drivers/infiniband/ulp/srp/ib_srp.h	2008-02-05 11:18:16.0 -0800
+++ ofa_kernel-1.3/drivers/infiniband/ulp/srp/ib_srp.h	2008-02-05 11:20:49.0 -0800
@@ -160,6 +160,7 @@
 	int			status;
 	enum srp_target_state	state;
 	int			qp_in_error;
+	struct timer_list	qp_err_timer;
 };
 
 struct srp_iu {
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] traffic jittery, send queue full reports from mthca driver

2008-02-06 Thread Or Gerlitz
I just opened case #897 on the below, it happens with last night snapshot.

Or

client MT25204 FW 1.2.0 two CPUs, four cores each
server MT25418 FW 2.3.0 two CPUs, four cores each

client : iperf -c $server -P 4 -d -t 3600 -i 1
server : iperf -s -i 1

[  5] 39.0-40.0 sec  29.4 MBytes246 Mbits/sec
[  4] 39.0-40.0 sec  25.5 MBytes214 Mbits/sec
[  3] 34.0-35.0 sec  88.0 KBytes721 Kbits/sec
[  3] 35.0-36.0 sec  0.00 Bytes  0.00 bits/sec
[  3] 36.0-37.0 sec  0.00 Bytes  0.00 bits/sec
[  3] 37.0-38.0 sec  0.00 Bytes  0.00 bits/sec
[  3] 38.0-39.0 sec  0.00 Bytes  0.00 bits/sec
[  3] 39.0-40.0 sec  0.00 Bytes  0.00 bits/sec
[  5] 40.0-41.0 sec  38.5 MBytes323 Mbits/sec
[  8] 40.0-41.0 sec  36.2 MBytes304 Mbits/sec
[  9] 40.0-41.0 sec  54.3 MBytes456 Mbits/sec
[ 10] 40.0-41.0 sec  32.1 MBytes270 Mbits/sec
[ 11] 40.0-41.0 sec  29.4 MBytes247 Mbits/sec
[SUM] 40.0-41.0 sec152 MBytes  1.28 Gbits/sec

ib_mthca :03:00.0: SQ 000404 full (756910656 head, 756910592 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (756915376 head, 756915312 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757146224 head, 757146160 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757146336 head, 757146272 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757317104 head, 757317040 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757361808 head, 757361744 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757361920 head, 757361856 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757515760 head, 757515696 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757515872 head, 757515808 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757515984 head, 757515920 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757516112 head, 757516048 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757516224 head, 757516160 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757516352 head, 757516288 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757516448 head, 757516384 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757516576 head, 757516512 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757523168 head, 757523104 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757531472 head, 757531408 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757531568 head, 757531504 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757548064 head, 757548000 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757582992 head, 757582928 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758082528 head, 758082464 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758162208 head, 758162144 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758232720 head, 758232656 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758232848 head, 758232784 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758232960 head, 758232896 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758233088 head, 758233024 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758303696 head, 758303632 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758303776 head, 758303712 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758307744 head, 758307680 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758307872 head, 758307808 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758334928 head, 758334864 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758335056 head, 758334992 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758341744 head, 758341680 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758341856 head, 758341792 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758396784 

[ewg] Re: traffic jittery, send queue full reports from mthca driver

2008-02-06 Thread Or Gerlitz
 ib_mthca :03:00.0: SQ 000404 full (756910656 head, 756910592 tail, 64 
 max, 0 nreq)
 ib0: failed to post zlen send

Eli,

can this be a bug in the send ring accounting wrt to the zlen packet you use in 
the unsig-ud-qp patch?

Or.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [PATCH] call skb_orphan() after sending an SKB

2008-02-06 Thread Eli Cohen

On Wed, 2008-02-06 at 15:11 +0200, Or Gerlitz wrote:
 Eli Cohen wrote:
  On Wed, 2008-02-06 at 10:17 +0200, Or Gerlitz wrote:
 
  The problem with ttcpv was that it stopped sending packets since it was
  waiting for freeing the memory. The system did not hang, just the
  application (ttcpv) stopped sending. Other applications could continue
  working over the ipoib interface.
 
 What's ttcpv, doing web-search I only find ttcp, so I would be happy to 
 get pointer plus what param you were using to see the problem.
It's a variant of ttcp we're using here in our regression. Dotan can you
send a pointer?
 
  Also, I see that you have added a call to netif_stop_queue(), is this to
  solve another problem?
 
  This was just a whole that I found in code review - when I post a zero
  length packet, I still want this to affect the net queue control.
 
 Why posting a zero len packet is related to the net queue control logic? 
 I was thinking it has to do with releasing unsignaled SKBs

Yes but if I have no more room in the tx ring I would like to stop the
queue even here.
 
 Or
 

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] RE: traffic jittery, send queue full reports from mthca driver

2008-02-06 Thread Tziporet Koren
OK - Eli found the problem to be fixed soon

Tziporet

-Original Message-
From: Or Gerlitz [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, February 06, 2008 2:54 PM
To: Tziporet Koren
Cc: ewg@lists.openfabrics.org
Subject: traffic jittery, send queue full reports from mthca driver

I just opened case #897 on the below, it happens with last night
snapshot.

Or

client MT25204 FW 1.2.0 two CPUs, four cores each
server MT25418 FW 2.3.0 two CPUs, four cores each

client : iperf -c $server -P 4 -d -t 3600 -i 1
server : iperf -s -i 1

[  5] 39.0-40.0 sec  29.4 MBytes246 Mbits/sec
[  4] 39.0-40.0 sec  25.5 MBytes214 Mbits/sec
[  3] 34.0-35.0 sec  88.0 KBytes721 Kbits/sec
[  3] 35.0-36.0 sec  0.00 Bytes  0.00 bits/sec
[  3] 36.0-37.0 sec  0.00 Bytes  0.00 bits/sec
[  3] 37.0-38.0 sec  0.00 Bytes  0.00 bits/sec
[  3] 38.0-39.0 sec  0.00 Bytes  0.00 bits/sec
[  3] 39.0-40.0 sec  0.00 Bytes  0.00 bits/sec
[  5] 40.0-41.0 sec  38.5 MBytes323 Mbits/sec
[  8] 40.0-41.0 sec  36.2 MBytes304 Mbits/sec
[  9] 40.0-41.0 sec  54.3 MBytes456 Mbits/sec
[ 10] 40.0-41.0 sec  32.1 MBytes270 Mbits/sec
[ 11] 40.0-41.0 sec  29.4 MBytes247 Mbits/sec
[SUM] 40.0-41.0 sec152 MBytes  1.28 Gbits/sec

ib_mthca :03:00.0: SQ 000404 full (756910656 head, 756910592 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (756915376 head, 756915312 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757146224 head, 757146160 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757146336 head, 757146272 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757317104 head, 757317040 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757361808 head, 757361744 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757361920 head, 757361856 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757515760 head, 757515696 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757515872 head, 757515808 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757515984 head, 757515920 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757516112 head, 757516048 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757516224 head, 757516160 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757516352 head, 757516288 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757516448 head, 757516384 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757516576 head, 757516512 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757523168 head, 757523104 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757531472 head, 757531408 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757531568 head, 757531504 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757548064 head, 757548000 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757582992 head, 757582928 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758082528 head, 758082464 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758162208 head, 758162144 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758232720 head, 758232656 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758232848 head, 758232784 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758232960 head, 758232896 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758233088 head, 758233024 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758303696 head, 758303632 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758303776 head, 758303712 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758307744 head, 758307680 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758307872 head, 758307808 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758334928 head, 758334864 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758335056 head, 758334992 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 

[ewg] RE: traffic jittery, send queue full reports from mthca driver

2008-02-06 Thread Eli Cohen
I will check this. 

-Original Message-
From: Or Gerlitz [mailto:[EMAIL PROTECTED] 
Sent: ד 06 פברואר 2008 14:57
To: Eli Cohen
Cc: ewg@lists.openfabrics.org
Subject: Re: traffic jittery, send queue full reports from mthca driver

 ib_mthca :03:00.0: SQ 000404 full (756910656 head, 756910592 tail, 
 64 max, 0 nreq)
 ib0: failed to post zlen send

Eli,

can this be a bug in the send ring accounting wrt to the zlen packet you use in 
the unsig-ud-qp patch?

Or.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg][PATCH][0/2] SRP multipath failover within 60 seconds,

2008-02-06 Thread Tziporet Koren

Vu Pham wrote:
The following patches assist SRP/dm-multipath to failover within 60 
seconds (bugzilla #577) without data corruption, read/write error


1. srp_disconnect_without_wait.patch - srp send disconnect request  
without waiting for CM timewait exit event since srp current does not 
re-use the cm_id and qp/cq of a connection (patch 
srp_1_recreate_at_reconnect.patch already in kernel_patches/fixes 
recreate the cmid, qp/cq for a connection at reconnect)
2. srp_qp_in_err_timer_reconnect_target.patch - when detecting a 
post_send/post_receive error, srp set qp_in_error, set a timer to 
reconnect to target, return SCSI_MLQUEUE_HOST_BUSY to lock the queue, 
and return DID_NO_CONNECT when target state is DEAD or REMOVED


Here is my multipath.conf
defaults {
   udev_dir/dev
   polling_interval5
   selectorround-robin 0
   path_grouping_policymultibus
   getuid_callout  /sbin/scsi_id -g -u -s /block/%n
   prio_callout/bin/true
   path_checkerreadsector0
   rr_min_io   100
   rr_weight   priorities
   failbackimmediate
   no_path_retry   5
   user_friendly_names no
}
I also set srp_daemon.sh to rescan fabric every 60 seconds (instead of 
300 secs as default setting)


I ran data integrity test to /dev/mapper/devices and {disable path 
1, sleep 90, enable path 1, sleep 60, disable path 2, sleep 90, enable 
path 2, sleep 60} in the loop


RHEL5, 5.1 work very well (no data corruption, read/write failure report)
For SLES 10 sp1, it work well as long as I run *multipath* every 60 
secs. I think that I mis-configured the multipathd somehow (Here is 
how I set it up: using the same multipath.conf above, chkconfig 
boot.multipath on and chkconf multipathd on)


  -vu



This fix issue 577 https://bugs.openfabrics.org/show_bug.cgi?id=577 
that was found in OFED 1.2

Vlad - please take this

Tziporet

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] OFED teleconf 18 Feb

2008-02-06 Thread Jeff Squyres
Some US companies mark 18 Feb as a holiday (President's Day), so per  
request, I'm moving the OFED teleconference from 18 Feb to 19 Feb  
(same time slot).


You'll receive an Outlook meeting update shortly.

--
Jeff Squyres
Cisco Systems

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [GIT PULL] ~sashak/management.git

2008-02-06 Thread Sasha Khapyorsky
Hi Vlad,

Please pull recent ofed_1_3 branch of ~sashak/management.git.

The changes are:

Ira K. Weiny (2):
  Move opensm.8 man page in prep for making config file changes.
  Update man page for configurable partition and prefix-routes file

Ira Weiny (1):
  Add node name map, partition config, and QOS policy config files to the 
FILES section of man page.

Sasha Khapyorsky (1):
  opensm: scripts/opensmd - fix opensm path.


Thanks,
Sasha
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] traffic jittery, send queue full reports from mthca driver

2008-02-06 Thread Shirley Ma
Hello Or,

I found out that if you increase send_queue_size and recv_queue_size,
like 1K, this problem will be gone.

Thanks
Shirley

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [GIT PULL] ~sashak/management.git

2008-02-06 Thread Vladimir Sokolovsky

Sasha Khapyorsky wrote:

Hi Vlad,

Please pull recent ofed_1_3 branch of ~sashak/management.git.

The changes are:

Ira K. Weiny (2):
  Move opensm.8 man page in prep for making config file changes.
  Update man page for configurable partition and prefix-routes file

Ira Weiny (1):
  Add node name map, partition config, and QOS policy config files to the 
FILES section of man page.

Sasha Khapyorsky (1):
  opensm: scripts/opensmd - fix opensm path.



Done,

Regards,
Vladimir
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] RE: Updated: OFED weekly teleconference

2008-02-06 Thread Rupert Dance
Jeff,

Just so you know, this conflicts with the OFA IWG meeting which has always
been held from 11:30 - 1:00 PM EST on Tuesday's. Since this is a one time
occurrence, I would not change anything but I just thought you should know.

Rupert

 _
 From: Jeff Squyres (jsquyres) [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, February 06, 2008 10:03 AM
 To:   ewg@lists.openfabrics.org
 Cc:   Scott Bahling; John Russo; Ryan, Jim; Ken L Johnson;
 [EMAIL PROTECTED]; [EMAIL PROTECTED]; Head Bubba; Van Houten,
 Betty; Patrick Mullaney
 Subject:  Updated: OFED weekly teleconference
 When: Tuesday, February 19, 2008 12:00 PM-1:00 PM (GMT-05:00) Eastern Time
 (US  Canada).
 Where:ID: 210020028
 
 
 
 __
  
 Jeffrey Squyres has invited you to a Cisco Unified MeetingPlace Conference
 
 
 Date/Time:   FEB 19, 2008 at 12:00PM America/New_York 
 Length:  60 
 Frequency:   3 
 Meeting ID:  210020028 
 Meeting Password:
 
 Global Access Numbers: 
 http://cisco.com/en/US/about/doing_business/conferencing/index.html 
 
 US/Canada:  +1.866.432.9903United Kingdom:   +44.20.8824.0117 
 India:  +91.80.4103.3979   Germany:  +49.619.6773.9002 
 Japan:  +81.3.5763.9394China:+86.10.8515.5666 
 
 TO ATTEND A WEB AND VOICE CONFERENCE: 
 
 CISCO INTRANET ATTENDEES 
 Join the Web  Voice Conference* 
 1. Go to http://meetingplaceinternal.cisco.com/join.asp?210020028 
 2. Enter your CEC User ID  Password then click OK 
 - Accept any security warnings you receive and wait for the Meeting Room
 to initialize 
 3. Click on CONNECT from the Meeting Room to join the Voice Conference
 portion of the meeting 
 
 EXTERNAL ATTENDEES - Outside the Cisco Intranet** 
 Join the Web  Voice Conference* 
 1. Go to http://meetingplace.cisco.com/join.asp?210020028 
 2. Fill in the My Name is field then click Attend Meeting 
 - If you have a CEC User ID, click on the Cisco icon 
 - Accept any security warnings you receive and wait for the Meeting Room
 to initialize 
 3. Click on CONNECT from the Meeting Room to join the Voice Conference
 portion of the meeting 
 - Note: Guest users will see a link to the Global Access Numbers. 
 
 *If this is your first time attending a Web Conference, disable any pop-up
 blockers and visit
 http://meetingplace.cisco.com/mpweb/scripts/browsertestupper.asp to test
 your web browser for compatibility with the Web Conference.
 
 **Not all meetings are scheduled to allow external attendees into the Web
 Conference portion of the meeting, if the URL does not work, please follow
 the Voice only Conference instructions below to attend.
 
 TO ATTEND A VOICE ONLY CONFERENCE 
 1. Dial into Cisco Unified MeetingPlace (view the Access Numbers and link
 above) 
 2. Press 1 to attend the meeting 
 3. Follow the prompts to enter the Meeting ID 210020028 and join the
 meeting 
 
 SUPPORT 
 Information about this Conference: Contact Jeffrey Squyres, 85250971 
 Cisco IT Support Center: Attend the Voice Conference and then press #0 on
 your phone keypad 
 
 GLOBAL ACCESS NUMBERS 
 
 COUNTRYLOCATIONLOCAL NUMBER   TOLL
 FREE-FREEFONE 
 
 AlgeriaAlgiers+213.21.98.9047
 Argentina  Buenos Aires   +54.11.4341.0101 
 Australia  Canberra   +61.2.6216.0643 
Melbourne  +61.3.9659.4173 
North Sydney   +61.2.8446.5260 
 AustriaVienna +43.12.4030.6022 
 Azerbaijan Baku   +994.12.437.4829 
 BelgiumBrussels   +32.2.704.5072 
 Bosnia  
 HerzegovinaSarajevo   +387.33.56.2898 
 Brazil Brasilia   +55.613.424.0220 
Rio de Janeiro +55.21.2483.6302 
Sao Paulo  +55.11.5508.6311 
 Bulgaria   Sofia  +359.2.937.5938 
 Canada Calgary+1.403.514.2435 
Edmonton   +1.780.441.3715 
Halifax+1.902.474.0214 
Kanata +1.613.254.0005 
Markham+1.905.470.4810 
Montreal   +1.514.847.6875 
Ottawa +1.613.788.7250 
Quebec +1.418.634.5645 
Regina +1.306.566.6410 
Toronto+1.416.306.7230 
Vancouver  +1.604.647.2350 
Winnipeg   +1.204.336.6610 
 Chile  Santiago   +56.2.431.4936 
 China  Beijing+86.10.8515.5666 
Chengdu+86.28.8696.1333 
Guangzhou  +86.20.8519. 
  

[ewg] Re: Updated: OFED weekly teleconference

2008-02-06 Thread Jeff Squyres
Note that I'm not the one who schedules the EWG teleconferences; I'm  
just the guy who provides the phone bridge.


Tziporet is the OFED release manager and coordinates the EWG  
teleconferences.



On Feb 6, 2008, at 10:12 AM, Rupert Dance wrote:


Jeff,

Just so you know, this conflicts with the OFA IWG meeting which has  
always been held from 11:30 - 1:00 PM EST on Tuesday's. Since this  
is a one time occurrence, I would not change anything but I just  
thought you should know.


Rupert

_
From:   Jeff Squyres (jsquyres) [mailto:[EMAIL PROTECTED]
Sent:   Wednesday, February 06, 2008 10:03 AM
To: ewg@lists.openfabrics.org
Cc: Scott Bahling; John Russo; Ryan, Jim; Ken L Johnson; [EMAIL PROTECTED] 
; [EMAIL PROTECTED]; Head Bubba; Van Houten, Betty; Patrick  
Mullaney


Subject:Updated: OFED weekly teleconference
When:   Tuesday, February 19, 2008 12:00 PM-1:00 PM (GMT-05:00)  
Eastern Time (US  Canada).

Where:  ID: 210020028



__
Jeffrey Squyres has invited you to a Cisco Unified MeetingPlace  
Conference


Date/Time:   FEB 19, 2008 at 12:00PM America/New_York
Length:  60
Frequency:   3
Meeting ID:  210020028
Meeting Password:

Global Access Numbers:
http://cisco.com/en/US/about/doing_business/conferencing/index.html

US/Canada:  +1.866.432.9903United Kingdom:   +44.20.8824.0117
India:  +91.80.4103.3979   Germany:  +49.619.6773.9002
Japan:  +81.3.5763.9394China:+86.10.8515.5666

TO ATTEND A WEB AND VOICE CONFERENCE:

CISCO INTRANET ATTENDEES
Join the Web  Voice Conference*
1. Go to http://meetingplaceinternal.cisco.com/join.asp?210020028
2. Enter your CEC User ID  Password then click OK
- Accept any security warnings you receive and wait for the Meeting  
Room to initialize
3. Click on CONNECT from the Meeting Room to join the Voice  
Conference portion of the meeting


EXTERNAL ATTENDEES - Outside the Cisco Intranet**
Join the Web  Voice Conference*
1. Go to http://meetingplace.cisco.com/join.asp?210020028
2. Fill in the My Name is field then click Attend Meeting
- If you have a CEC User ID, click on the Cisco icon
- Accept any security warnings you receive and wait for the Meeting  
Room to initialize
3. Click on CONNECT from the Meeting Room to join the Voice  
Conference portion of the meeting

- Note: Guest users will see a link to the Global Access Numbers.

*If this is your first time attending a Web Conference, disable any  
pop-up blockers and visit http://meetingplace.cisco.com/mpweb/scripts/browsertestupper.asp 
 to test your web browser for compatibility with the Web Conference.


**Not all meetings are scheduled to allow external attendees into  
the Web Conference portion of the meeting, if the URL does not work,  
please follow the Voice only Conference instructions below to attend.


TO ATTEND A VOICE ONLY CONFERENCE
1. Dial into Cisco Unified MeetingPlace (view the Access Numbers and  
link above)

2. Press 1 to attend the meeting
3. Follow the prompts to enter the Meeting ID 210020028 and join the  
meeting


SUPPORT
Information about this Conference: Contact Jeffrey Squyres, 85250971
Cisco IT Support Center: Attend the Voice Conference and then press  
#0 on your phone keypad


GLOBAL ACCESS NUMBERS

COUNTRYLOCATIONLOCAL NUMBER   TOLL  
FREE-FREEFONE


AlgeriaAlgiers+213.21.98.9047
Argentina  Buenos Aires   +54.11.4341.0101
Australia  Canberra   +61.2.6216.0643
   Melbourne  +61.3.9659.4173
   North Sydney   +61.2.8446.5260
AustriaVienna +43.12.4030.6022
Azerbaijan Baku   +994.12.437.4829
BelgiumBrussels   +32.2.704.5072
Bosnia 
HerzegovinaSarajevo   +387.33.56.2898
Brazil Brasilia   +55.613.424.0220
   Rio de Janeiro +55.21.2483.6302
   Sao Paulo  +55.11.5508.6311
Bulgaria   Sofia  +359.2.937.5938
Canada Calgary+1.403.514.2435
   Edmonton   +1.780.441.3715
   Halifax+1.902.474.0214
   Kanata +1.613.254.0005
   Markham+1.905.470.4810
   Montreal   +1.514.847.6875
   Ottawa +1.613.788.7250
   Quebec +1.418.634.5645
   Regina +1.306.566.6410
   Toronto+1.416.306.7230
   Vancouver  +1.604.647.2350
   Winnipeg   +1.204.336.6610
Chile  Santiago   +56.2.431.4936
China  Beijing+86.10.8515.5666
   

[ewg] Your profile

2008-02-06 Thread Herschel Jacobs

Hello! I am tired today. I am nice girl that would like to chat with you. Email 
me at [EMAIL PROTECTED] only, because I am using my friend's email to write 
this. Will send some of my pictures

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] OFED 1.3 rc4 update

2008-02-06 Thread Shirley Ma
On Wed, 2008-02-06 at 18:25 +0200, Tziporet Koren wrote:
 Hi,
 
 We will have OFED 1.3-rc4 tomorrow after one more night of regression
 
 It will include:
 
1. IPoIB: Non-SRQ for CM mode
2. IPOIB: 4K MTU
3. IPoIB - Small messages improvements
 
 Note that today's latest build will include theses features too if 
 someone want to test it today
 
 Tziporet

Thanks Tziporet. We will test it right after it's out.

Thanks
Shirley

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Re: with the ipoib patches, debug prints spam the system log

2008-02-06 Thread Or Gerlitz
On 2/6/08, Eli Cohen [EMAIL PROTECTED] wrote:
 They are only visible when activating ipoib debug. I know it fills the
 dmesg ring with messages. Do you think I should remove them?

Yes, you should remove them.

The ipoib debug prints are very usefull to debug and analyze at the
field, however, your 3 prints per second addition makes them useless,
at least for me, and I use them a lot where working to debug and help
others, so please do.

Or



 On Wed, 2008-02-06 at 10:38 +0200, Or Gerlitz wrote:
  Eli,
 
  You have left somehow too many... debug prints in the last patches,
  please clean this up. See for example how the system log after less
  then a minute when ipoib debug prints are opened, it has one original
  print (ib0: Send unicast ARP to 0023) and all the rest are yours.
 
  Or
 
  Feb  6 14:39:23  kernel: ib0: posting zlen send, wrid = 4: head = 2756, 
  tail = 2752
  Feb  6 14:39:23  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 2757
  Feb  6 14:39:25  kernel: ib0: posting zlen send, wrid = 39: head = 2919, 
  tail = 2912
  Feb  6 14:39:25  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 2920
  Feb  6 14:39:25  kernel: ib0: posting zlen send, wrid = 15: head = 2959, 
  tail = 2944
  Feb  6 14:39:25  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 2960
  Feb  6 14:39:27  kernel: ib0: posting zlen send, wrid = 8: head = 3080, 
  tail = 3072
  Feb  6 14:39:27  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3081
  Feb  6 14:39:34  kernel: ib0: posting zlen send, wrid = 51: head = 3699, 
  tail = 3696
  Feb  6 14:39:34  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3700
  Feb  6 14:39:35  kernel: ib0: posting zlen send, wrid = 25: head = 3737, 
  tail = 3728
  Feb  6 14:39:35  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3738
  Feb  6 14:39:35  kernel: ib0: posting zlen send, wrid = 3: head = 3779, 
  tail = 3776
  Feb  6 14:39:35  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3780
  Feb  6 14:39:36  kernel: ib0: posting zlen send, wrid = 48: head = 3824, 
  tail = 3808
  Feb  6 14:39:36  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3825
  Feb  6 14:39:38  kernel: ib0: posting zlen send, wrid = 24: head = 3992, 
  tail = 3984
  Feb  6 14:39:38  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3993
  Feb  6 14:39:38  kernel: ib0: posting zlen send, wrid = 4: head = 4036, 
  tail = 4032
  Feb  6 14:39:38  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 4037
  Feb  6 14:39:46  kernel: ib0: Send unicast ARP to 0023
  Feb  6 14:39:46  kernel: ib0: posting zlen send, wrid = 11: head = 4683, 
  tail = 4672
  Feb  6 14:39:46  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 4684
  Feb  6 14:39:58  kernel: ib0: posting zlen send, wrid = 58: head = 5626, 
  tail = 5616
  Feb  6 14:39:58  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 5627
  Feb  6 14:39:59  kernel: ib0: posting zlen send, wrid = 56: head = 5752, 
  tail = 5744
  Feb  6 14:39:59  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 5753
  Feb  6 14:40:01  kernel: ib0: posting zlen send, wrid = 54: head = 5878, 
  tail = 5872
  Feb  6 14:40:01  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 5879
  Feb  6 14:40:01  kernel: ib0: posting zlen send, wrid = 30: head = 5918, 
  tail = 5904
  Feb  6 14:40:01  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 5919
  Feb  6 14:40:10  kernel: ib0: posting zlen send, wrid = 33: head = 6689, 
  tail = 6672
  Feb  6 14:40:10  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 6690
  Feb  6 14:40:13  kernel: ib0: posting zlen send, wrid = 48: head = 6896, 
  tail = 6880
  Feb  6 14:40:13  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 6897
  Feb  6 14:40:13  kernel: ib0: posting zlen send, wrid = 26: head = 6938, 
  tail = 6928
  Feb  6 14:40:13  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 6939
  Feb  6 14:40:15  kernel: ib0: posting zlen send, wrid = 61: head = 7101, 
  tail = 7088
  Feb  6 14:40:15  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 7102

 ___
 ewg mailing list
 ewg@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] OFED 1.3 rc4 update

2008-02-06 Thread Tziporet Koren

Shirley Ma wrote:


Thanks Tziporet. We will test it right after it's out.

  

You can start use the lates build - 
http://www.openfabrics.org/builds/ofed-1.3/OFED-1.3-20080206-0751.tgz

Tziporet




___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Re: with the ipoib patches, debug prints spam the system log

2008-02-06 Thread Eli Cohen

On Wed, 2008-02-06 at 18:42 +0200, Or Gerlitz wrote:
 On 2/6/08, Eli Cohen [EMAIL PROTECTED] wrote:
  They are only visible when activating ipoib debug. I know it fills the
  dmesg ring with messages. Do you think I should remove them?
 
 Yes, you should remove them.
 
 The ipoib debug prints are very usefull to debug and analyze at the
 field, however, your 3 prints per second addition makes them useless,
 at least for me, and I use them a lot where working to debug and help
 others, so please do.
 
 Or
 
 

OK

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg][PATCH][1/2] SRP multipath failover within 60 seconds,

2008-02-06 Thread Roland Dreier
  diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
  b/drivers/infiniband/ulp/srp/ib_srp.c
  index 950228f..45a2533 100644
  --- a/drivers/infiniband/ulp/srp/ib_srp.c
  +++ b/drivers/infiniband/ulp/srp/ib_srp.c
  @@ -400,7 +400,6 @@
   printk(KERN_DEBUG PFX Sending CM DREQ failed\n);
   return;
   }
  -wait_for_completion(target-done);
   }
   
   static void srp_remove_work(struct work_struct *work)
  @@ -1266,7 +1294,6 @@
   case IB_CM_TIMEWAIT_EXIT:
   printk(KERN_ERR PFX connection closed\n);
   
  -comp = 1;
   target-status = 0;
   break;

Seems like this would leak the cm_id?
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg][PATCH][1/2] SRP multipath failover within 60 seconds,

2008-02-06 Thread Vu Pham

Roland Dreier wrote:

  diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
  index 950228f..45a2533 100644
  --- a/drivers/infiniband/ulp/srp/ib_srp.c
  +++ b/drivers/infiniband/ulp/srp/ib_srp.c
  @@ -400,7 +400,6 @@
printk(KERN_DEBUG PFX Sending CM DREQ failed\n);
return;
}
  - wait_for_completion(target-done);
   }
   
   static void srp_remove_work(struct work_struct *work)

  @@ -1266,7 +1294,6 @@
case IB_CM_TIMEWAIT_EXIT:
printk(KERN_ERR PFX connection closed\n);
   
  -		comp = 1;

target-status = 0;
break;

Seems like this would leak the cm_id?


I said in my [0/2] email, this patch should be applied on 
top of srp_1_recreate_at_reconnect.patch which is already in 
ofed_1_3.git tree kernel_patches/fixes/ directory


I attached it here


Hello, Roland!
Please consider the following for 2.6.19.

---

From: Ishai Rabinovitz [EMAIL PROTECTED]

For some reason (could be a firmware problem) I got a CQ overrun in SRP.
Because of that there was a QP FATAL. Since in srp_reconnect_target we are not
destroying the QP, the QP FATAL persists after the reconnect.
In order to be able to recover from such situation I suggest we
destroy the CQ and the QP in every reconnect.

This also corrects a minor spec in-compliance - when srp_reconnect_target
is called, srp destroys the CM ID and resets the QP, the new connection
will be retried with the same QPN which could theoretically lead to
stale packets (for strict spec compliance I think QPN should not be reused
till all stale packets are flushed out of the network).

---

IB/srp: destroy/re-create QP and CQ on each reconnect.
This makes SRP more robust in presence of hardware errors
and is closer to behaviour suggested by IB spec,
reducing chance of stale packets.

Signed-off-by: Ishai Rabinovitz [EMAIL PROTECTED]
Signed-off-by: Michael S. Tsirkin [EMAIL PROTECTED]

Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c	2006-08-31 12:23:52.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c	2006-08-31 12:30:48.0 +0300
@@ -495,10 +495,10 @@
 static int srp_reconnect_target(struct srp_target_port *target)
 {
 	struct ib_cm_id *new_cm_id;
-	struct ib_qp_attr qp_attr;
 	struct srp_request *req, *tmp;
-	struct ib_wc wc;
 	int ret;
+	struct ib_cq *old_cq;
+	struct ib_qp *old_qp;
 
 	spin_lock_irq(target-scsi_host-host_lock);
 	if (target-state != SRP_TARGET_LIVE) {
@@ -522,17 +522,17 @@
 	ib_destroy_cm_id(target-cm_id);
 	target-cm_id = new_cm_id;
 
-	qp_attr.qp_state = IB_QPS_RESET;
-	ret = ib_modify_qp(target-qp, qp_attr, IB_QP_STATE);
-	if (ret)
-		goto err;
-
-	ret = srp_init_qp(target, target-qp);
-	if (ret)
+	old_qp = target-qp;
+	old_cq = target-cq;
+	ret = srp_create_target_ib(target);
+	if (ret) {
+		target-qp = old_qp;
+		target-cq = old_cq;
 		goto err;
+	}
 
-	while (ib_poll_cq(target-cq, 1, wc)  0)
-		; /* nothing */
+	ib_destroy_qp(old_qp);
+	ib_destroy_cq(old_cq);
 
 	spin_lock_irq(target-scsi_host-host_lock);
 	list_for_each_entry_safe(req, tmp, target-req_queue, list)

-- 
MST

___
openib-general mailing list
[EMAIL PROTECTED]
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ofa-general] Re: [ewg] OFED 1.3 rc4 update

2008-02-06 Thread Pradeep Satyanarayana
Tziporet Koren wrote:
 Shirley Ma wrote:

 Thanks Tziporet. We will test it right after it's out.

   
 You can start use the lates build -
 http://www.openfabrics.org/builds/ofed-1.3/OFED-1.3-20080206-0751.tgz
 
 Tziporet
 

I have downloaded the todays build mentioned above. I am still seeing the issue
of failing ib_destroy_cq() for the rcq mentioned yesterday.

Here are the steps that I follow:

1. On a freshly booted system configure ib0
2. Switch to connected mode ( on HCA that supports SRQ)
3. ping remote interface
4. modprobe -r ib_ehca
5. I see the failures about ib_destroy_cq() failing and the
cascading failures following that (srq and pd cannot be destroyed)
6. If I try a modprobe ib_ehca I get an error Cannot allocate memory
This also means some one is chewing tons of memory. I realize that the
qp and associated pd were not freed, so some memory is lost. However,
this system has 8 GB of memory.

Pradeep

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg][PATCH][1/2] SRP multipath failover within 60 seconds,

2008-02-06 Thread Vu Pham

Roland Dreier wrote:

  diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
  index 950228f..45a2533 100644
  --- a/drivers/infiniband/ulp/srp/ib_srp.c
  +++ b/drivers/infiniband/ulp/srp/ib_srp.c
  @@ -400,7 +400,6 @@
printk(KERN_DEBUG PFX Sending CM DREQ failed\n);
return;
}
  - wait_for_completion(target-done);
   }
   
   static void srp_remove_work(struct work_struct *work)

  @@ -1266,7 +1294,6 @@
case IB_CM_TIMEWAIT_EXIT:
printk(KERN_ERR PFX connection closed\n);
   
  -		comp = 1;

target-status = 0;
break;

Seems like this would leak the cm_id?


I said in my [0/2] email, this patch should be applied on 
top of srp_1_recreate_at_reconnect.patch which is already in 
ofed_1_3.git tree kernel_patches/fixes/ directory


I attached it here

Hello, Roland!
Please consider the following for 2.6.19.

---

From: Ishai Rabinovitz [EMAIL PROTECTED]

For some reason (could be a firmware problem) I got a CQ overrun in SRP.
Because of that there was a QP FATAL. Since in srp_reconnect_target we are not
destroying the QP, the QP FATAL persists after the reconnect.
In order to be able to recover from such situation I suggest we
destroy the CQ and the QP in every reconnect.

This also corrects a minor spec in-compliance - when srp_reconnect_target
is called, srp destroys the CM ID and resets the QP, the new connection
will be retried with the same QPN which could theoretically lead to
stale packets (for strict spec compliance I think QPN should not be reused
till all stale packets are flushed out of the network).

---

IB/srp: destroy/re-create QP and CQ on each reconnect.
This makes SRP more robust in presence of hardware errors
and is closer to behaviour suggested by IB spec,
reducing chance of stale packets.

Signed-off-by: Ishai Rabinovitz [EMAIL PROTECTED]
Signed-off-by: Michael S. Tsirkin [EMAIL PROTECTED]

Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c	2006-08-31 12:23:52.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c	2006-08-31 12:30:48.0 +0300
@@ -495,10 +495,10 @@
 static int srp_reconnect_target(struct srp_target_port *target)
 {
 	struct ib_cm_id *new_cm_id;
-	struct ib_qp_attr qp_attr;
 	struct srp_request *req, *tmp;
-	struct ib_wc wc;
 	int ret;
+	struct ib_cq *old_cq;
+	struct ib_qp *old_qp;
 
 	spin_lock_irq(target-scsi_host-host_lock);
 	if (target-state != SRP_TARGET_LIVE) {
@@ -522,17 +522,17 @@
 	ib_destroy_cm_id(target-cm_id);
 	target-cm_id = new_cm_id;
 
-	qp_attr.qp_state = IB_QPS_RESET;
-	ret = ib_modify_qp(target-qp, qp_attr, IB_QP_STATE);
-	if (ret)
-		goto err;
-
-	ret = srp_init_qp(target, target-qp);
-	if (ret)
+	old_qp = target-qp;
+	old_cq = target-cq;
+	ret = srp_create_target_ib(target);
+	if (ret) {
+		target-qp = old_qp;
+		target-cq = old_cq;
 		goto err;
+	}
 
-	while (ib_poll_cq(target-cq, 1, wc)  0)
-		; /* nothing */
+	ib_destroy_qp(old_qp);
+	ib_destroy_cq(old_cq);
 
 	spin_lock_irq(target-scsi_host-host_lock);
 	list_for_each_entry_safe(req, tmp, target-req_queue, list)

-- 
MST

___
openib-general mailing list
[EMAIL PROTECTED]
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] Re: [ofa-general] [ANNOUNCE] open iSCSI over iSER target RPM is available

2008-02-06 Thread Joe Landman

Hi Erez

Erez Zilber wrote:

stgt (SCSI target) is an open-source framework for storage target
drivers. It supports iSCSI over iSER among other storage target drivers.

Voltaire added a git tree for stgt that will be added to OFED 1.4:
http://www2.openfabrics.org/git/?p=~dorons/tgt.git;a=summary

Until OFED 1.4 gets released, it is possible to install the stgt RPM on
top of OFED 1.3. For more details about how to install and use stgt,
please refer to https://wiki.openfabrics.org/tiki-index.php?page=ISER-target

Some performance numbers that were measured by OSC (using SDR cards):


Is there a 2TB limit on this target? It turns our 6TB partition into a 
2TB lun.



* READ: 920 MB/sec
* WRITE: 850 MB/sec


Not getting anything even remotely close to this.  Are there more 
details on configuration somewhere?  I followed the web page as indicated.


Joe



We hope to have DDR measurements numbers soon.




--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: [EMAIL PROTECTED]
web  : http://www.scalableinformatics.com
   http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 866 888 3112
cell : +1 734 612 4615
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [Stgt-devel] [ofa-general] [ANNOUNCE] open iSCSI over iSER target RPM is available

2008-02-06 Thread FUJITA Tomonori
On Wed, 06 Feb 2008 16:38:11 -0500
Joe Landman [EMAIL PROTECTED] wrote:

 Hi Erez
 
 Erez Zilber wrote:
  stgt (SCSI target) is an open-source framework for storage target
  drivers. It supports iSCSI over iSER among other storage target drivers.
  
  Voltaire added a git tree for stgt that will be added to OFED 1.4:
  http://www2.openfabrics.org/git/?p=~dorons/tgt.git;a=summary
  
  Until OFED 1.4 gets released, it is possible to install the stgt RPM on
  top of OFED 1.3. For more details about how to install and use stgt,
  please refer to https://wiki.openfabrics.org/tiki-index.php?page=ISER-target
  
  Some performance numbers that were measured by OSC (using SDR cards):
 
 Is there a 2TB limit on this target? It turns our 6TB partition into a 
 2TB lun.

No, there isn't.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ofa-general] Re: [ewg] OFED 1.3 rc4 update

2008-02-06 Thread Pradeep Satyanarayana
Pradeep Satyanarayana wrote:
 Tziporet Koren wrote:
 Shirley Ma wrote:
 Thanks Tziporet. We will test it right after it's out.

   
 You can start use the lates build -
 http://www.openfabrics.org/builds/ofed-1.3/OFED-1.3-20080206-0751.tgz

 Tziporet

 
 I have downloaded the todays build mentioned above. I am still seeing the 
 issue
 of failing ib_destroy_cq() for the rcq mentioned yesterday.
 
 Here are the steps that I follow:
 
 1. On a freshly booted system configure ib0
 2. Switch to connected mode ( on HCA that supports SRQ)
 3. ping remote interface
 4. modprobe -r ib_ehca
 5. I see the failures about ib_destroy_cq() failing and the
 cascading failures following that (srq and pd cannot be destroyed)

The ib_destroy_qp() fails because of refcnt is not zero. On my
system it was set to 2.

Pradeep

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] with the ipoib patches, debug prints spam the system log

2008-02-06 Thread Or Gerlitz

Or Gerlitz wrote:

You have left somehow too many... debug prints in the last patches,
please clean this up. See for example how the system log after less
then a minute when ipoib debug prints are opened, it has one original
print (ib0: Send unicast ARP to 0023) and all the rest are yours.



Feb  6 14:39:23  kernel: ib0: posting zlen send, wrid = 4: head = 2756, tail = 
2752
Feb  6 14:39:23  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 2757


Hi Eli,

Just a reminder to remove this for RC4, using last night snapshot I 
still see it.


Or.



___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg