Re: [ewg] with the ipoib patches, debug prints spam the system log

2008-02-06 Thread Or Gerlitz

Or Gerlitz wrote:

You have left somehow too many... debug prints in the last patches,
please clean this up. See for example how the system log after less
then a minute when ipoib debug prints are opened, it has one original
print ("ib0: Send unicast ARP to 0023") and all the rest are yours.



Feb  6 14:39:23  kernel: ib0: posting zlen send, wrid = 4: head = 2756, tail = 
2752
Feb  6 14:39:23  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 2757


Hi Eli,

Just a reminder to remove this for RC4, using last night snapshot I 
still see it.


Or.



___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: traffic jittery, send queue full reports from mthca driver

2008-02-06 Thread Or Gerlitz
>> ib_mthca :03:00.0: SQ 000404 full (756910656 head, 756910592 tail, 64 
>> max, 0 nreq)
>> ib0: failed to post zlen send

OK, Eli, taking the kernel bits from OFED-1.3-20080206-0751.tgz I don't
see these prints any more. When probing out the driver inorder to replace
it with the drop, I have got the following:

ib0: timing out; will leak address handles
ib0: ib_dealloc_pd failed

so, is it another issue or related to the room-for-zlen-in-ring-accounting fix?

Or

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg][PATCH][0/2] SRP multipath failover within 60 seconds,

2008-02-06 Thread Vladimir Sokolovsky

Vu Pham wrote:
The following patches assist SRP/dm-multipath to failover within 60 
seconds (bugzilla #577) without data corruption, read/write error


1. srp_disconnect_without_wait.patch - srp send disconnect request  
without waiting for CM timewait exit event since srp current does not 
re-use the cm_id and qp/cq of a connection (patch 
srp_1_recreate_at_reconnect.patch already in kernel_patches/fixes 
recreate the cmid, qp/cq for a connection at reconnect)
2. srp_qp_in_err_timer_reconnect_target.patch - when detecting a 
post_send/post_receive error, srp set qp_in_error, set a timer to 
reconnect to target, return SCSI_MLQUEUE_HOST_BUSY to lock the queue, 
and return DID_NO_CONNECT when target state is DEAD or REMOVED


Here is my multipath.conf
defaults {
   udev_dir/dev
   polling_interval5
   selector"round-robin 0"
   path_grouping_policymultibus
   getuid_callout  "/sbin/scsi_id -g -u -s /block/%n"
   prio_callout/bin/true
   path_checkerreadsector0
   rr_min_io   100
   rr_weight   priorities
   failbackimmediate
   no_path_retry   5
   user_friendly_names no
}
I also set srp_daemon.sh to rescan fabric every 60 seconds (instead of 
300 secs as default setting)


I ran data integrity test to /dev/mapper/ and {disable path 1, 
sleep 90, enable path 1, sleep 60, disable path 2, sleep 90, enable path 
2, sleep 60} in the loop


RHEL5, 5.1 work very well (no data corruption, read/write failure report)
For SLES 10 sp1, it work well as long as I run *multipath* every 60 
secs. I think that I mis-configured the multipathd somehow (Here is how 
I set it up: using the same multipath.conf above, chkconfig 
boot.multipath on and chkconf multipathd on)


  -vu



Applied,
kernel_patches/fixes/srp_2_disconnect_without_wait.patch
kernel_patches/fixes/srp_3_qp_err_timer_reconnect_target.patch

Regards,
Vladimir
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ofa-general] Re: [ewg] OFED 1.3 rc4 update

2008-02-06 Thread Pradeep Satyanarayana
Pradeep Satyanarayana wrote:
> Tziporet Koren wrote:
>> Shirley Ma wrote:
>>> Thanks Tziporet. We will test it right after it's out.
>>>
>>>   
>> You can start use the lates build -
>> http://www.openfabrics.org/builds/ofed-1.3/OFED-1.3-20080206-0751.tgz
>>
>> Tziporet
>>
> 
> I have downloaded the todays build mentioned above. I am still seeing the 
> issue
> of failing ib_destroy_cq() for the rcq mentioned yesterday.
> 
> Here are the steps that I follow:
> 
> 1. On a freshly booted system configure ib0
> 2. Switch to connected mode ( on HCA that supports SRQ)
> 3. ping remote interface
> 4. modprobe -r ib_ehca
> 5. I see the failures about ib_destroy_cq() failing and the
> cascading failures following that (srq and pd cannot be destroyed)

The ib_destroy_qp() fails because of refcnt is not zero. On my
system it was set to 2.

Pradeep

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [Stgt-devel] [ofa-general] [ANNOUNCE] open iSCSI over iSER target RPM is available

2008-02-06 Thread FUJITA Tomonori
On Wed, 06 Feb 2008 16:38:11 -0500
Joe Landman <[EMAIL PROTECTED]> wrote:

> Hi Erez
> 
> Erez Zilber wrote:
> > stgt (SCSI target) is an open-source framework for storage target
> > drivers. It supports iSCSI over iSER among other storage target drivers.
> > 
> > Voltaire added a git tree for stgt that will be added to OFED 1.4:
> > http://www2.openfabrics.org/git/?p=~dorons/tgt.git;a=summary
> > 
> > Until OFED 1.4 gets released, it is possible to install the stgt RPM on
> > top of OFED 1.3. For more details about how to install and use stgt,
> > please refer to https://wiki.openfabrics.org/tiki-index.php?page=ISER-target
> > 
> > Some performance numbers that were measured by OSC (using SDR cards):
> 
> Is there a 2TB limit on this target? It turns our 6TB partition into a 
> 2TB lun.

No, there isn't.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [ofa-general] [ANNOUNCE] open iSCSI over iSER target RPM is available

2008-02-06 Thread Joe Landman

Hi Erez

Erez Zilber wrote:

stgt (SCSI target) is an open-source framework for storage target
drivers. It supports iSCSI over iSER among other storage target drivers.

Voltaire added a git tree for stgt that will be added to OFED 1.4:
http://www2.openfabrics.org/git/?p=~dorons/tgt.git;a=summary

Until OFED 1.4 gets released, it is possible to install the stgt RPM on
top of OFED 1.3. For more details about how to install and use stgt,
please refer to https://wiki.openfabrics.org/tiki-index.php?page=ISER-target

Some performance numbers that were measured by OSC (using SDR cards):


Is there a 2TB limit on this target? It turns our 6TB partition into a 
2TB lun.



* READ: 920 MB/sec
* WRITE: 850 MB/sec


Not getting anything even remotely close to this.  Are there more 
details on configuration somewhere?  I followed the web page as indicated.


Joe



We hope to have DDR measurements numbers soon.




--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: [EMAIL PROTECTED]
web  : http://www.scalableinformatics.com
   http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 866 888 3112
cell : +1 734 612 4615
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg][PATCH][1/2] SRP multipath failover within 60 seconds,

2008-02-06 Thread Vu Pham

Roland Dreier wrote:

 > diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
 > index 950228f..45a2533 100644
 > --- a/drivers/infiniband/ulp/srp/ib_srp.c
 > +++ b/drivers/infiniband/ulp/srp/ib_srp.c
 > @@ -400,7 +400,6 @@
 >   printk(KERN_DEBUG PFX "Sending CM DREQ failed\n");
 >   return;
 >   }
 > - wait_for_completion(&target->done);
 >  }
 >  
 >  static void srp_remove_work(struct work_struct *work)

 > @@ -1266,7 +1294,6 @@
 >   case IB_CM_TIMEWAIT_EXIT:
 >   printk(KERN_ERR PFX "connection closed\n");
 >  
 > -		comp = 1;

 >   target->status = 0;
 >   break;

Seems like this would leak the cm_id?


I said in my [0/2] email, this patch should be applied on 
top of srp_1_recreate_at_reconnect.patch which is already in 
ofed_1_3.git tree kernel_patches/fixes/ directory


I attached it here

Hello, Roland!
Please consider the following for 2.6.19.

---

>From: Ishai Rabinovitz <[EMAIL PROTECTED]>

For some reason (could be a firmware problem) I got a CQ overrun in SRP.
Because of that there was a QP FATAL. Since in srp_reconnect_target we are not
destroying the QP, the QP FATAL persists after the reconnect.
In order to be able to recover from such situation I suggest we
destroy the CQ and the QP in every reconnect.

This also corrects a minor spec in-compliance - when srp_reconnect_target
is called, srp destroys the CM ID and resets the QP, the new connection
will be retried with the same QPN which could theoretically lead to
stale packets (for strict spec compliance I think QPN should not be reused
till all stale packets are flushed out of the network).

---

IB/srp: destroy/re-create QP and CQ on each reconnect.
This makes SRP more robust in presence of hardware errors
and is closer to behaviour suggested by IB spec,
reducing chance of stale packets.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>
Signed-off-by: Michael S. Tsirkin <[EMAIL PROTECTED]>

Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c	2006-08-31 12:23:52.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c	2006-08-31 12:30:48.0 +0300
@@ -495,10 +495,10 @@
 static int srp_reconnect_target(struct srp_target_port *target)
 {
 	struct ib_cm_id *new_cm_id;
-	struct ib_qp_attr qp_attr;
 	struct srp_request *req, *tmp;
-	struct ib_wc wc;
 	int ret;
+	struct ib_cq *old_cq;
+	struct ib_qp *old_qp;
 
 	spin_lock_irq(target->scsi_host->host_lock);
 	if (target->state != SRP_TARGET_LIVE) {
@@ -522,17 +522,17 @@
 	ib_destroy_cm_id(target->cm_id);
 	target->cm_id = new_cm_id;
 
-	qp_attr.qp_state = IB_QPS_RESET;
-	ret = ib_modify_qp(target->qp, &qp_attr, IB_QP_STATE);
-	if (ret)
-		goto err;
-
-	ret = srp_init_qp(target, target->qp);
-	if (ret)
+	old_qp = target->qp;
+	old_cq = target->cq;
+	ret = srp_create_target_ib(target);
+	if (ret) {
+		target->qp = old_qp;
+		target->cq = old_cq;
 		goto err;
+	}
 
-	while (ib_poll_cq(target->cq, 1, &wc) > 0)
-		; /* nothing */
+	ib_destroy_qp(old_qp);
+	ib_destroy_cq(old_cq);
 
 	spin_lock_irq(target->scsi_host->host_lock);
 	list_for_each_entry_safe(req, tmp, &target->req_queue, list)

-- 
MST

___
openib-general mailing list
[EMAIL PROTECTED]
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg][PATCH][1/2] SRP multipath failover within 60 seconds,

2008-02-06 Thread Vu Pham

Vu Pham wrote:

Roland Dreier wrote:
 > diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c

 > index 950228f..45a2533 100644
 > --- a/drivers/infiniband/ulp/srp/ib_srp.c
 > +++ b/drivers/infiniband/ulp/srp/ib_srp.c
 > @@ -400,7 +400,6 @@
 >  printk(KERN_DEBUG PFX "Sending CM DREQ failed\n");
 >  return;
 >  }
 > -wait_for_completion(&target->done);
 >  }
 >   >  static void srp_remove_work(struct work_struct *work)
 > @@ -1266,7 +1294,6 @@
 >  case IB_CM_TIMEWAIT_EXIT:
 >  printk(KERN_ERR PFX "connection closed\n");
 >   > -comp = 1;
 >  target->status = 0;
 >  break;

Seems like this would leak the cm_id?


I said in my [0/2] email, this patch should be applied on top of 
srp_1_recreate_at_reconnect.patch which is already in ofed_1_3.git tree 
kernel_patches/fixes/ directory


I attached it here



I did not answer you correctly.

This would not leak the cm_id

in srp_reconnect_target()
...
srp_disconnect_target(target);
new_cm_id = ib_create_cm_id()
if (IS_ERR(new_cm_id)) {
ret = PTR_ERR(new_cm_id);
goto err;
}
ib_destroy_cm_id(target->cm_id);

the cm_id get destroy in srp_reconnect_target or in 
srp_remove_work


  -vu
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ofa-general] Re: [ewg] OFED 1.3 rc4 update

2008-02-06 Thread Pradeep Satyanarayana
Tziporet Koren wrote:
> Shirley Ma wrote:
>>
>> Thanks Tziporet. We will test it right after it's out.
>>
>>   
> You can start use the lates build -
> http://www.openfabrics.org/builds/ofed-1.3/OFED-1.3-20080206-0751.tgz
> 
> Tziporet
> 

I have downloaded the todays build mentioned above. I am still seeing the issue
of failing ib_destroy_cq() for the rcq mentioned yesterday.

Here are the steps that I follow:

1. On a freshly booted system configure ib0
2. Switch to connected mode ( on HCA that supports SRQ)
3. ping remote interface
4. modprobe -r ib_ehca
5. I see the failures about ib_destroy_cq() failing and the
cascading failures following that (srq and pd cannot be destroyed)
6. If I try a modprobe ib_ehca I get an error "Cannot allocate memory"
This also means some one is chewing tons of memory. I realize that the
qp and associated pd were not freed, so some memory is "lost". However,
this system has 8 GB of memory.

Pradeep

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg][PATCH][1/2] SRP multipath failover within 60 seconds,

2008-02-06 Thread Vu Pham

Roland Dreier wrote:

 > diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
 > index 950228f..45a2533 100644
 > --- a/drivers/infiniband/ulp/srp/ib_srp.c
 > +++ b/drivers/infiniband/ulp/srp/ib_srp.c
 > @@ -400,7 +400,6 @@
 >   printk(KERN_DEBUG PFX "Sending CM DREQ failed\n");
 >   return;
 >   }
 > - wait_for_completion(&target->done);
 >  }
 >  
 >  static void srp_remove_work(struct work_struct *work)

 > @@ -1266,7 +1294,6 @@
 >   case IB_CM_TIMEWAIT_EXIT:
 >   printk(KERN_ERR PFX "connection closed\n");
 >  
 > -		comp = 1;

 >   target->status = 0;
 >   break;

Seems like this would leak the cm_id?


I said in my [0/2] email, this patch should be applied on 
top of srp_1_recreate_at_reconnect.patch which is already in 
ofed_1_3.git tree kernel_patches/fixes/ directory


I attached it here


Hello, Roland!
Please consider the following for 2.6.19.

---

>From: Ishai Rabinovitz <[EMAIL PROTECTED]>

For some reason (could be a firmware problem) I got a CQ overrun in SRP.
Because of that there was a QP FATAL. Since in srp_reconnect_target we are not
destroying the QP, the QP FATAL persists after the reconnect.
In order to be able to recover from such situation I suggest we
destroy the CQ and the QP in every reconnect.

This also corrects a minor spec in-compliance - when srp_reconnect_target
is called, srp destroys the CM ID and resets the QP, the new connection
will be retried with the same QPN which could theoretically lead to
stale packets (for strict spec compliance I think QPN should not be reused
till all stale packets are flushed out of the network).

---

IB/srp: destroy/re-create QP and CQ on each reconnect.
This makes SRP more robust in presence of hardware errors
and is closer to behaviour suggested by IB spec,
reducing chance of stale packets.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>
Signed-off-by: Michael S. Tsirkin <[EMAIL PROTECTED]>

Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c
===
--- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c	2006-08-31 12:23:52.0 +0300
+++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c	2006-08-31 12:30:48.0 +0300
@@ -495,10 +495,10 @@
 static int srp_reconnect_target(struct srp_target_port *target)
 {
 	struct ib_cm_id *new_cm_id;
-	struct ib_qp_attr qp_attr;
 	struct srp_request *req, *tmp;
-	struct ib_wc wc;
 	int ret;
+	struct ib_cq *old_cq;
+	struct ib_qp *old_qp;
 
 	spin_lock_irq(target->scsi_host->host_lock);
 	if (target->state != SRP_TARGET_LIVE) {
@@ -522,17 +522,17 @@
 	ib_destroy_cm_id(target->cm_id);
 	target->cm_id = new_cm_id;
 
-	qp_attr.qp_state = IB_QPS_RESET;
-	ret = ib_modify_qp(target->qp, &qp_attr, IB_QP_STATE);
-	if (ret)
-		goto err;
-
-	ret = srp_init_qp(target, target->qp);
-	if (ret)
+	old_qp = target->qp;
+	old_cq = target->cq;
+	ret = srp_create_target_ib(target);
+	if (ret) {
+		target->qp = old_qp;
+		target->cq = old_cq;
 		goto err;
+	}
 
-	while (ib_poll_cq(target->cq, 1, &wc) > 0)
-		; /* nothing */
+	ib_destroy_qp(old_qp);
+	ib_destroy_cq(old_cq);
 
 	spin_lock_irq(target->scsi_host->host_lock);
 	list_for_each_entry_safe(req, tmp, &target->req_queue, list)

-- 
MST

___
openib-general mailing list
[EMAIL PROTECTED]
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] OFED 1.3 rc4 update

2008-02-06 Thread Tziporet Koren

Shirley Ma wrote:


Thanks Tziporet. We will test it right after it's out.

  

You can start use the lates build - 
http://www.openfabrics.org/builds/ofed-1.3/OFED-1.3-20080206-0751.tgz

Tziporet




___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg][PATCH][1/2] SRP multipath failover within 60 seconds,

2008-02-06 Thread Roland Dreier
 > diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
 > b/drivers/infiniband/ulp/srp/ib_srp.c
 > index 950228f..45a2533 100644
 > --- a/drivers/infiniband/ulp/srp/ib_srp.c
 > +++ b/drivers/infiniband/ulp/srp/ib_srp.c
 > @@ -400,7 +400,6 @@
 >  printk(KERN_DEBUG PFX "Sending CM DREQ failed\n");
 >  return;
 >  }
 > -wait_for_completion(&target->done);
 >  }
 >  
 >  static void srp_remove_work(struct work_struct *work)
 > @@ -1266,7 +1294,6 @@
 >  case IB_CM_TIMEWAIT_EXIT:
 >  printk(KERN_ERR PFX "connection closed\n");
 >  
 > -comp = 1;
 >  target->status = 0;
 >  break;

Seems like this would leak the cm_id?
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Re: with the ipoib patches, debug prints spam the system log

2008-02-06 Thread Eli Cohen

On Wed, 2008-02-06 at 18:42 +0200, Or Gerlitz wrote:
> On 2/6/08, Eli Cohen <[EMAIL PROTECTED]> wrote:
> > They are only visible when activating ipoib debug. I know it fills the
> > dmesg ring with messages. Do you think I should remove them?
> 
> Yes, you should remove them.
> 
> The ipoib debug prints are very usefull to debug and analyze at the
> field, however, your 3 prints per second addition makes them useless,
> at least for me, and I use them a lot where working to debug and help
> others, so please do.
> 
> Or
> 
> 

OK

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Re: with the ipoib patches, debug prints spam the system log

2008-02-06 Thread Or Gerlitz
On 2/6/08, Eli Cohen <[EMAIL PROTECTED]> wrote:
> They are only visible when activating ipoib debug. I know it fills the
> dmesg ring with messages. Do you think I should remove them?

Yes, you should remove them.

The ipoib debug prints are very usefull to debug and analyze at the
field, however, your 3 prints per second addition makes them useless,
at least for me, and I use them a lot where working to debug and help
others, so please do.

Or


>
> On Wed, 2008-02-06 at 10:38 +0200, Or Gerlitz wrote:
> > Eli,
> >
> > You have left somehow too many... debug prints in the last patches,
> > please clean this up. See for example how the system log after less
> > then a minute when ipoib debug prints are opened, it has one original
> > print ("ib0: Send unicast ARP to 0023") and all the rest are yours.
> >
> > Or
> >
> > Feb  6 14:39:23  kernel: ib0: posting zlen send, wrid = 4: head = 2756, 
> > tail = 2752
> > Feb  6 14:39:23  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 2757
> > Feb  6 14:39:25  kernel: ib0: posting zlen send, wrid = 39: head = 2919, 
> > tail = 2912
> > Feb  6 14:39:25  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 2920
> > Feb  6 14:39:25  kernel: ib0: posting zlen send, wrid = 15: head = 2959, 
> > tail = 2944
> > Feb  6 14:39:25  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 2960
> > Feb  6 14:39:27  kernel: ib0: posting zlen send, wrid = 8: head = 3080, 
> > tail = 3072
> > Feb  6 14:39:27  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3081
> > Feb  6 14:39:34  kernel: ib0: posting zlen send, wrid = 51: head = 3699, 
> > tail = 3696
> > Feb  6 14:39:34  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3700
> > Feb  6 14:39:35  kernel: ib0: posting zlen send, wrid = 25: head = 3737, 
> > tail = 3728
> > Feb  6 14:39:35  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3738
> > Feb  6 14:39:35  kernel: ib0: posting zlen send, wrid = 3: head = 3779, 
> > tail = 3776
> > Feb  6 14:39:35  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3780
> > Feb  6 14:39:36  kernel: ib0: posting zlen send, wrid = 48: head = 3824, 
> > tail = 3808
> > Feb  6 14:39:36  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3825
> > Feb  6 14:39:38  kernel: ib0: posting zlen send, wrid = 24: head = 3992, 
> > tail = 3984
> > Feb  6 14:39:38  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3993
> > Feb  6 14:39:38  kernel: ib0: posting zlen send, wrid = 4: head = 4036, 
> > tail = 4032
> > Feb  6 14:39:38  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 4037
> > Feb  6 14:39:46  kernel: ib0: Send unicast ARP to 0023
> > Feb  6 14:39:46  kernel: ib0: posting zlen send, wrid = 11: head = 4683, 
> > tail = 4672
> > Feb  6 14:39:46  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 4684
> > Feb  6 14:39:58  kernel: ib0: posting zlen send, wrid = 58: head = 5626, 
> > tail = 5616
> > Feb  6 14:39:58  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 5627
> > Feb  6 14:39:59  kernel: ib0: posting zlen send, wrid = 56: head = 5752, 
> > tail = 5744
> > Feb  6 14:39:59  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 5753
> > Feb  6 14:40:01  kernel: ib0: posting zlen send, wrid = 54: head = 5878, 
> > tail = 5872
> > Feb  6 14:40:01  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 5879
> > Feb  6 14:40:01  kernel: ib0: posting zlen send, wrid = 30: head = 5918, 
> > tail = 5904
> > Feb  6 14:40:01  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 5919
> > Feb  6 14:40:10  kernel: ib0: posting zlen send, wrid = 33: head = 6689, 
> > tail = 6672
> > Feb  6 14:40:10  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 6690
> > Feb  6 14:40:13  kernel: ib0: posting zlen send, wrid = 48: head = 6896, 
> > tail = 6880
> > Feb  6 14:40:13  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 6897
> > Feb  6 14:40:13  kernel: ib0: posting zlen send, wrid = 26: head = 6938, 
> > tail = 6928
> > Feb  6 14:40:13  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 6939
> > Feb  6 14:40:15  kernel: ib0: posting zlen send, wrid = 61: head = 7101, 
> > tail = 7088
> > Feb  6 14:40:15  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 7102
>
> ___
> ewg mailing list
> ewg@lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
>
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] OFED 1.3 rc4 update

2008-02-06 Thread Shirley Ma
On Wed, 2008-02-06 at 18:25 +0200, Tziporet Koren wrote:
> Hi,
> 
> We will have OFED 1.3-rc4 tomorrow after one more night of regression
> 
> It will include:
> 
>1. IPoIB: Non-SRQ for CM mode
>2. IPOIB: 4K MTU
>3. IPoIB - Small messages improvements
> 
> Note that today's latest build will include theses features too if 
> someone want to test it today
> 
> Tziporet

Thanks Tziporet. We will test it right after it's out.

Thanks
Shirley

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] OFED 1.3 rc4 update

2008-02-06 Thread Tziporet Koren

Hi,

We will have OFED 1.3-rc4 tomorrow after one more night of regression

It will include:

  1. IPoIB: Non-SRQ for CM mode
  2. IPOIB: 4K MTU
  3. IPoIB - Small messages improvements

Note that today's latest build will include theses features too if 
someone want to test it today


Tziporet

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Re: Updated: OFED weekly teleconference

2008-02-06 Thread Tziporet Koren

Jeff Squyres wrote:
Note that I'm not the one who schedules the EWG teleconferences; I'm 
just the guy who provides the phone bridge.


Tziporet is the OFED release manager and coordinates the EWG 
teleconferences.
Sorry about the conflict but its one time due to President day and 
several companies take this as a vacation day


Note that this meeting is usually only half an hour so maybe there is no 
conflict at all


Tziporet



On Feb 6, 2008, at 10:12 AM, Rupert Dance wrote:


Jeff,

Just so you know, this conflicts with the OFA IWG meeting which has 
always been held from 11:30 - 1:00 PM EST on Tuesday's. Since this is 
a one time occurrence, I would not change anything but I just thought 
you should know.


Rupert



___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Your profile

2008-02-06 Thread Herschel Jacobs

Hello! I am tired today. I am nice girl that would like to chat with you. Email 
me at [EMAIL PROTECTED] only, because I am using my friend's email to write 
this. Will send some of my pictures

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: Updated: OFED weekly teleconference

2008-02-06 Thread Jeff Squyres
Note that I'm not the one who schedules the EWG teleconferences; I'm  
just the guy who provides the phone bridge.


Tziporet is the OFED release manager and coordinates the EWG  
teleconferences.



On Feb 6, 2008, at 10:12 AM, Rupert Dance wrote:


Jeff,

Just so you know, this conflicts with the OFA IWG meeting which has  
always been held from 11:30 - 1:00 PM EST on Tuesday's. Since this  
is a one time occurrence, I would not change anything but I just  
thought you should know.


Rupert

_
From:   Jeff Squyres (jsquyres) [mailto:[EMAIL PROTECTED]
Sent:   Wednesday, February 06, 2008 10:03 AM
To: ewg@lists.openfabrics.org
Cc: Scott Bahling; John Russo; Ryan, Jim; Ken L Johnson; [EMAIL PROTECTED] 
; [EMAIL PROTECTED]; Head Bubba; Van Houten, Betty; Patrick  
Mullaney


Subject:Updated: OFED weekly teleconference
When:   Tuesday, February 19, 2008 12:00 PM-1:00 PM (GMT-05:00)  
Eastern Time (US & Canada).

Where:  ID: 210020028



__
Jeffrey Squyres has invited you to a Cisco Unified MeetingPlace  
Conference


Date/Time:   FEB 19, 2008 at 12:00PM America/New_York
Length:  60
Frequency:   3
Meeting ID:  210020028
Meeting Password:

Global Access Numbers:
http://cisco.com/en/US/about/doing_business/conferencing/index.html

US/Canada:  +1.866.432.9903United Kingdom:   +44.20.8824.0117
India:  +91.80.4103.3979   Germany:  +49.619.6773.9002
Japan:  +81.3.5763.9394China:+86.10.8515.5666

TO ATTEND A WEB AND VOICE CONFERENCE:

CISCO INTRANET ATTENDEES
Join the Web & Voice Conference*
1. Go to http://meetingplaceinternal.cisco.com/join.asp?210020028
2. Enter your CEC User ID & Password then click OK
- Accept any security warnings you receive and wait for the Meeting  
Room to initialize
3. Click on CONNECT from the Meeting Room to join the Voice  
Conference portion of the meeting


EXTERNAL ATTENDEES - Outside the Cisco Intranet**
Join the Web & Voice Conference*
1. Go to http://meetingplace.cisco.com/join.asp?210020028
2. Fill in the My Name is field then click Attend Meeting
- If you have a CEC User ID, click on the Cisco icon
- Accept any security warnings you receive and wait for the Meeting  
Room to initialize
3. Click on CONNECT from the Meeting Room to join the Voice  
Conference portion of the meeting

- Note: Guest users will see a link to the Global Access Numbers.

*If this is your first time attending a Web Conference, disable any  
pop-up blockers and visit http://meetingplace.cisco.com/mpweb/scripts/browsertestupper.asp 
 to test your web browser for compatibility with the Web Conference.


**Not all meetings are scheduled to allow external attendees into  
the Web Conference portion of the meeting, if the URL does not work,  
please follow the Voice only Conference instructions below to attend.


TO ATTEND A VOICE ONLY CONFERENCE
1. Dial into Cisco Unified MeetingPlace (view the Access Numbers and  
link above)

2. Press 1 to attend the meeting
3. Follow the prompts to enter the Meeting ID 210020028 and join the  
meeting


SUPPORT
Information about this Conference: Contact Jeffrey Squyres, 85250971
Cisco IT Support Center: Attend the Voice Conference and then press  
#0 on your phone keypad


GLOBAL ACCESS NUMBERS

COUNTRYLOCATIONLOCAL NUMBER   TOLL  
FREE-FREEFONE


AlgeriaAlgiers+213.21.98.9047
Argentina  Buenos Aires   +54.11.4341.0101
Australia  Canberra   +61.2.6216.0643
   Melbourne  +61.3.9659.4173
   North Sydney   +61.2.8446.5260
AustriaVienna +43.12.4030.6022
Azerbaijan Baku   +994.12.437.4829
BelgiumBrussels   +32.2.704.5072
Bosnia &
HerzegovinaSarajevo   +387.33.56.2898
Brazil Brasilia   +55.613.424.0220
   Rio de Janeiro +55.21.2483.6302
   Sao Paulo  +55.11.5508.6311
Bulgaria   Sofia  +359.2.937.5938
Canada Calgary+1.403.514.2435
   Edmonton   +1.780.441.3715
   Halifax+1.902.474.0214
   Kanata +1.613.254.0005
   Markham+1.905.470.4810
   Montreal   +1.514.847.6875
   Ottawa +1.613.788.7250
   Quebec +1.418.634.5645
   Regina +1.306.566.6410
   Toronto+1.416.306.7230
   Vancouver  +1.604.647.2350
   Winnipeg   +1.204.336.6610
Chile  Santiago   +56.2.431.4936
China  Beijing+86.10.8515.5666
   

[ewg] RE: Updated: OFED weekly teleconference

2008-02-06 Thread Rupert Dance
Jeff,

Just so you know, this conflicts with the OFA IWG meeting which has always
been held from 11:30 - 1:00 PM EST on Tuesday's. Since this is a one time
occurrence, I would not change anything but I just thought you should know.

Rupert

> _
> From: Jeff Squyres (jsquyres) [mailto:[EMAIL PROTECTED] 
> Sent: Wednesday, February 06, 2008 10:03 AM
> To:   ewg@lists.openfabrics.org
> Cc:   Scott Bahling; John Russo; Ryan, Jim; Ken L Johnson;
> [EMAIL PROTECTED]; [EMAIL PROTECTED]; Head Bubba; Van Houten,
> Betty; Patrick Mullaney
> Subject:  Updated: OFED weekly teleconference
> When: Tuesday, February 19, 2008 12:00 PM-1:00 PM (GMT-05:00) Eastern Time
> (US & Canada).
> Where:ID: 210020028
> 
> 
> 
> __
>  
> Jeffrey Squyres has invited you to a Cisco Unified MeetingPlace Conference
> 
> 
> Date/Time:   FEB 19, 2008 at 12:00PM America/New_York 
> Length:  60 
> Frequency:   3 
> Meeting ID:  210020028 
> Meeting Password:
> 
> Global Access Numbers: 
> http://cisco.com/en/US/about/doing_business/conferencing/index.html 
> 
> US/Canada:  +1.866.432.9903United Kingdom:   +44.20.8824.0117 
> India:  +91.80.4103.3979   Germany:  +49.619.6773.9002 
> Japan:  +81.3.5763.9394China:+86.10.8515.5666 
> 
> TO ATTEND A WEB AND VOICE CONFERENCE: 
> 
> CISCO INTRANET ATTENDEES 
> Join the Web & Voice Conference* 
> 1. Go to http://meetingplaceinternal.cisco.com/join.asp?210020028 
> 2. Enter your CEC User ID & Password then click OK 
> - Accept any security warnings you receive and wait for the Meeting Room
> to initialize 
> 3. Click on CONNECT from the Meeting Room to join the Voice Conference
> portion of the meeting 
> 
> EXTERNAL ATTENDEES - Outside the Cisco Intranet** 
> Join the Web & Voice Conference* 
> 1. Go to http://meetingplace.cisco.com/join.asp?210020028 
> 2. Fill in the My Name is field then click Attend Meeting 
> - If you have a CEC User ID, click on the Cisco icon 
> - Accept any security warnings you receive and wait for the Meeting Room
> to initialize 
> 3. Click on CONNECT from the Meeting Room to join the Voice Conference
> portion of the meeting 
> - Note: Guest users will see a link to the Global Access Numbers. 
> 
> *If this is your first time attending a Web Conference, disable any pop-up
> blockers and visit
> http://meetingplace.cisco.com/mpweb/scripts/browsertestupper.asp to test
> your web browser for compatibility with the Web Conference.
> 
> **Not all meetings are scheduled to allow external attendees into the Web
> Conference portion of the meeting, if the URL does not work, please follow
> the Voice only Conference instructions below to attend.
> 
> TO ATTEND A VOICE ONLY CONFERENCE 
> 1. Dial into Cisco Unified MeetingPlace (view the Access Numbers and link
> above) 
> 2. Press 1 to attend the meeting 
> 3. Follow the prompts to enter the Meeting ID 210020028 and join the
> meeting 
> 
> SUPPORT 
> Information about this Conference: Contact Jeffrey Squyres, 85250971 
> Cisco IT Support Center: Attend the Voice Conference and then press #0 on
> your phone keypad 
> 
> GLOBAL ACCESS NUMBERS 
> 
> COUNTRYLOCATIONLOCAL NUMBER   TOLL
> FREE-FREEFONE 
> 
> AlgeriaAlgiers+213.21.98.9047
> Argentina  Buenos Aires   +54.11.4341.0101 
> Australia  Canberra   +61.2.6216.0643 
>Melbourne  +61.3.9659.4173 
>North Sydney   +61.2.8446.5260 
> AustriaVienna +43.12.4030.6022 
> Azerbaijan Baku   +994.12.437.4829 
> BelgiumBrussels   +32.2.704.5072 
> Bosnia & 
> HerzegovinaSarajevo   +387.33.56.2898 
> Brazil Brasilia   +55.613.424.0220 
>Rio de Janeiro +55.21.2483.6302 
>Sao Paulo  +55.11.5508.6311 
> Bulgaria   Sofia  +359.2.937.5938 
> Canada Calgary+1.403.514.2435 
>Edmonton   +1.780.441.3715 
>Halifax+1.902.474.0214 
>Kanata +1.613.254.0005 
>Markham+1.905.470.4810 
>Montreal   +1.514.847.6875 
>Ottawa +1.613.788.7250 
>Quebec +1.418.634.5645 
>Regina +1.306.566.6410 
>Toronto+1.416.306.7230 
>Vancouver  +1.604.647.2350 
>Winnipeg   +1.204.336.6610 
> Chile  Santiago   +56.2.431.4936 
> China  Beijing+86.10.8515.5666 
>  

Re: [ewg] [PATCH v3 ofed-1.3] rdma_lat: Add option to support devices with different inline max values.

2008-02-06 Thread Steve Wise

Can we please pull this into ofed-1.3?

Thanks,

Steve.

Steve Wise wrote:

rdma_lat: Add option to support devices with different inline max values.

Currently the max inline value is hard-coded and too big for the
chelsio device.  This patch allows specifying the max inline as a
command line param.

Signed-off-by: Steve Wise <[EMAIL PROTECTED]>
---

 rdma_lat.c |   13 ++---
 1 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/rdma_lat.c b/rdma_lat.c
index 68c9120..30cb4a3 100755
--- a/rdma_lat.c
+++ b/rdma_lat.c
@@ -60,6 +60,7 @@
 #define PINGPONG_RDMA_WRID 3
 #define MAX_INLINE 400
 
+static int inline_size = MAX_INLINE;

 static int page_size;
 static pid_t pid;
 
@@ -603,7 +604,7 @@ static struct pingpong_context *pp_init_ctx(void *ptr, struct pp_data *data)

.max_recv_wr  = 1,
.max_send_sge = 1,
.max_recv_sge = 1,
-   .max_inline_data = MAX_INLINE
+   .max_inline_data = inline_size,
},
.qp_type = IBV_QPT_RC
};
@@ -915,6 +916,7 @@ static void usage(const char *argv0)
printf("  -s, --size=  size of message to exchange (default 
1)\n");
printf("  -t, --tx-depth=   size of tx queue (default 50)\n");
printf("  -n, --iters=number of exchanges (at least 2, default 
1000)\n");
+   printf("  -I, --inline_size=  max size of message to be sent in inline 
mode (default 400)\n");
printf("  -C, --report-cyclesreport times in cpu cycle units (default 
microseconds)\n");
printf("  -H, --report-histogram print out all results (default print 
summary only)\n");
printf("  -U, --report-unsorted  (implies -H) print out unsorted results 
(default sorted)\n");
@@ -1036,6 +1038,7 @@ int main(int argc, char *argv[])
{ .name = "size",   .has_arg = 1, .val = 's' },
{ .name = "iters",  .has_arg = 1, .val = 'n' },
{ .name = "tx-depth",   .has_arg = 1, .val = 't' },
+   { .name = "inline_size", .has_arg = 1, .val = 'I' },
{ .name = "report-cycles",  .has_arg = 0, .val = 'C' },
{ .name = "report-histogram",.has_arg = 0, .val = 'H' },
{ .name = "report-unsorted",.has_arg = 0, .val = 'U' },
@@ -1043,7 +1046,7 @@ int main(int argc, char *argv[])
{ 0 }
};
 
-		c = getopt_long(argc, argv, "p:d:i:s:n:t:CHUc", long_options, NULL);

+   c = getopt_long(argc, argv, "p:d:i:s:n:t:I:CHUc", long_options, 
NULL);
if (c == -1)
break;
 
@@ -1087,6 +1090,10 @@ int main(int argc, char *argv[])
 
 break;
 
+			case 'I':

+   inline_size = strtol(optarg, NULL, 0);
+   break;
+
case 'C':
report.cycles = 1;
break;
@@ -1192,7 +1199,7 @@ int main(int argc, char *argv[])
ctx->wr.sg_list= &ctx->list;
ctx->wr.num_sge= 1;
ctx->wr.opcode = IBV_WR_RDMA_WRITE;
-   if (ctx->size > MAX_INLINE || ctx->size == 0) {
+   if (ctx->size > inline_size || ctx->size == 0) {
ctx->wr.send_flags = IBV_SEND_SIGNALED;
} else {
ctx->wr.send_flags = IBV_SEND_SIGNALED | IBV_SEND_INLINE;
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [GIT PULL] ~sashak/management.git

2008-02-06 Thread Vladimir Sokolovsky

Sasha Khapyorsky wrote:

Hi Vlad,

Please pull recent ofed_1_3 branch of ~sashak/management.git.

The changes are:

Ira K. Weiny (2):
  Move opensm.8 man page in prep for making config file changes.
  Update man page for configurable partition and prefix-routes file

Ira Weiny (1):
  Add node name map, partition config, and QOS policy config files to the 
"FILES" section of man page.

Sasha Khapyorsky (1):
  opensm: scripts/opensmd - fix opensm path.



Done,

Regards,
Vladimir
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] traffic jittery, send queue full reports from mthca driver

2008-02-06 Thread Shirley Ma
Hello Or,

I found out that if you increase send_queue_size and recv_queue_size,
like 1K, this problem will be gone.

Thanks
Shirley

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [GIT PULL] ~sashak/management.git

2008-02-06 Thread Sasha Khapyorsky
Hi Vlad,

Please pull recent ofed_1_3 branch of ~sashak/management.git.

The changes are:

Ira K. Weiny (2):
  Move opensm.8 man page in prep for making config file changes.
  Update man page for configurable partition and prefix-routes file

Ira Weiny (1):
  Add node name map, partition config, and QOS policy config files to the 
"FILES" section of man page.

Sasha Khapyorsky (1):
  opensm: scripts/opensmd - fix opensm path.


Thanks,
Sasha
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] OFED teleconf 18 Feb

2008-02-06 Thread Jeff Squyres
Some US companies mark 18 Feb as a holiday (President's Day), so per  
request, I'm moving the OFED teleconference from 18 Feb to 19 Feb  
(same time slot).


You'll receive an Outlook meeting update shortly.

--
Jeff Squyres
Cisco Systems

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg][PATCH][0/2] SRP multipath failover within 60 seconds,

2008-02-06 Thread Tziporet Koren

Vu Pham wrote:
The following patches assist SRP/dm-multipath to failover within 60 
seconds (bugzilla #577) without data corruption, read/write error


1. srp_disconnect_without_wait.patch - srp send disconnect request  
without waiting for CM timewait exit event since srp current does not 
re-use the cm_id and qp/cq of a connection (patch 
srp_1_recreate_at_reconnect.patch already in kernel_patches/fixes 
recreate the cmid, qp/cq for a connection at reconnect)
2. srp_qp_in_err_timer_reconnect_target.patch - when detecting a 
post_send/post_receive error, srp set qp_in_error, set a timer to 
reconnect to target, return SCSI_MLQUEUE_HOST_BUSY to lock the queue, 
and return DID_NO_CONNECT when target state is DEAD or REMOVED


Here is my multipath.conf
defaults {
   udev_dir/dev
   polling_interval5
   selector"round-robin 0"
   path_grouping_policymultibus
   getuid_callout  "/sbin/scsi_id -g -u -s /block/%n"
   prio_callout/bin/true
   path_checkerreadsector0
   rr_min_io   100
   rr_weight   priorities
   failbackimmediate
   no_path_retry   5
   user_friendly_names no
}
I also set srp_daemon.sh to rescan fabric every 60 seconds (instead of 
300 secs as default setting)


I ran data integrity test to /dev/mapper/ and {disable path 
1, sleep 90, enable path 1, sleep 60, disable path 2, sleep 90, enable 
path 2, sleep 60} in the loop


RHEL5, 5.1 work very well (no data corruption, read/write failure report)
For SLES 10 sp1, it work well as long as I run *multipath* every 60 
secs. I think that I mis-configured the multipathd somehow (Here is 
how I set it up: using the same multipath.conf above, chkconfig 
boot.multipath on and chkconf multipathd on)


  -vu



This fix issue 577  
that was found in OFED 1.2

Vlad - please take this

Tziporet

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] RE: traffic jittery, send queue full reports from mthca driver

2008-02-06 Thread Tziporet Koren
OK - Eli found the problem to be fixed soon

Tziporet

-Original Message-
From: Or Gerlitz [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, February 06, 2008 2:54 PM
To: Tziporet Koren
Cc: ewg@lists.openfabrics.org
Subject: traffic jittery, send queue full reports from mthca driver

I just opened case #897 on the below, it happens with last night
snapshot.

Or

client MT25204 FW 1.2.0 two CPUs, four cores each
server MT25418 FW 2.3.0 two CPUs, four cores each

client : iperf -c $server -P 4 -d -t 3600 -i 1
server : iperf -s -i 1

[  5] 39.0-40.0 sec  29.4 MBytes246 Mbits/sec
[  4] 39.0-40.0 sec  25.5 MBytes214 Mbits/sec
[  3] 34.0-35.0 sec  88.0 KBytes721 Kbits/sec
[  3] 35.0-36.0 sec  0.00 Bytes  0.00 bits/sec
[  3] 36.0-37.0 sec  0.00 Bytes  0.00 bits/sec
[  3] 37.0-38.0 sec  0.00 Bytes  0.00 bits/sec
[  3] 38.0-39.0 sec  0.00 Bytes  0.00 bits/sec
[  3] 39.0-40.0 sec  0.00 Bytes  0.00 bits/sec
[  5] 40.0-41.0 sec  38.5 MBytes323 Mbits/sec
[  8] 40.0-41.0 sec  36.2 MBytes304 Mbits/sec
[  9] 40.0-41.0 sec  54.3 MBytes456 Mbits/sec
[ 10] 40.0-41.0 sec  32.1 MBytes270 Mbits/sec
[ 11] 40.0-41.0 sec  29.4 MBytes247 Mbits/sec
[SUM] 40.0-41.0 sec152 MBytes  1.28 Gbits/sec

ib_mthca :03:00.0: SQ 000404 full (756910656 head, 756910592 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (756915376 head, 756915312 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757146224 head, 757146160 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757146336 head, 757146272 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757317104 head, 757317040 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757361808 head, 757361744 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757361920 head, 757361856 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757515760 head, 757515696 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757515872 head, 757515808 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757515984 head, 757515920 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757516112 head, 757516048 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757516224 head, 757516160 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757516352 head, 757516288 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757516448 head, 757516384 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757516576 head, 757516512 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757523168 head, 757523104 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757531472 head, 757531408 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757531568 head, 757531504 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757548064 head, 757548000 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757582992 head, 757582928 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758082528 head, 758082464 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758162208 head, 758162144 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758232720 head, 758232656 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758232848 head, 758232784 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758232960 head, 758232896 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758233088 head, 758233024 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758303696 head, 758303632 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758303776 head, 758303712 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758307744 head, 758307680 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758307872 head, 758307808 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758334928 head, 758334864 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758335056 head, 758334992 tail,
64 max, 0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 0004

[ewg] Re: [PATCH] call skb_orphan() after sending an SKB

2008-02-06 Thread Eli Cohen

On Wed, 2008-02-06 at 15:11 +0200, Or Gerlitz wrote:
> Eli Cohen wrote:
> > On Wed, 2008-02-06 at 10:17 +0200, Or Gerlitz wrote:
> 
> > The problem with ttcpv was that it stopped sending packets since it was
> > waiting for freeing the memory. The system did not hang, just the
> > application (ttcpv) stopped sending. Other applications could continue
> > working over the ipoib interface.
> 
> What's ttcpv, doing web-search I only find ttcp, so I would be happy to 
> get pointer plus what param you were using to see the problem.
It's a variant of ttcp we're using here in our regression. Dotan can you
send a pointer?
> 
> >> Also, I see that you have added a call to netif_stop_queue(), is this to
> >> solve another problem?
> 
> > This was just a whole that I found in code review - when I post a zero
> > length packet, I still want this to affect the net queue control.
> 
> Why posting a zero len packet is related to the net queue control logic? 
> I was thinking it has to do with releasing "unsignaled SKBs"

Yes but if I have no more room in the tx ring I would like to stop the
queue even here.
> 
> Or
> 

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [PATCH] call skb_orphan() after sending an SKB

2008-02-06 Thread Or Gerlitz

Eli Cohen wrote:

On Wed, 2008-02-06 at 10:17 +0200, Or Gerlitz wrote:



The problem with ttcpv was that it stopped sending packets since it was
waiting for freeing the memory. The system did not hang, just the
application (ttcpv) stopped sending. Other applications could continue
working over the ipoib interface.


What's ttcpv, doing web-search I only find ttcp, so I would be happy to 
get pointer plus what param you were using to see the problem.



Also, I see that you have added a call to netif_stop_queue(), is this to
solve another problem?



This was just a whole that I found in code review - when I post a zero
length packet, I still want this to affect the net queue control.


Why posting a zero len packet is related to the net queue control logic? 
I was thinking it has to do with releasing "unsignaled SKBs"


Or

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] RE: traffic jittery, send queue full reports from mthca driver

2008-02-06 Thread Eli Cohen
I will check this. 

-Original Message-
From: Or Gerlitz [mailto:[EMAIL PROTECTED] 
Sent: ד 06 פברואר 2008 14:57
To: Eli Cohen
Cc: ewg@lists.openfabrics.org
Subject: Re: traffic jittery, send queue full reports from mthca driver

> ib_mthca :03:00.0: SQ 000404 full (756910656 head, 756910592 tail, 
> 64 max, 0 nreq)
> ib0: failed to post zlen send

Eli,

can this be a bug in the send ring accounting wrt to the zlen packet you use in 
the unsig-ud-qp patch?

Or.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: traffic jittery, send queue full reports from mthca driver

2008-02-06 Thread Or Gerlitz
> ib_mthca :03:00.0: SQ 000404 full (756910656 head, 756910592 tail, 64 
> max, 0 nreq)
> ib0: failed to post zlen send

Eli,

can this be a bug in the send ring accounting wrt to the zlen packet you use in 
the unsig-ud-qp patch?

Or.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] traffic jittery, send queue full reports from mthca driver

2008-02-06 Thread Or Gerlitz
I just opened case #897 on the below, it happens with last night snapshot.

Or

client MT25204 FW 1.2.0 two CPUs, four cores each
server MT25418 FW 2.3.0 two CPUs, four cores each

client : iperf -c $server -P 4 -d -t 3600 -i 1
server : iperf -s -i 1

[  5] 39.0-40.0 sec  29.4 MBytes246 Mbits/sec
[  4] 39.0-40.0 sec  25.5 MBytes214 Mbits/sec
[  3] 34.0-35.0 sec  88.0 KBytes721 Kbits/sec
[  3] 35.0-36.0 sec  0.00 Bytes  0.00 bits/sec
[  3] 36.0-37.0 sec  0.00 Bytes  0.00 bits/sec
[  3] 37.0-38.0 sec  0.00 Bytes  0.00 bits/sec
[  3] 38.0-39.0 sec  0.00 Bytes  0.00 bits/sec
[  3] 39.0-40.0 sec  0.00 Bytes  0.00 bits/sec
[  5] 40.0-41.0 sec  38.5 MBytes323 Mbits/sec
[  8] 40.0-41.0 sec  36.2 MBytes304 Mbits/sec
[  9] 40.0-41.0 sec  54.3 MBytes456 Mbits/sec
[ 10] 40.0-41.0 sec  32.1 MBytes270 Mbits/sec
[ 11] 40.0-41.0 sec  29.4 MBytes247 Mbits/sec
[SUM] 40.0-41.0 sec152 MBytes  1.28 Gbits/sec

ib_mthca :03:00.0: SQ 000404 full (756910656 head, 756910592 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (756915376 head, 756915312 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757146224 head, 757146160 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757146336 head, 757146272 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757317104 head, 757317040 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757361808 head, 757361744 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757361920 head, 757361856 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757515760 head, 757515696 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757515872 head, 757515808 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757515984 head, 757515920 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757516112 head, 757516048 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757516224 head, 757516160 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757516352 head, 757516288 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757516448 head, 757516384 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757516576 head, 757516512 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757523168 head, 757523104 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757531472 head, 757531408 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757531568 head, 757531504 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757548064 head, 757548000 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (757582992 head, 757582928 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758082528 head, 758082464 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758162208 head, 758162144 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758232720 head, 758232656 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758232848 head, 758232784 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758232960 head, 758232896 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758233088 head, 758233024 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758303696 head, 758303632 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758303776 head, 758303712 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758307744 head, 758307680 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758307872 head, 758307808 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758334928 head, 758334864 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758335056 head, 758334992 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758341744 head, 758341680 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758341856 head, 758341792 tail, 64 max, 
0 nreq)
ib0: failed to post zlen send
ib_mthca :03:00.0: SQ 000404 full (758396784 

Re: [ewg][PATCH][0/2] SRP multipath failover within 60 seconds,

2008-02-06 Thread Vu Pham

EXTRA NOTES:

1. pull cable/ plug back in (or ibportstate disable/enable)
a. Within 30 seconds I/Os resume on the same path (with same 
cm_id, qp and cq)
b. Within 30-45 seconds, I/Os resume on the same path (with 
new cm_id, qp and cq)

c. >45 seconds, I/Os fail-over to next path

2. After running test for a while, I stop the test, run 
*multipath -F* and unload ib_srp module. With RHEL 5 & 5.1, 
I can unload ib_srp cleanly; however, I got *srp is in use* 
error in SLES 10 sp1


   -vu

The following patches assist SRP/dm-multipath to failover within 60 
seconds (bugzilla #577) without data corruption, read/write error


1. srp_disconnect_without_wait.patch - srp send disconnect request  
without waiting for CM timewait exit event since srp current does not 
re-use the cm_id and qp/cq of a connection (patch 
srp_1_recreate_at_reconnect.patch already in kernel_patches/fixes 
recreate the cmid, qp/cq for a connection at reconnect)
2. srp_qp_in_err_timer_reconnect_target.patch - when detecting a 
post_send/post_receive error, srp set qp_in_error, set a timer to 
reconnect to target, return SCSI_MLQUEUE_HOST_BUSY to lock the queue, 
and return DID_NO_CONNECT when target state is DEAD or REMOVED


Here is my multipath.conf
defaults {
   udev_dir/dev
   polling_interval5
   selector"round-robin 0"
   path_grouping_policymultibus
   getuid_callout  "/sbin/scsi_id -g -u -s /block/%n"
   prio_callout/bin/true
   path_checkerreadsector0
   rr_min_io   100
   rr_weight   priorities
   failbackimmediate
   no_path_retry   5
   user_friendly_names no
}
I also set srp_daemon.sh to rescan fabric every 60 seconds (instead of 
300 secs as default setting)


I ran data integrity test to /dev/mapper/ and {disable path 1, 
sleep 90, enable path 1, sleep 60, disable path 2, sleep 90, enable path 
2, sleep 60} in the loop


RHEL5, 5.1 work very well (no data corruption, read/write failure report)
For SLES 10 sp1, it work well as long as I run *multipath* every 60 
secs. I think that I mis-configured the multipathd somehow (Here is how 
I set it up: using the same multipath.conf above, chkconfig 
boot.multipath on and chkconf multipathd on)


  -vu





___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg][PATCH][2/2] SRP multipath failover within 60 seconds,

2008-02-06 Thread Vu Pham
srp_qp_in_err_timer_reconnect_target.patch - when detecting 
a post_send/post_receive error, srp set qp_in_error, set a 
timer to reconnect to target, return SCSI_MLQUEUE_HOST_BUSY 
to lock the queue, and return DID_NO_CONNECT when target 
state is DEAD or REMOVED


Signed-off-by: Vu Pham <[EMAIL PROTECTED]>

--- ofa_kernel-1.3.configured/drivers/infiniband/ulp/srp/ib_srp.c	2008-02-05 11:18:16.0 -0800
+++ ofa_kernel-1.3/drivers/infiniband/ulp/srp/ib_srp.c	2008-02-05 15:18:33.0 -0800
@@ -885,6 +884,26 @@
   DMA_FROM_DEVICE);
 }
 
+static void srp_reconnect_work(struct work_struct *work)
+{
+	struct srp_target_port *target =
+		container_of(work, struct srp_target_port, work);
+
+	srp_reconnect_target(target);
+}
+
+static void srp_qp_in_err_timer(unsigned long data)
+{
+	struct srp_target_port *target = (struct srp_target_port *)data;
+
+	spin_lock_irq(target->scsi_host->host_lock);
+	INIT_WORK(&target->work, srp_reconnect_work);
+	schedule_work(&target->work);
+	spin_unlock_irq(target->scsi_host->host_lock);
+
+	del_timer(&target->qp_err_timer);
+}
+
 static void srp_completion(struct ib_cq *cq, void *target_ptr)
 {
 	struct srp_target_port *target = target_ptr;
@@ -896,7 +915,16 @@
 			printk(KERN_ERR PFX "failed %s status %d\n",
 			   wc.wr_id & SRP_OP_RECV ? "receive" : "send",
 			   wc.status);
-			target->qp_in_error = 1;
+			if (!target->qp_in_error) {
+target->qp_in_error = 1;
+if (!timer_pending(&target->qp_err_timer)) {
+	setup_timer(&target->qp_err_timer,
+		srp_qp_in_err_timer,
+		(unsigned long)target);
+	target->qp_err_timer.expires = 10 * HZ + jiffies;
+	add_timer(&target->qp_err_timer);
+}
+			}
 			break;
 		}
 
@@ -1004,12 +1032,13 @@
 	struct ib_device *dev;
 	int len;
 
-	if (target->state == SRP_TARGET_CONNECTING)
+	if (target->state == SRP_TARGET_CONNECTING ||
+	target->qp_in_error)
 		goto err;
 
 	if (target->state == SRP_TARGET_DEAD ||
 	target->state == SRP_TARGET_REMOVED) {
-		scmnd->result = DID_BAD_TARGET << 16;
+		scmnd->result = DID_NO_CONNECT << 16;
 		done(scmnd);
 		return 0;
 	}
--- ofa_kernel-1.3.configured/drivers/infiniband/ulp/srp/ib_srp.h	2008-02-05 11:18:16.0 -0800
+++ ofa_kernel-1.3/drivers/infiniband/ulp/srp/ib_srp.h	2008-02-05 11:20:49.0 -0800
@@ -160,6 +160,7 @@
 	int			status;
 	enum srp_target_state	state;
 	int			qp_in_error;
+	struct timer_list	qp_err_timer;
 };
 
 struct srp_iu {
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg][PATCH][1/2] SRP multipath failover within 60 seconds,

2008-02-06 Thread Vu Pham
srp_disconnect_without_wait.patch - srp send disconnect 
request without waiting for CM timewait exit event since srp 
current does not re-use the cm_id and qp/cq of a connection 
(patch srp_1_recreate_at_reconnect.patch already in 
kernel_patches/fixes recreate the cmid, qp/cq for a 
connection at reconnect)


Signed-off-by: Vu Pham <[EMAIL PROTECTED]>

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 950228f..45a2533 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -400,7 +400,6 @@
 		printk(KERN_DEBUG PFX "Sending CM DREQ failed\n");
 		return;
 	}
-	wait_for_completion(&target->done);
 }
 
 static void srp_remove_work(struct work_struct *work)
@@ -1266,7 +1294,6 @@
 	case IB_CM_TIMEWAIT_EXIT:
 		printk(KERN_ERR PFX "connection closed\n");
 
-		comp = 1;
 		target->status = 0;
 		break;
 
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg][PATCH][0/2] SRP multipath failover within 60 seconds,

2008-02-06 Thread Vu Pham
The following patches assist SRP/dm-multipath to failover within 60 
seconds (bugzilla #577) without data corruption, read/write error


1. srp_disconnect_without_wait.patch - srp send disconnect request  
without waiting for CM timewait exit event since srp current does not 
re-use the cm_id and qp/cq of a connection (patch 
srp_1_recreate_at_reconnect.patch already in kernel_patches/fixes 
recreate the cmid, qp/cq for a connection at reconnect)
2. srp_qp_in_err_timer_reconnect_target.patch - when detecting a 
post_send/post_receive error, srp set qp_in_error, set a timer to 
reconnect to target, return SCSI_MLQUEUE_HOST_BUSY to lock the queue, 
and return DID_NO_CONNECT when target state is DEAD or REMOVED


Here is my multipath.conf
defaults {
   udev_dir/dev
   polling_interval5
   selector"round-robin 0"
   path_grouping_policymultibus
   getuid_callout  "/sbin/scsi_id -g -u -s /block/%n"
   prio_callout/bin/true
   path_checkerreadsector0
   rr_min_io   100
   rr_weight   priorities
   failbackimmediate
   no_path_retry   5
   user_friendly_names no
}
I also set srp_daemon.sh to rescan fabric every 60 seconds (instead of 
300 secs as default setting)


I ran data integrity test to /dev/mapper/ and {disable path 1, 
sleep 90, enable path 1, sleep 60, disable path 2, sleep 90, enable path 
2, sleep 60} in the loop


RHEL5, 5.1 work very well (no data corruption, read/write failure report)
For SLES 10 sp1, it work well as long as I run *multipath* every 60 
secs. I think that I mis-configured the multipathd somehow (Here is how 
I set it up: using the same multipath.conf above, chkconfig 
boot.multipath on and chkconf multipathd on)


  -vu





___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: with the ipoib patches, debug prints spam the system log

2008-02-06 Thread Eli Cohen
They are only visible when activating ipoib debug. I know it fills the
dmesg ring with messages. Do you think I should remove them?

On Wed, 2008-02-06 at 10:38 +0200, Or Gerlitz wrote:
> Eli,
> 
> You have left somehow too many... debug prints in the last patches,
> please clean this up. See for example how the system log after less
> then a minute when ipoib debug prints are opened, it has one original
> print ("ib0: Send unicast ARP to 0023") and all the rest are yours.
> 
> Or
> 
> Feb  6 14:39:23  kernel: ib0: posting zlen send, wrid = 4: head = 2756, tail 
> = 2752
> Feb  6 14:39:23  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 2757
> Feb  6 14:39:25  kernel: ib0: posting zlen send, wrid = 39: head = 2919, tail 
> = 2912
> Feb  6 14:39:25  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 2920
> Feb  6 14:39:25  kernel: ib0: posting zlen send, wrid = 15: head = 2959, tail 
> = 2944
> Feb  6 14:39:25  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 2960
> Feb  6 14:39:27  kernel: ib0: posting zlen send, wrid = 8: head = 3080, tail 
> = 3072
> Feb  6 14:39:27  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3081
> Feb  6 14:39:34  kernel: ib0: posting zlen send, wrid = 51: head = 3699, tail 
> = 3696
> Feb  6 14:39:34  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3700
> Feb  6 14:39:35  kernel: ib0: posting zlen send, wrid = 25: head = 3737, tail 
> = 3728
> Feb  6 14:39:35  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3738
> Feb  6 14:39:35  kernel: ib0: posting zlen send, wrid = 3: head = 3779, tail 
> = 3776
> Feb  6 14:39:35  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3780
> Feb  6 14:39:36  kernel: ib0: posting zlen send, wrid = 48: head = 3824, tail 
> = 3808
> Feb  6 14:39:36  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3825
> Feb  6 14:39:38  kernel: ib0: posting zlen send, wrid = 24: head = 3992, tail 
> = 3984
> Feb  6 14:39:38  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3993
> Feb  6 14:39:38  kernel: ib0: posting zlen send, wrid = 4: head = 4036, tail 
> = 4032
> Feb  6 14:39:38  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 4037
> Feb  6 14:39:46  kernel: ib0: Send unicast ARP to 0023
> Feb  6 14:39:46  kernel: ib0: posting zlen send, wrid = 11: head = 4683, tail 
> = 4672
> Feb  6 14:39:46  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 4684
> Feb  6 14:39:58  kernel: ib0: posting zlen send, wrid = 58: head = 5626, tail 
> = 5616
> Feb  6 14:39:58  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 5627
> Feb  6 14:39:59  kernel: ib0: posting zlen send, wrid = 56: head = 5752, tail 
> = 5744
> Feb  6 14:39:59  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 5753
> Feb  6 14:40:01  kernel: ib0: posting zlen send, wrid = 54: head = 5878, tail 
> = 5872
> Feb  6 14:40:01  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 5879
> Feb  6 14:40:01  kernel: ib0: posting zlen send, wrid = 30: head = 5918, tail 
> = 5904
> Feb  6 14:40:01  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 5919
> Feb  6 14:40:10  kernel: ib0: posting zlen send, wrid = 33: head = 6689, tail 
> = 6672
> Feb  6 14:40:10  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 6690
> Feb  6 14:40:13  kernel: ib0: posting zlen send, wrid = 48: head = 6896, tail 
> = 6880
> Feb  6 14:40:13  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 6897
> Feb  6 14:40:13  kernel: ib0: posting zlen send, wrid = 26: head = 6938, tail 
> = 6928
> Feb  6 14:40:13  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 6939
> Feb  6 14:40:15  kernel: ib0: posting zlen send, wrid = 61: head = 7101, tail 
> = 7088
> Feb  6 14:40:15  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 7102

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [PATCH] call skb_orphan() after sending an SKB

2008-02-06 Thread Eli Cohen

On Wed, 2008-02-06 at 10:17 +0200, Or Gerlitz wrote:
> > commit f17ebf3e2099257da244587f1ee33f51745f7cdb
> > Author: Eli Cohen <[EMAIL PROTECTED]>
> > Date:   Tue Feb 5 11:15:46 2008 +0200
> >
> > Call skb_orphan() after sending an SKB
> >
> > This will call the destructor of the SKB (but not free the
> > memory). It appears that some applications (ttcpv for example)
> > are sensitive to delaying the time the SKB is freed. This commit
> > fixes this problem.
> 
> Can you explain what is the difference from the socket send buffer accounting
> point of view, between freeing the SKB to freeing the memory?
When you call skb_orphan(), the destructor of the SKB is called, in the
case this a function put by the socket. So from the socket point of view
the packet has been sent. The memory is still no freed since it is
needed by HW. Once we get a completion for the send operation, the SKB
is freed.


>  what was the
> problem with ttcpv, did it hanged?
The problem with ttcpv was that it stopped sending packets since it was
waiting for freeing the memory. The system did not hang, just the
application (ttcpv) stopped sending. Other applications could continue
working over the ipoib interface.

>  have you tested the unsig_udqp.patch with
> different socket buffer sizes to make sure there's no live-lock etc?
Yes, our regression system does that with different applications and
benchmarks.

>  what was the app you were using?
ttcpv


> 
> Also, I see that you have added a call to netif_stop_queue(), is this to
> solve another problem?

This was just a whole that I found in code review - when I post a zero
length packet, I still want this to affect the net queue control.
> 
> Or.
> 



___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] with the ipoib patches, debug prints spam the system log

2008-02-06 Thread Or Gerlitz
Eli,

You have left somehow too many... debug prints in the last patches,
please clean this up. See for example how the system log after less
then a minute when ipoib debug prints are opened, it has one original
print ("ib0: Send unicast ARP to 0023") and all the rest are yours.

Or

Feb  6 14:39:23  kernel: ib0: posting zlen send, wrid = 4: head = 2756, tail = 
2752
Feb  6 14:39:23  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 2757
Feb  6 14:39:25  kernel: ib0: posting zlen send, wrid = 39: head = 2919, tail = 
2912
Feb  6 14:39:25  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 2920
Feb  6 14:39:25  kernel: ib0: posting zlen send, wrid = 15: head = 2959, tail = 
2944
Feb  6 14:39:25  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 2960
Feb  6 14:39:27  kernel: ib0: posting zlen send, wrid = 8: head = 3080, tail = 
3072
Feb  6 14:39:27  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3081
Feb  6 14:39:34  kernel: ib0: posting zlen send, wrid = 51: head = 3699, tail = 
3696
Feb  6 14:39:34  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3700
Feb  6 14:39:35  kernel: ib0: posting zlen send, wrid = 25: head = 3737, tail = 
3728
Feb  6 14:39:35  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3738
Feb  6 14:39:35  kernel: ib0: posting zlen send, wrid = 3: head = 3779, tail = 
3776
Feb  6 14:39:35  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3780
Feb  6 14:39:36  kernel: ib0: posting zlen send, wrid = 48: head = 3824, tail = 
3808
Feb  6 14:39:36  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3825
Feb  6 14:39:38  kernel: ib0: posting zlen send, wrid = 24: head = 3992, tail = 
3984
Feb  6 14:39:38  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 3993
Feb  6 14:39:38  kernel: ib0: posting zlen send, wrid = 4: head = 4036, tail = 
4032
Feb  6 14:39:38  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 4037
Feb  6 14:39:46  kernel: ib0: Send unicast ARP to 0023
Feb  6 14:39:46  kernel: ib0: posting zlen send, wrid = 11: head = 4683, tail = 
4672
Feb  6 14:39:46  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 4684
Feb  6 14:39:58  kernel: ib0: posting zlen send, wrid = 58: head = 5626, tail = 
5616
Feb  6 14:39:58  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 5627
Feb  6 14:39:59  kernel: ib0: posting zlen send, wrid = 56: head = 5752, tail = 
5744
Feb  6 14:39:59  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 5753
Feb  6 14:40:01  kernel: ib0: posting zlen send, wrid = 54: head = 5878, tail = 
5872
Feb  6 14:40:01  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 5879
Feb  6 14:40:01  kernel: ib0: posting zlen send, wrid = 30: head = 5918, tail = 
5904
Feb  6 14:40:01  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 5919
Feb  6 14:40:10  kernel: ib0: posting zlen send, wrid = 33: head = 6689, tail = 
6672
Feb  6 14:40:10  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 6690
Feb  6 14:40:13  kernel: ib0: posting zlen send, wrid = 48: head = 6896, tail = 
6880
Feb  6 14:40:13  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 6897
Feb  6 14:40:13  kernel: ib0: posting zlen send, wrid = 26: head = 6938, tail = 
6928
Feb  6 14:40:13  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 6939
Feb  6 14:40:15  kernel: ib0: posting zlen send, wrid = 61: head = 7101, tail = 
7088
Feb  6 14:40:15  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 7102
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [PATCH] call skb_orphan() after sending an SKB

2008-02-06 Thread Or Gerlitz
> commit f17ebf3e2099257da244587f1ee33f51745f7cdb
> Author: Eli Cohen <[EMAIL PROTECTED]>
> Date:   Tue Feb 5 11:15:46 2008 +0200
>
> Call skb_orphan() after sending an SKB
>
> This will call the destructor of the SKB (but not free the
> memory). It appears that some applications (ttcpv for example)
> are sensitive to delaying the time the SKB is freed. This commit
> fixes this problem.

Can you explain what is the difference from the socket send buffer accounting
point of view, between freeing the SKB to freeing the memory? what was the
problem with ttcpv, did it hanged? have you tested the unsig_udqp.patch with
different socket buffer sizes to make sure there's no live-lock etc? what
was the app you were using?

Also, I see that you have added a call to netif_stop_queue(), is this to
solve another problem?

Or.

>
> Signed-off-by: Eli Cohen <[EMAIL PROTECTED]>
>
> diff --git a/kernel_patches/fixes/ipoib_0190_unsig_udqp.patch 
> b/kernel_patches/fixes/ipoib_0190_unsig_udqp.patch
> index b76cdab..3fbeda3 100644
> --- a/kernel_patches/fixes/ipoib_0190_unsig_udqp.patch
> +++ b/kernel_patches/fixes/ipoib_0190_unsig_udqp.patch
> @@ -10,10 +10,10 @@ UDP messages, went up from 380 mbps to 508 mbps.
>
>  Signed-off-by: Eli Cohen <[EMAIL PROTECTED]>
>  ---
> -Index: ofed_kernel/drivers/infiniband/ulp/ipoib/ipoib.h
> +Index: ofa_1_3_dev_kernel/drivers/infiniband/ulp/ipoib/ipoib.h
>  ===
>  ofed_kernel.orig/drivers/infiniband/ulp/ipoib/ipoib.h
> -+++ ofed_kernel/drivers/infiniband/ulp/ipoib/ipoib.h
> +--- ofa_1_3_dev_kernel.orig/drivers/infiniband/ulp/ipoib/ipoib.h 
> 2008-02-05 11:04:35.0 +0200
>  ofa_1_3_dev_kernel/drivers/infiniband/ulp/ipoib/ipoib.h  2008-02-05 
> 11:05:07.0 +0200
>  @@ -373,6 +373,7 @@ struct ipoib_dev_priv {
>
>   struct ib_wc ibwc[IPOIB_NUM_WC];
> @@ -39,10 +39,10 @@ Index: ofed_kernel/drivers/infiniband/ulp/ipoib/ipoib.h
>
>   struct ipoib_ah *ipoib_create_ah(struct net_device *dev,
>struct ib_pd *pd, struct ib_ah_attr *attr);
> -Index: ofed_kernel/drivers/infiniband/ulp/ipoib/ipoib_ib.c
> +Index: ofa_1_3_dev_kernel/drivers/infiniband/ulp/ipoib/ipoib_ib.c
>  ===
>  ofed_kernel.orig/drivers/infiniband/ulp/ipoib/ipoib_ib.c
> -+++ ofed_kernel/drivers/infiniband/ulp/ipoib/ipoib_ib.c
> +--- ofa_1_3_dev_kernel.orig/drivers/infiniband/ulp/ipoib/ipoib_ib.c  
> 2008-02-05 11:04:35.0 +0200
>  ofa_1_3_dev_kernel/drivers/infiniband/ulp/ipoib/ipoib_ib.c   
> 2008-02-05 11:05:44.0 +0200
>  @@ -254,12 +254,10 @@ repost:
>  "for buf %d\n", wr_id);
>   }
> @@ -128,7 +128,7 @@ Index: ofed_kernel/drivers/infiniband/ulp/ipoib/ipoib_ib.c
>   }
>
>   int ipoib_poll(struct napi_struct *napi, int budget)
> -@@ -361,11 +372,65 @@ void ipoib_ib_rx_completion(struct ib_cq
> +@@ -361,11 +372,68 @@ void ipoib_ib_rx_completion(struct ib_cq
>   netif_rx_schedule(dev, &priv->napi);
>   }
>
> @@ -168,8 +168,11 @@ Index: 
> ofed_kernel/drivers/infiniband/ulp/ipoib/ipoib_ib.c
>  +ipoib_warn(priv, "failed to post zlen send\n");
>  +else {
>  +++priv->tx_head;
> -+++priv->tx_outstanding;
>  +ipoib_dbg(priv, "%s-%d: head = %d\n", __func__, 
> __LINE__, priv->tx_head);
> ++if (++priv->tx_outstanding == ipoib_sendq_size) {
> ++ipoib_dbg(priv, "TX ring full, stopping kernel 
> net queue\n");
> ++netif_stop_queue(dev);
> ++}
>  +}
>  +}
>  +poll_tx(priv);
> @@ -197,7 +200,7 @@ Index: ofed_kernel/drivers/infiniband/ulp/ipoib/ipoib_ib.c
>   }
>
>   static inline int post_send(struct ipoib_dev_priv *priv,
> -@@ -405,6 +470,11 @@ static inline int post_send(struct ipoib
> +@@ -405,6 +473,11 @@ static inline int post_send(struct ipoib
>   } else
>   priv->tx_wr.opcode  = IB_WR_SEND;
>
> @@ -209,16 +212,18 @@ Index: 
> ofed_kernel/drivers/infiniband/ulp/ipoib/ipoib_ib.c
>   return ib_post_send(priv->qp, &priv->tx_wr, &bad_wr);
>   }
>
> -@@ -489,7 +559,7 @@ void ipoib_send(struct net_device *dev,
> +@@ -489,7 +562,9 @@ void ipoib_send(struct net_device *dev,
>   }
>
>   if (unlikely(priv->tx_outstanding > MAX_SEND_CQE + 1))
>  -poll_tx(priv, 0);
>  +poll_tx(priv);
> ++
> ++skb_orphan(skb);
>
>   return;
>
> -@@ -530,6 +600,32 @@ void ipoib_reap_ah(struct work_struct *w
> +@@ -530,6 +605,32 @@ void ipoib_reap_ah(struct work_struct *w
>  round_jiffies_relative(HZ));
>   }
>
> @@ -251,7 +256,7 @@ Index: ofed_kernel/drivers/infiniband/ulp/ipoib/ipoib_ib.c
>   int ipoib_ib_dev_open(struct net_device *dev)
>   {
>   struct ipoib_dev_priv *priv = netdev_priv(dev);
> -@

[ewg] some comments/cleanups for the openibd service script

2008-02-06 Thread Or Gerlitz
Vlad,

I just realized that there is some old and misleading sections here, for
example bringing up/down of GEN1 drivers, mlx4_enet driver which is not
part of this release AKAIK ..., kdapl which was removed, starting/stopping
the ipoib ha tools which were removed, etc.

I can send a patch to clean them up, but I thought you might prefer to do
it yourself, please let me know, this has to get in for 1.3, I don't want
to start handling support cases with questions on non existent features.

This service script goes into commencial distributions, correct?

Please see below and let me know your thinking,

thanks,

Or.

On Wed, 6 Feb 2008, Or Gerlitz wrote:

> --- /dev/null 2008-02-05 10:18:44.755516936 +0200
> +++ ofed_scripts/openibd  2008-02-06 13:46:50.0 +0200
> @@ -0,0 +1,1375 @@
> +#!/bin/bash
> +
> +#
> +# Copyright (c) 2006 Mellanox Technologies. All rights reserved.
> +#
> +# This Software is licensed under one of the following licenses:
> +#
> +# 1) under the terms of the "Common Public License 1.0" a copy of which is
> +#available from the Open Source Initiative, see
> +#http://www.opensource.org/licenses/cpl.php.
> +#
> +# 2) under the terms of the "The BSD License" a copy of which is
> +#available from the Open Source Initiative, see
> +#http://www.opensource.org/licenses/bsd-license.php.
> +#
> +# 3) under the terms of the "GNU General Public License (GPL) Version 2" a
> +#copy of which is available from the Open Source Initiative, see
> +#http://www.opensource.org/licenses/gpl-license.php.
> +#
> +# Licensee has the right to choose one of the above licenses.
> +#
> +# Redistributions of source code must retain the above copyright
> +# notice and one of the license notices.
> +#
> +# Redistributions in binary form must reproduce both the above copyright
> +# notice, one of the license notices in the documentation
> +# and/or other materials provided with the distribution.
> +#
> +#
> +#  $Id: openibd 9139 2006-08-29 14:03:38Z vlad $
> +#
> +
> +# config: /etc/infiniband/openib.conf
> +CONFIG="/etc/infiniband/openib.conf"
> +
> +if [ ! -f $CONFIG ]; then
> +echo No InfiniBand configuration found
> +exit 0
> +fi
> +
> +. $CONFIG
> +
> +CWD=`pwd`
> +cd /etc/infiniband
> +WD=`pwd`
> +
> +PATH=$PATH:/sbin:/usr/bin
> +if [ -e /etc/profile.d/ofed.sh ]; then
> +. /etc/profile.d/ofed.sh
> +fi
> +
> +# Only use ONBOOT option if called by a runlevel directory.
> +# Therefore determine the base, follow a runlevel link name ...
> +base=${0##*/}
> +link=${base#*[SK][0-9][0-9]}
> +# ... and compare them
> +if [ $link == $base ] ; then
> +RUNMODE=manual
> +ONBOOT=yes
> +else
> +RUNMODE=auto
> +fi
> +
> +ACTION=$1
> +shift
> +RESTART=0
> +max_ports_num_in_hca=0
> +
> +# Check if OpenIB configured to start automatically
> +if [ "X${ONBOOT}" != "Xyes" ]; then
> +exit 0
> +fi
> +
> +if ( grep -i 'SuSE Linux' /etc/issue >/dev/null 2>&1 ); then
> +if [ -n "$INIT_VERSION" ] ; then
> +# MODE=onboot
> +if LANG=C egrep -L "^ONBOOT=['\"]?[Nn][Oo]['\"]?" ${CONFIG} > 
> /dev/null ; then
> +exit 0
> +fi
> +fi
> +fi
> +
> +#
> +# Get a sane screen width
> +[ -z "${COLUMNS:-}" ] && COLUMNS=80
> +
> +[ -z "${CONSOLETYPE:-}" ] && [ -x /sbin/consoletype ] && 
> CONSOLETYPE="`/sbin/consoletype`"
> +
> +if [ -f /etc/sysconfig/i18n -a -z "${NOLOCALE:-}" ] ; then
> +  . /etc/sysconfig/i18n
> +  if [ "$CONSOLETYPE" != "pty" ]; then
> +case "${LANG:-}" in
> +ja_JP*|ko_KR*|zh_CN*|zh_TW*)
> +export LC_MESSAGES=en_US
> +;;
> +*)
> +export LANG
> +;;
> +esac
> +  else
> +export LANG
> +  fi
> +fi
> +
> +# Read in our configuration
> +if [ -z "${BOOTUP:-}" ]; then
> +  if [ -f /etc/sysconfig/init ]; then
> +  . /etc/sysconfig/init
> +  else
> +# This all seem confusing? Look in /etc/sysconfig/init,
> +# or in /usr/doc/initscripts-*/sysconfig.txt
> +BOOTUP=color
> +RES_COL=60
> +MOVE_TO_COL="echo -en \\033[${RES_COL}G"
> +SETCOLOR_SUCCESS="echo -en \\033[1;32m"
> +SETCOLOR_FAILURE="echo -en \\033[1;31m"
> +SETCOLOR_WARNING="echo -en \\033[1;33m"
> +SETCOLOR_NORMAL="echo -en \\033[0;39m"
> +LOGLEVEL=1
> +  fi
> +  if [ "$CONSOLETYPE" = "serial" ]; then
> +  BOOTUP=serial
> +  MOVE_TO_COL=
> +  SETCOLOR_SUCCESS=
> +  SETCOLOR_FAILURE=
> +  SETCOLOR_WARNING=
> +  SETCOLOR_NORMAL=
> +  fi
> +fi
> +
> +if [ "${BOOTUP:-}" != "verbose" ]; then
> +   INITLOG_ARGS="-q"
> +else
> +   INITLOG_ARGS=
> +fi
> +
> +echo_success() {
> +  echo -n $@
> +  [ "$BOOTUP" = "color" ] && $MOVE_TO_COL
> +  echo -n "[  "
> +  [ "$BOOTUP" = "color" ] && $SETCOLOR_SUCCESS
> +  echo -n $"OK"
> +  [ "$BOOTUP" = "color" ] && $SETCOLOR_NORMAL
> +  echo -n "  ]"
>