Hi List,

 I am using DRBD 8.3.6 along with Linux Kernel Version 2.6.32, in my
environment i have used an iSCSI device(ext3) on my secondary as the backup
device. When i run a test-case which does a synchronous writes on primary
mounted partition(ext3), At the same time if the network is down on iSCSI
Host i experience a hang on primary for a span of ~120 seconds.

 Testcase on Primary:

"
while true; do date | tee -a /mnt/drbd1/c.dat; echo -n A ; sync ; echo -n B ;
sleep 1 ; echo C ; done
"

Initial analysis pointed us to the Ext3 layer where we observed a hang,
below is the sequence,

journal_commit_transaction -> wait_for_iobuf -> wait_on_buffer < *gets
stuck here* > wait_on_buffer -> buffer locked -> wait_on_bit -> sync_buffer
-> io_schedule

When we debugged it further we understood that we were waiting for a
callback to be received from drbd driver,

submit_bh:

callback for bh = journal_end_buffer_io_sync

callback for bio = end_bio_bh_io_sync ( calls journal_end_buffer_io_sync )

submit_bh ->  register callback for bio (buffer io) end_bio_bh_io_sync ->
submit_bio -> generic_make_request -> __generic_make_request ->
q->make_request_fn -> corresponding handle for drbd is called which is
drbd_make_request_26,


When we debugged it further in drbd driver and the iscsi driver we
understood that when n/w is down, iSCSI layer goes to a blocked state for
time equivalent to the session recovery timeout value which default to 120
sec. On Secondary, Operations from <scsi_io_completion> to <asender through
wake_asender in drbd_endio_write_sec> does not happen when the iscsi is in
blocked state and hence the callback to the ext3 layer does not happen on
the Primary which waits on a wait queue to receive a P_RECV_ACK from
secondary. Attached the complete call trace for reference.

I back-ported a set of patches from 8.3 branch, major ones being the below,
Complete list is available as part of back-ported patches listed in the
attached text file.

all patches listed for drbd: detach from frozen backing device
&
drbd: Implemented real timeout checking for request processing time

 I can still see the issue with the back-ported patches, so we made some
changes to the drbd driver wherein if there is no response from the peer we
try to trigger a timeout and subsequently a state change. I have attached
the patch for reference. Can anyone please suggest if the attached patch is
the right way of resolving the issue?

Thanks & Regards,
Mukunda
---
 drivers/block/drbd/drbd_int.h      |    1 +
 drivers/block/drbd/drbd_main.c     |    1 +
 drivers/block/drbd/drbd_receiver.c |    3 +++
 drivers/block/drbd/drbd_req.c      |    8 +++++---
 4 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index 528f0bc..52e250d 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -1192,6 +1192,7 @@ struct drbd_conf {
 	u64 ed_uuid; /* UUID of the exposed data */
 	struct mutex state_mutex;
 	char congestion_reason;  /* Why we where congested... */
+	int p_connect;
 };
 
 static inline struct drbd_conf *minor_to_mdev(unsigned int minor)
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 7345af4..7c55599 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -3146,6 +3146,7 @@ void drbd_init_set_defaults(struct drbd_conf *mdev)
 	mdev->agreed_pro_version = PRO_VERSION_MAX;
 	mdev->write_ordering = WO_bio_barrier;
 	mdev->resync_wenr = LC_FREE;
+	mdev->p_connect = 0;
 }
 
 void drbd_mdev_cleanup(struct drbd_conf *mdev)
diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index 6ccefed..3452eb4 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -4396,8 +4396,11 @@ int drbd_asender(struct drbd_thread *thi)
 		if (signal_pending(current))
 			continue;
 		
+		mdev->p_connect = 1;
+		mod_timer(&mdev->request_timer, jiffies + 3000);
 		rv = drbd_recv_short(mdev, mdev->meta.socket,
 				     buf, expect-received, 0);
+		mdev->p_connect = 0;
 
 		clear_bit(SIGNAL_ASENDER, &mdev->flags);
 
diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
index cdd15de..5e74b59 100644
--- a/drivers/block/drbd/drbd_req.c
+++ b/drivers/block/drbd/drbd_req.c
@@ -1301,13 +1301,15 @@ void request_timer_fn(unsigned long data)
                 !time_in_range(now, mdev->last_reconnect_jif, mdev->last_reconnect_jif + ent)) {
                 dev_warn(DEV, "Remote failed to finish a request within ko-count * timeout\n");
                 _drbd_set_state(_NS(mdev, conn, C_TIMEOUT), CS_VERBOSE | CS_HARD, NULL);
- 	}
-	 if (dt && req->rq_state & RQ_LOCAL_PENDING &&
+	} else  if (dt && req->rq_state & RQ_LOCAL_PENDING &&
                 time_after(now, req->start_time + dt) &&
                 !time_in_range(now, mdev->last_reattach_jif, mdev->last_reattach_jif + dt)) {
                 dev_warn(DEV, "Local backing device failed to meet the disk-timeout\n");
                 __drbd_chk_io_error(mdev, DRBD_FORCE_DETACH);
-	} 
+	} else if (mdev->p_connect) {
+		dev_warn(DEV, "Remote failed timeout p_connect %d\n", mdev->p_connect);
+		_drbd_set_state(_NS(mdev, conn, C_TIMEOUT), CS_VERBOSE, NULL);
+	}
 	
 	nt = (time_after(now, req->start_time + et) ? now : req->start_time) + et;
        spin_unlock_irq(&mdev->req_lock);
-- 
1.7.2.3


Attachment: call-trace
Description: Binary data

Attachment: Backported
Description: Binary data

_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to