Comment #5 on issue 270 by [email protected]: drbd sync fails on readded
node
http://code.google.com/p/ganeti/issues/detail?id=270
And some more traces. I tried reproducing this on node B by starting from
scratch, and indeed:
# gnt-instance replace-disks -a instancename
Wed Jan 29 18:18:49 2014 - INFO: Checking disk/0 on B
Failure: prerequisites not met for this operation:
error type: wrong_state, error details:
Please run activate-disks on instance instancename first
# gnt-instance activate-disks instancename
nodeA:disk/0:/dev/drbd3
# gnt-instance replace-disks -a instancename
Wed Jan 29 18:19:55 2014 - INFO: Checking disk/0 on B
Wed Jan 29 18:19:55 2014 - INFO: Checking disk/0 on A
Wed Jan 29 18:19:56 2014 Replacing disk(s) 0 for instance 'instancename'
Wed Jan 29 18:19:56 2014 Current primary node: A
Wed Jan 29 18:19:56 2014 Current seconary node: B
Wed Jan 29 18:19:56 2014 STEP 1/6 Check device existence
Wed Jan 29 18:19:56 2014 - INFO: Checking disk/0 on A
Wed Jan 29 18:19:56 2014 - INFO: Checking disk/0 on B
Wed Jan 29 18:19:56 2014 - INFO: Checking volume groups
Wed Jan 29 18:19:56 2014 STEP 2/6 Check peer consistency
Wed Jan 29 18:19:56 2014 - INFO: Checking disk/0 consistency on node A
Wed Jan 29 18:19:57 2014 STEP 3/6 Allocate new storage
Wed Jan 29 18:19:57 2014 - INFO: Adding storage on B for disk/0
Wed Jan 29 18:19:58 2014 STEP 4/6 Changing drbd configuration
Wed Jan 29 18:19:58 2014 - INFO: Detaching disk/0 drbd from local storage
Wed Jan 29 18:19:58 2014 - INFO: Renaming the old LVs on the target node
Wed Jan 29 18:19:58 2014 - INFO: Renaming the new LVs on the target node
Wed Jan 29 18:19:59 2014 - INFO: Adding new mirror component on B
Wed Jan 29 18:20:01 2014 STEP 5/6 Sync devices
Wed Jan 29 18:20:01 2014 - INFO: Waiting for instance instancename to sync
disks
Wed Jan 29 18:20:13 2014 - INFO: Instance instancename's disks are in sync
[here it seems to be running for about 10 seconds]
Failure: command execution error:
DRBD device disk/0 is degraded!
In the log, during the 10 second pause:
2014-01-29 18:20:01,574: ganeti-masterd
pid=9336/Jq16/Job394618/I_REPLACE_DISKS INFO Waiting for instance
instancename to sync disks
2014-01-29 18:20:01,819: ganeti-masterd
pid=9336/Jq16/Job394618/I_REPLACE_DISKS INFO Degraded disks found, 10
retries left
2014-01-29 18:20:02,190: ganeti-masterd pid=9336/ClientReq1 INFO Received
job poll request for 394618
2014-01-29 18:20:02,990: ganeti-masterd
pid=9336/Jq16/Job394618/I_REPLACE_DISKS INFO Degraded disks found, 9
retries left
2014-01-29 18:20:04,192: ganeti-masterd
pid=9336/Jq16/Job394618/I_REPLACE_DISKS INFO Degraded disks found, 8
retries left
2014-01-29 18:20:05,355: ganeti-masterd
pid=9336/Jq16/Job394618/I_REPLACE_DISKS INFO Degraded disks found, 7
retries left
2014-01-29 18:20:06,527: ganeti-masterd
pid=9336/Jq16/Job394618/I_REPLACE_DISKS INFO Degraded disks found, 6
retries left
2014-01-29 18:20:07,690: ganeti-masterd
pid=9336/Jq16/Job394618/I_REPLACE_DISKS INFO Degraded disks found, 5
retries left
2014-01-29 18:20:08,855: ganeti-masterd
pid=9336/Jq16/Job394618/I_REPLACE_DISKS INFO Degraded disks found, 4
retries left
2014-01-29 18:20:10,022: ganeti-masterd
pid=9336/Jq16/Job394618/I_REPLACE_DISKS INFO Degraded disks found, 3
retries left
2014-01-29 18:20:11,184: ganeti-masterd
pid=9336/Jq16/Job394618/I_REPLACE_DISKS INFO Degraded disks found, 2
retries left
2014-01-29 18:20:12,348: ganeti-masterd
pid=9336/Jq16/Job394618/I_REPLACE_DISKS INFO Degraded disks found, 1
retries left
2014-01-29 18:20:13,507: ganeti-masterd
pid=9336/Jq16/Job394618/I_REPLACE_DISKS INFO Instance instancename's disks
are in sync
2014-01-29 18:20:13,710: ganeti-masterd pid=9336/Jq16/Job394618 ERROR Op
1/1: Caught exception in INSTANCE_REPLACE_DISKS(instancename)
Traceback (most recent call last):
File "/usr/share/ganeti/ganeti/jqueue.py", line 1115, in
_ExecOpCodeUnlocked
timeout=timeout)
File "/usr/share/ganeti/ganeti/jqueue.py", line 1426, in _WrapExecOpCode
return execop_fn(op, *args, **kwargs)
File "/usr/share/ganeti/ganeti/mcpu.py", line 517, in ExecOpCode
calc_timeout)
File "/usr/share/ganeti/ganeti/mcpu.py", line 459, in _LockAndExecLU
result = self._LockAndExecLU(lu, level + 1, calc_timeout)
File "/usr/share/ganeti/ganeti/mcpu.py", line 468, in _LockAndExecLU
result = self._LockAndExecLU(lu, level + 1, calc_timeout)
File "/usr/share/ganeti/ganeti/mcpu.py", line 468, in _LockAndExecLU
result = self._LockAndExecLU(lu, level + 1, calc_timeout)
File "/usr/share/ganeti/ganeti/mcpu.py", line 459, in _LockAndExecLU
result = self._LockAndExecLU(lu, level + 1, calc_timeout)
File "/usr/share/ganeti/ganeti/mcpu.py", line 459, in _LockAndExecLU
result = self._LockAndExecLU(lu, level + 1, calc_timeout)
File "/usr/share/ganeti/ganeti/mcpu.py", line 468, in _LockAndExecLU
result = self._LockAndExecLU(lu, level + 1, calc_timeout)
File "/usr/share/ganeti/ganeti/mcpu.py", line 407, in _LockAndExecLU
result = self._ExecLU(lu)
File "/usr/share/ganeti/ganeti/mcpu.py", line 374, in _ExecLU
result = _ProcessResult(submit_mj_fn, lu.op, lu.Exec(self.Log))
File "/usr/share/ganeti/ganeti/cmdlib/base.py", line 250, in Exec
tl.Exec(feedback_fn)
File "/usr/share/ganeti/ganeti/cmdlib/instance_storage.py", line 2158, in
Exec
result = fn(feedback_fn)
File "/usr/share/ganeti/ganeti/cmdlib/instance_storage.py", line 2453, in
_ExecDrbd8DiskOnly
self._CheckDevices(self.instance.primary_node, iv_names)
File "/usr/share/ganeti/ganeti/cmdlib/instance_storage.py", line 2300, in
_CheckDevices
raise errors.OpExecError("DRBD device %s is degraded!" % name)
OpExecError: DRBD device disk/0 is degraded!
2014-01-29 18:20:13,831: ganeti-masterd pid=9336/Jq16/Job394618 INFO
Finished job 394618, status = error
--
You received this message because this project is configured to send all
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings