mgmt
oVirt - 4.1.0.4
centos 7.3-1611
nodes ( installed from ovirt image
"ovirt-node-ng-installer-ovirt-4.1-2017030804.iso" )
OS Version: == RHEL - 7 - 3.1611.el7.centos
OS Description:== oVirt Node 4.1.0
Kernel Version:== 3.10.0 - 514.10.2.el7.x86_64
KVM Version:== 2.6.0 - 28.el7_3.3.1
LIBVIRT Version:== libvirt-2.0.0-10.el7_3.5
VDSM Version:== vdsm-4.19.4-1.el7.centos
SPICE Version:== 0.12.4 - 20.el7_3
GlusterFS Version:== glusterfs-3.8.9-1.el7 ( LVM thinprovisioning in
replica 2 - created from ovirt GUI )
concurently running
- huge import from export domain ( net workload )
- sequential write to VMs local disk ( gluster replica sequential workload )
- VMs database huge select ( random IOps )
- huge old snapshot delete ( random IOps )
In this configuration / workload is runnig one hour eg, with no exceptions
, with 70-80% disk load, but in some point VDSM freez all jobs for a
timeout and VM's are in "uknown" status .
The whole system revitalize then automaticaly in cca 20min time frame (
except the import and snapshot deleting(rollback) )
engine.log - focus 10:39:07 time ( Failed in 'HSMGetAllTasksStatusesVDS'
method )
========
n child command id: 'a8a3a4d5-cf7d-4423-8243-022911232508'
type:'RemoveSnapshotSingleDiskLive' to complete
2017-03-10 10:39:01,727+01 INFO
[org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommandCallback]
(DefaultQuartzScheduler2) [759c8e1f] Command 'RemoveSnapshotSingleDiskLive'
(id: 'a8a3a4d5-cf7d-4423-8243-022911232508') waiting on child command id:
'33df2c1e-6ce3-44fd-a39b-d111883b4c4e' type:'DestroyImage' to complete
2017-03-10 10:39:03,929+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand]
(DefaultQuartzScheduler5) [fde51205-3e8b-4b84-a478-352dc444ccc4] START,
GlusterServersListVDSCommand(HostName = 2kvm1,
VdsIdVDSCommandParametersBase:{runAsync='true',
hostId='86876b79-71d8-4ae1-883b-ba010cd270e7'}), log id: 446d0cd3
2017-03-10 10:39:04,343+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand]
(DefaultQuartzScheduler5) [fde51205-3e8b-4b84-a478-352dc444ccc4] FINISH,
GlusterServersListVDSCommand, return: [172.16.5.163/24:CONNECTED
<http://172.16.5.163/24:CONNECTED>,
16.0.0.164:CONNECTED], log id: 446d0cd3
2017-03-10 10:39:04,353+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand]
(DefaultQuartzScheduler5) [fde51205-3e8b-4b84-a478-352dc444ccc4] START,
GlusterVolumesListVDSCommand(HostName = 2kvm1,
GlusterVolumesListVDSParameters:{runAsync='true',
hostId='86876b79-71d8-4ae1-883b-ba010cd270e7'}), log id: 69ea1fda
2017-03-10 10:39:05,128+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand]
(DefaultQuartzScheduler5) [fde51205-3e8b-4b84-a478-352dc444ccc4] FINISH,
GlusterVolumesListVDSCommand, return:
{8ded4083-2f31-489e-a60d-a315a5eb9b22=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@7765e4ad},
log id: 69ea1fda
2017-03-10 10:39:07,163+01 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand]
(DefaultQuartzScheduler2) [759c8e1f] Failed in 'HSMGetAllTasksStatusesVDS'
method
2017-03-10 10:39:07,178+01 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler2) [759c8e1f] EVENT_ID:
VDS_BROKER_COMMAND_FAILURE(10,802), Correlation ID: null, Call Stack: null,
Custom Event ID: -1, Message: VDSM 2kvm2 command HSMGetAllTasksStatusesVDS
failed: Connection timed out
2017-03-10 10:39:07,182+01 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler2)
[759c8e1f] BaseAsyncTask::onTaskEndSuccess: Task
'f594bf69-619b-4d1b-8f6d-a9826997e478' (Parent Command 'ImportVm',
Parameters Type
'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended
successfully.
2017-03-10 10:39:07,182+01 INFO
[org.ovirt.engine.core.bll.CommandMultiAsyncTasks] (DefaultQuartzScheduler2)
[759c8e1f] Task with DB Task ID 'a05c7c07-9b98-4ab2-ac7b-9e70a75ba7b7' and
VDSM Task ID '7c60369f-70a3-4a6a-80c9-4753ac9ed372' is in state Polling. End
action for command 8deb3fe3-4a83-4605-816c-ffdc63fd9ac1 will proceed when
all the entity's tasks are completed.
2017-03-10 10:39:07,182+01 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler2)
[759c8e1f] SPMAsyncTask::PollTask: Polling task
'f351e8f6-6dd7-49aa-bf54-650d84fc6352' (Parent Command 'DestroyImage',
Parameters Type
'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') returned
status 'finished', result 'cleanSuccess'.
2017-03-10 10:39:07,182+01 ERROR
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler2)
[759c8e1f] BaseAsyncTask::logEndTaskFailure: Task
'f351e8f6-6dd7-49aa-bf54-650d84fc6352' (Parent Command 'DestroyImage',
Parameters Type
'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended with
failure:
-- Result: 'cleanSuccess'
-- Message: 'VDSGenericException: VDSErrorException: Failed to
HSMGetAllTasksStatusesVDS, error = Connection timed out, code = 100',
-- Exception: 'VDSGenericException: VDSErrorException: Failed to
HSMGetAllTasksStatusesVDS, error = Connection timed out, code = 100'
2017-03-10 10:39:07,184+01 INFO
[org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (DefaultQuartzScheduler2)
[759c8e1f] CommandAsyncTask::endActionIfNecessary: All tasks of command
'33df2c1e-6ce3-44fd-a39b-d111883b4c4e' has ended -> executing 'endAction'
2017-03-10 10:39:07,185+01 INFO
[org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (DefaultQuartzScheduler2)
[759c8e1f] CommandAsyncTask::endAction: Ending action for '1' tasks (command
ID: '33df2c1e-6ce3-44fd-a39b-d111883b4c4e'): calling endAction '.
2017-03-10 10:39:07,185+01 INFO
[org.ovirt.engine.core.bll.tasks.CommandAsyncTask]
(org.ovirt.thread.pool-6-thread-31) [759c8e1f]
CommandAsyncTask::endCommandAction [within thread] context: Attempting to
endAction 'DestroyImage',
2017-03-10 10:39:07,192+01 INFO
[org.ovirt.engine.core.bll.storage.disk.image.DestroyImageCommand]
(org.ovirt.thread.pool-6-thread-31) [759c8e1f] Command
[id=33df2c1e-6ce3-44fd-a39b-d111883b4c4e]: Updating status to 'FAILED', The
command end method logic will be executed by one of its parent commands.
2017-03-10 10:39:07,192+01 INFO
[org.ovirt.engine.core.bll.tasks.CommandAsyncTask]
(org.ovirt.thread.pool-6-thread-31) [759c8e1f]
CommandAsyncTask::HandleEndActionResult [within thread]: endAction for
action type 'DestroyImage' completed, handling the result.
2017-03-10 10:39:07,192+01 INFO
[org.ovirt.engine.core.bll.tasks.CommandAsyncTask]
(org.ovirt.thread.pool-6-thread-31) [759c8e1f]
CommandAsyncTask::HandleEndActionResult [within thread]: endAction for
action type 'DestroyImage' succeeded, clearing tasks.
2017-03-10 10:39:07,192+01 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
(org.ovirt.thread.pool-6-thread-31) [759c8e1f] SPMAsyncTask::ClearAsyncTask:
Attempting to clear task 'f351e8f6-6dd7-49aa-bf54-650d84fc6352'
2017-03-10 10:39:07,193+01 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.SPMClearTaskVDSCommand]
(org.ovirt.thread.pool-6-thread-31) [759c8e1f] START,
SPMClearTaskVDSCommand(
SPMTaskGuidBaseVDSCommandParameters:{runAsync='true',
storagePoolId='00000001-0001-0001-0001-000000000311',
ignoreFailoverLimit='false',
taskId='f351e8f6-6dd7-49aa-bf54-650d84fc6352'}), log id: 2b7080c2
2017-03-10 10:39:07,194+01 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand]
(org.ovirt.thread.pool-6-thread-31) [759c8e1f] START,
HSMClearTaskVDSCommand(HostName = 2kvm2,
HSMTaskGuidBaseVDSCommandParameters:{runAsync='true',
hostId='905375e1-6de4-4fdf-b69c-b2d546f869c8',
taskId='f351e8f6-6dd7-49aa-bf54-650d84fc6352'}), log id: 2edff460
2017-03-10 10:39:08,208+01 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand]
(org.ovirt.thread.pool-6-thread-31) [759c8e1f] FINISH,
HSMClearTaskVDSCommand, log id: 2edff460
2017-03-10 10:39:08,208+01 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.SPMClearTaskVDSCommand]
(org.ovirt.thread.pool-6-thread-31) [759c8e1f] FINISH,
SPMClearTaskVDSCommand, log id: 2b7080c2
2017-03-10 10:39:08,213+01 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
(org.ovirt.thread.pool-6-thread-31) [759c8e1f]
BaseAsyncTask::removeTaskFromDB: Removed task
'f351e8f6-6dd7-49aa-bf54-650d84fc6352' from DataBase
2017-03-10 10:39:08,213+01 INFO
[org.ovirt.engine.core.bll.tasks.CommandAsyncTask]
(org.ovirt.thread.pool-6-thread-31) [759c8e1f]
CommandAsyncTask::HandleEndActionResult [within thread]: Removing
CommandMultiAsyncTasks object for entity
'33df2c1e-6ce3-44fd-a39b-d111883b4c4e'
2017-03-10 10:39:10,142+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand]
(DefaultQuartzScheduler10) [a86dc7b5-52dc-40d4-a3b9-49d7eabbb93c] START,
GlusterServersListVDSCommand(HostName = 2kvm1,
VdsIdVDSCommandParametersBase:{runAsync='true',
hostId='86876b79-71d8-4ae1-883b-ba010cd270e7'}), log id: 2e7278cb
2017-03-10 10:39:11,513+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand]
(DefaultQuartzScheduler10) [a86dc7b5-52dc-40d4-a3b9-49d7eabbb93c] FINISH,
GlusterServersListVDSCommand, return: [172.16.5.163/24:CONNECTED
<http://172.16.5.163/24:CONNECTED>,
16.0.0.164:CONNECTED], log id: 2e7278cb
2017-03-10 10:39:11,523+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand]
(DefaultQuartzScheduler10) [a86dc7b5-52dc-40d4-a3b9-49d7eabbb93c] START,
GlusterVolumesListVDSCommand(HostName = 2kvm1,
GlusterVolumesListVDSParameters:{runAsync='true',
hostId='86876b79-71d8-4ae1-883b-ba010cd270e7'}), log id: 43704ef2
2017-03-10 10:39:11,777+01 INFO
[org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback]
(DefaultQuartzScheduler9) [67e1d8ed] Command 'RemoveSnapshot' (id:
'13c2cb7c-0809-4971-aceb-37ae66105ab7') waiting on child command id:
'a8a3a4d5-cf7d-4423-8243-022911232508' type:'RemoveSnapshotSingleDiskLive'
to complete
2017-03-10 10:39:11,789+01 WARN
[org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommand]
(DefaultQuartzScheduler9) [759c8e1f] Child command 'DESTROY_IMAGE' failed,
proceeding to verify
2017-03-10 10:39:11,789+01 INFO
[org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommand]
(DefaultQuartzScheduler9) [759c8e1f] Executing Live Merge command step
'DESTROY_IMAGE_CHECK'
2017-03-10 10:39:11,832+01 INFO
[org.ovirt.engine.core.bll.DestroyImageCheckCommand] (pool-5-thread-7)
[4856f570] Running command: DestroyImageCheckCommand internal: true.
2017-03-10 10:39:11,833+01 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.SPMGetVolumeInfoVDSCommand]
(pool-5-thread-7) [4856f570] START, SPMGetVolumeInfoVDSCommand(
SPMGetVolumeInfoVDSCommandParameters:{expectedEngineErrors='[VolumeDoesNotExist]',
runAsync='true', storagePoolId='00000001-0001-0001-0001-000000000311',
ignoreFailoverLimit='false',
storageDomainId='1603cd90-92ef-4c03-922c-cecb282fd00e',
imageGroupId='7543338a-3ca6-4698-bb50-c14f0bd71428',
imageId='50b592f7-bfba-4398-879c-8d6a19a2c000'}), log id: 2c8031f8
2017-03-10 10:39:11,833+01 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.SPMGetVolumeInfoVDSCommand]
(pool-5-thread-7) [4856f570] Executing GetVolumeInfo using the current SPM
2017-03-10 10:39:11,834+01 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.GetVolumeInfoVDSCommand]
(pool-5-thread-7) [4856f570] START, GetVolumeInfoVDSCommand(HostName =
2kvm2,
GetVolumeInfoVDSCommandParameters:{expectedEngineErrors='[VolumeDoesNotExist]',
runAsync='true', hostId='905375e1-6de4-4fdf-b69c-b2d546f869c8',
storagePoolId='00000001-0001-0001-0001-000000000311',
storageDomainId='1603cd90-92ef-4c03-922c-cecb282fd00e',
imageGroupId='7543338a-3ca6-4698-bb50-c14f0bd71428',
imageId='50b592f7-bfba-4398-879c-8d6a19a2c000'}), log id: 79ca86cc
2017-03-10 10:39:11,846+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand]
(DefaultQuartzScheduler10) [a86dc7b5-52dc-40d4-a3b9-49d7eabbb93c] FINISH,
GlusterVolumesListVDSCommand, return:
{8ded4083-2f31-489e-a60d-a315a5eb9b22=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@7765e4ad},
log id: 43704ef2
2017-03-10 10:39:16,858+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand]
(DefaultQuartzScheduler7) [d82701d9-9fa3-467d-b273-f5fe5a93062f] START,
GlusterServersListVDSCommand(HostName = 2kvm1,
VdsIdVDSCommandParametersBase:{runAsync='true',
hostId='86876b79-71d8-4ae1-883b-ba010cd270e7'}), log id: 6542adcd
2017-03-10 10:39:17,394+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand]
(DefaultQuartzScheduler7) [d82701d9-9fa3-467d-b273-f5fe5a93062f] FINISH,
GlusterServersListVDSCommand, return: [172.16.5.163/24:CONNECTED
<http://172.16.5.163/24:CONNECTED>,
16.0.0.164:CONNECTED], log id: 6542adcd
2017-03-10 10:39:17,406+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand]
(DefaultQuartzScheduler7) [d82701d9-9fa3-467d-b273-f5fe5a93062f] START,
GlusterVolumesListVDSCommand(HostName = 2kvm1,
GlusterVolumesListVDSParameters:{runAsync='true',
hostId='86876b79-71d8-4ae1-883b-ba010cd270e7'}), log id: 44ec33ed
2017-03-10 10:39:18,598+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand]
(DefaultQuartzScheduler7) [d82701d9-9fa3-467d-b273-f5fe5a93062f] FINISH,
GlusterVolumesListVDSCommand, return:
{8ded4083-2f31-489e-a60d-a315a5eb9b22=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@7765e4ad},
log id: 44ec33ed
2017-03-10 10:39:21,865+01 INFO
[org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback]
(DefaultQuartzScheduler6) [67e1d8ed] Command 'RemoveSnapshot' (id:
'13c2cb7c-0809-4971-aceb-37ae66105ab7') waiting on child command id:
'a8a3a4d5-cf7d-4423-8243-022911232508' type:'RemoveSnapshotSingleDiskLive'
to complete
2017-03-10 10:39:21,881+01 INFO
[org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommandCallback]
(DefaultQuartzScheduler6) [4856f570] Command 'RemoveSnapshotSingleDiskLive'
(id: 'a8a3a4d5-cf7d-4423-8243-022911232508') waiting on child command id:
'b1d63b8e-19d3-4d64-8fa8-4eb3e2d1a8fc' type:'DestroyImageCheck' to complete
2017-03-10 10:39:23,611+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand]
(DefaultQuartzScheduler6) [4856f570] START,
GlusterServersListVDSCommand(HostName = 2kvm1,
VdsIdVDSCommandParametersBase:{runAsync='true',
hostId='86876b79-71d8-4ae1-883b-ba010cd270e7'}), log id: 4c2fc22d
2017-03-10 10:39:24,616+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterTasksListVDSCommand]
(DefaultQuartzScheduler7) [d82701d9-9fa3-467d-b273-f5fe5a93062f] START,
GlusterTasksListVDSCommand(HostName = 2kvm1,
VdsIdVDSCommandParametersBase:{runAsync='true',
hostId='86876b79-71d8-4ae1-883b-ba010cd270e7'}), log id: 1f169371
2017-03-10 10:39:24,618+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand]
(DefaultQuartzScheduler6) [4856f570] FINISH, GlusterServersListVDSCommand,
return: [172.16.5.163/24:CONNECTED <http://172.16.5.163/24:CONNECTED>,
16.0.0.164:CONNECTED], log id: 4c2fc22d
2017-03-10 10:39:24,629+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand]
(DefaultQuartzScheduler6) [4856f570] START,
GlusterVolumesListVDSCommand(HostName = 2kvm1,
GlusterVolumesListVDSParameters:{runAsync='true',
hostId='86876b79-71d8-4ae1-883b-ba010cd270e7'}), log id: 2ac55735
2017-03-10 10:39:24,822+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterTasksListVDSCommand]
(DefaultQuartzScheduler7) [d82701d9-9fa3-467d-b273-f5fe5a93062f] FINISH,
GlusterTasksListVDSCommand, return: [], log id: 1f169371
2017-03-10 10:39:26,836+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand]
(DefaultQuartzScheduler6) [4856f570] FINISH, GlusterVolumesListVDSCommand,
return:
{8ded4083-2f31-489e-a60d-a315a5eb9b22=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@7765e4ad},
log id: 2ac55735
2017-03-10 10:39:31,849+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand]
(DefaultQuartzScheduler8) [fde51205-3e8b-4b84-a478-352dc444ccc4] START,
GlusterServersListVDSCommand(HostName = 2kvm1,
VdsIdVDSCommandParametersBase:{runAsync='true',
hostId='86876b79-71d8-4ae1-883b-ba010cd270e7'}), log id: 2e8dbcd1
2017-03-10 10:39:31,932+01 INFO
[org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback]
(DefaultQuartzScheduler6) [67e1d8ed] Command 'RemoveSnapshot' (id:
'13c2cb7c-0809-4971-aceb-37ae66105ab7') waiting on child command id:
'a8a3a4d5-cf7d-4423-8243-022911232508' type:'RemoveSnapshotSingleDiskLive'
to complete
2017-03-10 10:39:31,944+01 INFO
[org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommandCallback]
(DefaultQuartzScheduler6) [4856f570] Command 'RemoveSnapshotSingleDiskLive'
(id: 'a8a3a4d5-cf7d-4423-8243-022911232508') waiting on child command id:
'b1d63b8e-19d3-4d64-8fa8-4eb3e2d1a8fc' type:'DestroyImageCheck' to complete
2017-03-10 10:39:33,213+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand]
(DefaultQuartzScheduler8) [fde51205-3e8b-4b84-a478-352dc444ccc4] FINISH,
GlusterServersListVDSCommand, return: [172.16.5.163/24:CONNECTED
<http://172.16.5.163/24:CONNECTED>,
16.0.0.164:CONNECTED], log id: 2e8dbcd1
2017-03-10 10:39:33,226+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand]
(DefaultQuartzScheduler8) [fde51205-3e8b-4b84-a478-352dc444ccc4] START,
GlusterVolumesListVDSCommand(HostName = 2kvm1,
GlusterVolumesListVDSParameters:{runAsync='true',
hostId='86876b79-71d8-4ae1-883b-ba010cd270e7'}), log id: 1fb3f9e3
2017-03-10 10:39:34,375+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand]
(DefaultQuartzScheduler8) [fde51205-3e8b-4b84-a478-352dc444ccc4] FINISH,
GlusterVolumesListVDSCommand, return:
{8ded4083-2f31-489e-a60d-a315a5eb9b22=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@7765e4ad},
log id: 1fb3f9e3
2017-03-10 10:39:39,392+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand]
(DefaultQuartzScheduler9) [12d6d15f-e054-4833-bd87-58f6a51e5fa6] START,
GlusterServersListVDSCommand(HostName = 2kvm1,
VdsIdVDSCommandParametersBase:{runAsync='true',
hostId='86876b79-71d8-4ae1-883b-ba010cd270e7'}), log id: 1e0b8eeb
2017-03-10 10:39:40,753+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand]
(DefaultQuartzScheduler9) [12d6d15f-e054-4833-bd87-58f6a51e5fa6] FINISH,
GlusterServersListVDSCommand, return: [172.16.5.163/24:CONNECTED
<http://172.16.5.163/24:CONNECTED>,
16.0.0.164:CONNECTED], log id: 1e0b8eeb
2017-03-10 10:39:40,763+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand]
(DefaultQuartzScheduler9) [12d6d15f-e054-4833-bd87-58f6a51e5fa6] START,
GlusterVolumesListVDSCommand(HostName = 2kvm1,
GlusterVolumesListVDSParameters:{runAsync='true',
hostId='86876b79-71d8-4ae1-883b-ba010cd270e7'}), log id: 35b04b33
2017-03-10 10:39:41,952+01 INFO
[org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand]
(DefaultQuartzScheduler9) [12d6d15f-e054-4833-bd87-58f6a51e5fa6] FINISH,
GlusterVolumesListVDSCommand, return:
{8ded4083-2f31-489e-a60d-a315a5eb9b22=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity@7765e4ad},
log id: 35b04b33
2017-03-10 10:39:41,991+01 INFO
[org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback]
(DefaultQuartzScheduler6) [67e1d8ed] Command 'RemoveSnapshot' (id:
'13c2cb7c-0809-4971-aceb-37ae66105ab7') waiting on child command id:
'a8a3a4d5-cf7d-4423-8243-022911232508' type:'RemoveSnapshotSingleDiskLive'
to complete
gluster ( nothing in logs )
======
## "etc-glusterfs-glusterd.vol.log"
[2017-03-10 10:13:52.599019] I [MSGID: 106499]
[glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume slow1
[2017-03-10 10:16:48.639635] I [MSGID: 106499]
[glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume slow1
The message "I [MSGID: 106499]
[glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume slow1" repeated 3 times between
[2017-03-10 10:16:48.639635] and [2017-03-10 10:17:55.659379]
[2017-03-10 10:18:56.875516] I [MSGID: 106499]
[glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume slow1
[2017-03-10 10:19:57.204689] I [MSGID: 106499]
[glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume slow1
[2017-03-10 10:21:56.576879] I [MSGID: 106499]
[glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume slow1
[2017-03-10 10:21:57.772857] I [MSGID: 106499]
[glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume slow1
[2017-03-10 10:24:00.617931] I [MSGID: 106499]
[glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume slow1
[2017-03-10 10:30:04.918080] I [MSGID: 106499]
[glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume slow1
[2017-03-10 10:31:06.128638] I [MSGID: 106499]
[glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume slow1
[2017-03-10 10:32:07.325672] I [MSGID: 106499]
[glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume slow1
[2017-03-10 10:32:12.433586] I [MSGID: 106499]
[glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume slow1
[2017-03-10 10:32:13.544909] I [MSGID: 106499]
[glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume slow1
[2017-03-10 10:35:10.039213] I [MSGID: 106499]
[glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume slow1
[2017-03-10 10:37:19.905314] I [MSGID: 106499]
[glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume slow1
[2017-03-10 10:37:20.174209] I [MSGID: 106499]
[glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume slow1
[2017-03-10 10:38:12.635460] I [MSGID: 106499]
[glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume slow1
[2017-03-10 10:40:14.169864] I [MSGID: 106499]
[glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume slow1
## "rhev-data-center-mnt-glusterSD-localhost:_slow1.log"
[2017-03-10 09:43:40.346785] W [MSGID: 101159] [inode.c:1214:__inode_unlink]
0-inode:
be318638-e8a0-4c6d-977d-7a937aa84806/b6f2d08d-2441-4111-ab62-e14abdfaf602.61849:
dentry not found in 43e6968f-9c2a-40d8-8074-caf1a36f60cf
[2017-03-10 09:43:40.347076] W [MSGID: 101159] [inode.c:1214:__inode_unlink]
0-inode:
be318638-e8a0-4c6d-977d-7a937aa84806/b6f2d08d-2441-4111-ab62-e14abdfaf602.61879:
dentry not found in 902a6e3d-b7aa-439f-8262-cdc1b7f9f022
[2017-03-10 09:43:40.347145] W [MSGID: 101159] [inode.c:1214:__inode_unlink]
0-inode:
be318638-e8a0-4c6d-977d-7a937aa84806/b6f2d08d-2441-4111-ab62-e14abdfaf602.61935:
dentry not found in 846bbcfc-f2b3-4ab6-af44-aeaa10b39318
[2017-03-10 09:43:40.347211] W [MSGID: 101159] [inode.c:1214:__inode_unlink]
0-inode:
be318638-e8a0-4c6d-977d-7a937aa84806/b6f2d08d-2441-4111-ab62-e14abdfaf602.61922:
dentry not found in 66ad3bc5-26c7-4360-b33b-a084e3305cf8
[2017-03-10 09:43:40.351571] W [MSGID: 101159] [inode.c:1214:__inode_unlink]
0-inode:
be318638-e8a0-4c6d-977d-7a937aa84806/b6f2d08d-2441-4111-ab62-e14abdfaf602.61834:
dentry not found in 3b8278e1-40e5-4363-b21e-7bffcd024c62
[2017-03-10 09:43:40.352449] W [MSGID: 101159] [inode.c:1214:__inode_unlink]
0-inode:
be318638-e8a0-4c6d-977d-7a937aa84806/b6f2d08d-2441-4111-ab62-e14abdfaf602.61870:
dentry not found in 282f4c05-e09a-48e0-96a3-52e079ff2f73
[2017-03-10 09:50:38.829325] I [MSGID: 109066]
[dht-rename.c:1569:dht_rename] 0-slow1-dht: renaming
/1603cd90-92ef-4c03-922c-cecb282fd00e/images/014ca3aa-d5f5-4b88-8f84-be8d4c5dfc1e/f147532a-89fa-49e0-8225-f82343fca8be.meta.new
(hash=slow1-replicate-0/cache=slow1-replicate-0) =>
/1603cd90-92ef-4c03-922c-cecb282fd00e/images/014ca3aa-d5f5-4b88-8f84-be8d4c5dfc1e/f147532a-89fa-49e0-8225-f82343fca8be.meta
(hash=slow1-replicate-0/cache=slow1-replicate-0)
[2017-03-10 09:50:42.221775] I [MSGID: 109066]
[dht-rename.c:1569:dht_rename] 0-slow1-dht: renaming
/1603cd90-92ef-4c03-922c-cecb282fd00e/images/4cf7dd90-9dcc-428c-82bc-fbf08dbee0be/12812d56-1606-4bf8-a391-0a2cacbd020b.meta.new
(hash=slow1-replicate-0/cache=slow1-replicate-0) =>
/1603cd90-92ef-4c03-922c-cecb282fd00e/images/4cf7dd90-9dcc-428c-82bc-fbf08dbee0be/12812d56-1606-4bf8-a391-0a2cacbd020b.meta
(hash=slow1-replicate-0/cache=slow1-replicate-0)
[2017-03-10 09:50:45.956432] I [MSGID: 109066]
[dht-rename.c:1569:dht_rename] 0-slow1-dht: renaming
/1603cd90-92ef-4c03-922c-cecb282fd00e/images/3cef54b4-45b9-4f5b-82c2-fcc8def06a37/85287865-38f0-45df-9e6c-1294913cbb88.meta.new
(hash=slow1-replicate-0/cache=slow1-replicate-0) =>
/1603cd90-92ef-4c03-922c-cecb282fd00e/images/3cef54b4-45b9-4f5b-82c2-fcc8def06a37/85287865-38f0-45df-9e6c-1294913cbb88.meta
(hash=slow1-replicate-0/cache=slow1-replicate-0)
[2017-03-10 09:50:40.349563] I [MSGID: 109066]
[dht-rename.c:1569:dht_rename] 0-slow1-dht: renaming
/1603cd90-92ef-4c03-922c-cecb282fd00e/images/014ca3aa-d5f5-4b88-8f84-be8d4c5dfc1e/f147532a-89fa-49e0-8225-f82343fca8be.meta.new
(hash=slow1-replicate-0/cache=slow1-replicate-0) =>
/1603cd90-92ef-4c03-922c-cecb282fd00e/images/014ca3aa-d5f5-4b88-8f84-be8d4c5dfc1e/f147532a-89fa-49e0-8225-f82343fca8be.meta
(hash=slow1-replicate-0/cache=slow1-replicate-0)
[2017-03-10 09:50:44.503866] I [MSGID: 109066]
[dht-rename.c:1569:dht_rename] 0-slow1-dht: renaming
/1603cd90-92ef-4c03-922c-cecb282fd00e/images/4cf7dd90-9dcc-428c-82bc-fbf08dbee0be/12812d56-1606-4bf8-a391-0a2cacbd020b.meta.new
(hash=slow1-replicate-0/cache=slow1-replicate-0) =>
/1603cd90-92ef-4c03-922c-cecb282fd00e/images/4cf7dd90-9dcc-428c-82bc-fbf08dbee0be/12812d56-1606-4bf8-a391-0a2cacbd020b.meta
(hash=slow1-replicate-0/cache=slow1-replicate-0)
[2017-03-10 09:59:46.860762] W [MSGID: 101159] [inode.c:1214:__inode_unlink]
0-inode:
be318638-e8a0-4c6d-977d-7a937aa84806/6e105aa3-a3fc-4aca-be50-78b7642c4072.6684:
dentry not found in d1e65eea-8758-4407-ac2e-3605dc661364
[2017-03-10 10:02:22.500865] W [MSGID: 101159] [inode.c:1214:__inode_unlink]
0-inode:
be318638-e8a0-4c6d-977d-7a937aa84806/6e105aa3-a3fc-4aca-be50-78b7642c4072.8767:
dentry not found in e228bb28-9602-4f8e-8323-7434d77849fc
[2017-03-10 10:04:03.103839] W [MSGID: 101159] [inode.c:1214:__inode_unlink]
0-inode:
be318638-e8a0-4c6d-977d-7a937aa84806/6e105aa3-a3fc-4aca-be50-78b7642c4072.9787:
dentry not found in 6be71632-aa36-4975-b673-1357e0355027
[2017-03-10 10:06:02.406385] I [MSGID: 109066]
[dht-rename.c:1569:dht_rename] 0-slow1-dht: renaming
/1603cd90-92ef-4c03-922c-cecb282fd00e/images/2a9c1c6a-f045-4dce-a47b-95a2267eef72/6f264695-0669-4b49-a2f6-e6c92482f2fb.meta.new
(hash=slow1-replicate-0/cache=slow1-replicate-0) =>
/1603cd90-92ef-4c03-922c-cecb282fd00e/images/2a9c1c6a-f045-4dce-a47b-95a2267eef72/6f264695-0669-4b49-a2f6-e6c92482f2fb.meta
(hash=slow1-replicate-0/cache=slow1-replicate-0)
... no other record
messages
========
several times occured:
Mar 10 09:04:38 2kvm2 lvmetad: WARNING: Ignoring unsupported value for cmd.
Mar 10 09:04:38 2kvm2 lvmetad: WARNING: Ignoring unsupported value for cmd.
Mar 10 09:04:38 2kvm2 lvmetad: WARNING: Ignoring unsupported value for cmd.
Mar 10 09:04:38 2kvm2 lvmetad: WARNING: Ignoring unsupported value for cmd.
Mar 10 09:10:01 2kvm2 systemd: Started Session 274 of user root.
Mar 10 09:10:01 2kvm2 systemd: Starting Session 274 of user root.
Mar 10 09:20:02 2kvm2 systemd: Started Session 275 of user root.
Mar 10 09:20:02 2kvm2 systemd: Starting Session 275 of user root.
Mar 10 09:22:59 2kvm2 sanlock[1673]: 2017-03-10 09:22:59+0100 136031 [2576]:
s3 delta_renew long write time 11 sec
Mar 10 10:38:01 2kvm2 wdmd[1732]: /dev/watchdog0 reopen
Mar 10 10:38:10 2kvm2 journal: Cannot start job (query, none) for domain
TEST-LBS_EBSAPP; current job is (query, none) owned by (3284
remoteDispatchConnectGetAllDomainStats, 0 <null>) for (62s, 0s)
Mar 10 10:38:10 2kvm2 journal: Timed out during operation: cannot acquire
state change lock (held by remoteDispatchConnectGetAllDomainStats)
Mar 10 10:38:11 2kvm2 journal: vdsm vds.dispatcher ERROR SSL error receiving
from <yajsonrpc.betterAsyncore.Dispatcher connected ('::1', 40590, 0, 0) at
0x3acdd88>: unexpected eof
Mar 10 10:38:40 2kvm2 journal: Cannot start job (query, none) for domain
TEST1-LBS_ATRYA; current job is (query, none) owned by (3288
remoteDispatchConnectGetAllDomainStats, 0 <null>) for (47s, 0s)
Mar 10 10:38:40 2kvm2 journal: Timed out during operation: cannot acquire
state change lock (held by remoteDispatchConnectGetAllDomainStats)
Mar 10 10:38:41 2kvm2 journal: vdsm vds.dispatcher ERROR SSL error receiving
from <yajsonrpc.betterAsyncore.Dispatcher connected ('::1', 40592, 0, 0) at
0x3fd5b90>: unexpected eof
Mar 10 10:39:10 2kvm2 journal: Cannot start job (query, none) for domain
TEST-LBS_EBSAPP; current job is (query, none) owned by (3284
remoteDispatchConnectGetAllDomainStats, 0 <null>) for (122s, 0s)
Mar 10 10:39:10 2kvm2 journal: Timed out during operation: cannot acquire
state change lock (held by remoteDispatchConnectGetAllDomainStats)
Mar 10 10:39:10 2kvm2 journal: Cannot start job (query, none) for domain
TEST1-LBS_ATRYA; current job is (query, none) owned by (3288
remoteDispatchConnectGetAllDomainStats, 0 <null>) for (77s, 0s)
Mar 10 10:39:10 2kvm2 journal: Timed out during operation: cannot acquire
state change lock (held by remoteDispatchConnectGetAllDomainStats)
Mar 10 10:39:11 2kvm2 journal: vdsm vds.dispatcher ERROR SSL error receiving
from <yajsonrpc.betterAsyncore.Dispatcher connected ('::1', 40594, 0, 0) at
0x2447290>: unexpected eof
Mar 10 10:39:23 2kvm2 sanlock[1673]: 2017-03-10 10:39:23+0100 140615 [2576]:
s3 delta_renew write time 140 error -202
Mar 10 10:39:23 2kvm2 sanlock[1673]: 2017-03-10 10:39:23+0100 140615 [2576]:
s3 renewal error -202 delta_length 144 last_success 140451
Mar 10 10:39:40 2kvm2 journal: Cannot start job (query, none) for domain
TEST-LBS_EBSAPP; current job is (query, none) owned by (3284
remoteDispatchConnectGetAllDomainStats, 0 <null>) for (152s, 0s)
Mar 10 10:39:40 2kvm2 journal: Timed out during operation: cannot acquire
state change lock (held by remoteDispatchConnectGetAllDomainStats)
Mar 10 10:39:40 2kvm2 journal: Cannot start job (query, none) for domain
TEST1-LBS_ATRYA; current job is (query, none) owned by (3288
remoteDispatchConnectGetAllDomainStats, 0 <null>) for (107s, 0s)
Mar 10 10:39:40 2kvm2 journal: Timed out during operation: cannot acquire
state change lock (held by remoteDispatchConnectGetAllDomainStats)
Mar 10 10:39:41 2kvm2 journal: vdsm vds.dispatcher ERROR SSL error receiving
from <yajsonrpc.betterAsyncore.Dispatcher connected ('::1', 40596, 0, 0) at
0x2472ef0>: unexpected eof
Mar 10 10:39:49 2kvm2 kernel: INFO: task qemu-img:42107 blocked for more
than 120 seconds.