I now use: 
# linstor controller version 
linstor controller 1.12.0; GIT-hash: 8e15f3ceaa73a9217ddd644221bb8952403f7d84 

on 3 nodes. But my problem started before upgrade from 1.11.1-1 - see att1.txt 
. 

Now I see that during taking snapshot linstor-satellite on another node was 
restarted - see att2.txt 

I was able to remove all snapshots (with: linstor s d) so now list of snapshots 
returned now by linstor s l is empty. Those snapshots was taken by vzdump 
utility therefore they are taken and removed without creating any new resource. 

I would be grateful for your help in fixing our db. (separate mail in delivery 
). 

BR, 
Michal Szamocki 
Cirrus 

> Od: "Gábor Hernádi" <gabor.hern...@linbit.com>
> Do: "drbd-user" <drbd-user@lists.linbit.com>
> Wysłane: środa, 28 kwietnia, 2021 13:45:00
> Temat: Re: [DRBD-user] Linstor hangs on deleting snapshots

> Hello,

> can you please give us more details? for example the version of the linstor
> controller
> linstor controller version

> also what exactly happened before you tried to delete ... what exactly? You 
> are
> talking about snapshots, but you are showing us a list of resources and
> resource-definitions, not snapshots and snapshot-definitions.
> Please describe what happened since you created the origin resource. Did you
> create a snapshot? And afterwards restored it in a new *resource* called
> "snap_vm-107-disk-1_vzdump"?
> Did something else happen which seems unrelated (restart of controller, other
> resources failing, or other resources were deleted or such...)

> My goal would be here to reproduce this issue, afterwards I am quite sure we 
> can
> figure out what happens and come up with a proper fix.

> For the database, feel free to send me a direct email (not via mailing list)
> with the database file so I can fix it for you.
> Or if you want to try it yourself - make a BACKUP first of the database, just 
> in
> case. Afterwards your goal should be to let Linstor delete the resource as
> there are quite a few tables that need to be cleaned up. As the exception
> states an entry in LDV (LAYER_DRBD_VOLUME) has still a foreign key to LRI
> (LAYER_RESOURCE_IDS), I'd look for entries in LDV and see if there are 
> orphaned
> or duplicates (same "target resource" but multiple IDs per KIND) . After 
> double
> checking that, you can try to delete the orphaned entries and see if Linstor
> manages to cleanly remove the rest of the resource.

> On Wed, Apr 28, 2021 at 12:01 PM Michał Szamocki < [ 
> mailto:mszamo...@cirrus.pl
> | mszamo...@cirrus.pl ] > wrote:

>> Hello,

>> my linstor cluster failed to delete snapshots and now I have:
>> # linstor rd l | grep DELETING
>> | snap_vm-107-disk-1_vzdump | 7021 | DfltRscGrp | DELETING |
>> | snap_vm-108-disk-1_vzdump | 7019 | DfltRscGrp | DELETING |

>> # linstor r l | grep DELETING
>>| snap_vm-107-disk-1_vzdump | debra | 7021 | | Ok | DELETING | 2021-04-28 
>>07:09:35
>> | |
>>| snap_vm-107-disk-1_vzdump | elsa | 7021 | | Ok | DELETING | 2021-04-28 
>>07:09:36
>> | |
>>| snap_vm-108-disk-1_vzdump | debra | 7019 | | Ok | DELETING | 2021-04-28 
>>07:07:23
>> | |
>>| snap_vm-108-disk-1_vzdump | elsa | 7019 | | Ok | DELETING | 2021-04-28 
>>07:07:23
>> | |

>> Any operation fails error similar to this:
>> Caused by:
>> ==========

>> Category: Exception
>> Class name: JdbcSQLException
>> Class canonical name: org.h2.jdbc.JdbcSQLException
>> Generated at: Method 'getJdbcSQLException', Source file 'DbException.java', 
>> Line
>> #357

>> Error message: Naruszenie ograniczenia Klucza Głównego lub Indeksu 
>> Unikalnego:
>> "FK_LDV_LRI_INDEX_C ON LINSTOR.LAYER_RESOURCE_IDS(LAYER_RESOURCE_ID) VALUES
>> (21, 116935)"
>> Unique index or primary key violation: "FK_LDV_LRI_INDEX_C ON
>> LINSTOR.LAYER_RESOURCE_IDS(LAYER_RESOURCE_ID) VALUES (21, 116935)"; SQL
>> statement:
>> INSERT INTO LAYER_RESOURCE_IDS ( LAYER_RESOURCE_ID, NODE_NAME, RESOURCE_NAME,
>> SNAPSHOT_NAME, LAYER_RESOURCE_PARENT_ID, LAYER_RESOURCE_KIND,
>> LAYER_RESOURCE_SUFFIX, LAYER_RESOURCE_SUSPENDED ) VALUES ( ?, ?, ?, ?, ?, ?, 
>> ?,
>> ? ) [23505-197]

>> lvs and drbdadm status don't show any information about those snapshot.

>> How can I safetly remove information about those snapshots from
>> linstor-controller database?

>> BR,
>> Michal Szamocki
>> Cirrus
>> _______________________________________________
>> Star us on GITHUB: [ https://github.com/LINBIT | https://github.com/LINBIT ]
>> drbd-user mailing list
>> [ mailto:drbd-user@lists.linbit.com | drbd-user@lists.linbit.com ]
>> [ https://lists.linbit.com/mailman/listinfo/drbd-user |
>> https://lists.linbit.com/mailman/listinfo/drbd-user ]

> --
> Best regards,
> Gabor Hernadi

> _______________________________________________
> Star us on GITHUB: https://github.com/LINBIT
> drbd-user mailing list
> drbd-user@lists.linbit.com
> https://lists.linbit.com/mailman/listinfo/drbd-user
Apr 28 04:00:39 debra Satellite[1157]: 04:00:39.071 [MainWorkerPool-1] INFO  
LINSTOR/Satellite - SYSTEM - Snapshot 'snap_vm-107-disk-1_vzdump' of resource 
'vm-107-disk-1' registered.
Apr 28 04:00:40 debra Satellite[1157]: 04:00:40.208 [MainWorkerPool-2] INFO  
LINSTOR/Satellite - SYSTEM - Resource 'vm-107-disk-1' updated for node 'cindy'.
Apr 28 04:00:40 debra Satellite[1157]: 04:00:40.208 [MainWorkerPool-2] INFO  
LINSTOR/Satellite - SYSTEM - Resource 'vm-107-disk-1' updated for node 'debra'.
Apr 28 04:00:40 debra Satellite[1157]: 04:00:40.208 [MainWorkerPool-2] INFO  
LINSTOR/Satellite - SYSTEM - Resource 'vm-107-disk-1' updated for node 'elsa'.
Apr 28 04:00:41 debra Satellite[1157]: 04:00:41.436 [MainWorkerPool-5] INFO  
LINSTOR/Satellite - SYSTEM - Snapshot 'snap_vm-107-disk-1_vzdump' of resource 
'vm-107-disk-1' registered.
Apr 28 04:00:42 debra systemd[1]: Stopping LINSTOR Satellite Service...
Apr 28 04:00:42 debra Satellite[1157]: 04:00:42.138 [Thread-2] INFO  
LINSTOR/Satellite - SYSTEM - Shutdown in progress
Apr 28 04:00:42 debra Satellite[1157]: 04:00:42.181 [DrbdEventService] WARN  
LINSTOR/Satellite - SYSTEM - DRBD 'events2' stream ended unexpectedly
Apr 28 04:00:42 debra Satellite[1157]: 04:00:42.522 [DeviceManager] ERROR 
LINSTOR/Satellite - SYSTEM - Failed to create snapshot 
drbdpool/vm-107-disk-1_00000_snap_vm-107-disk-1_vzdump from vm-107-disk-1_00000 
within thin volume group drbdpool/drbdthinpool [Report number 
6067DA64-7097C-000000]
Apr 28 04:00:43 debra Satellite[1157]: 04:00:43.059 [Thread-2] INFO  
LINSTOR/Satellite - SYSTEM - Shutting down service instance 'DeviceManager' of 
type DeviceManager
Apr 28 04:00:43 debra Satellite[1157]: 04:00:43.059 [Thread-2] INFO  
LINSTOR/Satellite - SYSTEM - Waiting for service instance 'DeviceManager' to 
complete shutdown
Apr 28 04:00:43 debra Satellite[1157]: 04:00:43.060 [Thread-2] INFO  
LINSTOR/Satellite - SYSTEM - Shutting down service instance 
'SnapshotShippingService' of type SnapshotShippingService
Apr 28 04:00:43 debra Satellite[1157]: 04:00:43.060 [Thread-2] INFO  
LINSTOR/Satellite - SYSTEM - Waiting for service instance 
'SnapshotShippingService' to complete shutdown
Apr 28 04:00:43 debra Satellite[1157]: 04:00:43.061 [Thread-2] INFO  
LINSTOR/Satellite - SYSTEM - Shutting down service instance 
'DrbdEventPublisher-1' of type DrbdEventPublisher
Apr 28 04:00:43 debra Satellite[1157]: 04:00:43.061 [Thread-2] INFO  
LINSTOR/Satellite - SYSTEM - Waiting for service instance 
'DrbdEventPublisher-1' to complete shutdown
Apr 28 04:00:43 debra Satellite[1157]: 04:00:43.061 [Thread-2] INFO  
LINSTOR/Satellite - SYSTEM - Shutting down service instance 
'DrbdEventService-1' of type DrbdEventService
Apr 28 04:00:43 debra Satellite[1157]: 04:00:43.061 [Thread-2] INFO  
LINSTOR/Satellite - SYSTEM - Waiting for service instance 'DrbdEventService-1' 
to complete shutdown
Apr 28 04:00:43 debra Satellite[1157]: 04:00:43.061 [Thread-2] INFO  
LINSTOR/Satellite - SYSTEM - Shutting down service instance 'FileEventService' 
of type FileEventService
Apr 28 04:00:43 debra Satellite[1157]: 04:00:43.062 [Thread-2] INFO  
LINSTOR/Satellite - SYSTEM - Waiting for service instance 'FileEventService' to 
complete shutdown
Apr 28 04:00:43 debra Satellite[1157]: 04:00:43.062 [Thread-2] INFO  
LINSTOR/Satellite - SYSTEM - Shutting down service instance 'TimerEventService' 
of type TimerEventService
Apr 28 04:00:43 debra Satellite[1157]: 04:00:43.062 [Thread-2] INFO  
LINSTOR/Satellite - SYSTEM - Waiting for service instance 'TimerEventService' 
to complete shutdown
Apr 28 04:00:43 debra Satellite[1157]: 04:00:43.066 [Thread-2] INFO  
LINSTOR/Satellite - SYSTEM - Shutdown complete
Apr 28 04:00:43 debra systemd[1]: linstor-satellite.service: Succeeded.
Apr 28 04:00:43 debra systemd[1]: Stopped LINSTOR Satellite Service.
Apr 28 04:00:43 elsa vzdump[6070]: ERROR: Backup of VM 107 failed - API 
Return-Code: 500. Message: Could not create cluster wide snapshot 
snap_vm-107-disk-1_vzdump of vm-107-disk-1, 
because:#012[{"ret_code":17563649,"message":"New snapshot 
'snap_vm-107-disk-1_vzdump' of
 resource 'vm-107-disk-1' registered.","details":"Snapshot 
'snap_vm-107-disk-1_vzdump' of resource 'vm-107-disk-1' UUID is: 
4e25798a-4f64-4c3e-a888-c81585c45474","obj_refs":{"Snapshot":"snap_vm-107-disk-1_vzdump","UUID":"4e25798a-4f64-4c3e-a888-c81585c45474","RscDfn":"v
m-107-disk-1"}},{"ret_code":17563651,"message":"Suspended IO of 'vm-107-disk-1' 
on 'cindy' for 
snapshot","obj_refs":{"RscDfn":"vm-107-disk-1","Snapshot":"snap_vm-107-disk-1_vzdump"}},{"ret_code":17563651,"message":"Suspended
 IO of 'vm-107-disk-1' on 'elsa' for snapshot"
,"obj_refs":{"RscDfn":"vm-107-disk-1","Snapshot":"snap_vm-107-disk-1_vzdump"}},{"ret_code":17563651,"message":"Suspended
 IO of 'vm-107-disk-1' on 'elsa' for 
snapshot","obj_refs":{"RscDfn":"vm-107-disk-1","Snapshot":"snap_vm-107-disk-1_vzdump"}},{"ret_code":17563651,"mes
sage":"Suspended IO of 'vm-107-disk-1' on 'debra' for 
snapshot","obj_refs":{"RscDfn":"vm-107-disk-1","Snapshot":"snap_vm-107-disk-1_vzdump"}},{"ret_code":17563651,"message":"Suspended
 IO of 'vm-107-disk-1' on 'debra' for 
snapshot","obj_refs":{"RscDfn":"vm-107-disk-1","S
napshot":"snap_vm-107-disk-1_vzdump"}},{"ret_code":17563651,"message":"Took 
snapshot of 'vm-107-disk-1' on 
'elsa'","obj_refs":{"RscDfn":"vm-107-disk-1","Snapshot":"snap_vm-107-disk-1_vzdump"}},{"ret_code":-4611686018409823258,"message":"(Node:
 'debra') Failed to create 
snapshot drbdpool/vm-107-disk-1_00000_snap_vm-107-disk-1_vzdump from 
vm-107-disk-1_00000 within thin volume group 
drbdpool/drbdthinpool","details":"Command 'lvcreate --config devices { 
filter=['a|/dev/nvme0n1p4|','r|.*|'] } --snapshot --setactivationskip y 
--ignoreactiv
ationskip --activate y --name 
drbdpool/vm-107-disk-1_00000_snap_vm-107-disk-1_vzdump 
drbdpool/vm-107-disk-1_00000' returned with exitcode 5. \n\nStandard out: 
\n\n\nError message: \n  Logical Volume 
\"vm-107-disk-1_00000_snap_vm-107-disk-1_vzdump\" already exists in vol
ume group 
\"drbdpool\"\n\n","error_report_ids":["6067DA64-7097C-000000"],"obj_refs":{"RscDfn":"vm-107-disk-1","Snapshot":"snap_vm-107-disk-1_vzdump"}},{"ret_code":-9223372036837211158,"message":"Unable
 to abort snapshot process on disconnected satellite 'debra'","detail
s":"IO may be suspended until the connection to the satellite is 
re-established","obj_refs":{"RscDfn":"vm-107-disk-1","Snapshot":"snap_vm-107-disk-1_vzdump"}},{"ret_code":17563651,"message":"Aborted
 snapshot of 'vm-107-disk-1' on 'elsa'","obj_refs":{"RscDfn":"vm-107-dis
k-1","Snapshot":"snap_vm-107-disk-1_vzdump"}}]#012 at 
/usr/share/perl5/PVE/Storage/Custom/LINSTORPlugin.pm line 
428.#012#011PVE::Storage::Custom::LINSTORPlugin::volume_snapshot("PVE::Storage::Custom::LINSTORPlugin",
 HASH(0x560dd2334d80), "drbd1", "vm-107-disk-1", "vzdum
p") called at /usr/share/perl5/PVE/Storage.pm line 
293#012#011PVE::Storage::volume_snapshot(HASH(0x560dd1efabe0), 
"drbd1:vm-107-disk-1", "vzdump") called at /usr/share/perl5/PVE/LXC/Config.pm 
line 181#012#011PVE::LXC::Config::__snapshot_create_vol_snapshot("PVE::LXC::Co
nfig", 107, "rootfs", HASH(0x560dd22dd3e0), "vzdump") called at 
/usr/share/perl5/PVE/AbstractConfig.pm line 
803#012#011PVE::AbstractConfig::__ANON__("rootfs", HASH(0x560dd22dd3e0)) called 
at /usr/share/perl5/PVE/AbstractConfig.pm line 
475#012#011PVE::AbstractConfig::for
each_volume_full("PVE::LXC::Config", HASH(0x560dd2307218), undef, 
CODE(0x560dd2303910)) called at /usr/share/perl5/PVE/AbstractConfig.pm line 
484#012#011PVE::AbstractConfig::foreach_volume("PVE::LXC::Config", 
HASH(0x560dd2307218), CODE(0x560dd2303910)) called at /usr/sh
are/perl5/PVE/AbstractConfig.pm line 805#012#011eval {...} called at 
/usr/share/perl5/PVE/AbstractConfig.pm line 
793#012#011PVE::AbstractConfig::snapshot_create("PVE::LXC::Config", 107, 
"vzdump", 0, "vzdump backup snapshot") called at 
/usr/share/perl5/PVE/VZDump/LXC.pm 
line 225#012#011PVE::VZDump::LXC::__ANON__() called at 
/usr/share/perl5/PVE/AbstractConfig.pm line 
299#012#011PVE::AbstractConfig::__ANON__() called at 
/usr/share/perl5/PVE/Tools.pm line 220#012#011eval {...} called at 
/usr/share/perl5/PVE/Tools.pm line 220#012#011PVE::
Tools::lock_file_full("/run/lock/lxc/pve-config-107.lock", 10, 0, 
CODE(0x560dd22fd398)) called at /usr/share/perl5/PVE/AbstractConfig.pm line 
302#012#011PVE::AbstractConfig::__ANON__("PVE::LXC::Config", 107, 10, 0, 
CODE(0x560dd1eda968)) called at /usr/share/perl5/PVE/Ab
stractConfig.pm line 
322#012#011PVE::AbstractConfig::lock_config_full("PVE::LXC::Config", 107, 10, 
CODE(0x560dd1eda968)) called at /usr/share/perl5/PVE/AbstractConfig.pm line 
330#012#011PVE::AbstractConfig::lock_config("PVE::LXC::Config", 107, 
CODE(0x560dd1eda968)) call
ed at /usr/share/perl5/PVE/VZDump/LXC.pm line 
227#012#011PVE::VZDump::LXC::snapshot(PVE::VZDump::LXC=HASH(0x560dd1eb8a68), 
HASH(0x560dd1ed8230), 107) called at /usr/share/perl5/PVE/VZDump.pm line 
952#012#011eval {...} called at /usr/share/perl5/PVE/VZDump.pm line 738#01
2#011PVE::VZDump::exec_backup_task(PVE::VZDump=HASH(0x560dce9f0280), 
HASH(0x560dd1ed8230)) called at /usr/share/perl5/PVE/VZDump.pm line 
1160#012#011eval {...} called at /usr/share/perl5/PVE/VZDump.pm line 
1155#012#011PVE::VZDump::exec_backup(PVE::VZDump=HASH(0x560dce9f
0280), PVE::RPCEnvironment=HASH(0x560dce9f08c8), "root\@pam") called at 
/usr/share/perl5/PVE/API2/VZDump.pm line 
124#012#011PVE::API2::VZDump::__ANON__("UPID:elsa:000017B6:0CD92690:6088C1A2:vzdump::root\@pam:")
 called at /usr/share/perl5/PVE/RESTEnvironment.pm line 610#
012#011eval {...} called at /usr/share/perl5/PVE/RESTEnvironment.pm line 
601#012#011PVE::RESTEnvironment::fork_worker(PVE::RPCEnvironment=HASH(0x560dce9f08c8),
 "vzdump", undef, "root\@pam", CODE(0x560dd1eb2ad8)) called at 
/usr/share/perl5/PVE/API2/VZDump.pm line 148#012
#011PVE::API2::VZDump::__ANON__(HASH(0x560dcc6af1e8)) called at 
/usr/share/perl5/PVE/RESTHandler.pm line 
453#012#011PVE::RESTHandler::handle("PVE::API2::VZDump", HASH(0x560dd1d57830), 
HASH(0x560dcc6af1e8)) called at /usr/share/perl5/PVE/RESTHandler.pm line 
865#012#011ev
al {...} called at /usr/share/perl5/PVE/RESTHandler.pm line 
848#012#011PVE::RESTHandler::cli_handler("PVE::API2::VZDump", "vzdump", 
"vzdump", ARRAY(0x560dcc6a31d8), "vmid", undef, undef, undef) called at 
/usr/share/perl5/PVE/CLIHandler.pm line 630#012#011PVE::CLIHandler
::__ANON__(ARRAY(0x560dcc6a31d8), undef, undef) called at 
/usr/share/perl5/PVE/CLIHandler.pm line 
666#012#011PVE::CLIHandler::run_cli_handler("PVE::CLI::vzdump") called at 
/usr/bin/vzdump line 8
_______________________________________________
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
drbd-user@lists.linbit.com
https://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to