No performance benefit from enabling csums-alg

2024-04-17 Thread Tim Westbrook
We are testing to see if enabling csums-alg would reduce resync times when we 
have to clear metadata or invalidate a node

I have a Primary (P) and two secondaries (S1, S2)

S2 is disabled (off) to simplify interactions. 

When do any of the following I see no appreciable difference in the time 
between Connecting state and Consistent between 
systems with `csums-alg md5​ ` and one without that setting. [1]


* invalidate S1 on S1
* down drbd on S1 , clear metadata on S1, start drbd on S1
* disconnect network, create 1G file of 0s on P, reconnect network
* disconnect network, create 1G file of random on P, reconnect network 

This is on version 9.2.4 of the driver

Just trying to understand this feature better,.


Thanks,
Tim


[1]
  // with
  net {
after-sb-1pri discard-secondary;
verify-alg md5;
csums-alg md5;
  }
  // without
  net {
after-sb-1pri discard-secondary;
verify-alg md5;
  }

  note -verify works so I assume the md5 ability is in the kernel

Re: Usynced blocks if replication is interrupted during initial sync

2024-04-17 Thread Tim Westbrook
Philipp


Thanks again, 

Is there a way to tell what issues were addressed between version 9.2.4 and the 
current version? 

We may not be able to delay our release until 9.2.9 and would like to 
understand as much as possible what else we may need to be concerned with

Cheers,
Tim


From: Philipp Reisner 
Sent: Thursday, April 4, 2024 1:06 PM
To: Tim Westbrook 
Cc: drbd-user@lists.linbit.com 
Subject: Re: Usynced blocks if replication is interrupted during initial sync
 
[Caution - External]

Hello Tim,

We were able to write a reproducer test case and fix this regression
with this commit:
https://urldefense.com/v3/__https://github.com/LINBIT/drbd/commit/be9a404134acc3d167e8a7e60adce4f1910a4893__;!!O7uE89YCNVw!Lg3rRgojII2WxVzSLqO-h7mIpRxkiz34chmd89P-b1GDlUP3QD3-jc3gdlj5aTFp9uwgCw_5PBjXtwPtevJ0JK_oC8s8ZGg$

This commit will go into the drbd-9.1.20 and drbd-9.2.9 releases.

best regards,
 Philipp

On Fri, Mar 22, 2024 at 1:49 AM Tim Westbrook  wrote:
>
>
>
> Thank you
>
>
> So if "Copying bitmap of peer node_id=0" on reconnect after interruption, 
> indicates the issue, the issue still exists for me.
>
> I am able to dump the metadata, but not sure it is very useful at this 
> point...
>
> I have not tried invalidating it after a mount/unmount, nor have I tried 
> invalidating it after adding a node, but we were trying to avoid unmounting 
> once configured.
>
> Would you recommend against going back to a release version prior to this 
> change?
>
> Is there any other information I can provide that would help ?  Could I dump 
> the meta data at any some point to show the expected/unexpected state?
>
> Latest flow is below
>
> Thank you so much for your assistance,
> Tim
>
> 1. /dev/vg/persist mounted directly without drbd
> 2. Enable DRBD by creating a single node configuration file
> 3. Reboot
> 4. Create metadata on separate disk (--max-peers=5)
> 5. drdbadm up persist
> 6. drbdadm invalidate persist
> 7. drbdadm primary --force persist
> 8. drbdadm down persist
> 9. drbdadm up persist
> 10. drbdadm invalidate persist*
> 11. drbdadm primary --force persist
> 12. mount /dev/drbd0 to /persist
> 13. start using that mount point
> 14. some time later
> 15. Modify configuration to add new target backup node
> 16. Copy config to remote node and reboot, it will restart in secondary
> 17. drbdadm adjust persist (on primary)
> 18. secondary comes up and initial sync starts
> 19. stop at 50% by disabling network interface
> 20. re-enable network interface
> 21. sync completes right away - node-id 0 message here
> 22. drbdadm verify persist - fails many blocks
>
>
>
>
> From: Joel Colledge 
> Sent: Wednesday, March 20, 2024 12:02 AM
> To: Tim Westbrook 
> Cc: drbd-user@lists.linbit.com 
> Subject: Re: Usynced blocks if replication is interrupted during initial sync
>
> [Caution - External]
>
> > We are still seeing the issue as described but perhaps I am not putting the 
> > invalidate
> > at the right spot
> >
> > Note - I've added it at step 6 below, but I'm wondering if it should be 
> > after
> > the additional node is configured and adjusted (in which case I would need 
> > to
> > unmount as apparently you can't invalidate a disk in use)
> >
> > So do I need to invalidate after every node is added?
>
> With my reproducer, the workaround at step 6 works.
>
> > Also Note, the node-id in the logs from the kernel is 0 but peers are 
> > configured with 1 and 2 ,
> > is this an issue or they separate ids?
>
> I presume you are referring to the line:
> "Copying bitmap of peer node_id=0"
> The reason that node ID 0 appears here is that DRBD stores a bitmap of
> the blocks that have changed since it was first brought up. This is
> the "day0" bitmap. This is stored in all unused bitmap slots. All
> unused node IDs point to one of these bitmaps. In this case, node ID 0
> is unused. So this line means that it is using the day0 bitmap here.
> This is unexpected, as mentioned in my previous reply.
>
> Joel

linstor-gateway v1.5.0

2024-04-17 Thread Christoph Böhmwalder
Hello,

a new version of LINSTOR Gateway has been released, v1.5.0.

https://github.com/LINBIT/linstor-gateway/releases/tag/v1.5.0

There have been no further changes since the rc announcement last week.
For the full changelog, please see the release notes on GitHub.

Thanks,
Christoph

-- 
Christoph Böhmwalder
LINBIT | Keeping the Digital World Running
DRBD HA —  Disaster Recovery — Software defined Storage


linstor-proxmox v8.0.0

2024-04-17 Thread Roland Kammerer
Dear PVE on LINSTOR/DRBD users,

after some calmer period I'm happy to announce the final version 8.0.0
that includes two new features, but let me first notice that this
version needs LINSTOR 1.27.0 or later!

We did not find any bugs in the RC, for your convenience the last RC
announce mail:

- reassigning disks from one VM to another
- online migration of external storage like LVM to LINSTOR/DRBD

Reassining disks:
In order to allow reassigning disks between VMs we had to change the
naming scheme. New disks (also clones of old ones) will have names like
"pm-12cf742a_101" on PVE level, and "pm-12cf742a" on LINSTOR/DRBD level.
This is a static prefix ("pm-"), 8 characters of a UUID, and on PVE
level the VMID separated by an underscore ("_101"). The new naming
obsoletes the old "vm-101-disk-1" like names. Old VMs with legacy names
still work with version 8 of the plugin. For the user this change should
be completely transparent and should not require any changes besides
getting accustomed to the new names.

Online migration from external storage:
So far it was only possible to migrate data from external storage like
LVM to LINSTOR/DRBD if the data was migrated offline. If you want to
migrate data online you can now *temporarily* set "exactsize yes" in
your "storage.cfg" for a particular DRBD storage and then migrate disks
to it. After you are done, remove the "exactsize" option from the
"storage.cfg" again. The LINSTOR property that allowed temporary online
migration is then deleted when the disk is activated again (but not if
it is currently active). If you want to delete the property for all
active disks after migration, or you want to be extra sure, you can run
a command like this:

# linstor -m --output-version v1 rd l | jq '.[][].name' | \
  xargs -I {} linstor rd sp {} DrbdOptions/ExactSize False

Regards, rck

GIT: 
https://github.com/LINBIT/linstor-proxmox/commit/2da369dba902b7f540c6c36a28a29c07c3bc1030
TGZ: https://pkg.linbit.com//downloads/connectors/linstor-proxmox-8.0.0.tar.gz

Changelog:

[Roland Kammerer]
 * show diskless disks in "VM Disks"
 * allow reassigning disks
 * test: add simple test script
 * api: also check for APIAGE
 * add online mv storage
 * fix spelling
 * set allow-two-primaries in RD creation


signature.asc
Description: PGP signature