Re: [DRBD-user] Update DRBD in product

2023-03-23 Thread leroy.matthieu50
  
  
  

  
  
The target is to have both servers in Ubuntu 20.04.   
  
Before, they were in Ubuntu 18.04 but I want to switch the data to the updated 
server before updating the second one.
  
Thank you for your help.
  
  

  

  
  
  
  
  
>   
> Le 23 mars 2023 à 15:45, Roland Kammerera 
> écrit :
>   
>   
>  On Wed, Mar 22, 2023 at 11:30:09AM +0100, matthieu le roy wrote:  
> >  OS : Ubuntu  18.04.2  LTS  
> >  DRBD_KERNEL_VERSION=9.0.18   
>
> vs.  
>
> >  OS : Ubuntu  20.04.6  LTS  
> >  DRBD_KERNEL_VERSION=9.2.2   
>
> sorry, but I did not look further,  9.0.18  is completely outdated. Get  
> some matching, *current* versions and then try again please. I assume  
> these DRBD packages are from the PPA? Then sorry again, we only maintain  
> the last 2 LTS releases in the PPA, as stated in the info for the PPA.  
> We have current packages for  18.04.2, but that is for our customers  
> only.  
>
> I see the following options:  
> - upgrade the old server to a newer Ubuntu version which will then  
>  allow you to install a current DRBD from PPA.  
> - take the drbd-dkms from the PPA, but from a newer Ubuntu version. That  
>  should work, but you are obviously on your own.  
> - compile a newer DRBD version from a release tarball  
> - become a customer.  
>
> Regards, rck  
> ___  
> Star us on GITHUB: https://github.com/LINBIT  
> drbd-user mailing list  
> drbd-user@lists.linbit.com   
> https://lists.linbit.com/mailman/listinfo/drbd-user   
> 
 ___
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
drbd-user@lists.linbit.com
https://lists.linbit.com/mailman/listinfo/drbd-user


[DRBD-user] linstor-proxmox v7.0.0-rc.1

2023-03-23 Thread Roland Kammerer
Dear DRBD on PVE users,

This is RC1 of the upcoming 7.0.0 release of linstor-proxmox. First
things first: This *requires* LINSTOR 1.21.1 (or newer).

So far we more or less (size reporting was always a mess) tried to show
a node-local view of storage. For example we only showed pools that
actually exist on a node. This was okay, but had two problems:

- showing only the local storage in a distributed cluster does not make
  too much sense. The actual data lands on nodes LINSTOR decides.
- there are people that have the storage nodes pretty much separated
  from the PVE nodes. For these we did not show any storage information
  on such nodes (as they did not have the RG's SP deployed), which is
  rather confusing.

Now we shifted the calculation of free/used storage to LINSTOR and that
is what we show on all nodes.

tl;tr: we shifted from a node local view to a cluster view.

For PVE nodes we have the following assumption, which we already had:
- linstor-satellite (and all that comes with it: drbd-utils,
  drbd-dkms,...)
- linstor-proxmox plugin installed

If nobody complains this will become 7.0.0 in about a week from now,
please test.

If you use PVE7, you can take this package for testing (LINSTOR should
already be in the stable repos):
https://packages.linbit.com/public/staging/dists/proxmox-7/drbd-9/pool/linstor-proxmox_7.0.0~rc.1-1_all.deb

Regards, rck

GIT: 
https://github.com/LINBIT/linstor-proxmox/commit/8589c4f41da4d78d4f8c1eb3051e83c296cc10ed
TGZ: 
https://pkg.linbit.com//downloads/connectors/linstor-proxmox-7.0.0-rc.1.tar.gz


signature.asc
Description: PGP signature
___
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
drbd-user@lists.linbit.com
https://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] Update DRBD in product

2023-03-23 Thread Roland Kammerer
On Wed, Mar 22, 2023 at 11:30:09AM +0100, matthieu le roy wrote:
> OS : Ubuntu 18.04.2 LTS
> DRBD_KERNEL_VERSION=9.0.18

vs.

> OS : Ubuntu 20.04.6 LTS
> DRBD_KERNEL_VERSION=9.2.2

sorry, but I did not look further, 9.0.18 is completely outdated. Get
some matching, *current* versions and then try again please. I assume
these DRBD packages are from the PPA? Then sorry again, we only maintain
the last 2 LTS releases in the PPA, as stated in the info for the PPA.
We have current packages for 18.04.2, but that is for our customers
only.

I see the following options:
- upgrade the old server to a newer Ubuntu version which will then
  allow you to install a current DRBD from PPA.
- take the drbd-dkms from the PPA, but from a newer Ubuntu version. That
  should work, but you are obviously on your own.
- compile a newer DRBD version from a release tarball
- become a customer.

Regards, rck
___
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
drbd-user@lists.linbit.com
https://lists.linbit.com/mailman/listinfo/drbd-user


[DRBD-user] Update DRBD in product

2023-03-23 Thread matthieu le roy
Hello,

I have two servers running in high availability..
here is the info of the first server :

OS : Ubuntu 18.04.2 LTS
#drbdadm --version
DRBDADM_BUILDTAG=GIT-hash:\ 38a99411a8fcb883214a5300ad0ce1ef7ca37730\
build\ by\ buildd@lgw01-amd64-016\,\ 2019-05-27\ 12:45:18
DRBDADM_API_VERSION=2
DRBD_KERNEL_VERSION_CODE=0x090012
DRBD_KERNEL_VERSION=9.0.18
DRBDADM_VERSION_CODE=0x090900
DRBDADM_VERSION=9.9.0

here is the info of the second server after update :

OS : Ubuntu 20.04.6 LTS
# drbdadm --version
DRBDADM_BUILDTAG=GIT-hash:\ e267c4413f7cb3d8ec5e793c3fa7f518e95f23b1\
build\ by\ buildd@lcy02-amd64-101\,\ 2023-03-14\ 09:57:26
DRBDADM_API_VERSION=2
DRBD_KERNEL_VERSION_CODE=0x090202
DRBD_KERNEL_VERSION=9.2.2
DRBDADM_VERSION_CODE=0x091701
DRBDADM_VERSION=9.23.1

drbd config :

#cat /etc/drbd.d/alfresco.conf
resource alfresco {
  handlers {
#before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh";
#after-resync-target "/usr/lib/drbd/unsnapshot-resync-target-lvm.sh";
  }
  on storage1 {
device /dev/drbd5;
disk /dev/datavg/alfresco;
node-id 10;
address   10.50.20.1:7004;
meta-disk internal;
  }
  on storage2 {
device /dev/drbd5;
disk /dev/datavg/appli;
node-id 11;
address   10.50.20.2:7004;
meta-disk internal;
  }
}

# cat /etc/drbd.d/global_common.conf
global {
usage-count yes;
udev-always-use-vnr;
}

common {
handlers {
split-brain "/usr/lib/drbd/notify-split-brain.sh root";
}
net {
 after-sb-0pri discard-zero-changes;
 after-sb-1pri discard-secondary;
 after-sb-2pri disconnect;
data-integrity-alg crc32c;
timeout 90;
  ping-timeout 20;
  ping-int 15;
  connect-int 10;
}
}


order placed after update :


#drbdadm create-md appli
#drbdadm up appli

the synchro is launched I was able to follow the progress but arrived at
100% here is the status of the servers:

storage1 :
# drbdadm status alfresco
alfresco role:Primary
  disk:UpToDate
  storage2 role:Secondary
replication:SyncSource peer-disk:Inconsistent

# drbdsetup status --verbose --statistics alfresco
alfresco node-id:10 role:Primary suspended:no
write-ordering:flush
  volume:0 minor:5 disk:UpToDate quorum:yes
  size:536854492 read:423078021 written:419423956 al-writes:9640
bm-writes:0 upper-pending:0 lower-pending:0
  al-suspended:no blocked:no
  storage2 node-id:11 connection:Connected role:Secondary congested:no
ap-in-flight:0 rs-in-flight:0
volume:0 replication:SyncSource peer-disk:Inconsistent
resync-suspended:no
received:0 sent:421584224 out-of-sync:0 pending:0 unacked:0

storage2 :

# drbdadm status alfresco
alfresco role:Secondary
  disk:Inconsistent
  storage1 role:Primary
replication:SyncTarget peer-disk:UpToDate

# drbdsetup status --verbose --statistics alfresco
alfresco node-id:11 role:Secondary suspended:no force-io-failures:no
write-ordering:flush
  volume:0 minor:5 disk:Inconsistent backing_dev:/dev/datavg/alfresco
quorum:yes
  size:536854492 read:0 written:421584224 al-writes:14 bm-writes:6112
upper-pending:0 lower-pending:0
  al-suspended:no blocked:no
  storage1 node-id:10 connection:Connected role:Primary congested:no
ap-in-flight:0 rs-in-flight:0
volume:0 replication:SyncTarget peer-disk:UpToDate resync-suspended:no
received:421584224 sent:0 out-of-sync:0 pending:0 unacked:0


and while I haven't had any logs concerning drbd on storage1 since the
start of the sync, I have on storage2 these logs in a loop :

Mar 22 10:22:31 storage2 kernel: [ 4713.898381] INFO: task
drbd_s_alfresco:2104 blocked for more than 120 seconds.
Mar 22 10:22:31 storage2 kernel: [ 4713.898465]   Tainted: G
OE 5.4.0-144-generic #161-Ubuntu
Mar 22 10:22:31 storage2 kernel: [ 4713.898530] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 22 10:22:31 storage2 kernel: [ 4713.898604] drbd_s_alfresco D0
 2104  2 0x80004000
Mar 22 10:22:31 storage2 kernel: [ 4713.898609] Call Trace:
Mar 22 10:22:31 storage2 kernel: [ 4713.898624]  __schedule+0x2e3/0x740
Mar 22 10:22:31 storage2 kernel: [ 4713.898633]  ?
update_load_avg+0x7c/0x670
Mar 22 10:22:31 storage2 kernel: [ 4713.898641]  ? sched_clock+0x9/0x10
Mar 22 10:22:31 storage2 kernel: [ 4713.898648]  schedule+0x42/0xb0
Mar 22 10:22:31 storage2 kernel: [ 4713.898656]
 rwsem_down_write_slowpath+0x244/0x4d0
Mar 22 10:22:31 storage2 kernel: [ 4713.898663]  ?
put_prev_entity+0x23/0x100
Mar 22 10:22:31 storage2 kernel: [ 4713.898675]  down_write+0x41/0x50
Mar 22 10:22:31 storage2 kernel: [ 4713.898703]
 drbd_resync_finished+0x97/0x7c0 [drbd]
Mar 22 10:22:31 storage2 kernel: [ 4713.898735]  ? drbd_cork+0x64/0x70
[drbd]
Mar 22 10:22:31 storage2 kernel: [ 4713.898754]  ?
wait_for_sender_todo+0x21e/0x240 [drbd]
Mar 22 10:22:31 storage2 kernel: [ 4713.898777]
 w_resync_finished+0x2c/0x40 [drbd]
Mar 22 10:22:31 storage2 kernel: [ 4713.898795]  drbd_sender+0x13e/0x3d0
[drbd]
Mar 22 10:22:31 storage2 kernel: [ 4713.898827]
 drbd_thread_setup+0x87/0x1d0 [drbd]
M

Re: [DRBD-user] Using quorum in three node cluster results in split brain

2023-03-23 Thread Markus Hochholdinger
Hi,

Am Donnerstag, 23. März 2023, 10:21:22 CET schrieb Philipp Reisner:
> Thanks for sending this bug report, including instructions on
> reproducing it.  At first, I ignored your report because I could not
> reproduce the issue.  Thanks to your persistence, I realized that this
> issue only reproduces on the versions you reported. So it is something
> that is already fixed in the drbd-9.1 branch.

I'm very glad there's a solution for this issue. Great work, many thanks :-)

-- 
Mfg

Markus Hochholdinger


___
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
drbd-user@lists.linbit.com
https://lists.linbit.com/mailman/listinfo/drbd-user


[DRBD-user] drbd-reactor v1.1.0

2023-03-23 Thread Roland Kammerer
Dear DRBD users,

this is drbd-reactor version 1.1.0. There have not been any reported
issues for RC1.

Noteworthy things in this release:
- the prometheus plugin now exposes a metric for DRBD versions (loaded
  kernel module and drbd-utils). Currently this is evaluated once at
  plugin start.
- a fix for the promoter plugin when used with OCF agents (see version
  1.0.1)
- I guess this is the first release where I'm not the contributor with
  the most commits. Thanks to Joel and Matt for putting a lot of effort
  into CI tests. Also to Moritz who is "invisible" in the Changelog but
  irons out most of my bugs during review
- drbd-reactorctl is now cluster/context aware

I want to elaborate on the last point a bit, taking most of the
information from drbd-reactorctl(1):

There are now global arguments, '--context' and '--nodes' that allow to
specify a cluster context and to filter nodes in that context.

Users can define cluster contexts via toml files which consist of nodes
entries that themselves have fields for hostname and user. Usually one
does not need to set the specific fields as they have sane defaults: The
name of the node entry is used as hostname if not otherwise specified,
and root is the default user. If a cluster context is given or
default.toml exists commands are executed on all nodes defined for that
context. Execution is carried out in parallel via ssh.

A simple configuration can look like this:

 cat ~/.config/drbd-reactorctl/production.toml
 [nodes."alpha.mynet"]
 [nodes."beta.mynet"]
 [nodes."gamma.mynet"]

Node names should follow the output of uname -n and also please make
sure to quote node names containing dots.

It is also possible to define a command that is executed to generate the
nodes list via nodes-script. These commands, usually simple shell
scripts, are expected to be stored in the same directory as the toml
files. The scripts are expected to generate a valid nodes list as
documented above on stdout. Such a configuration would then look like
this:

 cat ~/.config/drbd-reactorctl/linstor.toml
 nodes-script="linstor.sh"

To disable a promoter plugin (linstor_db in our example) and stop its
systemd target cluster wide, one could now execute:

 drbd-reactorctl disable --context production --now linstor_db

Regards, rck

GIT: 
https://github.com/LINBIT/drbd-reactor/commit/d07771d37bfec71880e40257d43d285d6f4209ec
TGZ: https://pkg.linbit.com//downloads/drbd/utils/drbd-reactor-1.1.0.tar.gz
PPA: https://launchpad.net/~linbit/+archive/ubuntu/linbit-drbd9-stack

Changelog:
[ Joel Colledge ]
* e2e: add initial end-to-end test infrastructure
* e2e: provide a clear error when an empty test name is given
* e2e,virter: add provisioning file for tests
* e2e,docker: add initial container configuration for test suite
* e2e,virter: add initial Virter provisioning file for running tests
* e2e,virter: add configuration and a wrapper script for running vmshed
* e2e,virter: add getting started guide
* ci: add explicit stage to existing jobs
* ci: add job to build for e2e tests
* ci: add job to build docker image for e2e tests
* ci: add job to run end-to-end tests
* ci: add e2e tests lint job
* e2e: shorten names ReactorPromoter -> Promoter etc.
* e2e: add initial test for the User Mode Helper Plug-in
* ci: add job to check e2e test typing
* ci: allow pipeline to be started from the API
* e2e: factor polling for a condition out into a function
* e2e,promoter_preferred_node: make test more reliable
* Revert "e2e: disable promoter_preferred_node"

[ Roland Kammerer ]
* ci: disable for ordinary remote branches
* e2e: disable promoter_preferred_node
* prometheus: expose drbd_version
* promoter,ocf: fix env for old systemd
* ctl: add context

[ Matt Kereczman ]
* e2e: add preferred node to promoter tests
* e2e: add prometheus test


signature.asc
Description: PGP signature
___
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
drbd-user@lists.linbit.com
https://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] Using quorum in three node cluster results in split brain

2023-03-23 Thread Philipp Reisner
Hi Markus,

Thanks for sending this bug report, including instructions on
reproducing it.  At first, I ignored your report because I could not
reproduce the issue.  Thanks to your persistence, I realized that this
issue only reproduces on the versions you reported. So it is something
that is already fixed in the drbd-9.1 branch.

Here, Is the test to reproduce your steps in an automated way:
---
#! /usr/bin/env python3
#
from python import drbdtest
from python.drbdtest import connections, log, peer_devices

resource = drbdtest.setup_resource(nodes=3)
resource.resource_options = 'quorum majority;'
A, B, C = resource.nodes
resource.add_disk('1M', diskful_nodes=[A, B])
resource.up_wait()

log('* Make up-to-date data available.')
resource.skip_initial_sync()

A.primary()
connections(to_node=A).event(r'connection .* role:Primary')
connections(A, B).disconnect(wait=False) # wait=False ->
'peer-disk:Outdated' observable:
peer_devices(A, B).event(r'peer-device .* peer-disk:Outdated')
connections(A, C).disconnect()
try:
B.primary()
except:
pass
else:
raise RuntimeError('B promoted!')

resource.down()
resource.cluster.teardown()
---
Using the drbd-test suite ( https://github.com/LINBIT/drbd9-tests )

With that, I was able to identify which of the recent changes fixes
that issue. It is
https://github.com/LINBIT/drbd/commit/057d17f455e909a75827948cd1fa932e58793a66

It will be released with drbd-9.1.14 and drbd-9.2.3 in about two weeks.

And I will add this test snipped to one of the larger quorum-* tests cases.

best regards,
 Philipp

On Wed, Mar 22, 2023 at 7:50 PM Markus Hochholdinger
 wrote:
>
> Hi,
>
> Am Dienstag, 21. März 2023, 17:59:37 CET schrieb Markus Hochholdinger:
> > Still wondering why the diskless quorum is not working for me.
>
> I've tested the following drbd versions:
> 9.1.0 ok
> 9.1.5 ok
> 9.1.6 ok
> 9.1.7 fail
> 9.1.8 fail
> 9.1.13 fail
> 9.2.0 fail
> 9.2.1 fail
> 9.2.2 fail
>
> Between 9.1.6 and 9.1.7 something changed what resulted in:
> An Outdated Secondary can become Primary (without --force)
>
> Now I'll have a look at the diff between those two versions
>
>
> --
> Mfg
>
> Markus Hochholdinger
>
>
> ___
> Star us on GITHUB: https://github.com/LINBIT
> drbd-user mailing list
> drbd-user@lists.linbit.com
> https://lists.linbit.com/mailman/listinfo/drbd-user
___
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
drbd-user@lists.linbit.com
https://lists.linbit.com/mailman/listinfo/drbd-user