Hi Ravi, I would like to avoid an offline upgrade, since it would disrupt quite some services. Is there anything further I can investigate or do?
Thanks Olaf Op di 3 nov. 2020 om 12:17 schreef Ravishankar N <ravishan...@redhat.com>: > > On 02/11/20 8:35 pm, Olaf Buitelaar wrote: > > Dear Gluster users, > > I'm trying to upgrade from gluster 6.10 to 7.8, i've currently tried this > on 2 hosts, but on both the Self-Heal Daemon refuses to start. > It could be because not all not are updated yet, but i'm a bit hesitant to > continue, without the Self-Heal Daemon running. > I'm not using quata's and i'm not seeing the peer reject messages, as > other users reported in the mailing list. > In fact gluster peer status and gluster pool list, display all nodes as > connected. > Also gluster v heal <vol> info shows all nodes as Status: connected, > however some report pending heals, which don't really seem to progress. > Only in gluster v status <vol> the 2 upgraded nodes report not running; > > Self-heal Daemon on localhost N/A N/A N > N/A > Self-heal Daemon on 10.32.9.5 N/A N/A Y > 24022 > Self-heal Daemon on 10.201.0.4 N/A N/A Y > 26704 > Self-heal Daemon on 10.201.0.3 N/A N/A N > N/A > Self-heal Daemon on 10.32.9.4 N/A N/A Y > 46294 > Self-heal Daemon on 10.32.9.3 N/A N/A Y > 22194 > Self-heal Daemon on 10.201.0.9 N/A N/A Y > 14902 > Self-heal Daemon on 10.201.0.6 N/A N/A Y > 5358 > Self-heal Daemon on 10.201.0.5 N/A N/A Y > 28073 > Self-heal Daemon on 10.201.0.7 N/A N/A Y > 15385 > Self-heal Daemon on 10.201.0.1 N/A N/A Y > 8917 > Self-heal Daemon on 10.201.0.12 N/A N/A Y > 56796 > Self-heal Daemon on 10.201.0.8 N/A N/A Y > 7990 > Self-heal Daemon on 10.201.0.11 N/A N/A Y > 68223 > Self-heal Daemon on 10.201.0.10 N/A N/A Y > 20828 > > After the upgrade i see the > file /var/lib/glusterd/vols/<vol>/<vol>-shd.vol being created, which > doesn't exists on the 6.10 nodes. > > in the logs i see these relevant messages; > log: glusterd.log > 0-management: Regenerating volfiles due to a max op-version mismatch or > glusterd.upgrade file not being present, op_version retrieved:60000, max > op_version: 70200 > > I think this is because of the shd multiplex ( > https://bugzilla.redhat.com/show_bug.cgi?id=1659708) added by Rafi. > > Rafi, is there any workaround which can work for rolling upgrades? Or > should we just do an offline upgrade of all server nodes for the shd to > come online? > > -Ravi > > > > [2020-10-31 21:48:42.256193] W [MSGID: 106204] > [glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management: Unknown > key: tier-enabled > [2020-10-31 21:48:42.256232] W [MSGID: 106204] > [glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management: Unknown > key: brick-0 > [2020-10-31 21:48:42.256240] W [MSGID: 106204] > [glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management: Unknown > key: brick-1 > [2020-10-31 21:48:42.256246] W [MSGID: 106204] > [glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management: Unknown > key: brick-2 > [2020-10-31 21:48:42.256251] W [MSGID: 106204] > [glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management: Unknown > key: brick-3 > [2020-10-31 21:48:42.256256] W [MSGID: 106204] > [glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management: Unknown > key: brick-4 > [2020-10-31 21:48:42.256261] W [MSGID: 106204] > [glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management: Unknown > key: brick-5 > [2020-10-31 21:48:42.256266] W [MSGID: 106204] > [glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management: Unknown > key: brick-6 > [2020-10-31 21:48:42.256271] W [MSGID: 106204] > [glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management: Unknown > key: brick-7 > [2020-10-31 21:48:42.256276] W [MSGID: 106204] > [glusterd-store.c:3275:glusterd_store_update_volinfo] 0-management: Unknown > key: brick-8 > > [2020-10-31 21:51:36.049009] W [MSGID: 106617] > [glusterd-svc-helper.c:948:glusterd_attach_svc] 0-glusterd: attach failed > for glustershd(volume=backups) > [2020-10-31 21:51:36.049055] E [MSGID: 106048] > [glusterd-shd-svc.c:482:glusterd_shdsvc_start] 0-glusterd: Failed to attach > shd svc(volume=backups) to pid=9262 > [2020-10-31 21:51:36.049138] E [MSGID: 106615] > [glusterd-shd-svc.c:638:glusterd_shdsvc_restart] 0-management: Couldn't > start shd for vol: backups on restart > [2020-10-31 21:51:36.183133] I [MSGID: 106618] > [glusterd-svc-helper.c:901:glusterd_attach_svc] 0-glusterd: adding svc > glustershd (volume=backups) to existing process with pid 9262 > > log: glustershd.log > > [2020-10-31 21:49:55.976120] I [MSGID: 100041] > [glusterfsd-mgmt.c:1111:glusterfs_handle_svc_attach] 0-glusterfs: received > attach request for volfile-id=shd/backups > [2020-10-31 21:49:55.976136] W [MSGID: 100042] > [glusterfsd-mgmt.c:1137:glusterfs_handle_svc_attach] 0-glusterfs: got > attach for shd/backups but no active graph [Invalid argument] > > So i suspect something in the logic for the self-heal daemon has > changed, since it has the new *.vol configuration for the shd. Question is, > is this just a transitional state, till all nodes are upgraded. And thus > safe to continue the update. Or is this something that should be fixed, and > if so, any clues how? > > Thanks Olaf > > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://bluejeans.com/441850968 > > Gluster-users mailing > listGluster-users@gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users > >
________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users