[ovirt-users] Re: unable to bring up gluster bricks after 4.5 upgrade
It appears that I may have resolved the issue after putting host into maintenance again and rebooting a second time. I'm really not sure why but all bricks are up now On Mon, Aug 29, 2022 at 3:45 PM Jayme wrote: > A bit more info from the host's brick log > > [2022-08-29 18:43:44.251198 +] D [MSGID: 0] > [options.c:1113:xlator_reconfigure_rec] 0-engine-barrier: reconfigured > [2022-08-29 18:43:44.251203 +] D [MSGID: 0] > [options.c:1133:xlator_reconfigure_rec] 0-engine-index: No reconfigure() > found > [2022-08-29 18:43:44.251207 +] D [MSGID: 0] > [options.c:1113:xlator_reconfigure_rec] 0-engine-index: reconfigured > [2022-08-29 18:43:44.251214 +] I [MSGID: 0] > [options.c:1251:xlator_option_reconf_bool] 0-engine-quota: option > deem-statfs using set value off > [2022-08-29 18:43:44.251221 +] I [MSGID: 0] > [options.c:1251:xlator_option_reconf_bool] 0-engine-quota: option > server-quota using set value off > [2022-08-29 18:43:44.251248 +] D [MSGID: 0] > [options.c:1113:xlator_reconfigure_rec] 0-engine-quota: reconfigured > [2022-08-29 18:44:04.899452 +] E [MSGID: 113072] > [posix-inode-fd-ops.c:2087:posix_writev] 0-engine-posix: write failed: > offset 0, [Invalid argument] > [2022-08-29 18:44:04.899542 +] E [MSGID: 115067] > [server-rpc-fops_v2.c:1324:server4_writev_cbk] 0-engine-server: WRITE info > [{frame=358765}, {WRITEV_fd_no=5}, > {uuid_utoa=c816cdf3-12e6-45c0-ae0f-2cf03e0f7299}, > {client=CTX_ID:11b78775-07c9-47ff-b426-b44f3f88a3f7-GRAPH_ID:0-PID:25622-HOST:host1.x-PC_NAME:engine-client-0-RECON_NO:-2}, > {error-xlator=engine-posix}, {errno=22}, {error=Invalid argument}] > [2022-08-29 18:44:14.876436 +] E [MSGID: 113002] > [posix-entry-ops.c:769:posix_mkdir] 0-engine-posix: gfid is null for (null) > [Invalid argument] > [2022-08-29 18:44:14.876503 +] E [MSGID: 115056] > [server-rpc-fops_v2.c:497:server4_mkdir_cbk] 0-engine-server: MKDIR info > [{frame=359508}, {MKDIR_path=}, > {uuid_utoa=----0001}, {bname=}, > {client=CTX_ID:37199949-8cf2-4bbe-938e-e9ef3bd98486-GRAPH_ID:3-PID:2473-HOST:host0.x-PC_NAME:engine-client-0-RECON_NO:-0}, > {error-xlator=engine-posix}, {errno=22}, {error=Invalid argument}] > > On Mon, Aug 29, 2022 at 3:18 PM Jayme wrote: > >> Hello All, >> >> I've been struggling with a few issues upgrading my 3-node HCI custer >> from 4.4 to 4.5. >> >> At present the self hosted engine VM is properly running oVirt 4.5 on >> CentOS 8x stream. >> >> First host node, I set in maintenance and installed new node-ng image. I >> ran into issue with rescue mode on boot which appears to have been related >> to LVM devices bug. I was able to work past that and get the node to boot. >> >> The node running 4.5.2 image is booting properly and gluster/lvm mounts >> etc all look good. I am able to activate the host and run VMs on it etc. >> however, oVirt cli is showing that all bricks on host are DOWN. >> >> I was unable to get the bricks back up even after doing a force start of >> the volumes. >> >> Here is the glusterd log from the host in question when I try force start >> on the engine volume (other volumes are similar: >> >> ==> glusterd.log <== >> The message "I [MSGID: 106568] >> [glusterd-svc-mgmt.c:266:glusterd_svc_stop] 0-management: bitd service is >> stopped" repeated 2 times between [2022-08-29 18:09:56.027147 +] and >> [2022-08-29 18:10:34.694144 +] >> [2022-08-29 18:10:34.695348 +] I [MSGID: 106618] >> [glusterd-svc-helper.c:909:glusterd_attach_svc] 0-glusterd: adding svc >> glustershd (volume=engine) to existing process with pid 2473 >> [2022-08-29 18:10:34.695669 +] I [MSGID: 106131] >> [glusterd-proc-mgmt.c:84:glusterd_proc_stop] 0-management: scrub already >> stopped >> [2022-08-29 18:10:34.695691 +] I [MSGID: 106568] >> [glusterd-svc-mgmt.c:266:glusterd_svc_stop] 0-management: scrub service is >> stopped >> [2022-08-29 18:10:34.695832 +] I [MSGID: 106617] >> [glusterd-svc-helper.c:698:glusterd_svc_attach_cbk] 0-management: svc >> glustershd of volume engine attached successfully to pid 2473 >> [2022-08-29 18:10:34.703718 +] E [MSGID: 106115] >> [glusterd-mgmt.c:119:gd_mgmt_v3_collate_errors] 0-management: Post commit >> failed on gluster2.x. Please check log file for details. >> [2022-08-29 18:10:34.703774 +] E [MSGID: 106115] >> [glusterd-mgmt.c:119:gd_mgmt_v3_collate_errors] 0-management: Post commit >> failed on gluster1.x. Please check log file for details. >> [2022-08-29 18:10:34.703797 +] E [MSGID: 106664] >> [glusterd-mgmt.c:1969:glusterd_mgmt_v3_post_commit] 0-management: Post >> commit failed on peers >> [2022-08-29 18:10:34.703800 +] E [MSGID: 106664] >> [glusterd-mgmt.c:2664:glusterd_mgmt_v3_initiate_all_phases] 0-management: >> Post commit Op Failed >> >> If I run start command manually on host cli: >> >> gluster volume start engine force >> volume start: engine: failed: Post commit failed on gluster1.. Please >> check log file for details.
[ovirt-users] Re: unable to bring up gluster bricks after 4.5 upgrade
A bit more info from the host's brick log [2022-08-29 18:43:44.251198 +] D [MSGID: 0] [options.c:1113:xlator_reconfigure_rec] 0-engine-barrier: reconfigured [2022-08-29 18:43:44.251203 +] D [MSGID: 0] [options.c:1133:xlator_reconfigure_rec] 0-engine-index: No reconfigure() found [2022-08-29 18:43:44.251207 +] D [MSGID: 0] [options.c:1113:xlator_reconfigure_rec] 0-engine-index: reconfigured [2022-08-29 18:43:44.251214 +] I [MSGID: 0] [options.c:1251:xlator_option_reconf_bool] 0-engine-quota: option deem-statfs using set value off [2022-08-29 18:43:44.251221 +] I [MSGID: 0] [options.c:1251:xlator_option_reconf_bool] 0-engine-quota: option server-quota using set value off [2022-08-29 18:43:44.251248 +] D [MSGID: 0] [options.c:1113:xlator_reconfigure_rec] 0-engine-quota: reconfigured [2022-08-29 18:44:04.899452 +] E [MSGID: 113072] [posix-inode-fd-ops.c:2087:posix_writev] 0-engine-posix: write failed: offset 0, [Invalid argument] [2022-08-29 18:44:04.899542 +] E [MSGID: 115067] [server-rpc-fops_v2.c:1324:server4_writev_cbk] 0-engine-server: WRITE info [{frame=358765}, {WRITEV_fd_no=5}, {uuid_utoa=c816cdf3-12e6-45c0-ae0f-2cf03e0f7299}, {client=CTX_ID:11b78775-07c9-47ff-b426-b44f3f88a3f7-GRAPH_ID:0-PID:25622-HOST:host1.x-PC_NAME:engine-client-0-RECON_NO:-2}, {error-xlator=engine-posix}, {errno=22}, {error=Invalid argument}] [2022-08-29 18:44:14.876436 +] E [MSGID: 113002] [posix-entry-ops.c:769:posix_mkdir] 0-engine-posix: gfid is null for (null) [Invalid argument] [2022-08-29 18:44:14.876503 +] E [MSGID: 115056] [server-rpc-fops_v2.c:497:server4_mkdir_cbk] 0-engine-server: MKDIR info [{frame=359508}, {MKDIR_path=}, {uuid_utoa=----0001}, {bname=}, {client=CTX_ID:37199949-8cf2-4bbe-938e-e9ef3bd98486-GRAPH_ID:3-PID:2473-HOST:host0.x-PC_NAME:engine-client-0-RECON_NO:-0}, {error-xlator=engine-posix}, {errno=22}, {error=Invalid argument}] On Mon, Aug 29, 2022 at 3:18 PM Jayme wrote: > Hello All, > > I've been struggling with a few issues upgrading my 3-node HCI custer from > 4.4 to 4.5. > > At present the self hosted engine VM is properly running oVirt 4.5 on > CentOS 8x stream. > > First host node, I set in maintenance and installed new node-ng image. I > ran into issue with rescue mode on boot which appears to have been related > to LVM devices bug. I was able to work past that and get the node to boot. > > The node running 4.5.2 image is booting properly and gluster/lvm mounts > etc all look good. I am able to activate the host and run VMs on it etc. > however, oVirt cli is showing that all bricks on host are DOWN. > > I was unable to get the bricks back up even after doing a force start of > the volumes. > > Here is the glusterd log from the host in question when I try force start > on the engine volume (other volumes are similar: > > ==> glusterd.log <== > The message "I [MSGID: 106568] [glusterd-svc-mgmt.c:266:glusterd_svc_stop] > 0-management: bitd service is stopped" repeated 2 times between [2022-08-29 > 18:09:56.027147 +] and [2022-08-29 18:10:34.694144 +] > [2022-08-29 18:10:34.695348 +] I [MSGID: 106618] > [glusterd-svc-helper.c:909:glusterd_attach_svc] 0-glusterd: adding svc > glustershd (volume=engine) to existing process with pid 2473 > [2022-08-29 18:10:34.695669 +] I [MSGID: 106131] > [glusterd-proc-mgmt.c:84:glusterd_proc_stop] 0-management: scrub already > stopped > [2022-08-29 18:10:34.695691 +] I [MSGID: 106568] > [glusterd-svc-mgmt.c:266:glusterd_svc_stop] 0-management: scrub service is > stopped > [2022-08-29 18:10:34.695832 +] I [MSGID: 106617] > [glusterd-svc-helper.c:698:glusterd_svc_attach_cbk] 0-management: svc > glustershd of volume engine attached successfully to pid 2473 > [2022-08-29 18:10:34.703718 +] E [MSGID: 106115] > [glusterd-mgmt.c:119:gd_mgmt_v3_collate_errors] 0-management: Post commit > failed on gluster2.x. Please check log file for details. > [2022-08-29 18:10:34.703774 +] E [MSGID: 106115] > [glusterd-mgmt.c:119:gd_mgmt_v3_collate_errors] 0-management: Post commit > failed on gluster1.x. Please check log file for details. > [2022-08-29 18:10:34.703797 +] E [MSGID: 106664] > [glusterd-mgmt.c:1969:glusterd_mgmt_v3_post_commit] 0-management: Post > commit failed on peers > [2022-08-29 18:10:34.703800 +] E [MSGID: 106664] > [glusterd-mgmt.c:2664:glusterd_mgmt_v3_initiate_all_phases] 0-management: > Post commit Op Failed > > If I run start command manually on host cli: > > gluster volume start engine force > volume start: engine: failed: Post commit failed on gluster1.. Please > check log file for details. > Post commit failed on gluster2.. Please check log file for details. > > I feel like this may be some issue with the difference in major versions > of GlusterFS on the nodes but I am unsure. The other nodes are running > ovirt-node-ng-4.4.6.3 > > At this point I am afraid to bring down any other node to attempt > upgrading it without the br