Just the glusterd.log from each node, right?
> On Apr 4, 2019, at 11:25 AM, Atin Mukherjee <amukh...@redhat.com> wrote: > > Darell, > > I fully understand that you can't reproduce it and you don't have bandwidth > to test it again, but would you be able to send us the glusterd log from all > the nodes when this happened. We would like to go through the logs and get > back. I would particularly like to see if something has gone wrong with > transport.socket.listen-port option. But with out the log files we can't find > out anything. Hope you understand it. > > On Thu, Apr 4, 2019 at 9:27 PM Darrell Budic <bu...@onholyground.com > <mailto:bu...@onholyground.com>> wrote: > I didn’t follow any specific documents, just a generic rolling upgrade one > node at a time. Once the first node didn’t reconnect, I tried to follow the > workaround in the bug during the upgrade. Basic procedure was: > > - take 3 nodes that were initially installed with 3.12.x (forget which, but > low number) and had been upgraded directly to 5.5 from 3.12.15 > - op-version was 50400 > - on node A: > - yum install centos-release-gluster6 > - yum upgrade (was some ovirt cockpit components, gluster, and a lib or two > this time), hit yes > - discover glusterd was dead > - systemctl restart glusterd > - no peer connections, try iptables -F; systemctl restart glusterd, no > change > - following the workaround in the bug, try iptables -F & restart glusterd on > other 2 nodes, no effect > - nodes B & C were still connected to each other and all bricks were fine > at this point > - try upgrading other 2 nodes and restarting gluster, no effect (iptables > still empty) > - lost quota here, so all bricks went offline > - read logs, not finding much, but looked at glusterd.vol and compared to new > versions > - updated glusterd.vol on A and restarted glusterd > - A doesn’t show any connected peers, but both other nodes show A as > connected > - update glusterd.vol on B & C, restart glusterd > - all nodes show connected and volumes are active and healing > > The only odd thing in my process was that node A did not have any active > bricks on it at the time of the upgrade. It doesn’t seem like this mattered > since B & C showed the same symptoms between themselves while being upgraded, > but I don’t know. The only log entry that referenced anything about peer > connections is included below already. > > Looks like it was related to my glusterd settings, since that’s what fixed it > for me. Unfortunately, I don’t have the bandwidth or the systems to test > different versions of that specifically, but maybe you guys can on some test > resources? Otherwise, I’ve got another cluster (my production one!) that’s > midway through the upgrade from 3.12.15 -> 5.5. I paused when I started > getting multiple brick processes on the two nodes that had gone to 5.5 > already. I think I’m going to jump the last node right to 6 to try and avoid > that mess, and it has the same glusterd.vol settings. I’ll try and capture > it’s logs during the upgrade and see if there’s any new info, or if it has > the same issues as this group did. > > -Darrell > >> On Apr 4, 2019, at 2:54 AM, Sanju Rakonde <srako...@redhat.com >> <mailto:srako...@redhat.com>> wrote: >> >> We don't hit https://bugzilla.redhat.com/show_bug.cgi?id=1694010 >> <https://bugzilla.redhat.com/show_bug.cgi?id=1694010> while upgrading to >> glusterfs-6. We tested it in different setups and understood that this issue >> is seen because of some issue in setup. >> >> regarding the issue you have faced, can you please let us know which >> documentation you have followed for the upgrade? During our testing, we >> didn't hit any such issue. we would like to understand what went wrong. >> >> On Thu, Apr 4, 2019 at 2:08 AM Darrell Budic <bu...@onholyground.com >> <mailto:bu...@onholyground.com>> wrote: >> Hari- >> >> I was upgrading my test cluster from 5.5 to 6 and I hit this bug >> (https://bugzilla.redhat.com/show_bug.cgi?id=1694010 >> <https://bugzilla.redhat.com/show_bug.cgi?id=1694010>) or something similar. >> In my case, the workaround did not work, and I was left with a gluster that >> had gone into no-quorum mode and stopped all the bricks. Wasn’t much in the >> logs either, but I noticed my /etc/glusterfs/glusterd.vol files were not the >> same as the newer versions, so I updated them, restarted glusterd, and >> suddenly the updated node showed as peer-in-cluster again. Once I updated >> other notes the same way, things started working again. Maybe a place to >> look? >> >> My old config (all nodes): >> volume management >> type mgmt/glusterd >> option working-directory /var/lib/glusterd >> option transport-type socket >> option transport.socket.keepalive-time 10 >> option transport.socket.keepalive-interval 2 >> option transport.socket.read-fail-log off >> option ping-timeout 10 >> option event-threads 1 >> option rpc-auth-allow-insecure on >> # option transport.address-family inet6 >> # option base-port 49152 >> end-volume >> >> changed to: >> volume management >> type mgmt/glusterd >> option working-directory /var/lib/glusterd >> option transport-type socket,rdma >> option transport.socket.keepalive-time 10 >> option transport.socket.keepalive-interval 2 >> option transport.socket.read-fail-log off >> option transport.socket.listen-port 24007 >> option transport.rdma.listen-port 24008 >> option ping-timeout 0 >> option event-threads 1 >> option rpc-auth-allow-insecure on >> # option lock-timer 180 >> # option transport.address-family inet6 >> # option base-port 49152 >> option max-port 60999 >> end-volume >> >> the only thing I found in the glusterd logs that looks relevant was >> (repeated for both of the other nodes in this cluster), so no clue why it >> happened: >> [2019-04-03 20:19:16.802638] I [MSGID: 106004] >> [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management: Peer >> <ossuary-san> (<0ecbf953-681b-448f-9746-d1c1fe7a0978>), in state <Peer in >> Cluster>, has disconnected from glusterd. >> >> >>> On Apr 2, 2019, at 4:53 AM, Atin Mukherjee <atin.mukherje...@gmail.com >>> <mailto:atin.mukherje...@gmail.com>> wrote: >>> >>> >>> >>> On Mon, 1 Apr 2019 at 10:28, Hari Gowtham <hgowt...@redhat.com >>> <mailto:hgowt...@redhat.com>> wrote: >>> Comments inline. >>> >>> On Mon, Apr 1, 2019 at 5:55 AM Sankarshan Mukhopadhyay >>> <sankarshan.mukhopadh...@gmail.com >>> <mailto:sankarshan.mukhopadh...@gmail.com>> wrote: >>> > >>> > Quite a considerable amount of detail here. Thank you! >>> > >>> > On Fri, Mar 29, 2019 at 11:42 AM Hari Gowtham <hgowt...@redhat.com >>> > <mailto:hgowt...@redhat.com>> wrote: >>> > > >>> > > Hello Gluster users, >>> > > >>> > > As you all aware that glusterfs-6 is out, we would like to inform you >>> > > that, we have spent a significant amount of time in testing >>> > > glusterfs-6 in upgrade scenarios. We have done upgrade testing to >>> > > glusterfs-6 from various releases like 3.12, 4.1 and 5.3. >>> > > >>> > > As glusterfs-6 has got in a lot of changes, we wanted to test those >>> > > portions. >>> > > There were xlators (and respective options to enable/disable them) >>> > > added and deprecated in glusterfs-6 from various versions [1]. >>> > > >>> > > We had to check the following upgrade scenarios for all such options >>> > > Identified in [1]: >>> > > 1) option never enabled and upgraded >>> > > 2) option enabled and then upgraded >>> > > 3) option enabled and then disabled and then upgraded >>> > > >>> > > We weren't manually able to check all the combinations for all the >>> > > options. >>> > > So the options involving enabling and disabling xlators were >>> > > prioritized. >>> > > The below are the result of the ones tested. >>> > > >>> > > Never enabled and upgraded: >>> > > checked from 3.12, 4.1, 5.3 to 6 the upgrade works. >>> > > >>> > > Enabled and upgraded: >>> > > Tested for tier which is deprecated, It is not a recommended upgrade. >>> > > As expected the volume won't be consumable and will have a few more >>> > > issues as well. >>> > > Tested with 3.12, 4.1 and 5.3 to 6 upgrade. >>> > > >>> > > Enabled, disabled before upgrade. >>> > > Tested for tier with 3.12 and the upgrade went fine. >>> > > >>> > > There is one common issue to note in every upgrade. The node being >>> > > upgraded is going into disconnected state. You have to flush the >>> > > iptables >>> > > and the restart glusterd on all nodes to fix this. >>> > > >>> > >>> > Is this something that is written in the upgrade notes? I do not seem >>> > to recall, if not, I'll send a PR >>> >>> No this wasn't mentioned in the release notes. PRs are welcome. >>> >>> > >>> > > The testing for enabling new options is still pending. The new options >>> > > won't cause as much issues as the deprecated ones so this was put at >>> > > the end of the priority list. It would be nice to get contributions >>> > > for this. >>> > > >>> > >>> > Did the range of tests lead to any new issues? >>> >>> Yes. In the first round of testing we found an issue and had to postpone the >>> release of 6 until the fix was made available. >>> https://bugzilla.redhat.com/show_bug.cgi?id=1684029 >>> <https://bugzilla.redhat.com/show_bug.cgi?id=1684029> >>> >>> And then we tested it again after this patch was made available. >>> and came across this: >>> https://bugzilla.redhat.com/show_bug.cgi?id=1694010 >>> <https://bugzilla.redhat.com/show_bug.cgi?id=1694010> >>> >>> This isn’t a bug as we found that upgrade worked seamelessly in two >>> different setup. So we have no issues in the upgrade path to glusterfs-6 >>> release. >>> >>> <https://bugzilla.redhat.com/show_bug.cgi?id=1694010> >>> >>> Have mentioned this in the second mail as to how to over this situation >>> for now until the fix is available. >>> >>> > >>> > > For the disable testing, tier was used as it covers most of the xlator >>> > > that was removed. And all of these tests were done on a replica 3 >>> > > volume. >>> > > >>> > >>> > I'm not sure if the Glusto team is reading this, but it would be >>> > pertinent to understand if the approach you have taken can be >>> > converted into a form of automated testing pre-release. >>> >>> I don't have an answer for this, have CCed Vijay. >>> He might have an idea. >>> >>> > >>> > > Note: This is only for upgrade testing of the newly added and removed >>> > > xlators. Does not involve the normal tests for the xlator. >>> > > >>> > > If you have any questions, please feel free to reach us. >>> > > >>> > > [1] >>> > > https://docs.google.com/spreadsheets/d/1nh7T5AXaV6kc5KgILOy2pEqjzC3t_R47f1XUXSVFetI/edit?usp=sharing >>> > > >>> > > <https://docs.google.com/spreadsheets/d/1nh7T5AXaV6kc5KgILOy2pEqjzC3t_R47f1XUXSVFetI/edit?usp=sharing> >>> > > >>> > > Regards, >>> > > Hari and Sanju. >>> > _______________________________________________ >>> > Gluster-users mailing list >>> > Gluster-users@gluster.org <mailto:Gluster-users@gluster.org> >>> > https://lists.gluster.org/mailman/listinfo/gluster-users >>> > <https://lists.gluster.org/mailman/listinfo/gluster-users> >>> >>> >>> >>> -- >>> Regards, >>> Hari Gowtham. >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users@gluster.org <mailto:Gluster-users@gluster.org> >>> https://lists.gluster.org/mailman/listinfo/gluster-users >>> <https://lists.gluster.org/mailman/listinfo/gluster-users> >>> -- >>> --Atin >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users@gluster.org <mailto:Gluster-users@gluster.org> >>> https://lists.gluster.org/mailman/listinfo/gluster-users >>> <https://lists.gluster.org/mailman/listinfo/gluster-users> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users@gluster.org <mailto:Gluster-users@gluster.org> >> https://lists.gluster.org/mailman/listinfo/gluster-users >> <https://lists.gluster.org/mailman/listinfo/gluster-users> >> >> -- >> Thanks, >> Sanju > > _______________________________________________ > Gluster-users mailing list > Gluster-users@gluster.org <mailto:Gluster-users@gluster.org> > https://lists.gluster.org/mailman/listinfo/gluster-users > <https://lists.gluster.org/mailman/listinfo/gluster-users>
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users