Can you also check the time differences between nodes? We had a situation recently where the server time mismatch caused failures.
On Thu, Jun 28, 2018 at 2:50 AM, Kevin D Johnson <[email protected]> wrote: > You can also try to convert to the old primary/secondary model to back it > away from the default CCR configuration. > > mmchcluster --ccr-disable -p servername > > Then, temporarily go with only one quorum node and add more once the > cluster comes back up. Once the cluster is back up and has at least two > quorum nodes, do a --ccr-enable with the mmchcluster command. > > Kevin D. Johnson > Spectrum Computing, Senior Managing Consultant > MBA, MAcc, MS Global Technology and Development > IBM Certified Technical Specialist Level 2 Expert > > [image: IBM Certified Technical Specialist Level 2 Expert] > <https://www.youracclaim.com/badges/69d10078-02df-4e57-a223-bb3c9ae06306> > Certified Deployment Professional - Spectrum Scale > Certified Solution Advisor - Spectrum Computing > Certified Solution Architect - Spectrum Storage Solutions > > > 720.349.6199 - [email protected] > > "To think is to achieve." - Thomas J. Watson, Sr. > > > > > ----- Original message ----- > From: "IBM Spectrum Scale" <[email protected]> > Sent by: [email protected] > To: [email protected], gpfsug main discussion list < > [email protected]> > Cc: > Subject: Re: [gpfsug-discuss] gpfs client cluster, lost quorum, ccr issues > Date: Wed, Jun 27, 2018 5:15 PM > > > Hi Renata, > > You may want to reduce the set of quorum nodes. If your version supports > the --force option, you can run > > mmchnode --noquorum -N <broken-nodes> --force > > It is a good idea to configure tiebreaker disks in a cluster that has only > 2 quorum nodes. > > Regards, The Spectrum Scale (GPFS) team > > ------------------------------------------------------------ > ------------------------------------------------------ > If you feel that your question can benefit other users of Spectrum Scale > (GPFS), then please post it to the public IBM developerWroks Forum at > https://www.ibm.com/developerworks/community/ > forums/html/forum?id=11111111-0000-0000-0000-000000000479. > > If your query concerns a potential software error in Spectrum Scale (GPFS) > and you have an IBM software maintenance contract please contact > 1-800-237-5511 in the United States or your local IBM Service Center in > other countries. > > The forum is informally monitored as time permits and should not be used > for priority messages to the Spectrum Scale (GPFS) team. > > [image: Inactive hide details for Renata Maria Dart ---06/27/2018 02:21:52 > PM---Hi, we have a client cluster of 4 nodes with 3 quorum n]Renata Maria > Dart ---06/27/2018 02:21:52 PM---Hi, we have a client cluster of 4 nodes > with 3 quorum nodes. One of the quorum nodes is no longer i > > From: Renata Maria Dart <[email protected]> > To: [email protected] > Date: 06/27/2018 02:21 PM > Subject: [gpfsug-discuss] gpfs client cluster, lost quorum, ccr issues > Sent by: [email protected] > ------------------------------ > > > > Hi, we have a client cluster of 4 nodes with 3 quorum nodes. One of the > quorum nodes is no longer in service and the other was reinstalled with > a newer OS, both without informing the gpfs admins. Gpfs is still > "working" on the two remaining nodes, that is, they continue to have access > to the gpfs data on the remote clusters. But, I can no longer get > any gpfs commands to work. On one of the 2 nodes that are still serving > data, > > root@ocio-gpu01 ~]# mmlscluster > get file failed: Not enough CCR quorum nodes available (err 809) > gpfsClusterInit: Unexpected error from ccr fget mmsdrfs. Return code: 158 > mmlscluster: Command failed. Examine previous error messages to determine > cause. > > > On the reinstalled node, this fails in the same way: > > [root@ocio-gpu02 ccr]# mmstartup > get file failed: Not enough CCR quorum nodes available (err 809) > gpfsClusterInit: Unexpected error from ccr fget mmsdrfs. Return code: 158 > mmstartup: Command failed. Examine previous error messages to determine > cause. > > > I have looked through the users group interchanges but didn't find anything > that seems to fit this scenario. > > Is there a way to salvage this cluster? Can it be done without > shutting gpfs down on the 2 nodes that continue to work? > > Thanks for any advice, > > Renata Dart > SLAC National Accelerator Lb > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > >
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
