[ovirt-users] Re: rebooting an ovirt cluster

2019-02-05 Thread feral
Incidentally, at least so far, ovirt-node-4.3 is going much better (for the
installation anyway). The documentation for hyperconverged does not mention
anything about setting up cockpit though, so you have to manually enable
and start it, and if you're not using ipv6, you have to modify the
cockpit.socket to forcibly enable ipv4.


On Tue, Feb 5, 2019 at 8:27 AM Sahina Bose  wrote:

>
>
> On Tue, Feb 5, 2019 at 7:23 AM Greg Sheremeta  wrote:
>
>>
>>
>> On Mon, Feb 4, 2019 at 4:15 PM feral  wrote:
>>
>>> I think I found the answer to glusterd not starting.
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1472267
>>>
>>> Apparently the version of gluster (3.12.15) that comes packaged with
>>> ovirt-node 4.2.8 has a known issue where gluster tries to come up before
>>> networking, fails, and crashes. This was fixed in gluster 3.13.0
>>> (apparently). Do devs paruse this list?
>>>
>>
>> Yes :)
>>
>>
>>> Any chance someone who can update the gluster package might read this?
>>>
>>
>> +Sahina might be able to help
>> The developers list is
>> https://lists.ovirt.org/archives/list/de...@ovirt.org/
>>
>
> On 4.2, we're stuck with glusterfs 3.12 due to dependency on gluster-gnfs.
>
> The bug you refer to is hit only when a hostname changes or one of the
> network interfaces is down and brick path cannot be resolved. What's the
> error in glusterd.log for the failure to start?
>
>
>>
>>> On Mon, Feb 4, 2019 at 2:38 AM Simone Tiraboschi 
>>> wrote:
>>>


 On Sat, Feb 2, 2019 at 7:32 PM feral  wrote:

> How is an oVirt hyperconverged cluster supposed to come back to life
> after a power outage to all 3 nodes?
>
> Running ovirt-node (ovirt-node-ng-installer-4.2.0-2019013006.el7.iso)
> to get things going, but I've run into multiple issues.
>
> 1. During the gluster setup, the volume sizes I specify, are not
> reflected in the deployment configuration. The auto-populated values are
> used every time. I manually hacked on the config to get the volume sizes
> correct. I also noticed if I create the deployment config with "sdb" by
> accident, but click back and change it to "vdb", again, the changes are 
> not
> reflected in the config.
> My deployment config does seem to work. All volumes are created
> (though the xfs options used don't make sense as you end up with stripe
> sizes that aren't a multiple of the block size).
> Once gluster is deployed, I deploy the hosted engine, and everything
> works.
>
> 2. Reboot all nodes. I was testing for power outage response. All
> nodes come up, but glusterd is not running (seems to have failed for some
> reason). I can manually restart glusterd on all nodes and it comes up and
> starts communicating normally. However, the engine does not come online. 
> So
> I figure out where it last lived, and try to start it manually through the
> web interface. This fails because vdsm-ovirtmgmt is not up. I figured out
> the correct way to start up the engine would be through the cli via
> hosted-engine --vm-start.
>

 This is not required at all.
 Are you sure that your cluster is not set in global maintenance mode?
 Can you please share /var/log/ovirt-hosted-engine-ha/agent.log and
 broker.log from your hosts?


> This does work, but it takes a very long time, and it usually starts
> up on any node other than the one I told it to start on.
>
> So I guess two (or three) questions. What is the expected operation
> after a full cluster reboot (ie: in the event of a power failure)? Why
> doesn't the engine start automatically, and what might be causing glusterd
> to fail, when it can be restarted manually and works fine?
>
> --
> _
> Fact:
> 1. Ninjas are mammals.
> 2. Ninjas fight ALL the time.
> 3. The purpose of the ninja is to flip out and kill people.
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/RIADNRZRXTPTRG4XBFUMNWASBWRFCG4V/
>

>>>
>>> --
>>> _
>>> Fact:
>>> 1. Ninjas are mammals.
>>> 2. Ninjas fight ALL the time.
>>> 3. The purpose of the ninja is to flip out and kill people.
>>> ___
>>> Users mailing list -- users@ovirt.org
>>> To unsubscribe send an email to users-le...@ovirt.org
>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>> oVirt Code of Conduct:
>>> https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives:
>>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/WNN3PJFFP4VU5YAPDNYC7WQOTBDXDKPC/
>>>

[ovirt-users] Re: rebooting an ovirt cluster

2019-02-05 Thread Sahina Bose
On Tue, Feb 5, 2019 at 7:23 AM Greg Sheremeta  wrote:

>
>
> On Mon, Feb 4, 2019 at 4:15 PM feral  wrote:
>
>> I think I found the answer to glusterd not starting.
>> https://bugzilla.redhat.com/show_bug.cgi?id=1472267
>>
>> Apparently the version of gluster (3.12.15) that comes packaged with
>> ovirt-node 4.2.8 has a known issue where gluster tries to come up before
>> networking, fails, and crashes. This was fixed in gluster 3.13.0
>> (apparently). Do devs paruse this list?
>>
>
> Yes :)
>
>
>> Any chance someone who can update the gluster package might read this?
>>
>
> +Sahina might be able to help
> The developers list is
> https://lists.ovirt.org/archives/list/de...@ovirt.org/
>

On 4.2, we're stuck with glusterfs 3.12 due to dependency on gluster-gnfs.

The bug you refer to is hit only when a hostname changes or one of the
network interfaces is down and brick path cannot be resolved. What's the
error in glusterd.log for the failure to start?


>
>> On Mon, Feb 4, 2019 at 2:38 AM Simone Tiraboschi 
>> wrote:
>>
>>>
>>>
>>> On Sat, Feb 2, 2019 at 7:32 PM feral  wrote:
>>>
 How is an oVirt hyperconverged cluster supposed to come back to life
 after a power outage to all 3 nodes?

 Running ovirt-node (ovirt-node-ng-installer-4.2.0-2019013006.el7.iso)
 to get things going, but I've run into multiple issues.

 1. During the gluster setup, the volume sizes I specify, are not
 reflected in the deployment configuration. The auto-populated values are
 used every time. I manually hacked on the config to get the volume sizes
 correct. I also noticed if I create the deployment config with "sdb" by
 accident, but click back and change it to "vdb", again, the changes are not
 reflected in the config.
 My deployment config does seem to work. All volumes are created (though
 the xfs options used don't make sense as you end up with stripe sizes that
 aren't a multiple of the block size).
 Once gluster is deployed, I deploy the hosted engine, and everything
 works.

 2. Reboot all nodes. I was testing for power outage response. All nodes
 come up, but glusterd is not running (seems to have failed for some
 reason). I can manually restart glusterd on all nodes and it comes up and
 starts communicating normally. However, the engine does not come online. So
 I figure out where it last lived, and try to start it manually through the
 web interface. This fails because vdsm-ovirtmgmt is not up. I figured out
 the correct way to start up the engine would be through the cli via
 hosted-engine --vm-start.

>>>
>>> This is not required at all.
>>> Are you sure that your cluster is not set in global maintenance mode?
>>> Can you please share /var/log/ovirt-hosted-engine-ha/agent.log and
>>> broker.log from your hosts?
>>>
>>>
 This does work, but it takes a very long time, and it usually starts up
 on any node other than the one I told it to start on.

 So I guess two (or three) questions. What is the expected operation
 after a full cluster reboot (ie: in the event of a power failure)? Why
 doesn't the engine start automatically, and what might be causing glusterd
 to fail, when it can be restarted manually and works fine?

 --
 _
 Fact:
 1. Ninjas are mammals.
 2. Ninjas fight ALL the time.
 3. The purpose of the ninja is to flip out and kill people.
 ___
 Users mailing list -- users@ovirt.org
 To unsubscribe send an email to users-le...@ovirt.org
 Privacy Statement: https://www.ovirt.org/site/privacy-policy/
 oVirt Code of Conduct:
 https://www.ovirt.org/community/about/community-guidelines/
 List Archives:
 https://lists.ovirt.org/archives/list/users@ovirt.org/message/RIADNRZRXTPTRG4XBFUMNWASBWRFCG4V/

>>>
>>
>> --
>> _
>> Fact:
>> 1. Ninjas are mammals.
>> 2. Ninjas fight ALL the time.
>> 3. The purpose of the ninja is to flip out and kill people.
>> ___
>> Users mailing list -- users@ovirt.org
>> To unsubscribe send an email to users-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/WNN3PJFFP4VU5YAPDNYC7WQOTBDXDKPC/
>>
>
>
> --
>
> GREG SHEREMETA
>
> SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
>
> Red Hat NA
>
> 
>
> gsher...@redhat.comIRC: gshereme
> 
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 

[ovirt-users] Re: rebooting an ovirt cluster

2019-02-05 Thread feral
ha! I was all worried I was completely tarding out as I've never heard of
4.3... Released two days ago...

On Mon, Feb 4, 2019 at 11:53 PM Sandro Bonazzola 
wrote:

>
>
> Il giorno mar 5 feb 2019 alle ore 02:53 Greg Sheremeta <
> gsher...@redhat.com> ha scritto:
>
>>
>>
>> On Mon, Feb 4, 2019 at 4:15 PM feral  wrote:
>>
>>> I think I found the answer to glusterd not starting.
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1472267
>>>
>>> Apparently the version of gluster (3.12.15) that comes packaged with
>>> ovirt-node 4.2.8 has a known issue where gluster tries to come up before
>>> networking, fails, and crashes. This was fixed in gluster 3.13.0
>>> (apparently).
>>>
>>
> May I suggest to upgrade to 4.3.0? It ships Gluster 5 which should include
> the needed fixes.
>
>
>
>> Do devs paruse this list?
>>>
>>
>> Yes :)
>>
>>
>>> Any chance someone who can update the gluster package might read this?
>>>
>>
>> +Sahina might be able to help
>> The developers list is
>> https://lists.ovirt.org/archives/list/de...@ovirt.org/
>>
>>
>>> On Mon, Feb 4, 2019 at 2:38 AM Simone Tiraboschi 
>>> wrote:
>>>


 On Sat, Feb 2, 2019 at 7:32 PM feral  wrote:

> How is an oVirt hyperconverged cluster supposed to come back to life
> after a power outage to all 3 nodes?
>
> Running ovirt-node (ovirt-node-ng-installer-4.2.0-2019013006.el7.iso)
> to get things going, but I've run into multiple issues.
>
> 1. During the gluster setup, the volume sizes I specify, are not
> reflected in the deployment configuration. The auto-populated values are
> used every time. I manually hacked on the config to get the volume sizes
> correct. I also noticed if I create the deployment config with "sdb" by
> accident, but click back and change it to "vdb", again, the changes are 
> not
> reflected in the config.
> My deployment config does seem to work. All volumes are created
> (though the xfs options used don't make sense as you end up with stripe
> sizes that aren't a multiple of the block size).
> Once gluster is deployed, I deploy the hosted engine, and everything
> works.
>
> 2. Reboot all nodes. I was testing for power outage response. All
> nodes come up, but glusterd is not running (seems to have failed for some
> reason). I can manually restart glusterd on all nodes and it comes up and
> starts communicating normally. However, the engine does not come online. 
> So
> I figure out where it last lived, and try to start it manually through the
> web interface. This fails because vdsm-ovirtmgmt is not up. I figured out
> the correct way to start up the engine would be through the cli via
> hosted-engine --vm-start.
>

 This is not required at all.
 Are you sure that your cluster is not set in global maintenance mode?
 Can you please share /var/log/ovirt-hosted-engine-ha/agent.log and
 broker.log from your hosts?


> This does work, but it takes a very long time, and it usually starts
> up on any node other than the one I told it to start on.
>
> So I guess two (or three) questions. What is the expected operation
> after a full cluster reboot (ie: in the event of a power failure)? Why
> doesn't the engine start automatically, and what might be causing glusterd
> to fail, when it can be restarted manually and works fine?
>
> --
> _
> Fact:
> 1. Ninjas are mammals.
> 2. Ninjas fight ALL the time.
> 3. The purpose of the ninja is to flip out and kill people.
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/RIADNRZRXTPTRG4XBFUMNWASBWRFCG4V/
>

>>>
>>> --
>>> _
>>> Fact:
>>> 1. Ninjas are mammals.
>>> 2. Ninjas fight ALL the time.
>>> 3. The purpose of the ninja is to flip out and kill people.
>>> ___
>>> Users mailing list -- users@ovirt.org
>>> To unsubscribe send an email to users-le...@ovirt.org
>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>> oVirt Code of Conduct:
>>> https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives:
>>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/WNN3PJFFP4VU5YAPDNYC7WQOTBDXDKPC/
>>>
>>
>>
>> --
>>
>> GREG SHEREMETA
>>
>> SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
>>
>> Red Hat NA
>>
>> 
>>
>> gsher...@redhat.comIRC: gshereme
>> 
>>
>
>
> --
>
> SANDRO BONAZZOLA
>
> MANAGER, SOFTWARE ENGINEERING, EMEA R RHV
>
> Red Hat EMEA 
>
> sbona...@redhat.com
> 

[ovirt-users] Re: rebooting an ovirt cluster

2019-02-04 Thread Sandro Bonazzola
Il giorno mar 5 feb 2019 alle ore 02:53 Greg Sheremeta 
ha scritto:

>
>
> On Mon, Feb 4, 2019 at 4:15 PM feral  wrote:
>
>> I think I found the answer to glusterd not starting.
>> https://bugzilla.redhat.com/show_bug.cgi?id=1472267
>>
>> Apparently the version of gluster (3.12.15) that comes packaged with
>> ovirt-node 4.2.8 has a known issue where gluster tries to come up before
>> networking, fails, and crashes. This was fixed in gluster 3.13.0
>> (apparently).
>>
>
May I suggest to upgrade to 4.3.0? It ships Gluster 5 which should include
the needed fixes.



> Do devs paruse this list?
>>
>
> Yes :)
>
>
>> Any chance someone who can update the gluster package might read this?
>>
>
> +Sahina might be able to help
> The developers list is
> https://lists.ovirt.org/archives/list/de...@ovirt.org/
>
>
>> On Mon, Feb 4, 2019 at 2:38 AM Simone Tiraboschi 
>> wrote:
>>
>>>
>>>
>>> On Sat, Feb 2, 2019 at 7:32 PM feral  wrote:
>>>
 How is an oVirt hyperconverged cluster supposed to come back to life
 after a power outage to all 3 nodes?

 Running ovirt-node (ovirt-node-ng-installer-4.2.0-2019013006.el7.iso)
 to get things going, but I've run into multiple issues.

 1. During the gluster setup, the volume sizes I specify, are not
 reflected in the deployment configuration. The auto-populated values are
 used every time. I manually hacked on the config to get the volume sizes
 correct. I also noticed if I create the deployment config with "sdb" by
 accident, but click back and change it to "vdb", again, the changes are not
 reflected in the config.
 My deployment config does seem to work. All volumes are created (though
 the xfs options used don't make sense as you end up with stripe sizes that
 aren't a multiple of the block size).
 Once gluster is deployed, I deploy the hosted engine, and everything
 works.

 2. Reboot all nodes. I was testing for power outage response. All nodes
 come up, but glusterd is not running (seems to have failed for some
 reason). I can manually restart glusterd on all nodes and it comes up and
 starts communicating normally. However, the engine does not come online. So
 I figure out where it last lived, and try to start it manually through the
 web interface. This fails because vdsm-ovirtmgmt is not up. I figured out
 the correct way to start up the engine would be through the cli via
 hosted-engine --vm-start.

>>>
>>> This is not required at all.
>>> Are you sure that your cluster is not set in global maintenance mode?
>>> Can you please share /var/log/ovirt-hosted-engine-ha/agent.log and
>>> broker.log from your hosts?
>>>
>>>
 This does work, but it takes a very long time, and it usually starts up
 on any node other than the one I told it to start on.

 So I guess two (or three) questions. What is the expected operation
 after a full cluster reboot (ie: in the event of a power failure)? Why
 doesn't the engine start automatically, and what might be causing glusterd
 to fail, when it can be restarted manually and works fine?

 --
 _
 Fact:
 1. Ninjas are mammals.
 2. Ninjas fight ALL the time.
 3. The purpose of the ninja is to flip out and kill people.
 ___
 Users mailing list -- users@ovirt.org
 To unsubscribe send an email to users-le...@ovirt.org
 Privacy Statement: https://www.ovirt.org/site/privacy-policy/
 oVirt Code of Conduct:
 https://www.ovirt.org/community/about/community-guidelines/
 List Archives:
 https://lists.ovirt.org/archives/list/users@ovirt.org/message/RIADNRZRXTPTRG4XBFUMNWASBWRFCG4V/

>>>
>>
>> --
>> _
>> Fact:
>> 1. Ninjas are mammals.
>> 2. Ninjas fight ALL the time.
>> 3. The purpose of the ninja is to flip out and kill people.
>> ___
>> Users mailing list -- users@ovirt.org
>> To unsubscribe send an email to users-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/WNN3PJFFP4VU5YAPDNYC7WQOTBDXDKPC/
>>
>
>
> --
>
> GREG SHEREMETA
>
> SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX
>
> Red Hat NA
>
> 
>
> gsher...@redhat.comIRC: gshereme
> 
>


-- 

SANDRO BONAZZOLA

MANAGER, SOFTWARE ENGINEERING, EMEA R RHV

Red Hat EMEA 

sbona...@redhat.com

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 

[ovirt-users] Re: rebooting an ovirt cluster

2019-02-04 Thread Greg Sheremeta
On Mon, Feb 4, 2019 at 4:15 PM feral  wrote:

> I think I found the answer to glusterd not starting.
> https://bugzilla.redhat.com/show_bug.cgi?id=1472267
>
> Apparently the version of gluster (3.12.15) that comes packaged with
> ovirt-node 4.2.8 has a known issue where gluster tries to come up before
> networking, fails, and crashes. This was fixed in gluster 3.13.0
> (apparently). Do devs paruse this list?
>

Yes :)


> Any chance someone who can update the gluster package might read this?
>

+Sahina might be able to help
The developers list is
https://lists.ovirt.org/archives/list/de...@ovirt.org/


> On Mon, Feb 4, 2019 at 2:38 AM Simone Tiraboschi 
> wrote:
>
>>
>>
>> On Sat, Feb 2, 2019 at 7:32 PM feral  wrote:
>>
>>> How is an oVirt hyperconverged cluster supposed to come back to life
>>> after a power outage to all 3 nodes?
>>>
>>> Running ovirt-node (ovirt-node-ng-installer-4.2.0-2019013006.el7.iso)
>>> to get things going, but I've run into multiple issues.
>>>
>>> 1. During the gluster setup, the volume sizes I specify, are not
>>> reflected in the deployment configuration. The auto-populated values are
>>> used every time. I manually hacked on the config to get the volume sizes
>>> correct. I also noticed if I create the deployment config with "sdb" by
>>> accident, but click back and change it to "vdb", again, the changes are not
>>> reflected in the config.
>>> My deployment config does seem to work. All volumes are created (though
>>> the xfs options used don't make sense as you end up with stripe sizes that
>>> aren't a multiple of the block size).
>>> Once gluster is deployed, I deploy the hosted engine, and everything
>>> works.
>>>
>>> 2. Reboot all nodes. I was testing for power outage response. All nodes
>>> come up, but glusterd is not running (seems to have failed for some
>>> reason). I can manually restart glusterd on all nodes and it comes up and
>>> starts communicating normally. However, the engine does not come online. So
>>> I figure out where it last lived, and try to start it manually through the
>>> web interface. This fails because vdsm-ovirtmgmt is not up. I figured out
>>> the correct way to start up the engine would be through the cli via
>>> hosted-engine --vm-start.
>>>
>>
>> This is not required at all.
>> Are you sure that your cluster is not set in global maintenance mode?
>> Can you please share /var/log/ovirt-hosted-engine-ha/agent.log and
>> broker.log from your hosts?
>>
>>
>>> This does work, but it takes a very long time, and it usually starts up
>>> on any node other than the one I told it to start on.
>>>
>>> So I guess two (or three) questions. What is the expected operation
>>> after a full cluster reboot (ie: in the event of a power failure)? Why
>>> doesn't the engine start automatically, and what might be causing glusterd
>>> to fail, when it can be restarted manually and works fine?
>>>
>>> --
>>> _
>>> Fact:
>>> 1. Ninjas are mammals.
>>> 2. Ninjas fight ALL the time.
>>> 3. The purpose of the ninja is to flip out and kill people.
>>> ___
>>> Users mailing list -- users@ovirt.org
>>> To unsubscribe send an email to users-le...@ovirt.org
>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>> oVirt Code of Conduct:
>>> https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives:
>>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/RIADNRZRXTPTRG4XBFUMNWASBWRFCG4V/
>>>
>>
>
> --
> _
> Fact:
> 1. Ninjas are mammals.
> 2. Ninjas fight ALL the time.
> 3. The purpose of the ninja is to flip out and kill people.
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/WNN3PJFFP4VU5YAPDNYC7WQOTBDXDKPC/
>


-- 

GREG SHEREMETA

SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX

Red Hat NA



gsher...@redhat.comIRC: gshereme

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/UKUZRETYS2TCADBUCDADPTKOLLIGRNOA/


[ovirt-users] Re: rebooting an ovirt cluster

2019-02-04 Thread feral
I think I found the answer to glusterd not starting.
https://bugzilla.redhat.com/show_bug.cgi?id=1472267

Apparently the version of gluster (3.12.15) that comes packaged with
ovirt-node 4.2.8 has a known issue where gluster tries to come up before
networking, fails, and crashes. This was fixed in gluster 3.13.0
(apparently). Do devs paruse this list? Any chance someone who can update
the gluster package might read this?

On Mon, Feb 4, 2019 at 2:38 AM Simone Tiraboschi 
wrote:

>
>
> On Sat, Feb 2, 2019 at 7:32 PM feral  wrote:
>
>> How is an oVirt hyperconverged cluster supposed to come back to life
>> after a power outage to all 3 nodes?
>>
>> Running ovirt-node (ovirt-node-ng-installer-4.2.0-2019013006.el7.iso) to
>> get things going, but I've run into multiple issues.
>>
>> 1. During the gluster setup, the volume sizes I specify, are not
>> reflected in the deployment configuration. The auto-populated values are
>> used every time. I manually hacked on the config to get the volume sizes
>> correct. I also noticed if I create the deployment config with "sdb" by
>> accident, but click back and change it to "vdb", again, the changes are not
>> reflected in the config.
>> My deployment config does seem to work. All volumes are created (though
>> the xfs options used don't make sense as you end up with stripe sizes that
>> aren't a multiple of the block size).
>> Once gluster is deployed, I deploy the hosted engine, and everything
>> works.
>>
>> 2. Reboot all nodes. I was testing for power outage response. All nodes
>> come up, but glusterd is not running (seems to have failed for some
>> reason). I can manually restart glusterd on all nodes and it comes up and
>> starts communicating normally. However, the engine does not come online. So
>> I figure out where it last lived, and try to start it manually through the
>> web interface. This fails because vdsm-ovirtmgmt is not up. I figured out
>> the correct way to start up the engine would be through the cli via
>> hosted-engine --vm-start.
>>
>
> This is not required at all.
> Are you sure that your cluster is not set in global maintenance mode?
> Can you please share /var/log/ovirt-hosted-engine-ha/agent.log and
> broker.log from your hosts?
>
>
>> This does work, but it takes a very long time, and it usually starts up
>> on any node other than the one I told it to start on.
>>
>> So I guess two (or three) questions. What is the expected operation after
>> a full cluster reboot (ie: in the event of a power failure)? Why doesn't
>> the engine start automatically, and what might be causing glusterd to fail,
>> when it can be restarted manually and works fine?
>>
>> --
>> _
>> Fact:
>> 1. Ninjas are mammals.
>> 2. Ninjas fight ALL the time.
>> 3. The purpose of the ninja is to flip out and kill people.
>> ___
>> Users mailing list -- users@ovirt.org
>> To unsubscribe send an email to users-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/RIADNRZRXTPTRG4XBFUMNWASBWRFCG4V/
>>
>

-- 
_
Fact:
1. Ninjas are mammals.
2. Ninjas fight ALL the time.
3. The purpose of the ninja is to flip out and kill people.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/WNN3PJFFP4VU5YAPDNYC7WQOTBDXDKPC/


[ovirt-users] Re: rebooting an ovirt cluster

2019-02-04 Thread Simone Tiraboschi
On Sat, Feb 2, 2019 at 7:32 PM feral  wrote:

> How is an oVirt hyperconverged cluster supposed to come back to life after
> a power outage to all 3 nodes?
>
> Running ovirt-node (ovirt-node-ng-installer-4.2.0-2019013006.el7.iso) to
> get things going, but I've run into multiple issues.
>
> 1. During the gluster setup, the volume sizes I specify, are not reflected
> in the deployment configuration. The auto-populated values are used every
> time. I manually hacked on the config to get the volume sizes correct. I
> also noticed if I create the deployment config with "sdb" by accident, but
> click back and change it to "vdb", again, the changes are not reflected in
> the config.
> My deployment config does seem to work. All volumes are created (though
> the xfs options used don't make sense as you end up with stripe sizes that
> aren't a multiple of the block size).
> Once gluster is deployed, I deploy the hosted engine, and everything works.
>
> 2. Reboot all nodes. I was testing for power outage response. All nodes
> come up, but glusterd is not running (seems to have failed for some
> reason). I can manually restart glusterd on all nodes and it comes up and
> starts communicating normally. However, the engine does not come online. So
> I figure out where it last lived, and try to start it manually through the
> web interface. This fails because vdsm-ovirtmgmt is not up. I figured out
> the correct way to start up the engine would be through the cli via
> hosted-engine --vm-start.
>

This is not required at all.
Are you sure that your cluster is not set in global maintenance mode?
Can you please share /var/log/ovirt-hosted-engine-ha/agent.log and
broker.log from your hosts?


> This does work, but it takes a very long time, and it usually starts up on
> any node other than the one I told it to start on.
>
> So I guess two (or three) questions. What is the expected operation after
> a full cluster reboot (ie: in the event of a power failure)? Why doesn't
> the engine start automatically, and what might be causing glusterd to fail,
> when it can be restarted manually and works fine?
>
> --
> _
> Fact:
> 1. Ninjas are mammals.
> 2. Ninjas fight ALL the time.
> 3. The purpose of the ninja is to flip out and kill people.
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/RIADNRZRXTPTRG4XBFUMNWASBWRFCG4V/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/P5YRAPW3YFGXXTYRDXCEIFFFXZEMPAKU/


[ovirt-users] Re: rebooting an ovirt cluster

2019-02-03 Thread Strahil
That's a good question... Yet, I'm new in oVirt - so I can't say for sure. I can even be wrong.Best Regards,Strahil NikolovOn Feb 4, 2019 03:45, feral  wrote:So why is this the default behavior of the ovirt Node distro?On Sun, Feb 3, 2019 at 5:16 PM Strahil  wrote:2. Reboot all nodes. I was testing for power outage response. All nodes come up, but glusterd is not running (seems to have failed for some reason). I can manually restart glusterd on all nodes and it comes up and starts communicating normally. However, the engine does not come online. So I figure out where it last lived, and try to start it manually through the web interface. This fails because vdsm-ovirtmgmt is not up. I figured out the correct way to start up the engine would be through the cli via hosted-engine --vm-start. This does work, but it takes a very long time, and it usually starts up on any node other than the one I told it to start on.If you use fstab - prepare for pain... Systemd mounts are more effective.Here is a sample:[root@ovirt1 ~]# systemctl cat gluster_bricks-engine.mount# /etc/systemd/system/gluster_bricks-engine.mount[Unit]Description=Mount glusterfs brick - ENGINERequires = vdo.serviceAfter = vdo.serviceBefore = glusterd.serviceConflicts = umount.target[Mount]What=/dev/mapper/gluster_vg_md0-gluster_lv_engineWhere=/gluster_bricks/engineType=xfsOptions=inode64,noatime,nodiratime[Install]WantedBy=glusterd.service[root@ovirt1 ~]# systemctl cat glusterd.service# /etc/systemd/system/glusterd.service[Unit]Description=GlusterFS, a clustered file-system serverRequires=rpcbind.service gluster_bricks-engine.mount gluster_bricks-data.mountAfter=network.target rpcbind.service gluster_bricks-engine.mount Before=network-online.target[Service]Type=forkingPIDFile=/var/run/glusterd.pidLimitNOFILE=65536Environment="LOG_LEVEL=INFO"EnvironmentFile=-/etc/sysconfig/glusterdExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid  --log-level $LOG_LEVELKillMode=processSuccessExitStatus=15[Install]WantedBy=multi-user.target# /etc/systemd/system/glusterd.service.d/99-cpu.conf[Service]CPUAccounting=yesSlice=glusterfs.sliceNote : Some of the 'After=' and 'Requires='  entries were removed during copy-pasting.So I guess two (or three) questions. What is the expected operation after a full cluster reboot (ie: in the event of a power failure)? Why doesn't the engine start automatically, and what might be causing glusterd to fail, when it can be restarted manually and works fine?Expected -everything to be up and running.Root cause , the system's fstab generator starts after cluster tries to start the bricks - and of course fails.Then everything on the chain fails.Just use systemd's mount entries ( I have added automount also)  and you won't have such issues.Best Regards,Strahil Nikolov-- _Fact:1. Ninjas are mammals.2. Ninjas fight ALL the time.3. The purpose of the ninja is to flip out and kill people.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/N5ON5URNMFMAVSMFROTKTPBCKGNDN5ON/


[ovirt-users] Re: rebooting an ovirt cluster

2019-02-03 Thread feral
So why is this the default behavior of the ovirt Node distro?

On Sun, Feb 3, 2019 at 5:16 PM Strahil  wrote:

>
> 2. Reboot all nodes. I was testing for power outage response. All nodes
> come up, but glusterd is not running (seems to have failed for some
> reason). I can manually restart glusterd on all nodes and it comes up and
> starts communicating normally. However, the engine does not come online. So
> I figure out where it last lived, and try to start it manually through the
> web interface. This fails because vdsm-ovirtmgmt is not up. I figured out
> the correct way to start up the engine would be through the cli via
> hosted-engine --vm-start. This does work, but it takes a very long time,
> and it usually starts up on any node other than the one I told it to start
> on.
>
> If you use fstab - prepare for pain... Systemd mounts are more effective.
> Here is a sample:
>
> [root@ovirt1 ~]# systemctl cat gluster_bricks-engine.mount
> # /etc/systemd/system/gluster_bricks-engine.mount
> [Unit]
> Description=Mount glusterfs brick - ENGINE
> Requires = vdo.service
> After = vdo.service
> Before = glusterd.service
> Conflicts = umount.target
>
> [Mount]
> What=/dev/mapper/gluster_vg_md0-gluster_lv_engine
> Where=/gluster_bricks/engine
> Type=xfs
> Options=inode64,noatime,nodiratime
>
> [Install]
> WantedBy=glusterd.service
>
> [root@ovirt1 ~]# systemctl cat glusterd.service
> # /etc/systemd/system/glusterd.service
> [Unit]
> Description=GlusterFS, a clustered file-system server
> Requires=rpcbind.service gluster_bricks-engine.mount
> gluster_bricks-data.mount
> After=network.target rpcbind.service gluster_bricks-engine.mount
> Before=network-online.target
>
> [Service]
> Type=forking
> PIDFile=/var/run/glusterd.pid
> LimitNOFILE=65536
> Environment="LOG_LEVEL=INFO"
> EnvironmentFile=-/etc/sysconfig/glusterd
> ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid  --log-level
> $LOG_LEVELKillMode=process
> SuccessExitStatus=15
>
> [Install]
> WantedBy=multi-user.target
>
> # /etc/systemd/system/glusterd.service.d/99-cpu.conf
> [Service]
> CPUAccounting=yes
> Slice=glusterfs.slice
>
>
> Note : Some of the 'After=' and 'Requires='  entries were removed during
> copy-pasting.
>
> So I guess two (or three) questions. What is the expected operation after
> a full cluster reboot (ie: in the event of a power failure)? Why doesn't
> the engine start automatically, and what might be causing glusterd to fail,
> when it can be restarted manually and works fine?
>
>
>
> Expected -everything to be up and running.
> Root cause , the system's fstab generator starts after cluster tries to
> start the bricks - and of course fails.
> Then everything on the chain fails.
>
> Just use systemd's mount entries ( I have added automount also)  and you
> won't have such issues.
>
> Best Regards,
> Strahil Nikolov
>


-- 
_
Fact:
1. Ninjas are mammals.
2. Ninjas fight ALL the time.
3. The purpose of the ninja is to flip out and kill people.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZHCHWBBC2MB2UZYWIZH7UF2IQZOUKA7Q/


[ovirt-users] Re: rebooting an ovirt cluster

2019-02-03 Thread Strahil
2. Reboot all nodes. I was testing for power outage response. All nodes come up, but glusterd is not running (seems to have failed for some reason). I can manually restart glusterd on all nodes and it comes up and starts communicating normally. However, the engine does not come online. So I figure out where it last lived, and try to start it manually through the web interface. This fails because vdsm-ovirtmgmt is not up. I figured out the correct way to start up the engine would be through the cli via hosted-engine --vm-start. This does work, but it takes a very long time, and it usually starts up on any node other than the one I told it to start on.If you use fstab - prepare for pain... Systemd mounts are more effective.Here is a sample:[root@ovirt1 ~]# systemctl cat gluster_bricks-engine.mount# /etc/systemd/system/gluster_bricks-engine.mount[Unit]Description=Mount glusterfs brick - ENGINERequires = vdo.serviceAfter = vdo.serviceBefore = glusterd.serviceConflicts = umount.target[Mount]What=/dev/mapper/gluster_vg_md0-gluster_lv_engineWhere=/gluster_bricks/engineType=xfsOptions=inode64,noatime,nodiratime[Install]WantedBy=glusterd.service[root@ovirt1 ~]# systemctl cat glusterd.service# /etc/systemd/system/glusterd.service[Unit]Description=GlusterFS, a clustered file-system serverRequires=rpcbind.service gluster_bricks-engine.mount gluster_bricks-data.mountAfter=network.target rpcbind.service gluster_bricks-engine.mount Before=network-online.target[Service]Type=forkingPIDFile=/var/run/glusterd.pidLimitNOFILE=65536Environment="LOG_LEVEL=INFO"EnvironmentFile=-/etc/sysconfig/glusterdExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid  --log-level $LOG_LEVELKillMode=processSuccessExitStatus=15[Install]WantedBy=multi-user.target# /etc/systemd/system/glusterd.service.d/99-cpu.conf[Service]CPUAccounting=yesSlice=glusterfs.sliceNote : Some of the 'After=' and 'Requires='  entries were removed during copy-pasting.So I guess two (or three) questions. What is the expected operation after a full cluster reboot (ie: in the event of a power failure)? Why doesn't the engine start automatically, and what might be causing glusterd to fail, when it can be restarted manually and works fine?Expected -everything to be up and running.Root cause , the system's fstab generator starts after cluster tries to start the bricks - and of course fails.Then everything on the chain fails.Just use systemd's mount entries ( I have added automount also)  and you won't have such issues.Best Regards,Strahil Nikolov___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/W7YCIYIZVQGXJPGMVMYCTRJUVT7YZOSE/