Re: [CentOS] Ovirt migration

2016-06-22 Thread Digimer
On 23/06/16 01:05 AM, Andrew Dent wrote:
> Hi
> 
> We have a Single box running Ovirt 3.5.6.2-1 + CentOS7.
> The Engine, VDSM Host and the Storage is all on the Single box, which
> contains 2 * RAID1 arrays.
> 
> We are looking to purchase a second box, and I'm wondering if someone
> can please help me to understand of how best to migrate to a HA
> environment using both boxes. We do have some additional NFS storage on
> the network that we can temporarily move VMs to.
> 
> Is it possible to migrate our current Ovirt setup to be HA using
> Gluster, or would we need to start again by exporting all VMs, wipe box
> one and reinstall CentOS & Ovirt, then install CentOS and Ovirt on box two.
> 
> I'd like to move the Engine to be hosted in a VM as well.
> Can we do HA without Gluster? Pros/Cons
> 
> Regards
> 
> Andrew Dent

See this thread, we just discussed a setup like this. Feel free to ask
here if you have additional questions though.

https://lists.centos.org/pipermail/centos/2016-June/160085.html

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


[CentOS] Ovirt migration

2016-06-22 Thread Andrew Dent

Hi

We have a Single box running Ovirt 3.5.6.2-1 + CentOS7.
The Engine, VDSM Host and the Storage is all on the Single box, which 
contains 2 * RAID1 arrays.


We are looking to purchase a second box, and I'm wondering if someone 
can please help me to understand of how best to migrate to a HA 
environment using both boxes. We do have some additional NFS storage on 
the network that we can temporarily move VMs to.


Is it possible to migrate our current Ovirt setup to be HA using 
Gluster, or would we need to start again by exporting all VMs, wipe box 
one and reinstall CentOS & Ovirt, then install CentOS and Ovirt on box 
two.


I'd like to move the Engine to be hosted in a VM as well.
Can we do HA without Gluster? Pros/Cons

Regards

Andrew Dent

___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] KVM HA

2016-06-22 Thread Digimer
RHEV is a cloud solution with some HA features. It is not an actual HA
solution.

digimer

On 23/06/16 12:08 AM, Eero Volotinen wrote:
> How about trying commercial RHEV?
> 
> Eero
> 22.6.2016 8.02 ap. "Tom Robinson"  kirjoitti:
> 
>> Hi,
>>
>> I have two KVM hosts (CentOS 7) and would like them to operate as High
>> Availability servers,
>> automatically migrating guests when one of the hosts goes down.
>>
>> My question is: Is this even possible? All the documentation for HA that
>> I've found appears to not
>> do this. Am I missing something?
>>
>> My configuration so fare includes:
>>
>>  * SAN Storage Volumes for raw device mappings for guest vms (single
>> volume per guest).
>>  * multipathing of iSCSI and Infiniband paths to raw devices
>>  * live migration of guests works
>>  * a cluster configuration (pcs, corosync, pacemaker)
>>
>> Currently when I migrate a guest, I can all too easily start it up on both
>> hosts! There must be some
>> way to fence these off but I'm just not sure how to do this.
>>
>> Any help is appreciated.
>>
>> Kind regards,
>> Tom
>>
>>
>> --
>>
>> Tom Robinson
>> IT Manager/System Administrator
>>
>> MoTeC Pty Ltd
>>
>> 121 Merrindale Drive
>> Croydon South
>> 3136 Victoria
>> Australia
>>
>> T: +61 3 9761 5050
>> F: +61 3 9761 5051
>> E: tom.robin...@motec.com.au
>>
>>
>> ___
>> CentOS mailing list
>> CentOS@centos.org
>> https://lists.centos.org/mailman/listinfo/centos
>>
>>
> ___
> CentOS mailing list
> CentOS@centos.org
> https://lists.centos.org/mailman/listinfo/centos
> 


-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] KVM HA

2016-06-22 Thread Eero Volotinen
How about trying commercial RHEV?

Eero
22.6.2016 8.02 ap. "Tom Robinson"  kirjoitti:

> Hi,
>
> I have two KVM hosts (CentOS 7) and would like them to operate as High
> Availability servers,
> automatically migrating guests when one of the hosts goes down.
>
> My question is: Is this even possible? All the documentation for HA that
> I've found appears to not
> do this. Am I missing something?
>
> My configuration so fare includes:
>
>  * SAN Storage Volumes for raw device mappings for guest vms (single
> volume per guest).
>  * multipathing of iSCSI and Infiniband paths to raw devices
>  * live migration of guests works
>  * a cluster configuration (pcs, corosync, pacemaker)
>
> Currently when I migrate a guest, I can all too easily start it up on both
> hosts! There must be some
> way to fence these off but I'm just not sure how to do this.
>
> Any help is appreciated.
>
> Kind regards,
> Tom
>
>
> --
>
> Tom Robinson
> IT Manager/System Administrator
>
> MoTeC Pty Ltd
>
> 121 Merrindale Drive
> Croydon South
> 3136 Victoria
> Australia
>
> T: +61 3 9761 5050
> F: +61 3 9761 5051
> E: tom.robin...@motec.com.au
>
>
> ___
> CentOS mailing list
> CentOS@centos.org
> https://lists.centos.org/mailman/listinfo/centos
>
>
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] CPU Compatibility Question

2016-06-22 Thread Walter H.

On 23.06.2016 02:52, listmail wrote:

According to the compatibility chart over here:
https://access.redhat.com/support/policy/intel
...anything later than 6.3 (6.4 and up) should work with the E3-12xx v3
family of processors. But those are not the results I am seeing.

Does anyone have experience or commentary on this compatibility issue?


can it be the chipset, that is causing this?

___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


[CentOS] CPU Compatibility Question

2016-06-22 Thread listmail
Hi All,

Hopefully someone with broad overview of CentOS compatibility issues can 
comment on this:

I am evaluating a Supermicro X10SLM motherboard with an Intel E3-1231 v3 
CPU. Testing with boots from Live DVDs, the CentOS 6.x family is panicking 
at boot time. I have tried 6.8, 6.5, and 6.3, and each one panics at 
slightly different points, but they all seem to fail after udev starts up 
(or tries to).

On the other hand, I was able to boot CentOS 7.0 1511 from the Live CD.

According to the compatibility chart over here:
https://access.redhat.com/support/policy/intel
...anything later than 6.3 (6.4 and up) should work with the E3-12xx v3 
family of processors. But those are not the results I am seeing.

Does anyone have experience or commentary on this compatibility issue?

Thanks,
--Bill



___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Install C7 VM on C6 Host

2016-06-22 Thread Mark LaPierre
I had no real reason to doubt.  I was just being lazy.  I figured that,
if anyone knew the correct answer, it you be the people on this list.

Thank you for your gracious forbearance.

On 06/21/16 20:01, Boris Epstein wrote:
> I would think the same as Gordon that as long as your 64-bit VM
> virtualization is running properly there should be no problem running C7 on
> a VM running under C6. May I ask what the initial doubt was based upon? Has
> anybody out there had such an issue before?
> 
> Cheers,
> 
> Boris.
> 
> 
> On Tue, Jun 21, 2016 at 7:30 PM, Gordon Messmer 
> wrote:
> 
>> On 06/21/2016 04:06 PM, Mark LaPierre wrote:
>>
>>> Before I waste myself a bunch of time trying the impossible I figured I
>>> would ask if I can install an instance of C7 in a KVM based VM on a C6
>>> host.
>>>
>>
>>
>> Yes.
>>
>>
>> ___
>> CentOS mailing list
>> CentOS@centos.org
>> https://lists.centos.org/mailman/listinfo/centos
>>
> ___
> CentOS mailing list
> CentOS@centos.org
> https://lists.centos.org/mailman/listinfo/centos
> 


-- 
_
   °v°
  /(_)\
   ^ ^  Mark LaPierre
Registered Linux user No #267004
https://linuxcounter.net/

___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] KVM HA

2016-06-22 Thread Digimer
On 22/06/16 02:36 PM, Paul Heinlein wrote:
> On Wed, 22 Jun 2016, Digimer wrote:
> 
>> The nodes are not important, the hosted services are.
> 
> The only time this isn't true is when you're using the node to heat the
> room.
> 
> Otherwise, the service is always the important thing. (The node may
> become as synonymous with the service because there's no redundancy, but
> that's a bug, not a feature.)

"(The node may become as synonymous with the service there's no
redundancy, but that's a bug, not a feature.)"

I am so stealing the hell out of that line. <3

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] KVM HA

2016-06-22 Thread Digimer
On 22/06/16 02:34 PM, m.r...@5-cent.us wrote:
> Digimer wrote:
>> On 22/06/16 02:01 PM, Chris Adams wrote:
>>> Once upon a time, John R Pierce  said:
 On 6/22/2016 10:47 AM, Digimer wrote:
> This is called "fabric fencing" and was originally the only supported
> option in the very early days of HA. It has fallen out of favour for
> several reasons, but it does still work fine. The main issues is that
> it leaves the node in an unclean state. If an admin (out of ignorance or
> panic) reconnects the node, all hell can break lose. So generally
> power cycling is much safer.
> 
>>> If the node is just disconnected and left running, and later
>>> reconnected, it can try to write out (now old/incorrect) data to the
>>> storage, corrupting things.
>>>
>>> Speaking of shared storage, another fencing option is SCSI reservations.
>>> It can be terribly finicky, but it can be useful.
>>
>> Close.
>>
>> The cluster software and any hosted services aren't running. It's not
>> that they think they're wrong, they just have no existing state so they
>> won't try to touch anything without first ensuring it is safe to do so.
> 
> Question: when y'all are saying "reconnect", is this different from
> stopping the h/a services, reconnecting to the network, and then starting
> the services (which would let you avoid a reboot)?
> 
>   mark

Expecting a lost node to behave in any predictable manner is not allowed
in HA. In theory, with fabric fencing, that is exactly how you could
recover (stop all HA software, reconnect, start), but even then a reboot
is highly recommended before reconnecting.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] KVM HA

2016-06-22 Thread Digimer
On 22/06/16 02:31 PM, John R Pierce wrote:
> On 6/22/2016 11:06 AM, Digimer wrote:
>> I know this goes against the
>> grain of sysadmins to yank power, but in an HA setup, nodes should be
>> disposable and replaceable. The nodes are not important, the hosted
>> services are.
> 
> of course, the really tricky problem is implementing an ISCSI storage
> infrastructure thats fully redundant and has no single point of
> failure.   this requires the redundant storage controllers to have
> shared write-back cache, fully redundant networking, etc.   The
> fiberchannel SAN folks had all this down pat 20 years ago, but at an
> astronomical price point.
> 
> The more complex this stuff gets, the more points of potential failure
> you introduce.

Or use DRBD. That's what we do for our shared storage backing our VMs
and shared FS. Works like a charm.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] KVM HA

2016-06-22 Thread Chris Adams
Once upon a time, John R Pierce  said:
> of course, the really tricky problem is implementing an ISCSI
> storage infrastructure thats fully redundant and has no single point
> of failure.   this requires the redundant storage controllers to
> have shared write-back cache, fully redundant networking, etc.   The
> fiberchannel SAN folks had all this down pat 20 years ago, but at an
> astronomical price point.

Yep.  I inherited a home-brew iSCSI SAN with two CentOS servers and a
Dell MD3000 SAS storage array.  The servers run as a cluster, but you
don't get the benefits of specialized hardware.

We also have installed a few Dell EqualLogic iSCSI SANs.  They do have
the specialized hardware, like battery-backed caches and special
interconnects between the controllers.  They run active/standby, so they
can do "port mirroring" between controllers (where if port 1 on the
active controller loses link, but port 1 on the standby controller is
still up, the active controller can keep talking through the standby
controller's port).

I like the "build it yourself" approach for lots of things (sometimes
too many :) ), but IMHO you just can't reach the same level of HA and
performance as a dedicated SAN.

-- 
Chris Adams 
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] KVM HA

2016-06-22 Thread Paul Heinlein

On Wed, 22 Jun 2016, Digimer wrote:


The nodes are not important, the hosted services are.


The only time this isn't true is when you're using the node to heat 
the room.


Otherwise, the service is always the important thing. (The node may 
become as synonymous with the service because there's no redundancy, 
but that's a bug, not a feature.)


--
Paul Heinlein
heinl...@madboa.com
45°38' N, 122°6' W___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] KVM HA

2016-06-22 Thread m . roth
Digimer wrote:
> On 22/06/16 02:01 PM, Chris Adams wrote:
>> Once upon a time, John R Pierce  said:
>>> On 6/22/2016 10:47 AM, Digimer wrote:
 This is called "fabric fencing" and was originally the only supported
 option in the very early days of HA. It has fallen out of favour for
 several reasons, but it does still work fine. The main issues is that
 it leaves the node in an unclean state. If an admin (out of ignorance or
 panic) reconnects the node, all hell can break lose. So generally
 power cycling is much safer.

>> If the node is just disconnected and left running, and later
>> reconnected, it can try to write out (now old/incorrect) data to the
>> storage, corrupting things.
>>
>> Speaking of shared storage, another fencing option is SCSI reservations.
>> It can be terribly finicky, but it can be useful.
>
> Close.
>
> The cluster software and any hosted services aren't running. It's not
> that they think they're wrong, they just have no existing state so they
> won't try to touch anything without first ensuring it is safe to do so.

Question: when y'all are saying "reconnect", is this different from
stopping the h/a services, reconnecting to the network, and then starting
the services (which would let you avoid a reboot)?

  mark


___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] KVM HA

2016-06-22 Thread John R Pierce

On 6/22/2016 11:06 AM, Digimer wrote:

I know this goes against the
grain of sysadmins to yank power, but in an HA setup, nodes should be
disposable and replaceable. The nodes are not important, the hosted
services are.


of course, the really tricky problem is implementing an ISCSI storage 
infrastructure thats fully redundant and has no single point of 
failure.   this requires the redundant storage controllers to have 
shared write-back cache, fully redundant networking, etc.   The 
fiberchannel SAN folks had all this down pat 20 years ago, but at an 
astronomical price point.


The more complex this stuff gets, the more points of potential failure 
you introduce.






--
john r pierce, recycling bits in santa cruz

___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] KVM HA

2016-06-22 Thread Digimer
On 22/06/16 02:12 PM, Chris Adams wrote:
> Once upon a time, Digimer  said:
>> The cluster software and any hosted services aren't running. It's not
>> that they think they're wrong, they just have no existing state so they
>> won't try to touch anything without first ensuring it is safe to do so.
> 
> Well, I was being short; what I meant was, in HA, if you aren't known to
> be right, you are wrong, and you do nothing.

Ah, yes, exactly right.

>> SCSI reservations, and anything that blocks access is technically OK.
>> However, I stand by the recommendation to power cycle lost nodes. It's
>> by far the safest (and easiest) approach. I know this goes against the
>> grain of sysadmins to yank power, but in an HA setup, nodes should be
>> disposable and replaceable. The nodes are not important, the hosted
>> services are.
> 
> One advantage SCSI reservations have is that if you can access the
> storage, you can lock out everybody else.  It doesn't require access to
> a switch, management card, etc. (that may have its own problems).  If
> you can access the storage, you own it, if you can't, you don't.
> Putting a lock directly on the actual shared resource can be the safest
> path (if you can't access it, you can't screw it up).
> 
> I agree that rebooting a failed node is also good, just pointing out
> that putting the lock directly on the shared resource is also good.

The SCSI reservation protects shared storage only, which is my main
concern. A lot of folks think that fencing is only needed for storage,
when it is needed for all HA'ed services. If you know what you're doing
though, particularly if you combine it with watchdog based fencing like
fence_sanlock, you can be in good shape (if very very slow fencing times
are OK for you).

In the end though, I personally always use IPMI as the primary fence
method with a pair of switched PDUs as my backup method. Brutal, Simple
and highly effective. :P

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] KVM HA

2016-06-22 Thread Chris Adams
Once upon a time, Digimer  said:
> The cluster software and any hosted services aren't running. It's not
> that they think they're wrong, they just have no existing state so they
> won't try to touch anything without first ensuring it is safe to do so.

Well, I was being short; what I meant was, in HA, if you aren't known to
be right, you are wrong, and you do nothing.

> SCSI reservations, and anything that blocks access is technically OK.
> However, I stand by the recommendation to power cycle lost nodes. It's
> by far the safest (and easiest) approach. I know this goes against the
> grain of sysadmins to yank power, but in an HA setup, nodes should be
> disposable and replaceable. The nodes are not important, the hosted
> services are.

One advantage SCSI reservations have is that if you can access the
storage, you can lock out everybody else.  It doesn't require access to
a switch, management card, etc. (that may have its own problems).  If
you can access the storage, you own it, if you can't, you don't.
Putting a lock directly on the actual shared resource can be the safest
path (if you can't access it, you can't screw it up).

I agree that rebooting a failed node is also good, just pointing out
that putting the lock directly on the shared resource is also good.

-- 
Chris Adams 
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] KVM HA

2016-06-22 Thread Digimer
On 22/06/16 02:01 PM, Chris Adams wrote:
> Once upon a time, John R Pierce  said:
>> On 6/22/2016 10:47 AM, Digimer wrote:
>>> This is called "fabric fencing" and was originally the only supported
>>> option in the very early days of HA. It has fallen out of favour for
>>> several reasons, but it does still work fine. The main issues is that it
>>> leaves the node in an unclean state. If an admin (out of ignorance or
>>> panic) reconnects the node, all hell can break lose. So generally power
>>> cycling is much safer.
>>
>> how is that any different than said ignorant admin powering up the
>> shutdown node ?
> 
> On boot, the cluster software assumes it is "wrong" and doesn't connect
> to any resources until it can verify state.
> 
> If the node is just disconnected and left running, and later
> reconnected, it can try to write out (now old/incorrect) data to the
> storage, corrupting things.
> 
> Speaking of shared storage, another fencing option is SCSI reservations.
> It can be terribly finicky, but it can be useful.

Close.

The cluster software and any hosted services aren't running. It's not
that they think they're wrong, they just have no existing state so they
won't try to touch anything without first ensuring it is safe to do so.

SCSI reservations, and anything that blocks access is technically OK.
However, I stand by the recommendation to power cycle lost nodes. It's
by far the safest (and easiest) approach. I know this goes against the
grain of sysadmins to yank power, but in an HA setup, nodes should be
disposable and replaceable. The nodes are not important, the hosted
services are.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] KVM HA

2016-06-22 Thread Chris Adams
Once upon a time, John R Pierce  said:
> On 6/22/2016 10:47 AM, Digimer wrote:
> >This is called "fabric fencing" and was originally the only supported
> >option in the very early days of HA. It has fallen out of favour for
> >several reasons, but it does still work fine. The main issues is that it
> >leaves the node in an unclean state. If an admin (out of ignorance or
> >panic) reconnects the node, all hell can break lose. So generally power
> >cycling is much safer.
> 
> how is that any different than said ignorant admin powering up the
> shutdown node ?

On boot, the cluster software assumes it is "wrong" and doesn't connect
to any resources until it can verify state.

If the node is just disconnected and left running, and later
reconnected, it can try to write out (now old/incorrect) data to the
storage, corrupting things.

Speaking of shared storage, another fencing option is SCSI reservations.
It can be terribly finicky, but it can be useful.
-- 
Chris Adams 
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] KVM HA

2016-06-22 Thread John R Pierce

On 6/22/2016 10:47 AM, Digimer wrote:

This is called "fabric fencing" and was originally the only supported
option in the very early days of HA. It has fallen out of favour for
several reasons, but it does still work fine. The main issues is that it
leaves the node in an unclean state. If an admin (out of ignorance or
panic) reconnects the node, all hell can break lose. So generally power
cycling is much safer.


how is that any different than said ignorant admin powering up the 
shutdown node ?



--
john r pierce, recycling bits in santa cruz

___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] KVM HA

2016-06-22 Thread Digimer
On 22/06/16 01:38 PM, John R Pierce wrote:
> On 6/21/2016 10:01 PM, Tom Robinson wrote:
>> Currently when I migrate a guest, I can all too easily start it up on
>> both hosts! There must be some
>> way to fence these off but I'm just not sure how to do this.
> 
> in addition to power fencing as described by others, you can also fence
> at the ethernet switch layer, where you disable the switch port(s) that
> the dead host is on.  this of course requires managed switches that your
> cluster management software can talk to.   if you're using dedicated
> networking for ISCSI (often done for high performance), you can just
> disable that port.

This is called "fabric fencing" and was originally the only supported
option in the very early days of HA. It has fallen out of favour for
several reasons, but it does still work fine. The main issues is that it
leaves the node in an unclean state. If an admin (out of ignorance or
panic) reconnects the node, all hell can break lose. So generally power
cycling is much safer.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


[CentOS] Centos 7 boot hanging unless systemd.log_target=console is used

2016-06-22 Thread James Hartig
I have multiple VMs that are hanging on boot. Sometimes they'll boot
fine after 5 mins and other times it'll take over an hour. The problem
seems to be related to journald but I'd like to figure out how I can
get more information.
The VMs are running CentOS 7.1.1503. systemd and journald are both
version 208. We are reluctant to upgrade to the newest CentOS because
we found that journal-gatewayd is broken on boxes that have been
upgraded.

If I add the kernel parameter systemd.log_target=console then they
boot up fine without hanging at all. I can't seem to get into the
debug shell (Ctrl+Alt+F9) while the system is hanging after enabling
debug-shell. Adding systemd.log_level=debug without
systemd.log_target=console seems to make the problem worse and causes
a hang of over an hour. I tried deleting /var/log/journal but that
didn't help and I ran fsck on the root drive with no errors. I also
tried masking systemd-udev-settle but that didn't help either.

While the system is hanging the last console output is "Welcome to
CentOS Linux 7 (Core)!".

Here's the output from dmesg without "quiet" kernel parameter and
without "systemd.log_level=debug": http://gobin.io/zScc

systemd-analyze blame doesn't seem to help either (top 3):
10.451s nginx.service
 7.149s network.service
 2.178s etcd.service

What else can I do to debug this?

Thanks!
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


[CentOS-virt] Cannot allocate Memory

2016-06-22 Thread Shaun Reitan
Any of you guys ever seen an issue with Xen 4.4 were xm cannot create a 
guest because of what looks like an issue allocating memory even though 
xm info shows like 5x the amount of free memory needed? We are still 
unfortunately still using xm... it's on my list, i know..


We've had this happen on a couple hosts now.  Only way to resolve seams 
to be rebooting the host.  I'm going to update the host to latest Xen 
4.4 now hoping this is a old bug.



Here's from xen logs

[2016-06-22 09:13:50 1958] DEBUG (XendDomainInfo:105) 
XendDomainInfo.create(['vm', ['name', 'xxx'], ['memory', 2048], 
['on_xend_start', 'ignore'], ['on_xend_stop', 'ignore'], ['vcpus', 2], 
['oos', 1], ['image', ['linux', ['kernel', 
'/kernels/vmlinux-2.6.18.8-4'], ['videoram', 4], ['args', 
'root=/dev/xvda ro xencons=tty console=tty1 '], ['tsc_mode', 0], 
['nomigrate', 0]]], ['s3_integrity', 1], ['device', ['vbd', ['uname', 
'phy:vg/fs_6818'], ['dev', 'xvda'], ['mode', 'w']]], ['device', ['vbd', 
['uname', 'phy:vg/fs_6819'], ['dev', 'xvdb'], ['mode', 'w']]], 
['device', ['vif', ['rate', '40mb/s'], ['mac', 'FE:FD:48:01:F1:E7')
[2016-06-22 09:13:50 1958] DEBUG (XendDomainInfo:2504) 
XendDomainInfo.constructDomain
[2016-06-22 09:13:50 1958] DEBUG (balloon:187) Balloon: 7602632 KiB 
free; need 16384; done.
[2016-06-22 09:13:50 1958] ERROR (XendDomainInfo:2566) (12, 'Cannot 
allocate memory')

Traceback (most recent call last):
  File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", 
line 2561, in _constructDomain

target = self.info.target())
Error: (12, 'Cannot allocate memory')
[2016-06-22 09:13:50 1958] ERROR (XendDomainInfo:490) VM start failed
Traceback (most recent call last):
  File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", 
line 475, in start

XendTask.log_progress(0, 30, self._constructDomain)
  File "/usr/lib64/python2.6/site-packages/xen/xend/XendTask.py", line 
209, in log_progress

retval = func(*args, **kwds)
  File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", 
line 2572, in _constructDomain

raise VmError(failmsg)
VmError: Creating domain failed: name=xxx
[2016-06-22 09:13:50 1958] ERROR (XendDomainInfo:110) Domain 
construction failed

Traceback (most recent call last):
  File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", 
line 108, in create

vm.start()
  File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", 
line 475, in start

XendTask.log_progress(0, 30, self._constructDomain)
  File "/usr/lib64/python2.6/site-packages/xen/xend/XendTask.py", line 
209, in log_progress

retval = func(*args, **kwds)
  File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", 
line 2572, in _constructDomain

raise VmError(failmsg)
VmError: Creating domain failed: name=xxx

--
Shaun___
CentOS-virt mailing list
CentOS-virt@centos.org
https://lists.centos.org/mailman/listinfo/centos-virt


Re: [CentOS] KVM HA

2016-06-22 Thread John R Pierce

On 6/21/2016 10:01 PM, Tom Robinson wrote:

Currently when I migrate a guest, I can all too easily start it up on both 
hosts! There must be some
way to fence these off but I'm just not sure how to do this.


in addition to power fencing as described by others, you can also fence 
at the ethernet switch layer, where you disable the switch port(s) that 
the dead host is on.  this of course requires managed switches that your 
cluster management software can talk to.   if you're using dedicated 
networking for ISCSI (often done for high performance), you can just 
disable that port.




--
john r pierce, recycling bits in santa cruz

___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS-es] Puerto 4567

2016-06-22 Thread David González Romero
Igual puede restringirle el acceso solo a tu red, así el cuidado será mayor...

El día 21 de junio de 2016, 19:43, Eliud Cardenas
 escribió:
> Hola a todos,
>
> Me respondo a mi mismo, es el puerto que abre Percona mysql para la
> comunicación grupal.
>
> Nada de cuidado.
>
> Saludos!
>
>> Eliud Cardenas 
>> June 21, 2016 at 6:32 PM
>>
>> Hola a todos,
>>
>> Tengo una duda, alguno de ustedes han tendido esto:
>>
>> 4567/tcp open  tram
>>
>> Tengo un centos 7 y veo este puerto abierto pero no veo quien lo abre ni
>> porque esta ahi.
>>
>> Saludos!
>>
>> ___
>> CentOS-es mailing list
>> CentOS-es@centos.org
>> https://lists.centos.org/mailman/listinfo/centos-es
>>
>
> --
>
>
> ___
> CentOS-es mailing list
> CentOS-es@centos.org
> https://lists.centos.org/mailman/listinfo/centos-es
___
CentOS-es mailing list
CentOS-es@centos.org
https://lists.centos.org/mailman/listinfo/centos-es


[CentOS-virt] [Virt SIG] status of virt7-docker repository?

2016-06-22 Thread Haïkel
Hi,

I'm considering to add virt7-docker repository as a dependency for the
openstack repo from Cloud SIG.
It seems that it's not consumable yet and some packages like
Kubernetes are outdated.

Could you update me with the current status and if there is anything
we can do to help?

Regards,
H.
___
CentOS-virt mailing list
CentOS-virt@centos.org
https://lists.centos.org/mailman/listinfo/centos-virt


[CentOS] CentOS-announce Digest, Vol 136, Issue 4

2016-06-22 Thread centos-announce-request
Send CentOS-announce mailing list submissions to
centos-annou...@centos.org

To subscribe or unsubscribe via the World Wide Web, visit
https://lists.centos.org/mailman/listinfo/centos-announce
or, via email, send a message with subject or body 'help' to
centos-announce-requ...@centos.org

You can reach the person managing the list at
centos-announce-ow...@centos.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of CentOS-announce digest..."


Today's Topics:

   1. CEBA-2016:1266  CentOS 6 tzdata BugFix Update (Johnny Hughes)
   2. CEBA-2016:1266  CentOS 7 tzdata BugFix Update (Johnny Hughes)
   3. CESA-2016:1267 Important CentOS 6 setroubleshoot-plugins
  Security Update (Johnny Hughes)
   4. CESA-2016:1267 Important CentOS 6 setroubleshoot  Security
  Update (Johnny Hughes)
   5. CEBA-2016:1266  CentOS 5 tzdata BugFix Update (Johnny Hughes)


--

Message: 1
Date: Tue, 21 Jun 2016 18:43:34 +
From: Johnny Hughes 
To: centos-annou...@centos.org
Subject: [CentOS-announce] CEBA-2016:1266  CentOS 6 tzdata BugFix
Update
Message-ID: <20160621184334.ga56...@n04.lon1.karan.org>
Content-Type: text/plain; charset=us-ascii


CentOS Errata and Bugfix Advisory 2016:1266 

Upstream details at : https://rhn.redhat.com/errata/RHBA-2016-1266.html

The following updated files have been uploaded and are currently 
syncing to the mirrors: ( sha256sum Filename ) 

i386:
27275c88c15db6a83722068e6f998d74ccc00bafbb2a80cb4590b47b6ed9e5a2  
tzdata-2016e-1.el6.noarch.rpm
df6503b270368fa7f3b9147637e423ca6db45485f03072e9d5e273409d782007  
tzdata-java-2016e-1.el6.noarch.rpm

x86_64:
27275c88c15db6a83722068e6f998d74ccc00bafbb2a80cb4590b47b6ed9e5a2  
tzdata-2016e-1.el6.noarch.rpm
df6503b270368fa7f3b9147637e423ca6db45485f03072e9d5e273409d782007  
tzdata-java-2016e-1.el6.noarch.rpm

Source:
65814adef78cc1939847dc32f5ffbce8db34fc58acd8f1a002e002f5917ea638  
tzdata-2016e-1.el6.src.rpm



-- 
Johnny Hughes
CentOS Project { http://www.centos.org/ }
irc: hughesjr, #cen...@irc.freenode.net
Twitter: @JohnnyCentOS



--

Message: 2
Date: Tue, 21 Jun 2016 19:05:22 +
From: Johnny Hughes 
To: centos-annou...@centos.org
Subject: [CentOS-announce] CEBA-2016:1266  CentOS 7 tzdata BugFix
Update
Message-ID: <20160621190522.ga59...@n04.lon1.karan.org>
Content-Type: text/plain; charset=us-ascii


CentOS Errata and Bugfix Advisory 2016:1266 

Upstream details at : https://rhn.redhat.com/errata/RHBA-2016-1266.html

The following updated files have been uploaded and are currently 
syncing to the mirrors: ( sha256sum Filename ) 

x86_64:
35e627912852d34e84ea76e5cbcadf233c7945185320fc0a8b2fa7e5e4ee2099  
tzdata-2016e-1.el7.noarch.rpm
688596a9be955e0f481845db5b41da2232ef701d56967971e2fe382be32fdad4  
tzdata-java-2016e-1.el7.noarch.rpm

Source:
32bbb097ae98767d8b45e3b38f03ae1c205eecaef6d1bcd58912fd6f5443a5fb  
tzdata-2016e-1.el7.src.rpm



-- 
Johnny Hughes
CentOS Project { http://www.centos.org/ }
irc: hughesjr, #cen...@irc.freenode.net
Twitter: @JohnnyCentOS



--

Message: 3
Date: Tue, 21 Jun 2016 19:07:23 +
From: Johnny Hughes 
To: centos-annou...@centos.org
Subject: [CentOS-announce] CESA-2016:1267 Important CentOS 6
setroubleshoot-plugins Security Update
Message-ID: <20160621190723.ga60...@n04.lon1.karan.org>
Content-Type: text/plain; charset=us-ascii


CentOS Errata and Security Advisory 2016:1267 Important

Upstream details at : https://rhn.redhat.com/errata/RHSA-2016-1267.html

The following updated files have been uploaded and are currently 
syncing to the mirrors: ( sha256sum Filename ) 

i386:
3b8bbdeaf83bf2ae720395dd7ea13c3946f439e85bc60731f54414d85beb8a90  
setroubleshoot-plugins-3.0.40-3.1.el6_8.noarch.rpm

x86_64:
3b8bbdeaf83bf2ae720395dd7ea13c3946f439e85bc60731f54414d85beb8a90  
setroubleshoot-plugins-3.0.40-3.1.el6_8.noarch.rpm

Source:
524e4dac899bb14f96404df56f34475e037b4c9c770db40b4e039a05b5b1804f  
setroubleshoot-plugins-3.0.40-3.1.el6_8.src.rpm



-- 
Johnny Hughes
CentOS Project { http://www.centos.org/ }
irc: hughesjr, #cen...@irc.freenode.net
Twitter: @JohnnyCentOS



--

Message: 4
Date: Tue, 21 Jun 2016 19:07:49 +
From: Johnny Hughes 
To: centos-annou...@centos.org
Subject: [CentOS-announce] CESA-2016:1267 Important CentOS 6
setroubleshoot  Security Update
Message-ID: <20160621190749.ga60...@n04.lon1.karan.org>
Content-Type: text/plain; charset=us-ascii


CentOS Errata and Security Advisory 2016:1267 Important

Upstream details at : https://rhn.redhat.com/errata/RHSA-2016-1267.html

The following updated files have been uploaded and are currently 
syncing to the mirrors: ( sha256sum Filename ) 

i386:
6162d2040eee1d468be25455dff5505b881b0e843848a0d770b47f8f7b6de9fe  

Re: [CentOS-virt] PCI Passthrough not working

2016-06-22 Thread Francis Greaves
More information... 
I have pcifront showing as a module in the DomU and the usb shows in dmesg as: 
[ 3.167543] usbcore: registered new interface driver usbfs 
[ 3.167563] usbcore: registered new interface driver hub 
[ 3.167585] usbcore: registered new device driver usb 
[ 3.196056] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002 
[ 3.196060] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 
[ 3.196064] usb usb1: Product: EHCI Host Controller 
[ 3.196068] usb usb1: Manufacturer: Linux 3.2.0-4-686-pae ehci_hcd 
[ 3.196071] usb usb1: SerialNumber: :00:00.0 
[ 3.508036] usb 1-1: new high-speed USB device number 2 using ehci_hcd 
[ 19.064072] usb 1-1: device not accepting address 2, error -110 
[ 19.176070] usb 1-1: new high-speed USB device number 3 using ehci_hcd 
[ 34.732067] usb 1-1: device not accepting address 3, error -110 
[ 34.844082] usb 1-1: new high-speed USB device number 4 using ehci_hcd 
[ 45.280073] usb 1-1: device not accepting address 4, error -110 
[ 45.392067] usb 1-1: new high-speed USB device number 5 using ehci_hcd 
[ 55.824112] usb 1-1: device not accepting address 5, error -110 

I am looking at xl dmesg in Dom0 where there are some messages relating to the 
PCI usb: 
[VT-D] It's disallowed to assign :00:1a.0 with shared RMRR at 7b80 for 
Dom6. 
(XEN) XEN_DOMCTL_assign_device: assign :00:1a.0 to dom6 failed (-1) 
(XEN) [VT-D] It's risky to assign :00:1a.0 with shared RMRR at 7b80 for 
Dom7. 
(XEN) [VT-D] It's risky to assign :00:1a.0 with shared RMRR at 7b80 for 
Dom8. 
(XEN) [VT-D] It's risky to assign :00:1a.0 with shared RMRR at 7b80 for 
Dom9. 
(XEN) [VT-D] It's risky to assign :00:1a.0 with shared RMRR at 7b80 for 
Dom10. 

In the 
as an aside... I just get blocks on the screen after the scrubbing message, and 
no text. I see there is a message: 
(XEN) Xen is relinquishing VGA console. 
How can I prevent this? Is there something wrong with my /etc/default/grub 

... 
GRUB_CMDLINE_LINUX="crashkernel=auto intremap=no_x2apic_optout" 
GRUB_CMDLINE_XEN_DEFAULT="dom0_mem=13312M,max:14336M dom0_max_vcpus=6 
dom0_vcpus_pin" 
GRUB_GFXMODE=1024x768 
GRUB_GFXPAYLOAD_LINUX=keep 
GRUB_CMDLINE_LINUX_XEN_REPLACE_DEFAULT="console=hvc0 earlyprintk=xen" 

Many thanks 
Francis 

From: "Francis Greaves"  
To: "centos-virt"  
Sent: Wednesday, 22 June, 2016 09:56:44 
Subject: [CentOS-virt] PCI Passthrough not working 

Further to my messages back in May I have at last got round to trying to get my 
DomU to recognise USB devices. 

I am using Xen 4.6 with CentOS kernel 3.18.34-20.el7.x86_64. 
I have to manually make the port available before creating the DomU by issuing 
the command: 
xl pci-assignable-add 00:1a.0 
otherwise nothing shows in: 
xl pci-assignable-list 

I have added this to my .cfg file as per the May Emails: 
pci=['00:1a.0,rdm_policy=relaxed'] 

and in the DomU, which starts fine, I get the following information: 
lspci 
00:00.0 USB controller: Intel Corporation Wellsburg USB Enhanced Host 
Controller #2 (rev 05) 
lsusb 
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub 

However when I plug in a device to this USB port (Am sure it is the correct 
port in the Dom0) I see nothing in the DomU at all. 

What can I do now? 
Many thanks 
Francis 

___ 
CentOS-virt mailing list 
CentOS-virt@centos.org 
https://lists.centos.org/mailman/listinfo/centos-virt 
___
CentOS-virt mailing list
CentOS-virt@centos.org
https://lists.centos.org/mailman/listinfo/centos-virt


[CentOS-virt] PCI Passthrough not working

2016-06-22 Thread Francis Greaves
Further to my messages back in May I have at last got round to trying to get my 
DomU to recognise USB devices. 

I am using Xen 4.6 with CentOS kernel 3.18.34-20.el7.x86_64. 
I have to manually make the port available before creating the DomU by issuing 
the command: 
xl pci-assignable-add 00:1a.0 
otherwise nothing shows in: 
xl pci-assignable-list 

I have added this to my .cfg file as per the May Emails: 
pci=['00:1a.0,rdm_policy=relaxed'] 

and in the DomU, which starts fine, I get the following information: 
lspci 
00:00.0 USB controller: Intel Corporation Wellsburg USB Enhanced Host 
Controller #2 (rev 05) 
lsusb 
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub 

However when I plug in a device to this USB port (Am sure it is the correct 
port in the Dom0) I see nothing in the DomU at all. 

What can I do now? 
Many thanks 
Francis 
___
CentOS-virt mailing list
CentOS-virt@centos.org
https://lists.centos.org/mailman/listinfo/centos-virt


Re: [CentOS] KVM HA

2016-06-22 Thread Barak Korren
On 22 June 2016 at 09:03, Indunil Jayasooriya  wrote:
>
> When an UNCLEAN SHUDWON happens or ifdown eth0 in node1 ,  can OVIRT
> migrate VMs from node1 to node2?

Yep.

> in that case, Is power management such as ILO needed?

It needs a way to ensure the host is down to prevent storage
corruption, so yeah.

If you have some other way to determine that all the VMs on the failed
host are down, you can use the API/CLI to to tell it to consider the
host as if it had already been fenced out and bring the VMs back up on
the 2nd host.

-- 
Barak Korren
bkor...@redhat.com
RHEV-CI Team
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Any further developments on CentOS7 for i386?

2016-06-22 Thread Zdenek Sedlak
On 2015-10-07 15:21, Johnny Hughes wrote:
> On 10/06/2015 05:30 PM, Kay Schenk wrote:
>> Well I haven't tested out the CentOS 7 for i386 yet as sent in the
>> message of 06/02--
>>
>> https://lists.centos.org/pipermail/centos-devel/2015-June/013426.html
>>
>> Nor have I seen any additional information. So how is this going?
>> I'm almost ready to jump in as I would really prefer to be on  Gnome 3.
>>
> 
> We have moved it into place here, which is where it is going to live
> permanently:
> 
> http://mirror.centos.org/altarch/7/os/i386/
> 
> I am working on a wiki page now and we are still doing some testing, but
> the 32 bit arch should be completely usable right now and installable
> (in its final form) from these isos:
> 
> http://mirror.centos.org/altarch/7/isos/i386/
> 
> The 2 bugs listed in the link above are still there:
> 
> 1.  If installing on a QEMU (kvm) i386 VM, you must modify the VM cpu to
> use "copy host cpu"
> 
> http://bugs.centos.org/view.php?id=8748
> 
> 2.  The gnome desktop will not exit or log out from the menu.
> 
>  http://bugs.centos.org/view.php?id=8834
> 
> Both have workarounds listed.
> 
> We should have a release announcement fairly soon and hopefully EPEL
> will start building 32 bit packages soon(ish) as well.
> 
> Thanks,
> Johnny Hughes
> 
> 
> 
> ___
> CentOS mailing list
> CentOS@centos.org
> https://lists.centos.org/mailman/listinfo/centos
> 

Hi Johnny,

> We have moved it into place here, which is where it is going to live
> permanently:
>
> http://mirror.centos.org/altarch/7/os/i386/

Are they any mirrors (with rsync support) available in the world?

I'd like to have this repo inside the LAN to avoid unnecessary traffic
and I don't want to overload the main ḿirror...


//Zdenek
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] KVM HA

2016-06-22 Thread Digimer
On 22/06/16 02:03 AM, Indunil Jayasooriya wrote:
> On Wed, Jun 22, 2016 at 11:08 AM, Barak Korren  wrote:
> 
>>>
>>> My question is: Is this even possible? All the documentation for HA that
>> I've found appears to not
>>> do this. Am I missing something?
>>
>> You can use oVirt for that (www.ovirt.org).
>>
> 
> When an UNCLEAN SHUDWON happens or ifdown eth0 in node1 ,  can OVIRT
> migrate VMs from node1 to node2?
> 
> in that case, Is power management such as ILO needed?

I can't speak to ovirt (it's more of a cloud platform than an HA one),
but in HA in general, this is how it works...

Say node1 is hosting vm-A. Node1 stops responding for some reason (maybe
it's hung, maybe it's running by lost net, maybe it's a pile of flaming
rubble, you don't know). Within a moment, the other cluster node(s) will
declare it lost and initiate fencing.

Typically "fencing" means "shut the target off over IPMI (iRMC, iLO,
DRAC, RSA, etc). However, lets assume that the node lost all power
(we've seen this with voltage regulators failing on the mainboard,
shorted cable harnesses, etc). In that case, the IPMI BMC will fail as
well so this method of fencing will fail.

The cluster can't assume that "no response from fence device A == dead
node". All you know is that you still don't know what state the peer is
in. To make assumption and boot vm-A now would be potentially
disastrous. So instead, the cluster blocks and retries the fencing
indefinitely, leaving things hung. The logic being that, as bad as it is
to hang, it is better than risking a split-brain/corruption.

What we do to mitigate this, and pacemaker supports this just fine, is
add a second layer for fencing. We do this with a pair of switched PDUs.
So say that node1's first PSU is plugged into PDU 1, Outlet 1 and its
second PSU is plugged into PDU 2, Outlet 1. Now, instead of blocking
after IPMI fails, it instead moves on and turns off the power to those
two outlets. Being that the PDUs are totally external, they should be
up. So in this case, now we can say "yes, node1 is gone" and safely boot
vm-A on node2.

Make sense?

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] KVM HA

2016-06-22 Thread Digimer
On 22/06/16 02:10 AM, Tom Robinson wrote:
> Hi Digimer,
> 
> Thanks for your reply.
> 
> On 22/06/16 15:20, Digimer wrote:
>> On 22/06/16 01:01 AM, Tom Robinson wrote:
>>> Hi,
>>>
>>> I have two KVM hosts (CentOS 7) and would like them to operate as High 
>>> Availability servers,
>>> automatically migrating guests when one of the hosts goes down.
>>>
>>> My question is: Is this even possible? All the documentation for HA that 
>>> I've found appears to not
>>> do this. Am I missing something?
>>
>> Very possible. It's all I've done for years now.
>>
>> https://alteeve.ca/w/AN!Cluster_Tutorial_2
>>
>> That's for EL 6, but the basic concepts port perfectly. In EL7, just
>> change out cman + rgmanager for pacemaker. The commands change, but the
>> concepts don't. Also, we use DRBD but you can conceptually swap that for
>> "SAN" and the logic is the same (though I would argue that a SAN is less
>> reliable).
> 
> In what way is the SAN method less reliable? Am I going to get into a world 
> of trouble going that way?

In the HA space, there should be no single point of failure. A SAN, for
all of it's redundancies, is a single thing. Google for tales of bad SAN
firmware upgrades to get an idea of what I am talking about.

We've found that using DRBD and build clusters in pairs to be a far more
resilient design. First, you don't put all you eggs in one basket, as it
were. So if you have multiple failures and lose a cluster, you lose one
pair and the servers it was hosting. Very bad, but less so that losing
the storage for all your systems.

Consider this case that happened a couple of years ago;

We had a client, through no malicious intent and misunderstanding of how
"hot swap" worked, walk up to a machine and start ejecting drives. We
got in touch and stopped him in very short order, but the damage was
done and the node's array was hosed. Certainly not a failure scenario we
had ever considered.

DRBD (which is sort of like "RAID 1 over a network") simply market the
local storage as Diskless and routed all read/writes to the good peer.
The hosted VM servers (and the software underpinning them) kept working
just fine. We lost the ability to live migrate because we couldn't read
from the local disk anymore, but the client continued to operate for
about an hour until we could schedule a controlled reboot to move the
servers without interrupting production.

In short, to do HA right, you have to be able to look at every piece of
you stack and say "what happens if this goes away?" and design it so
that the answer is "we'll recover".

For clients who need the performance of SANs (go big enough and the
caching and whatnot of a SAN is superior), we then recommend two SANs
and connect each one to a node and then treat them from there up as DAS.

>> There is an active mailing list for HA clustering, too:
>>
>> http://clusterlabs.org/mailman/listinfo/users
> I've had a brief look at the web-site. Lots of good info there. Thanks!

Clusterlabs is now the umbrella for a collection of different open
source HA projects, so it will continue to grow as time goes on.

>>> My configuration so fare includes:
>>>
>>>  * SAN Storage Volumes for raw device mappings for guest vms (single volume 
>>> per guest).
>>>  * multipathing of iSCSI and Infiniband paths to raw devices
>>>  * live migration of guests works
>>>  * a cluster configuration (pcs, corosync, pacemaker)
>>>
>>> Currently when I migrate a guest, I can all too easily start it up on both 
>>> hosts! There must be some
>>> way to fence these off but I'm just not sure how to do this.
>>
>> Fencing, exactly.
>>
>> What we do is create a small /shared/definitions (on gfs2) to host the
>> VM XML definitions and then undefine the VMs from the nodes. This makes
>> the servers disappear on non-cluster aware tools, like
>> virsh/virt-manager. Pacemaker can still start the servers just fine and
>> pacemaker, with fencing, makes sure that the server is only ever running
>> on one node at a time.
> 
> That sounds simple enough :-P. Although, I wanted to be able to easily open 
> VM Consoles which I do
> currently through virt-manager. I also use virsh for all kinds of ad-hoc 
> management. Is there an
> easy way to still have my cake and eat it? We also have a number of Windows 
> VM's. Remote desktop is
> great but sometimes you just have to have a console.

We use virt-manager, too. It's just fine. Virsh also works just fine.
The only real difference is that once the server shuts off, it
"vanishes" from those tools. I would say about 75%+ of our clients run
some flavour of windows on our systems and both access and performance
is just fine.

>> We also have an active freenode IRC channel; #clusterlabs. Stop on by
>> and say hello. :)
> 
> Will do. I have a bit of reading now to catch up but I'm sure I'll have a few 
> more questions before
> long.
> 
> Kind regards,
> Tom

Happy to help. If you stop by and don't get a reply, please idle. Folks
there span all timezones but are 

Re: [CentOS] KVM HA

2016-06-22 Thread Tom Robinson
Hi Digimer,

Thanks for your reply.

On 22/06/16 15:20, Digimer wrote:
> On 22/06/16 01:01 AM, Tom Robinson wrote:
>> Hi,
>>
>> I have two KVM hosts (CentOS 7) and would like them to operate as High 
>> Availability servers,
>> automatically migrating guests when one of the hosts goes down.
>>
>> My question is: Is this even possible? All the documentation for HA that 
>> I've found appears to not
>> do this. Am I missing something?
> 
> Very possible. It's all I've done for years now.
> 
> https://alteeve.ca/w/AN!Cluster_Tutorial_2
> 
> That's for EL 6, but the basic concepts port perfectly. In EL7, just
> change out cman + rgmanager for pacemaker. The commands change, but the
> concepts don't. Also, we use DRBD but you can conceptually swap that for
> "SAN" and the logic is the same (though I would argue that a SAN is less
> reliable).

In what way is the SAN method less reliable? Am I going to get into a world of 
trouble going that way?

> 
> There is an active mailing list for HA clustering, too:
> 
> http://clusterlabs.org/mailman/listinfo/users
I've had a brief look at the web-site. Lots of good info there. Thanks!

> 
>> My configuration so fare includes:
>>
>>  * SAN Storage Volumes for raw device mappings for guest vms (single volume 
>> per guest).
>>  * multipathing of iSCSI and Infiniband paths to raw devices
>>  * live migration of guests works
>>  * a cluster configuration (pcs, corosync, pacemaker)
>>
>> Currently when I migrate a guest, I can all too easily start it up on both 
>> hosts! There must be some
>> way to fence these off but I'm just not sure how to do this.
> 
> Fencing, exactly.
> 
> What we do is create a small /shared/definitions (on gfs2) to host the
> VM XML definitions and then undefine the VMs from the nodes. This makes
> the servers disappear on non-cluster aware tools, like
> virsh/virt-manager. Pacemaker can still start the servers just fine and
> pacemaker, with fencing, makes sure that the server is only ever running
> on one node at a time.

That sounds simple enough :-P. Although, I wanted to be able to easily open VM 
Consoles which I do
currently through virt-manager. I also use virsh for all kinds of ad-hoc 
management. Is there an
easy way to still have my cake and eat it? We also have a number of Windows 
VM's. Remote desktop is
great but sometimes you just have to have a console.

> We also have an active freenode IRC channel; #clusterlabs. Stop on by
> and say hello. :)

Will do. I have a bit of reading now to catch up but I'm sure I'll have a few 
more questions before
long.

Kind regards,
Tom



signature.asc
Description: OpenPGP digital signature
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] KVM HA

2016-06-22 Thread Indunil Jayasooriya
On Wed, Jun 22, 2016 at 11:08 AM, Barak Korren  wrote:

> >
> > My question is: Is this even possible? All the documentation for HA that
> I've found appears to not
> > do this. Am I missing something?
>
> You can use oVirt for that (www.ovirt.org).
>

When an UNCLEAN SHUDWON happens or ifdown eth0 in node1 ,  can OVIRT
migrate VMs from node1 to node2?

in that case, Is power management such as ILO needed?





-- 
cat /etc/motd

Thank you
Indunil Jayasooriya
http://www.theravadanet.net/
http://www.siyabas.lk/sinhala_how_to_install.html   -  Download Sinhala
Fonts
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos