date:20170210

Re: [ClusterLabs] two node cluster: vm starting - shutting down 15min later - starting again 15min later ... and so on

2017-02-10 Thread Ken Gaillot

On 02/10/2017 06:49 AM, Lentes, Bernd wrote:
> 
> 
> - On Feb 10, 2017, at 1:10 AM, Ken Gaillot kgail...@redhat.com wrote:
> 
>> On 02/09/2017 10:48 AM, Lentes, Bernd wrote:
>>> Hi,
>>>
>>> i have a two node cluster with a vm as a resource. Currently i'm just 
>>> testing
>>> and playing. My vm boots and shuts down again in 15min gaps.
>>> Surely this is related to "PEngine Recheck Timer (I_PE_CALC) just popped
>>> (90ms)" found in the logs. I googled, and it is said that this
>>> is due to time-based rule
>>> (http://oss.clusterlabs.org/pipermail/pacemaker/2009-May/001647.html). OK.
>>> But i don't have any time-based rules.
>>> This is the config for my vm:
>>>
>>> primitive prim_vm_mausdb VirtualDomain \
>>> params config="/var/lib/libvirt/images/xml/mausdb_vm.xml" \
>>> params hypervisor="qemu:///system" \
>>> params migration_transport=ssh \
>>> op start interval=0 timeout=90 \
>>> op stop interval=0 timeout=95 \
>>> op monitor interval=30 timeout=30 \
>>> op migrate_from interval=0 timeout=100 \
>>> op migrate_to interval=0 timeout=120 \
>>> meta allow-migrate=true \
>>> meta target-role=Started \
>>> utilization cpu=2 hv_memory=4099
>>>
>>> The only constraint concerning the vm i had was a location (which i didn't
>>> create).
>>
>> What is the constraint? If its ID starts with "cli-", it was created by
>> a command-line tool (such as crm_resource, crm shell or pcs, generally
>> for a "move" or "ban" command).
>>
> I deleted the one i mentioned, but now i have two again. I didn't create them.
> Does the crm create constraints itself ?
> 
> location cli-ban-prim_vm_mausdb-on-ha-idg-2 prim_vm_mausdb role=Started -inf: 
> ha-idg-2
> location cli-prefer-prim_vm_mausdb prim_vm_mausdb role=Started inf: ha-idg-2

The command-line tool you use creates them.

If you're using crm_resource, they're created by crm_resource
--move/--ban. If you're using pcs, they're created by pcs resource
move/ban. Etc.

> One location constraint inf, one -inf for the same resource on the same node.
> Isn't that senseless ?

Yes, but that's what you told it to do :-)

The command-line tools move or ban resources by setting constraints to
achieve that effect. Those constraints are permanent until you remove them.

How to clear them again depends on which tool you use ... crm_resource
--clear, pcs resource clear, etc.

> 
> "crm resorce scores" show -inf for that resource on that node:
> native_color: prim_vm_mausdb allocation score on ha-idg-1: 100
> native_color: prim_vm_mausdb allocation score on ha-idg-2: -INFINITY
> 
> Is -inf stronger ?
> Is it true that only the values for "native_color" are notable ?
> 
> A principle question: When i have trouble to start/stop/migrate resources,
> is it senseful to do a "crm resource cleanup" before trying again ?
> (Beneath finding the reason for the trouble).

It's best to figure out what the problem is first, make sure that's
taken care of, then clean up. The cluster might or might not do anything
when you clean up, depending on what stickiness you have, your failure
handling settings, etc.

> Sorry for asking basic stuff. I read a lot before, but in practise it's total 
> different.
> Although i just have a vm as a resource, and i'm only testing, i'm sometimes 
> astonished about the 
> complexity of a simple two node cluster: scores, failcounts, constraints, 
> default values for a lot of variables ...
> you have to keep an eye on a lot of stuff.
> 
> Bernd
>  
> 
> Helmholtz Zentrum Muenchen
> Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
> Ingolstaedter Landstr. 1
> 85764 Neuherberg
> www.helmholtz-muenchen.de
> Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe
> Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons 
> Enhsen
> Registergericht: Amtsgericht Muenchen HRB 6466
> USt-IdNr: DE 129521671

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Trouble setting up selfcompiled Apache in a pacemaker cluster on Oracle Linux 6.8

2017-02-10 Thread Souvignier, Daniel

Hi Ken,

I don't have SELinux enabled... It's disabled by default on all machines
that I have installed here (disabled via kickstart). So that can't be the
problem. 
As I already described, I assume it has something to do with the custom
location of the apache which is installed via selfcompiled rpms and the
handling of the PIDfile. But when I start Apache manually, it works and
creates a PIDFile in the correct place.

Regards,
Daniel 

--
Daniel Souvignier

IT Center
Gruppe: Linux-basierte Anwendungen
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel.: +49 241 80-29267
souvign...@itc.rwth-aachen.de
www.itc.rwth-aachen.de


-Ursprüngliche Nachricht-
Von: Ken Gaillot [mailto:kgail...@redhat.com] 
Gesendet: Freitag, 10. Februar 2017 00:56
An: users@clusterlabs.org
Betreff: Re: [ClusterLabs] Trouble setting up selfcompiled Apache in a
pacemaker cluster on Oracle Linux 6.8

On 01/16/2017 10:16 AM, Souvignier, Daniel wrote:
> Hi List,
> 
>  
> 
> Ive got trouble getting Apache to work in a Pacemaker cluster I set 
> up between two Oracle Linux 6.8 hosts. The cluster itself works just 
> fine, but Apache wont come up. Thing is here, this Apache is 
> different from a basic setup because it is selfcompiled and therefore 
> living in /usr/local/apache2. Also it is the latest version available 
> (2.4.25), which could also be causing problems. To be able to debug, I 
> went into the file /usr/lib/ocf/resources.d/heartbeat/apache and 
> verbosed it by simply adding set x. This way, I can extract the 
> scripts output from the logfile /var/log/cluster/corosync.log, which I 
> appended to this email (hopefully it wont get filtered).
> 
>  
> 
> The command I used to invoke the apache script mentioned above was:
> 
> pcs resource create httpd ocf:heartbeat:apache 
> configfile=/usr/local/apache2/conf/httpd.conf
> httpd=/usr/local/apache2/bin/httpd
> statusurl=http://localhost/server-status
> envfiles=/usr/local/apache2/bin/envvars op monitor interval=60s
> 
>  
> 
> Before you ask: the paths are correct and mod_status is also 
> configured correctly (works fine when starting Apache manually). I 
> should also add that the two nodes which form this cluster are virtual 
> (Vmware vSphere) hosts and living in the same network (so no 
> firewalling between them, there is a dedicated firewall just before 
> the network). I assume that it has something to do with the handling 
> of the pid file, but I couldnt seem to fix it until now. I pointed 
> Apache to create the pid file in /var/run/httpd.pid, but that didnt 
> work either. Suggestions on how to solv this? Thanks in advance!

Do you have SELinux enabled? If so, check /var/log/audit/audit.log for
denials.

It looks like the output you attached is from the cluster's initial probe
(one-time monitor operation), and not the start operation.

>  
> 
> Kind regards,
> 
> Daniel Souvignier
> 
>  
> 
> P.S.: If you need the parameters I compiled Apache with, I can tell 
> you, but I dont think that it is relevant here.
> 
>  
> 
> --
> 
> Daniel Souvignier
> 
>  
> 
> IT Center
> 
> Gruppe: Linux-basierte Anwendungen
> 
> Abteilung: Systeme und Betrieb
> 
> RWTH Aachen University
> 
> Seffenter Weg 23
> 
> 52074 Aachen
> 
> Tel.: +49 241 80-29267
> 
> souvign...@itc.rwth-aachen.de 
> 
> www.itc.rwth-aachen.de 

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


smime.p7s
Description: S/MIME cryptographic signature
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] two node cluster: vm starting - shutting down 15min later - starting again 15min later ... and so on

2017-02-10 Thread Lentes, Bernd

- On Feb 10, 2017, at 1:10 AM, Ken Gaillot kgail...@redhat.com wrote:

> On 02/09/2017 10:48 AM, Lentes, Bernd wrote:
>> Hi,
>> 
>> i have a two node cluster with a vm as a resource. Currently i'm just testing
>> and playing. My vm boots and shuts down again in 15min gaps.
>> Surely this is related to "PEngine Recheck Timer (I_PE_CALC) just popped
>> (90ms)" found in the logs. I googled, and it is said that this
>> is due to time-based rule
>> (http://oss.clusterlabs.org/pipermail/pacemaker/2009-May/001647.html). OK.
>> But i don't have any time-based rules.
>> This is the config for my vm:
>> 
>> primitive prim_vm_mausdb VirtualDomain \
>> params config="/var/lib/libvirt/images/xml/mausdb_vm.xml" \
>> params hypervisor="qemu:///system" \
>> params migration_transport=ssh \
>> op start interval=0 timeout=90 \
>> op stop interval=0 timeout=95 \
>> op monitor interval=30 timeout=30 \
>> op migrate_from interval=0 timeout=100 \
>> op migrate_to interval=0 timeout=120 \
>> meta allow-migrate=true \
>> meta target-role=Started \
>> utilization cpu=2 hv_memory=4099
>> 
>> The only constraint concerning the vm i had was a location (which i didn't
>> create).
> 
> What is the constraint? If its ID starts with "cli-", it was created by
> a command-line tool (such as crm_resource, crm shell or pcs, generally
> for a "move" or "ban" command).
> 
I deleted the one i mentioned, but now i have two again. I didn't create them.
Does the crm create constraints itself ?

location cli-ban-prim_vm_mausdb-on-ha-idg-2 prim_vm_mausdb role=Started -inf: 
ha-idg-2
location cli-prefer-prim_vm_mausdb prim_vm_mausdb role=Started inf: ha-idg-2

One location constraint inf, one -inf for the same resource on the same node.
Isn't that senseless ?

"crm resorce scores" show -inf for that resource on that node:
native_color: prim_vm_mausdb allocation score on ha-idg-1: 100
native_color: prim_vm_mausdb allocation score on ha-idg-2: -INFINITY

Is -inf stronger ?
Is it true that only the values for "native_color" are notable ?

A principle question: When i have trouble to start/stop/migrate resources,
is it senseful to do a "crm resource cleanup" before trying again ?
(Beneath finding the reason for the trouble).

Sorry for asking basic stuff. I read a lot before, but in practise it's total 
different.
Although i just have a vm as a resource, and i'm only testing, i'm sometimes 
astonished about the 
complexity of a simple two node cluster: scores, failcounts, constraints, 
default values for a lot of variables ...
you have to keep an eye on a lot of stuff.

Bernd

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe
Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons Enhsen
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] two node cluster: vm starting - shutting down 15min later - starting again 15min later ... and so on

Re: [ClusterLabs] Trouble setting up selfcompiled Apache in a pacemaker cluster on Oracle Linux 6.8

Re: [ClusterLabs] two node cluster: vm starting - shutting down 15min later - starting again 15min later ... and so on

3 matches

Site Navigation

Mail list logo

Footer information