Re: [ClusterLabs] Help needed getting DRBD cluster working

2015-10-06 Thread Gordon Ross
On 5 Oct 2015, at 15:05, Ken Gaillot  wrote:
> 
> The "rc=6" in the failed actions means the resource's Pacemaker
> configuration is invalid. (For OCF return codes, see
> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#s-ocf-return-codes
> )
> 
> The "_monitor_0" means that this was the initial probe that Pacemaker
> does before trying to start the resource, to make sure it's not already
> running. As an aside, you probably want to add recurring monitors as
> well, otherwise Pacemaker won't notice if the resource fails. For
> example: op monitor interval="29s" role="Master" op monitor
> interval="31s" role="Slave"
> 
> As to why the probe is failing, it's hard to tell. Double-check your
> configuration to make sure disc0 is the exact DRBD name, Pacemaker can
> read the DRBD configuration file, etc. You can also try running the DRBD
> resource agent's "status" command manually to see if it prints a more
> detailed error message.

I cleated the CIB and re-created most of it with your suggested parameters. It 
now looks like:

node $id="739377522" ct1
node $id="739377523" ct2
node $id="739377524" ct3 \
attributes standby="on"
primitive drbd_disc0 ocf:linbit:drbd \
params drbd_resource="disc0" \
meta target-role="Started" \
op monitor interval="19s" on-fail="restart" role="Master" 
start-delay="10s" timeout="20s" \
op monitor interval="20s" on-fail="restart" role="Slave" 
start-delay="10s" timeout="20s"
ms ms_drbd0 drbd_disc0 \
meta master-max="1" master-node-max="1" clone-max="2" 
clone-node-max="1" notify="true" target-role="Started"
location cli-prefer-drbd_disc0 ms_drbd0 inf: ct2
location cli-prefer-ms_drbd0 ms_drbd0 inf: ct2
property $id="cib-bootstrap-options" \
dc-version="1.1.10-42f2063" \
cluster-infrastructure="corosync" \
stonith-enabled="false" \
no-quorum-policy="stop" \
symmetric-cluster="false"


I think I’m missing something basic between the DRBD/Pacemaker hook-up.

As soon as Pacemaker/Corosync start, DRBD on both nodes stop. a “cat 
/proc/drbd” then just returns:

version: 8.4.3 (api:1/proto:86-101)
srcversion: 6551AD2C98F533733BE558C 

and no details on the replicated disc and the drbd block device disappears.

GTG
-- 
Gordon Ross,
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: crm_report consumes all available RAM

2015-10-06 Thread Jan Pokorný
On 06/10/15 10:28 +0200, Dejan Muhamedagic wrote:
> On Mon, Oct 05, 2015 at 07:00:18PM +0300, Vladislav Bogdanov wrote:
>> 14.09.2015 02:31, Andrew Beekhof wrote:
>>> 
 On 8 Sep 2015, at 10:18 pm, Ulrich Windl 
  wrote:
 
>>> Vladislav Bogdanov  schrieb am 08.09.2015 um 
>>> 14:05 in
 Nachricht <55eecefb.8050...@hoster-ok.com>:
> Hi,
> 
> just discovered very interesting issue.
> If there is a system user with very big UID (8002 in my case),
> then crm_report (actually 'grep' it runs) consumes too much RAM.
> 
> Relevant part of the process tree at that moment looks like (word-wrap 
> off):
> USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
> ...
> root 25526  0.0  0.0 106364   636 ?S12:37   0:00  
> \_
> /bin/sh /usr/sbin/crm_report --dest=/var/log/crm_report -f -01-01 
> 00:00:00
> root 25585  0.0  0.0 106364   636 ?S12:37   0:00
> \_ bash /var/log/crm_report/collector
> root 25613  0.0  0.0 106364   152 ?S12:37   0:00
> \_ bash /var/log/crm_report/collector
> root 25614  0.0  0.0 106364   692 ?S12:37   0:00
> \_ bash /var/log/crm_report/collector
> root 27965  4.9  0.0 100936   452 ?S12:38   0:01
> |   \_ cat /var/log/lastlog
> root 27966 23.0 82.9 3248996 1594688 ? D12:38   0:08
> |   \_ grep -l -e Starting Pacemaker
> root 25615  0.0  0.0 155432   600 ?S12:37   0:00
> \_ sort -u
> 
> ls -ls /var/log/lastlog shows:
> 40 -rw-r--r--. 1 root root 2336876 Sep  8 04:36 /var/log/lastlog
> 
> That is sparse binary file, which consumes only 40k of disk space.
> At the same time its size is 23GB, and grep takes all the RAM trying to
> grep a string from a 23GB of mostly zeroes without new-lines.
> 
> I believe this is worth fixing,
>>> 
>>> Shouldn’t this be directed to the grep folks?
>> 
>> Actually, not everything in /var/log are textual logs. Currently
>> findmsg() [z,bz,xz]cats _every_ file there and greps for a pattern.
>> Shouldn't it skip some well-known ones? btmp, lastlog and wtmp are
>> good candidates to be skipped. They are not intended to be handled
>> as a text.
>> 
>> Or may be just test that file is a text in a find_decompressor() and
>> to not cat it if it is not?
>> 
>> something like
>> find_decompressor() {
>> if echo $1 | grep -qs 'bz2$'; then
>> echo "bzip2 -dc"
>> elif echo $1 | grep -qs 'gz$'; then
>> echo "gzip -dc"
>> elif echo $1 | grep -qs 'xz$'; then
>> echo "xz -dc"
>> elif file $1 | grep -qs 'text'; then
>> echo "cat"
>> else
>> echo "echo"
> 
> Good idea.

Even better might be using process substitution and avoid cat'ing if
not needed even for plain text files, assuming GNU grep 2.13+ that,
in combination with kernel, attempts to detect sparse files, marking
them as binary files[1], which can then be utilized in combination
with -I option.

But that is not expected to work under /bin/sh and achieving the same
in compatible way would be quite clumsy.  Not to speak about using
non-POSIX extensions to grep.

And I don't think grep folks can do any better with piped input...

[1] 
http://git.savannah.gnu.org/cgit/grep.git/tree/NEWS?id=c528aa1da0ef1635fa48c3ec804162cf3e71cb79#n22

>> fi
>> }

-- 
Jan (Poki)


pgpFwn9zKMyVe.pgp
Description: PGP signature
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] will rgmanager/ccs support the vm RA's new migrate-setspeed?

2015-10-06 Thread Jan Pokorný
On 05/10/15 22:36 -0400, Digimer wrote:
> Re: https://github.com/ClusterLabs/resource-agents/pull/629
> 
> I'd love to support this in the current gen Anvil!. Would it be hard to
> add support to ccs for this?

All depends on the capabilities of vm.sh resource agent for rgmanager.
When the referenced changes are adapted there (if possible, hopefully
yes), then once the agent is installed to the system across the nodes
(and ccs_update_schema executed), everything should start working
smoothly.

(Anyone interested?)

-- 
Jan (Poki)


pgptm3q52iYB5.pgp
Description: PGP signature
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] corosync - CS_ERR_BAD_HANDLE when multiple nodes are starting up

2015-10-06 Thread Jan Friesse

Thomas,

Thomas Lamprecht napsal(a):

Hi,

thanks for the response!
I added some information and clarification below.

On 10/01/2015 09:23 AM, Jan Friesse wrote:

Hi,

Thomas Lamprecht napsal(a):

Hello,

we are using corosync version needle (2.3.5) for our cluster filesystem
(pmxcfs).
The situation is the following. First we start up the pmxcfs, which is
an fuse fs. And if there is an cluster configuration, we start also
corosync.
This allows the filesystem to exist on one node 'cluster's or forcing it
in an local mode. We use CPG to send our messages to all members,
the filesystem is in the RAM and all fs operations are sent 'over the
wire'.

The problem is now the following:
When we're restarting all (in my test case 3) nodes at the same time, I
get in 1 from 10 cases only CS_ERR_BAD_HANDLE back when calling


I'm really unsure how to understand what are you doing. You are
restarting all nodes and get CS_ERR_BAD_HANDLE? I mean, if you are
restarting all nodes, which node returns CS_ERR_BAD_HANDLE? Or are you
restarting just pmxcfs? Or just coorsync?

Clarification, sorry was a bit unspecific. I can see the error behaviour
in two cases:
1) I restart three physical hosts (= nodes) at the same time, one of
them - normally the last one coming up again - joins successfully the
corosync cluster the filesystem (pmxcfs) notices that, but then
cpg_mcast_joined receives only CS_ERR_BAD_HANDLE errors.


Ok, that is weird. Are you able to reproduce same behavior restarting 
pmxcfs? Or really membership change (= restart of node) is needed? Also 
are you sure network interface is up when corosync starts?


corosync.log of failing node may be interesting.




2) I disconnect the network interface on which corosync runs, and
reconnect it a bit later. This triggers the same as above, but also not
every time.


Just to make sure. Don't do ifdown. Corosync reacts to ifdown pretty 
badly. Also NetworkManager does ifdown on cable unplug if not configured 
in server mode. If you want to test network split, ether use iptables 
(make sure to block all traffic needed by corosync, so if you are using 
multicast make sure to block both unicast and multicast packets on input 
and output - 
https://github.com/jfriesse/csts/blob/master/tests/inc/fw.sh), or use 
blocking on switch.




Currently I'm trying to get an somewhat reproduce able test and try it
also on bigger setups and other possible causes, need to do a bit more
home work here and report back later.


Actually, smaller clusters are better for debugging, but yes, larger 
setup may show problem faster.





cpg_mcast_joined to send out the data, but only one node.
corosyn-quorumtool shows that we have quorum, and the logs are also
showing a healthy connect to the corosync cluster. The failing handle is
initialized once at the initialization of our filesystem. Should it be
reinitialized on every reconnect?


Again, I'm unsure what you mean by reconnect. On Corosync shudown you
have to reconnect (I believe this is not the case because you are
getting error only with 10% probability).

Yes, we reconnect to Corosync, and it's not a corosync shutdown, the
whole host reboots or the network interfaces goes down and then a bit
later up again. The probability is just an estimation but the main
problem is that I can not reproduce it all the time.



Restarting the filesystem solves this problem, the strange thing is that
isn't clearly reproduce-able and often works just fine.

Are there some known problems or steps we should look for?


Hard to tell but generally:
- Make sure cpg_init really returns CS_OK. If not, returned handle is
invalid
- Make sure there is no memory corruption and handle is really valid
(valgrind may be helpful).

cpg_init checks are in place and should be OK.
Yes, will use Valgrind, but one questions ahead:

Can the handle get lost somehow? Is there a need to reinitialize the cpg
with cpg_initialize/cpg_model_initialize after we left and later
rejoined the cluster?


I'm still unsure what you mean after we left and later rejoined. As long 
as corosync is running client application "don't need to care about" 
membership changes. It's corosync problem. So if network split happens, 
you don't have to call cpg_initialize. Only place where cpg_initalize is 
needed is initial connection and reconnection after corosync main 
process exit.



Regards,
  Honza



Regards,
  Honza




___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org





Re: [ClusterLabs] Help needed getting DRBD cluster working

2015-10-06 Thread Ken Gaillot
On 10/06/2015 09:38 AM, Gordon Ross wrote:
> On 5 Oct 2015, at 15:05, Ken Gaillot  wrote:
>>
>> The "rc=6" in the failed actions means the resource's Pacemaker
>> configuration is invalid. (For OCF return codes, see
>> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#s-ocf-return-codes
>> )
>>
>> The "_monitor_0" means that this was the initial probe that Pacemaker
>> does before trying to start the resource, to make sure it's not already
>> running. As an aside, you probably want to add recurring monitors as
>> well, otherwise Pacemaker won't notice if the resource fails. For
>> example: op monitor interval="29s" role="Master" op monitor
>> interval="31s" role="Slave"
>>
>> As to why the probe is failing, it's hard to tell. Double-check your
>> configuration to make sure disc0 is the exact DRBD name, Pacemaker can
>> read the DRBD configuration file, etc. You can also try running the DRBD
>> resource agent's "status" command manually to see if it prints a more
>> detailed error message.
> 
> I cleated the CIB and re-created most of it with your suggested parameters. 
> It now looks like:
> 
> node $id="739377522" ct1
> node $id="739377523" ct2
> node $id="739377524" ct3 \
>   attributes standby="on"
> primitive drbd_disc0 ocf:linbit:drbd \
>   params drbd_resource="disc0" \
>   meta target-role="Started" \
>   op monitor interval="19s" on-fail="restart" role="Master" 
> start-delay="10s" timeout="20s" \
>   op monitor interval="20s" on-fail="restart" role="Slave" 
> start-delay="10s" timeout="20s"
> ms ms_drbd0 drbd_disc0 \
>   meta master-max="1" master-node-max="1" clone-max="2" 
> clone-node-max="1" notify="true" target-role="Started"

You want to omit target-role, or set it to "Master". Otherwise both
nodes will start as slaves.

> location cli-prefer-drbd_disc0 ms_drbd0 inf: ct2
> location cli-prefer-ms_drbd0 ms_drbd0 inf: ct2

You've given the above constraints different names, but they are
identical: they both say ms_drbd0 can run on ct2 only.

When you're using clone/ms resources, you generally only ever need to
refer to the clone's name, not the resource being cloned. So you don't
need any constraints for drbd_disc0.

You've set symmetric-cluster=false in the cluster options, which means
that Pacemaker will not start resources on any node unless a location
constaint enables it. Here, you're only enabling ct2. Duplicate the
constraint for ct1 (or set symmetric-cluster=true and use a -INF
location constraint for the third node instead).

> property $id="cib-bootstrap-options" \
>   dc-version="1.1.10-42f2063" \
>   cluster-infrastructure="corosync" \
>   stonith-enabled="false" \

I'm sure you've heard this before, but stonith is the only way to avoid
data corruption in a split-brain situation. It's usually best to make
fencing the first priority rather than save it for last, because some
problems can become more difficult to troubleshoot without fencing. DRBD
in particular needs special configuration to coordinate fencing with
Pacemaker: https://drbd.linbit.com/users-guide/s-pacemaker-fencing.html

>   no-quorum-policy="stop" \
>   symmetric-cluster="false"
> 
> 
> I think I’m missing something basic between the DRBD/Pacemaker hook-up.
> 
> As soon as Pacemaker/Corosync start, DRBD on both nodes stop. a “cat 
> /proc/drbd” then just returns:
> 
> version: 8.4.3 (api:1/proto:86-101)
> srcversion: 6551AD2C98F533733BE558C 
> 
> and no details on the replicated disc and the drbd block device disappears.
> 
> GTG
> 


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org