Re: [ClusterLabs] fence-virtd reach remote server serial/VM channel/TCP

2015-08-05 Thread Noel Kuntze

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hello Jan,

I know that increasing the complexity reduces the
availability of a service, so it is no surprise to me
that it is frowned upon running services, which
should be highgly available, on virtual machines.

However, services are regularely run on
VMs and HA is desired, even if the only thing
that should be protected against is the downtime
when the kernel needs to get upgraded or
a daemon needs to be restarted.

So I think fence-virt has a use case.
My use case currently is to build a HA cluster
of VMs, which currently host a simple mirror for software
packages. They're stored on shared storage, which has a
partition formatted with GFS2 on it. I use pcs(d), pacemaker,
corosync and fence-virt over a serial device to fence hosts.
Obviously, a single serial connection
I currently only have one hypervisor, but could expand to more.
I'm doing this, because I want to write a doc about clustering
on Linux in the year 2015, so clustering on VMs is definitely a use case
that I will describe.

I know that multicast should actually work in common use cases,
but I found that for some reason, the bridge device of the VMs don't forward
traffic for the default multicast group of fence-virt to the other bridge 
ports, rendering
it useless. I haven't dug deeper why that happens, but through Googling I found
that it's a common problem that bridge devices on Linux don't forward
some types of traffic. Obviously, if multicast works, one can just relay
multicast networks over several other interfaces to relay requests.

The man page of virt_fence.conf mentions libvirt-qmf as backend, instead
of libvirt, which should be able to route fencing requests to the correct 
host by using
Apache QMF. I figure that's the correct backend for such a purpose.

Mit freundlichen Grüßen/Kind Regards,
Noel Kuntze

GPG Key ID: 0x63EC6658
Fingerprint: 23CA BB60 2146 05E7 7278 6592 3839 298F 63EC 6658

Am 05.08.2015 um 21:09 schrieb Jan Pokorný:
 On 02/08/15 16:30 +0200, Noel Kuntze wrote:
 I would like to know if it is possible for
 fence-virtd to realy a request from a client, which it
 received via serial, VM channel or TCP connection
 from an agent to another daemon, if the VM that should
 be fenced does not run on the same host as the contacted daemon.

 First, it doesn't sound like very commendable or at least common setup
 to have virtualized cluster nodes spread around multiple hosts.
 When increasing the complexity of a deployment, new points of failure
 can be introduced, defeating the purpose of HA.

 Could you please share details of your use case? 


 To your question, it might (hypothetically) be doable if you manage to
 put the guests on the first host together with the other host into
 the same multicast-friendly network or will rely multicast packets
 between those remote sides by other means.

 Alternatively, you might implement such relying directly as
 fence_virtd module (backend), possibly reusing some code from the
 client side (fence_virt).



 ___
 Users mailing list: Users@clusterlabs.org
 http://clusterlabs.org/mailman/listinfo/users

 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org

-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQIcBAEBCAAGBQJVwsgQAAoJEDg5KY9j7GZY15kP/0iScwPftMkssy5K33Cj1Qa5
9jOwOAB35yNoecsfpG26b1jLN2V0XsNOQa1NKp9L6lx0TXiv/2D9meu9/aRckVG5
rPgzl4zuOeNrdIYzN+AHruDfIiqbtxwRQ83taUulP+rtYAspJt8KwcjQiyp4MkGp
IyH4YphbN7jWKl/EwNqSsneEIjI6jZrD93DFVI9wmCsg1zKd8IcOAaAeB86q9C24
JQE4s8tvxOw6BkoIEZq8fBC4aFvhBBSKvoBUwvTUnlcTWwoxraHRbXz+R+F+Zr3Z
Db6kOMs2q5Ogscpv2xlJaP5VCGgsCSMEesJT3hBR4AgSWbHgexlKKG1PGRTBUWz3
EuhC7TR66tfldTS8mbiZ6lqdjeXneRnEWIhZaCwHWOu8k4Q5ap5X8r1PYnzJIEXA
XKfJquPuSiesyflMdxxMZeDCW/Fme8V9dF8cy6TzUrEjLAqPo7kyXtxJxzXJk5x3
qWcsF9BIhoExfX0jx6pYes4ArzxGw8umUB4Sp5J0smAI5V8+DUp2NHNNjRdfeHfd
fwwchC8sruX51pQEiniOv2FfejwTKqv/Qqd+A+ps1/02j4S/jITcVWBX819RTswJ
UA1bSS/dSFZc0DEZhUCxdgJYuAQ/1SPNePK2Okb9BgX24phoF5/f5NuDGTA9ZUN2
IgiMTn7O0gx0fhz+nS4l
=OBVz
-END PGP SIGNATURE-



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Corosync: 100% cpu (corosync 2.3.5, libqb 0.17.1, pacemaker 1.1.13)

2015-08-05 Thread Pallai Roland
hi,

I've built a recent cluster stack from sources on Debian Jessie and I can't
get rid of cpu spikes. Corosync blocks the entire system for seconds on
every simple transition, even itself:

 drbdtest1 corosync[4734]:   [MAIN  ] Corosync main process was not
scheduled for 2590.4512 ms (threshold is 2400. ms). Consider token
timeout increase.

and even drbd:
 drbdtest1 kernel: drbd p1: PingAck did not arrive in time.

My previous build (corosync 1.4.6, libqb 0.17.0, pacemaker 1.1.12) works
fine on this nodes with the same corosync/pacemaker setup.

What should I try? It's a test environment, the issue is 100% reproducible
in seconds. Network traffic is minimal all the time and there is no I/O
load.


*Pacemaker config:*

node 167969573: drbdtest1
node 167969574: drbdtest2
primitive drbd_p1 ocf:linbit:drbd \
params drbd_resource=p1 \
op monitor interval=30
primitive drbd_p2 ocf:linbit:drbd \
params drbd_resource=p2 \
op monitor interval=30
primitive dummy_test ocf:pacemaker:Dummy \
meta allow-migrate=true \
params state=/var/run/activenode
primitive fence_libvirt stonith:external/libvirt \
params hostlist=drbdtest1,drbdtest2
hypervisor_uri=qemu+ssh://libvirt-fencing@mgx4/system \
op monitor interval=30
primitive fs_boot Filesystem \
params device=/dev/null directory=/boot fstype=* \
meta is-managed=false \
op monitor interval=20 timeout=40 on-fail=block OCF_CHECK_LEVEL=20
primitive fs_f1 Filesystem \
params device=/dev/drbd/by-res/p1 directory=/mnt/p1 fstype=ext4
options=commit=60,barrier=0,data=writeback \
op monitor interval=20 timeout=40 \
op start timeout=300 interval=0 \
op stop timeout=180 interval=0
primitive ip_10.3.3.138 IPaddr2 \
params ip=10.3.3.138 cidr_netmask=32 \
op monitor interval=10s timeout=20s
primitive sysinfo ocf:pacemaker:SysInfo \
op start timeout=20s interval=0 \
op stop timeout=20s interval=0 \
op monitor interval=60s
group dummy-group dummy_test
ms ms_drbd_p1 drbd_p1 \
meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1
notify=true
ms ms_drbd_p2 drbd_p2 \
meta master-max=2 master-node-max=1 clone-max=2 notify=true
clone fencing_by_libvirt fence_libvirt \
meta globally-unique=false
clone fs_boot_clone fs_boot
clone sysinfos sysinfo \
meta globally-unique=false
location fs1_on_high_load fs_f1 \
rule -inf: cpu_load gte 4
colocation dummy_coloc inf: dummy-group ms_drbd_p2:Master
colocation f1a-coloc inf: fs_f1 ms_drbd_p1:Master
colocation f1b-coloc inf: fs_f1 fs_boot_clone:Started
order dummy_order inf: ms_drbd_p2:promote dummy-group:start
order orderA inf: ms_drbd_p1:promote fs_f1:start
property cib-bootstrap-options: \
dc-version=1.1.13-6052cd1 \
cluster-infrastructure=corosync \
expected-quorum-votes=2 \
no-quorum-policy=ignore \
symmetric-cluster=true \
placement-strategy=default \
last-lrm-refresh=1438735742 \
have-watchdog=false
property cib-bootstrap-options-stonith: \
stonith-enabled=true \
stonith-action=reboot
rsc_defaults rsc-options: \
resource-stickiness=100


*corosync.conf:*

totem {
version: 2
token: 3000
token_retransmits_before_loss_const: 10
clear_node_high_bit: yes
crypto_cipher: none
crypto_hash: none
interface {
ringnumber: 0
bindnetaddr: 10.3.3.37
mcastaddr: 225.0.0.37
mcastport: 5403
ttl: 1
}
}

logging {
fileline: off
to_stderr: no
to_logfile: yes
logfile: /var/log/corosync/corosync.log
to_syslog: yes
syslog_facility: daemon
debug: off
timestamp: on
logger_subsys {
subsys: QUORUM
debug: off
}
}

quorum {
provider: corosync_votequorum
expected_votes: 2
}
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Failure of apache services

2015-08-05 Thread Vijay Partha
Hi.

this is my pcs status

Online: [ node1 node2 ]

Full list of resources:

 WebSite(ocf::heartbeat:apache):Started node1

Failed actions:
WebSite_start_0 on node2 'unknown error' (1): call=28, status=complete,
last-rc-change='Wed Aug  5 08:26:47 2015', queued=0ms, exec=3158ms


Traceback (most recent call last):
  File /usr/sbin/pcs, line 138, in module
main(sys.argv[1:])
  File /usr/sbin/pcs, line 127, in main
status.status_cmd(argv)
  File /usr/lib/python2.6/site-packages/pcs/status.py, line 13, in
status_cmd
full_status()
  File /usr/lib/python2.6/site-packages/pcs/status.py, line 60, in
full_status
utils.serviceStatus(  )
  File /usr/lib/python2.6/site-packages/pcs/utils.py, line 1504, in
serviceStatus
if is_systemctl():
  File /usr/lib/python2.6/site-packages/pcs/utils.py, line 1476, in
is_systemctl
elif re.search(r'Foobar Linux release 6\.', issue):
NameError: global name 'issue' is not defined


I am not able to run my apache service on the second node.

-- 
With Regards
P.Vijay
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Disabling resources and adding apache instances

2015-08-05 Thread Vijay Partha
Cluster name: pacemaker1
Last updated: Wed Aug  5 09:07:27 2015
Last change: Wed Aug  5 08:58:24 2015
Stack: cman
Current DC: node1 - partition with quorum
Version: 1.1.11-97629de
2 Nodes configured
2 Resources configured


Online: [ node1 node2 ]

Full list of resources:

 ClusterIP(ocf::heartbeat:IPaddr2):Started node2
 WebSite(ocf::heartbeat:apache):Started node1

Failed actions:
WebSite_monitor_0 on node2 'unknown error' (1): call=96,
status=complete, last-rc-change='Wed Aug  5 08:53:24 2015', queued=1ms,
exec=51ms


Traceback (most recent call last):
  File /usr/sbin/pcs, line 138, in module
main(sys.argv[1:])
  File /usr/sbin/pcs, line 127, in main
status.status_cmd(argv)
  File /usr/lib/python2.6/site-packages/pcs/status.py, line 13, in
status_cmd
full_status()
  File /usr/lib/python2.6/site-packages/pcs/status.py, line 60, in
full_status
utils.serviceStatus(  )
  File /usr/lib/python2.6/site-packages/pcs/utils.py, line 1504, in
serviceStatus
if is_systemctl():
  File /usr/lib/python2.6/site-packages/pcs/utils.py, line 1476, in
is_systemctl
elif re.search(r'Foobar Linux release 6\.', issue):
NameError: global name 'issue' is not defined


This is the error that i got after location constraint and ClusterIP
started on node1 .

On Wed, Aug 5, 2015 at 12:37 PM, Andrei Borzenkov arvidj...@gmail.com
wrote:

 On Wed, Aug 5, 2015 at 9:23 AM, Vijay Partha vijaysarath...@gmail.com
 wrote:
  Hi,
 
  I have 2 doubts.
 
  1.) If i disable a resource and reboot the node, will the pacemaker
 restart
  the service?

 What exactly disable means? There is no such operation in pacemaker.

   Or how can i stop the service and after rebooting the service
  should be started automatically by pacemaker
 

 Unfortunately pacemaker does not really provide any way to temporary
 stop resource. You can set target role to Stopped which will trigger
 resource stop. Then resource won't be started after reboot, because
 you told it to remain Stopped. Same applies to is-managed=false.

 If I'm wrong and it is possible I would be very interested to learn it.

  2.) how to create apache instances in such a way that 1 instance runs in
 1
  node and another instance runs on the second node.
 

 Just define two resources and set location constraints for each.

 ___
 Users mailing list: Users@clusterlabs.org
 http://clusterlabs.org/mailman/listinfo/users

 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org




-- 
With Regards
P.Vijay
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Re: Disabling resources and adding apache instances

2015-08-05 Thread Ulrich Windl
 Andrei Borzenkov arvidj...@gmail.com schrieb am 05.08.2015 um 09:07 in
Nachricht
caa91j0xsx3u6xnexjts1u9pcwpovrqwjhodqpvmsexonudb...@mail.gmail.com:
 On Wed, Aug 5, 2015 at 9:23 AM, Vijay Partha vijaysarath...@gmail.com wrote:
 Hi,

 I have 2 doubts.

 1.) If i disable a resource and reboot the node, will the pacemaker restart
 the service?
 
 What exactly disable means? There is no such operation in pacemaker.
 
  Or how can i stop the service and after rebooting the service
 should be started automatically by pacemaker

 
 Unfortunately pacemaker does not really provide any way to temporary
 stop resource. You can set target role to Stopped which will trigger

Actually it does: You can have time-based rules. If you add location 
constraints for a resource disallowing it to run anywhere for some time, I 
guess it will work ;-)

 resource stop. Then resource won't be started after reboot, because
 you told it to remain Stopped. Same applies to is-managed=false.
 
 If I'm wrong and it is possible I would be very interested to learn it.

Sometimes things happen that nobody can explain, unfortunately.

 
 2.) how to create apache instances in such a way that 1 instance runs in 1
 node and another instance runs on the second node.

 
 Just define two resources and set location constraints for each.

Regards,
Ulrich
P.S.: Thought of the day: Is there any reasonable use for a nuclear bomb? If 
not, why have one?
(We use it, because we have it is not considered to be a valid answer)



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] apache services

2015-08-05 Thread Ken Gaillot
On 08/05/2015 04:05 AM, Vijay Partha wrote:
 Hi,
 
 I need to run apache service on both the nodes in a cluster. httpd is
 listening in port 80 on first node and httpd is listening to port 81 on the
 second. I am not able to add these instances separately rather both of them
 are starting on the same node1. even if i move the service i get an error
 WebSite1_start_0 on node2 'unknown error' (1): call=27, status=complete,
 last-rc-change='Wed Aug  5 11:02:47 2015', queued=1ms, exec=3146ms.
 
 Please help me out.

You have two separate issues:

1. Both instances are starting on the same node; and

2. Moving an instance produces an error.

For #1, the answer is colocation constraints (which are distinct from
location constraints and ordering constraints). Colocation constraints
say that two resources should be kept together (if the score is
positive) or kept apart (if the score is negative).

For #2, pacemaker is asking the resource agent to perform an action, and
the resource agent is saying it can't. Look at the logs to try to find
the error reported by the resource agent. You can also try running the
resource agent manually.

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Running pacemaker 1.1.13 with legacy plugin or heartbeat

2015-08-05 Thread Ken Gaillot
FYI to anyone running the legacy plugin or heartbeat as pacemaker's
communication layer:

Use-after-free memory issues can cause segfault crashes in the cib when
using pacemaker 1.1.13 with the legacy plugin. Heartbeat is likely to be
affected as well.

Clusters using CMAN or corosync 2 as the communication layer are not
affected.

if switching to CMAN or corosync 2 isn't an option for you, I strongly
recommend using a vendor that supports your communication layer, as they
are more likely to do thorough testing and provide fixes.

If anyone wants a targeted patch, I can provide one, but I would
recommend instead using the upstream master branch as of at least commit
0f8059e. That branch includes an overhaul of the affected code area, as
well as other bug fixes.
-- 
Ken Gaillot kgail...@redhat.com

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] fence-virtd reach remote server serial/VM channel/TCP

2015-08-05 Thread Digimer
On 05/08/15 03:09 PM, Jan Pokorný wrote:
 On 02/08/15 16:30 +0200, Noel Kuntze wrote:
 I would like to know if it is possible for
 fence-virtd to realy a request from a client, which it
 received via serial, VM channel or TCP connection
 from an agent to another daemon, if the VM that should
 be fenced does not run on the same host as the contacted daemon.
 
 First, it doesn't sound like very commendable or at least common setup
 to have virtualized cluster nodes spread around multiple hosts.
 When increasing the complexity of a deployment, new points of failure
 can be introduced, defeating the purpose of HA.
 
 Could you please share details of your use case?  

To interject;

It is not something I would do, but I've heard of cases where a separate
department handles hardware and the devops types are restricted to VMs
only. In such a case, you would want to span hosts to protect against a
host failure. Not sure if this is Noel's use-case, of course.

 To your question, it might (hypothetically) be doable if you manage to
 put the guests on the first host together with the other host into
 the same multicast-friendly network or will rely multicast packets
 between those remote sides by other means.
 
 Alternatively, you might implement such relying directly as
 fence_virtd module (backend), possibly reusing some code from the
 client side (fence_virt).


-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: Antw: Re: [Question] About movement of pacemaker_remote.

2015-08-05 Thread Andrew Beekhof
Ok, I’ll look into it. Thanks for retesting. 

 On 5 Aug 2015, at 4:00 pm, renayama19661...@ybb.ne.jp wrote:
 
 Hi Andrew,
 
 Do you know if this behaviour still exists?
 A LOT of work went into the remote node logic in the last couple of months, 
 its 
 possible this was fixed as a side-effect.
  
  
 It is the latest and does not confirm it.
 I confirm it.
 
 
 I confirmed it in latest 
 Pacemaker.(pacemaker-eefdc909a41b571dc2e155f7b14b5ef0368f2de7)
 
 After all the phenomenon occurs.
 
 
 In the first clean up, pacemaker fails in connection with pacemaker_remote.
 The second succeeds.
 
 The problem does not seem to be settled somehow or other.
 
 
 
 It was the latest and incorporated my log again.
 
 ---
 (snip)
 static size_tcrm_remote_recv_once(crm_remote_t * remote){int rc = 0;
 size_t read_len = sizeof(struct crm_remote_header_v0);
 struct crm_remote_header_v0 *header = crm_remote_header(remote);
 
 if(header) {
 /* Stop at the end of the current message */
 read_len = header-size_total;
 }
 
 /* automatically grow the buffer when needed */
 if(remote-buffer_size  read_len) {
remote-buffer_size = 2 * read_len;
 crm_trace(Expanding buffer to %u bytes, remote-buffer_size);
 
 remote-buffer = realloc_safe(remote-buffer, remote-buffer_size + 
 1);CRM_ASSERT(remote-buffer != NULL);
 }
 
 #ifdef HAVE_GNUTLS_GNUTLS_H
 if (remote-tls_session) {if (remote-buffer == NULL) {
 crm_info(### YAMAUCHI buffer is NULL [buffer_zie[%d] 
 readlen[%d], remote-buffer_size, read_len);
 }
 rc = gnutls_record_recv(*(remote-tls_session),
 remote-buffer + remote-buffer_offset,
 remote-buffer_size - remote-buffer_offset);
 (snip)
 ---
 
 When Pacemaker fails in connection first in remote, my log is printed.
 My log is not printed by the second connection.
 
 [root@sl7-01 ~]# tail -f /var/log/messages | grep YAMA
 Aug  5 14:46:25 sl7-01 crmd[21306]: info: ### YAMAUCHI buffer is NULL 
 [buffer_zie[1326] readlen[40]
 Aug  5 14:46:26 sl7-01 crmd[21306]: info: ### YAMAUCHI buffer is NULL 
 [buffer_zie[1326] readlen[40]
 Aug  5 14:46:28 sl7-01 crmd[21306]: info: ### YAMAUCHI buffer is NULL 
 [buffer_zie[1326] readlen[40]
 Aug  5 14:46:30 sl7-01 crmd[21306]: info: ### YAMAUCHI buffer is NULL 
 [buffer_zie[1326] readlen[40]
 Aug  5 14:46:31 sl7-01 crmd[21306]: info: ### YAMAUCHI buffer is NULL 
 [buffer_zie[1326] readlen[40]
 (snip)
 
 Best Regards,
 Hideo Yamauchi.
 
 
 
 
 - Original Message -
 From: renayama19661...@ybb.ne.jp renayama19661...@ybb.ne.jp
 To: Cluster Labs - All topics related to open-source clustering welcomed 
 users@clusterlabs.org
 Cc: 
 Date: 2015/8/4, Tue 18:40
 Subject: Re: [ClusterLabs] Antw: Re: Antw: Re: [Question] About movement of 
 pacemaker_remote.
 
 Hi Andrew,
 
 Do you know if this behaviour still exists?
 A LOT of work went into the remote node logic in the last couple of months, 
 its 
 possible this was fixed as a side-effect.
 
 
 It is the latest and does not confirm it.
 I confirm it.
 
 Many Thanks!
 Hideo Yamauchi.
 
 
 - Original Message -
 From: Andrew Beekhof and...@beekhof.net
 To: renayama19661...@ybb.ne.jp; Cluster Labs - All topics related to 
 open-source clustering welcomed users@clusterlabs.org
 Cc: 
 Date: 2015/8/4, Tue 13:16
 Subject: Re: [ClusterLabs] Antw: Re: Antw: Re: [Question] About movement of 
 pacemaker_remote.
 
 
   On 12 May 2015, at 12:12 pm, renayama19661...@ybb.ne.jp wrote:
 
   Hi All,
 
   The problem is like a buffer becoming NULL after crm_resouce -C 
 practice 
 somehow or other after having rebooted remote node.
 
   I incorporated log in a source code and confirmed it.
 
   
   crm_remote_recv_once(crm_remote_t * remote)
   {
   (snip)
  /* automatically grow the buffer when needed */
  if(remote-buffer_size  read_len) {
 remote-buffer_size = 2 * read_len;
  crm_trace(Expanding buffer to %u bytes, 
 remote-buffer_size);
 
  remote-buffer = realloc_safe(remote-buffer, 
 remote-buffer_size + 1);
  CRM_ASSERT(remote-buffer != NULL);
  }
 
   #ifdef HAVE_GNUTLS_GNUTLS_H
  if (remote-tls_session) {
  if (remote-buffer == NULL) {
 crm_info(### YAMAUCHI buffer is NULL [buffer_zie[%d] 
 readlen[%d], remote-buffer_size, read_len);
  }
  rc = gnutls_record_recv(*(remote-tls_session),
  remote-buffer + 
 remote-buffer_offset,
  remote-buffer_size - 
 remote-buffer_offset);
   (snip)
   
 
   May 12 10:54:01 sl7-01 crmd[30447]: info: crm_remote_recv_once: ### 
 YAMAUCHI buffer is NULL [buffer_zie[1326] readlen[40]
   May 12 10:54:02 sl7-01 crmd[30447]: info: crm_remote_recv_once: ### 
 YAMAUCHI buffer is NULL