Re: [Linux-HA] Filesystem thinks it is run as a clone
On Wed, Apr 13, 2011 at 10:57 AM, Christoph Bartoschek wrote: > Am 13.04.2011 08:26, schrieb Andrew Beekhof: >> On Tue, Apr 12, 2011 at 5:17 PM, Christoph Bartoschek >> wrote: >>> Hi, >>> >>> today we tested some NFS cluster scenarios and the first test failed. >>> The first test was to put the current master node into standby. Stopping >>> the services worked but then starting it on the other node failed. The >>> ocf:hearbeat:Filesystem resource failed to start. In the logfile we see: >>> >>> Apr 12 14:08:42 laplace Filesystem[10772]: [10820]: INFO: Running start >>> for /dev/home-data/afs on /srv/nfs/afs >>> Apr 12 14:08:42 laplace Filesystem[10772]: [10822]: ERROR: DANGER! ext4 >>> on /dev/home-data/afs is NOT cluster-aware! >>> Apr 12 14:08:42 laplace Filesystem[10772]: [10824]: ERROR: DO NOT RUN IT >>> AS A CLONE! >> >> To my eye the Filesystem agent looks confused > > The agent is confused because OCF_RESKEY_CRM_meta_clone is non-zero. Is > this something that can happen? Not unless the resource has been cloned - and looking at the config this did not seem to be the case. Or did I miss something? > >> >>> The message comes from the following code in ocf:hearbeat:Filesystem: >>> >>> case $FSTYPE in >>> ocfs2) ocfs2_init >>> ;; >>> nfs|smbfs|none|gfs2) : # this is kind of safe too >>> ;; >>> *) if [ -n "$OCF_RESKEY_CRM_meta_clone" ]; then >>> ocf_log err "DANGER! $FSTYPE on $DEVICE is NOT >>> cluster-aware!" >>> ocf_log err "DO NOT RUN IT AS A CLONE!" >>> ocf_log err "Politely refusing to proceed to avoid data >>> corruption." >>> exit $OCF_ERR_CONFIGURED >>> fi >>> ;; >>> esac >>> >>> >>> The message is only printed if the variable OCF_RESKEY_CRM_meta_clone is >>> non-zero. Our configuration however does not run the filesystem as a >>> clone. Somehow the OCF_RESKEY_CRM_meta_clone variable leaked into the >>> start of the Filesystem resource. Is this a known bug? >>> >>> Or is there a configuration error on our side? Here is the current >>> configuration: >>> >>> >>> node laplace \ >>> attributes standby="off" >>> node ries \ >>> attributes standby="off" >>> primitive ClusterIP ocf:heartbeat:IPaddr2 \ >>> params ip="192.168.143.228" cidr_netmask="24" \ >>> op monitor interval="30s" \ >>> meta target-role="Started" >>> primitive p_drbd_nfs ocf:linbit:drbd \ >>> params drbd_resource="home-data" \ >>> op monitor interval="15" role="Master" \ >>> op monitor interval="30" role="Slave" >>> primitive p_exportfs_afs ocf:heartbeat:exportfs \ >>> params fsid="1" directory="/srv/nfs/afs" >>> options="rw,no_root_squash,mountpoint" \ >>> clientspec="192.168.143.0/255.255.255.0" \ >>> wait_for_leasetime_on_stop="false" \ >>> op monitor interval="30s" >>> primitive p_fs_afs ocf:heartbeat:Filesystem \ >>> params device="/dev/home-data/afs" directory="/srv/nfs/afs" \ >>> fstype="ext4" \ >>> op monitor interval="10s" \ >>> meta target-role="Started" >>> primitive p_lsb_nfsserver lsb:nfs-kernel-server \ >>> op monitor interval="30s" >>> primitive p_lvm_nfs ocf:heartbeat:LVM \ >>> params volgrpname="home-data" \ >>> op monitor interval="30s" >>> group g_nfs p_lvm_nfs p_fs_afs p_exportfs_afs ClusterIP >>> ms ms_drbd_nfs p_drbd_nfs \ >>> meta master-max="1" master-node-max="1" clone-max="2" >>> clone-node-max="1" notify="true" target-role="Started" >>> clone cl_lsb_nfsserver p_lsb_nfsserver >>> colocation c_nfs_on_drbd inf: g_nfs ms_drbd_nfs:Master >>> order o_drbd_before_nfs inf: ms_drbd_nfs:promote g_nfs:start >>> property $id="cib-bootstrap-options" \ >>> dc-version="1.0.9-unknown" \ >>> cluster-infrastructure="openais" \ >>> expected-quorum-votes="2" \ >>> stonith-enabled="false" \ >>> no-quorum-policy="ignore" \ >>> last-lrm-refresh="1302610197" >>> rsc_defaults $id="rsc-options" \ >>> resource-stickiness="200" >>> >>> >>> Thanks >>> Christoph >>> ___ >>> Linux-HA mailing list >>> Linux-HA@lists.linux-ha.org >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>> See also: http://linux-ha.org/ReportingProblems >>> > > ___ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Filesystem thinks it is run as a clone
Am 13.04.2011 08:26, schrieb Andrew Beekhof: > On Tue, Apr 12, 2011 at 5:17 PM, Christoph Bartoschek > wrote: >> Hi, >> >> today we tested some NFS cluster scenarios and the first test failed. >> The first test was to put the current master node into standby. Stopping >> the services worked but then starting it on the other node failed. The >> ocf:hearbeat:Filesystem resource failed to start. In the logfile we see: >> >> Apr 12 14:08:42 laplace Filesystem[10772]: [10820]: INFO: Running start >> for /dev/home-data/afs on /srv/nfs/afs >> Apr 12 14:08:42 laplace Filesystem[10772]: [10822]: ERROR: DANGER! ext4 >> on /dev/home-data/afs is NOT cluster-aware! >> Apr 12 14:08:42 laplace Filesystem[10772]: [10824]: ERROR: DO NOT RUN IT >> AS A CLONE! > > To my eye the Filesystem agent looks confused The agent is confused because OCF_RESKEY_CRM_meta_clone is non-zero. Is this something that can happen? > >> The message comes from the following code in ocf:hearbeat:Filesystem: >> >> case $FSTYPE in >> ocfs2) ocfs2_init >> ;; >> nfs|smbfs|none|gfs2): # this is kind of safe too >> ;; >> *) if [ -n "$OCF_RESKEY_CRM_meta_clone" ]; then >> ocf_log err "DANGER! $FSTYPE on $DEVICE is NOT >> cluster-aware!" >> ocf_log err "DO NOT RUN IT AS A CLONE!" >> ocf_log err "Politely refusing to proceed to avoid data >> corruption." >> exit $OCF_ERR_CONFIGURED >> fi >> ;; >> esac >> >> >> The message is only printed if the variable OCF_RESKEY_CRM_meta_clone is >> non-zero. Our configuration however does not run the filesystem as a >> clone. Somehow the OCF_RESKEY_CRM_meta_clone variable leaked into the >> start of the Filesystem resource. Is this a known bug? >> >> Or is there a configuration error on our side? Here is the current >> configuration: >> >> >> node laplace \ >> attributes standby="off" >> node ries \ >> attributes standby="off" >> primitive ClusterIP ocf:heartbeat:IPaddr2 \ >> params ip="192.168.143.228" cidr_netmask="24" \ >> op monitor interval="30s" \ >> meta target-role="Started" >> primitive p_drbd_nfs ocf:linbit:drbd \ >> params drbd_resource="home-data" \ >> op monitor interval="15" role="Master" \ >> op monitor interval="30" role="Slave" >> primitive p_exportfs_afs ocf:heartbeat:exportfs \ >> params fsid="1" directory="/srv/nfs/afs" >> options="rw,no_root_squash,mountpoint" \ >> clientspec="192.168.143.0/255.255.255.0" \ >> wait_for_leasetime_on_stop="false" \ >> op monitor interval="30s" >> primitive p_fs_afs ocf:heartbeat:Filesystem \ >> params device="/dev/home-data/afs" directory="/srv/nfs/afs" \ >> fstype="ext4" \ >> op monitor interval="10s" \ >> meta target-role="Started" >> primitive p_lsb_nfsserver lsb:nfs-kernel-server \ >> op monitor interval="30s" >> primitive p_lvm_nfs ocf:heartbeat:LVM \ >> params volgrpname="home-data" \ >> op monitor interval="30s" >> group g_nfs p_lvm_nfs p_fs_afs p_exportfs_afs ClusterIP >> ms ms_drbd_nfs p_drbd_nfs \ >> meta master-max="1" master-node-max="1" clone-max="2" >> clone-node-max="1" notify="true" target-role="Started" >> clone cl_lsb_nfsserver p_lsb_nfsserver >> colocation c_nfs_on_drbd inf: g_nfs ms_drbd_nfs:Master >> order o_drbd_before_nfs inf: ms_drbd_nfs:promote g_nfs:start >> property $id="cib-bootstrap-options" \ >> dc-version="1.0.9-unknown" \ >> cluster-infrastructure="openais" \ >> expected-quorum-votes="2" \ >> stonith-enabled="false" \ >> no-quorum-policy="ignore" \ >> last-lrm-refresh="1302610197" >> rsc_defaults $id="rsc-options" \ >> resource-stickiness="200" >> >> >> Thanks >> Christoph >> ___ >> Linux-HA mailing list >> Linux-HA@lists.linux-ha.org >> http://lists.linux-ha.org/mailman/listinfo/linux-ha >> See also: http://linux-ha.org/ReportingProblems >> ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Filesystem thinks it is run as a clone
On Tue, Apr 12, 2011 at 5:17 PM, Christoph Bartoschek wrote: > Hi, > > today we tested some NFS cluster scenarios and the first test failed. > The first test was to put the current master node into standby. Stopping > the services worked but then starting it on the other node failed. The > ocf:hearbeat:Filesystem resource failed to start. In the logfile we see: > > Apr 12 14:08:42 laplace Filesystem[10772]: [10820]: INFO: Running start > for /dev/home-data/afs on /srv/nfs/afs > Apr 12 14:08:42 laplace Filesystem[10772]: [10822]: ERROR: DANGER! ext4 > on /dev/home-data/afs is NOT cluster-aware! > Apr 12 14:08:42 laplace Filesystem[10772]: [10824]: ERROR: DO NOT RUN IT > AS A CLONE! To my eye the Filesystem agent looks confused > The message comes from the following code in ocf:hearbeat:Filesystem: > > case $FSTYPE in > ocfs2) ocfs2_init > ;; > nfs|smbfs|none|gfs2) : # this is kind of safe too > ;; > *) if [ -n "$OCF_RESKEY_CRM_meta_clone" ]; then > ocf_log err "DANGER! $FSTYPE on $DEVICE is NOT > cluster-aware!" > ocf_log err "DO NOT RUN IT AS A CLONE!" > ocf_log err "Politely refusing to proceed to avoid data > corruption." > exit $OCF_ERR_CONFIGURED > fi > ;; > esac > > > The message is only printed if the variable OCF_RESKEY_CRM_meta_clone is > non-zero. Our configuration however does not run the filesystem as a > clone. Somehow the OCF_RESKEY_CRM_meta_clone variable leaked into the > start of the Filesystem resource. Is this a known bug? > > Or is there a configuration error on our side? Here is the current > configuration: > > > node laplace \ > attributes standby="off" > node ries \ > attributes standby="off" > primitive ClusterIP ocf:heartbeat:IPaddr2 \ > params ip="192.168.143.228" cidr_netmask="24" \ > op monitor interval="30s" \ > meta target-role="Started" > primitive p_drbd_nfs ocf:linbit:drbd \ > params drbd_resource="home-data" \ > op monitor interval="15" role="Master" \ > op monitor interval="30" role="Slave" > primitive p_exportfs_afs ocf:heartbeat:exportfs \ > params fsid="1" directory="/srv/nfs/afs" > options="rw,no_root_squash,mountpoint" \ > clientspec="192.168.143.0/255.255.255.0" \ > wait_for_leasetime_on_stop="false" \ > op monitor interval="30s" > primitive p_fs_afs ocf:heartbeat:Filesystem \ > params device="/dev/home-data/afs" directory="/srv/nfs/afs" \ > fstype="ext4" \ > op monitor interval="10s" \ > meta target-role="Started" > primitive p_lsb_nfsserver lsb:nfs-kernel-server \ > op monitor interval="30s" > primitive p_lvm_nfs ocf:heartbeat:LVM \ > params volgrpname="home-data" \ > op monitor interval="30s" > group g_nfs p_lvm_nfs p_fs_afs p_exportfs_afs ClusterIP > ms ms_drbd_nfs p_drbd_nfs \ > meta master-max="1" master-node-max="1" clone-max="2" > clone-node-max="1" notify="true" target-role="Started" > clone cl_lsb_nfsserver p_lsb_nfsserver > colocation c_nfs_on_drbd inf: g_nfs ms_drbd_nfs:Master > order o_drbd_before_nfs inf: ms_drbd_nfs:promote g_nfs:start > property $id="cib-bootstrap-options" \ > dc-version="1.0.9-unknown" \ > cluster-infrastructure="openais" \ > expected-quorum-votes="2" \ > stonith-enabled="false" \ > no-quorum-policy="ignore" \ > last-lrm-refresh="1302610197" > rsc_defaults $id="rsc-options" \ > resource-stickiness="200" > > > Thanks > Christoph > ___ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] Filesystem thinks it is run as a clone
Hi, today we tested some NFS cluster scenarios and the first test failed. The first test was to put the current master node into standby. Stopping the services worked but then starting it on the other node failed. The ocf:hearbeat:Filesystem resource failed to start. In the logfile we see: Apr 12 14:08:42 laplace Filesystem[10772]: [10820]: INFO: Running start for /dev/home-data/afs on /srv/nfs/afs Apr 12 14:08:42 laplace Filesystem[10772]: [10822]: ERROR: DANGER! ext4 on /dev/home-data/afs is NOT cluster-aware! Apr 12 14:08:42 laplace Filesystem[10772]: [10824]: ERROR: DO NOT RUN IT AS A CLONE! The message comes from the following code in ocf:hearbeat:Filesystem: case $FSTYPE in ocfs2) ocfs2_init ;; nfs|smbfs|none|gfs2): # this is kind of safe too ;; *) if [ -n "$OCF_RESKEY_CRM_meta_clone" ]; then ocf_log err "DANGER! $FSTYPE on $DEVICE is NOT cluster-aware!" ocf_log err "DO NOT RUN IT AS A CLONE!" ocf_log err "Politely refusing to proceed to avoid data corruption." exit $OCF_ERR_CONFIGURED fi ;; esac The message is only printed if the variable OCF_RESKEY_CRM_meta_clone is non-zero. Our configuration however does not run the filesystem as a clone. Somehow the OCF_RESKEY_CRM_meta_clone variable leaked into the start of the Filesystem resource. Is this a known bug? Or is there a configuration error on our side? Here is the current configuration: node laplace \ attributes standby="off" node ries \ attributes standby="off" primitive ClusterIP ocf:heartbeat:IPaddr2 \ params ip="192.168.143.228" cidr_netmask="24" \ op monitor interval="30s" \ meta target-role="Started" primitive p_drbd_nfs ocf:linbit:drbd \ params drbd_resource="home-data" \ op monitor interval="15" role="Master" \ op monitor interval="30" role="Slave" primitive p_exportfs_afs ocf:heartbeat:exportfs \ params fsid="1" directory="/srv/nfs/afs" options="rw,no_root_squash,mountpoint" \ clientspec="192.168.143.0/255.255.255.0" \ wait_for_leasetime_on_stop="false" \ op monitor interval="30s" primitive p_fs_afs ocf:heartbeat:Filesystem \ params device="/dev/home-data/afs" directory="/srv/nfs/afs" \ fstype="ext4" \ op monitor interval="10s" \ meta target-role="Started" primitive p_lsb_nfsserver lsb:nfs-kernel-server \ op monitor interval="30s" primitive p_lvm_nfs ocf:heartbeat:LVM \ params volgrpname="home-data" \ op monitor interval="30s" group g_nfs p_lvm_nfs p_fs_afs p_exportfs_afs ClusterIP ms ms_drbd_nfs p_drbd_nfs \ meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" target-role="Started" clone cl_lsb_nfsserver p_lsb_nfsserver colocation c_nfs_on_drbd inf: g_nfs ms_drbd_nfs:Master order o_drbd_before_nfs inf: ms_drbd_nfs:promote g_nfs:start property $id="cib-bootstrap-options" \ dc-version="1.0.9-unknown" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ stonith-enabled="false" \ no-quorum-policy="ignore" \ last-lrm-refresh="1302610197" rsc_defaults $id="rsc-options" \ resource-stickiness="200" Thanks Christoph ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems