Re: [Pacemaker] Postgres RA won't start
On Wed, Oct 12, 2011 at 07:41:20PM -0600, Serge Dubrouski wrote: On Wed, Oct 12, 2011 at 9:20 AM, Amar Prasovic a...@linux.org.ba wrote: Thank you all for tips and suggestions. I managed to configure postgres so it actually starts. First, I updated resource-agents (Florian thanks for the tip, still don't know how did I manage to miss that :) ) Second, I deleted postgres primitive, cleared all failcounts and configure it again like this: primitive postgres_res ocf:heartbeat:pgsql \ params pgctl=/usr/lib/postgresql/8.4/bin/pg_ctl psql=/usr/bin/psql start_opt= pgdata=/var/lib/postgresql/8.4/main config=/etc/postgresql/8.4/main/postgresql.conf pgdba=postgres \ op start interval=0 timeout=120s \ op stop interval=0 timeout=120s \ op monitor interval=30s timeout=30s depth=0 After that, it all worked like a charm. However, I noticed some strange output in the log file, it wasn't there before I updated the resource-agents. Here is the extract from the syslog: http://pastebin.com/ybPi0VMp (postgres_res:monitor:stderr) [: 647: monitor: unexpected operator This error is actually reported with any operator. I tried to start the script from CLI, I got the same thing with ./pgsql start, ./pgsql status, ./pgsql stop Weird. I don't know what to tell. The RA is basically all right, it just misses one nor very important fix. On my system CentOS 5. PosgreSQL 8.4 or 9.0 it doesn't produce any errors. If understand you log right the problem is in line 647 of the RA which is: [ $1 == validate-all ] exit $rc == != = Make that [ $1 = validate-all ] exit $rc -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Postgres RA won't start
On Thu, Oct 13, 2011 at 4:29 AM, Lars Ellenberg lars.ellenb...@linbit.comwrote: On Wed, Oct 12, 2011 at 07:41:20PM -0600, Serge Dubrouski wrote: On Wed, Oct 12, 2011 at 9:20 AM, Amar Prasovic a...@linux.org.ba wrote: Thank you all for tips and suggestions. I managed to configure postgres so it actually starts. First, I updated resource-agents (Florian thanks for the tip, still don't know how did I manage to miss that :) ) Second, I deleted postgres primitive, cleared all failcounts and configure it again like this: primitive postgres_res ocf:heartbeat:pgsql \ params pgctl=/usr/lib/postgresql/8.4/bin/pg_ctl psql=/usr/bin/psql start_opt= pgdata=/var/lib/postgresql/8.4/main config=/etc/postgresql/8.4/main/postgresql.conf pgdba=postgres \ op start interval=0 timeout=120s \ op stop interval=0 timeout=120s \ op monitor interval=30s timeout=30s depth=0 After that, it all worked like a charm. However, I noticed some strange output in the log file, it wasn't there before I updated the resource-agents. Here is the extract from the syslog: http://pastebin.com/ybPi0VMp (postgres_res:monitor:stderr) [: 647: monitor: unexpected operator This error is actually reported with any operator. I tried to start the script from CLI, I got the same thing with ./pgsql start, ./pgsql status, ./pgsql stop Weird. I don't know what to tell. The RA is basically all right, it just misses one nor very important fix. On my system CentOS 5. PosgreSQL 8.4 or 9.0 it doesn't produce any errors. If understand you log right the problem is in line 647 of the RA which is: [ $1 == validate-all ] exit $rc == != = Theoretically yes = is for strings and == is for numbers. But why it would create a problem on Debian and not on CentOS and why nobody else reported this issue so far? BTW, other RAs use == operator as well: apache, LVM, portblock, Make that [ $1 = validate-all ] exit $rc -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker -- Serge Dubrouski. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Postgres RA won't start
On Thu, Oct 13, 2011 at 06:35:27AM -0600, Serge Dubrouski wrote: On Thu, Oct 13, 2011 at 4:29 AM, Lars Ellenberg lars.ellenb...@linbit.comwrote: On Wed, Oct 12, 2011 at 07:41:20PM -0600, Serge Dubrouski wrote: On Wed, Oct 12, 2011 at 9:20 AM, Amar Prasovic a...@linux.org.ba wrote: Thank you all for tips and suggestions. I managed to configure postgres so it actually starts. First, I updated resource-agents (Florian thanks for the tip, still don't know how did I manage to miss that :) ) Second, I deleted postgres primitive, cleared all failcounts and configure it again like this: primitive postgres_res ocf:heartbeat:pgsql \ params pgctl=/usr/lib/postgresql/8.4/bin/pg_ctl psql=/usr/bin/psql start_opt= pgdata=/var/lib/postgresql/8.4/main config=/etc/postgresql/8.4/main/postgresql.conf pgdba=postgres \ op start interval=0 timeout=120s \ op stop interval=0 timeout=120s \ op monitor interval=30s timeout=30s depth=0 After that, it all worked like a charm. However, I noticed some strange output in the log file, it wasn't there before I updated the resource-agents. Here is the extract from the syslog: http://pastebin.com/ybPi0VMp (postgres_res:monitor:stderr) [: 647: monitor: unexpected operator This error is actually reported with any operator. I tried to start the script from CLI, I got the same thing with ./pgsql start, ./pgsql status, ./pgsql stop Weird. I don't know what to tell. The RA is basically all right, it just misses one nor very important fix. On my system CentOS 5. PosgreSQL 8.4 or 9.0 it doesn't produce any errors. If understand you log right the problem is in line 647 of the RA which is: [ $1 == validate-all ] exit $rc == != = Theoretically yes = is for strings and == is for numbers. But why it would create a problem on Debian and not on CentOS and why nobody else reported this issue so far? BTW, other RAs use == operator as well: apache, LVM, portblock, As you found out by now, if they are bash, that's ok. If they are /bin/sh, then that's a bug. dash for example does not like ==. And no, apache and portblock use these in some embeded awk script. LVM I fixed as well. Make that [ $1 = validate-all ] exit $rc -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Postgres RA won't start
Thank you all for tips and suggestions. I managed to configure postgres so it actually starts. First, I updated resource-agents (Florian thanks for the tip, still don't know how did I manage to miss that :) ) Second, I deleted postgres primitive, cleared all failcounts and configure it again like this: primitive postgres_res ocf:heartbeat:pgsql \ params pgctl=/usr/lib/postgresql/8.4/bin/pg_ctl psql=/usr/bin/psql start_opt= pgdata=/var/lib/postgresql/8.4/main config=/etc/postgresql/8.4/main/postgresql.conf pgdba=postgres \ op start interval=0 timeout=120s \ op stop interval=0 timeout=120s \ op monitor interval=30s timeout=30s depth=0 After that, it all worked like a charm. However, I noticed some strange output in the log file, it wasn't there before I updated the resource-agents. Here is the extract from the syslog: http://pastebin.com/ybPi0VMp (postgres_res:monitor:stderr) [: 647: monitor: unexpected operator This error is actually reported with any operator. I tried to start the script from CLI, I got the same thing with ./pgsql start, ./pgsql status, ./pgsql stop Here is the pgsql script I am using: http://pastebin.com/55mKNDCM P..S. you can ignore nginx errors in syslog, I will open a new topic about that ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Postgres RA won't start
On Wed, Oct 12, 2011 at 9:20 AM, Amar Prasovic a...@linux.org.ba wrote: Thank you all for tips and suggestions. I managed to configure postgres so it actually starts. First, I updated resource-agents (Florian thanks for the tip, still don't know how did I manage to miss that :) ) Second, I deleted postgres primitive, cleared all failcounts and configure it again like this: primitive postgres_res ocf:heartbeat:pgsql \ params pgctl=/usr/lib/postgresql/8.4/bin/pg_ctl psql=/usr/bin/psql start_opt= pgdata=/var/lib/postgresql/8.4/main config=/etc/postgresql/8.4/main/postgresql.conf pgdba=postgres \ op start interval=0 timeout=120s \ op stop interval=0 timeout=120s \ op monitor interval=30s timeout=30s depth=0 After that, it all worked like a charm. However, I noticed some strange output in the log file, it wasn't there before I updated the resource-agents. Here is the extract from the syslog: http://pastebin.com/ybPi0VMp (postgres_res:monitor:stderr) [: 647: monitor: unexpected operator This error is actually reported with any operator. I tried to start the script from CLI, I got the same thing with ./pgsql start, ./pgsql status, ./pgsql stop Weird. I don't know what to tell. The RA is basically all right, it just misses one nor very important fix. On my system CentOS 5. PosgreSQL 8.4 or 9.0 it doesn't produce any errors. If understand you log right the problem is in line 647 of the RA which is: [ $1 == validate-all ] exit $rc I do not see why it would complain on this line. Here is the pgsql script I am using: http://pastebin.com/55mKNDCM P..S. you can ignore nginx errors in syslog, I will open a new topic about that ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker -- Serge Dubrouski. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Postgres RA won't start
On 2011-10-11 16:10, Amar Prasovic wrote: Hello everyone, I tried to configure postgres RA and I ran into some problems. [...] in crm_mon Online: [ webnode02 webnode01 ] Master/Slave Set: drbd_cluster Masters: [ webnode01 ] Slaves: [ webnode02 ] Resource Group: cluster_1 fs_res (ocf::heartbeat:Filesystem):Started webnode01 ClusterIP (ocf::heartbeat:IPaddr2): Started webnode01 nginx_res (ocf::heartbeat:nginx):Started webnode01 postgres_res (ocf::heartbeat:pgsql): Stopped Failed actions: postgres_res_start_0 (node=webnode01, call=84, rc=5, status=complete): not installed postgres_res_start_0 (node=webnode02, call=66, rc=5, status=complete): not installed There are just 4 scenarios in which pgsql returns OCF_ERR_INSTALLED: - The resource agent is not installed or is not executable (unlikely); - pgctl or psql are not installed or not executable; - the configuration file does not exist or is not readable during a non-probe; - the username identified by the pgdba resource parameter does not resolve to a uid. All of those do log error messages to the log though. You can grep for ERROR in your logs, it should turn up what went wrong. Cheers, Florian -- Need help with Pacemaker? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Postgres RA won't start
What version of resource-agents package do you use? Old version of pgsql depended on fuser tool installed, otherway it could fail with that error code. Hello Serge, thank you for your answer. I don't have any resource-agents installed. The system is Debian Squeeze 6.0.3 and it automatically installed cluster-agents 1.0.3-3.1 When I try to install resource-agents I run into dependency problems: webnode01 postgresql # apt-get install resource-agents Reading package lists... Done Building dependency tree Reading state information... Done Some packages could not be installed. This may mean that you have requested an impossible situation or if you are using the unstable distribution that some required packages have not yet been created or been moved out of Incoming. The following information may help to resolve the situation: The following packages have unmet dependencies: resource-agents : Depends: libplumb2 but it is not going to be installed Depends: libplumbgpl2 but it is not going to be installed E: Broken packages When I try to install libplumb2, the installation wants to remove pacemaker: webnode01 postgresql # apt-get install libplumb2 Reading package lists... Done Building dependency tree Reading state information... Done The following packages were automatically installed and are no longer required: libsensors4 libsnmp15 libheartbeat2 corosync libnspr4-0d libtimedate-perl libsnmp-base openhpid libcurl3 libssh2-1 lm-sensors libopenhpi2 fancontrol libopenipmi0 libperl5.10 libesmtp5 libcorosync4 libnet1 libnss3-1d Use 'apt-get autoremove' to remove them. The following extra packages will be installed: libpils2 The following packages will be REMOVED: cluster-agents cluster-glue libcluster-glue pacemaker The following NEW packages will be installed: libpils2 libplumb2 0 upgraded, 2 newly installed, 4 to remove and 0 not upgraded. Need to get 115 kB of archives. After this operation, 5,874 kB disk space will be freed. Do you want to continue [Y/n]? n Abort. Can I do something with fuser tools? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Postgres RA won't start
On 2011-10-11 17:10, Amar Prasovic wrote: What version of resource-agents package do you use? Old version of pgsql depended on fuser tool installed, otherway it could fail with that error code. Hello Serge, thank you for your answer. I don't have any resource-agents installed. The system is Debian Squeeze 6.0.3 and it automatically installed cluster-agents 1.0.3-3.1 When I try to install resource-agents I run into dependency problems: Yeah, that's a bit awkward. The squeeze package is called cluster-agents, but then it was decided that the package should be named resource-agents as on all other platforms, and that's the current name in squeeze-backports. webnode01 postgresql # apt-get install resource-agents Do apt-get -t squeeze-backports install resource-agents instead. Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Postgres RA won't start
I don't have too much of experience with pacemaker in Devian. I'd also suggest getting the latest version of pgsql RA from git, though if your basic package is too old there could be conflicts. On Oct 11, 2011 9:11 AM, Amar Prasovic a...@linux.org.ba wrote: What version of resource-agents package do you use? Old version of pgsql depended on fuser tool installed, otherway it could fail with that error code. Hello Serge, thank you for your answer. I don't have any resource-agents installed. The system is Debian Squeeze 6.0.3 and it automatically installed cluster-agents 1.0.3-3.1 When I try to install resource-agents I run into dependency problems: webnode01 postgresql # apt-get install resource-agents Reading package lists... Done Building dependency tree Reading state information... Done Some packages could not be installed. This may mean that you have requested an impossible situation or if you are using the unstable distribution that some required packages have not yet been created or been moved out of Incoming. The following information may help to resolve the situation: The following packages have unmet dependencies: resource-agents : Depends: libplumb2 but it is not going to be installed Depends: libplumbgpl2 but it is not going to be installed E: Broken packages When I try to install libplumb2, the installation wants to remove pacemaker: webnode01 postgresql # apt-get install libplumb2 Reading package lists... Done Building dependency tree Reading state information... Done The following packages were automatically installed and are no longer required: libsensors4 libsnmp15 libheartbeat2 corosync libnspr4-0d libtimedate-perl libsnmp-base openhpid libcurl3 libssh2-1 lm-sensors libopenhpi2 fancontrol libopenipmi0 libperl5.10 libesmtp5 libcorosync4 libnet1 libnss3-1d Use 'apt-get autoremove' to remove them. The following extra packages will be installed: libpils2 The following packages will be REMOVED: cluster-agents cluster-glue libcluster-glue pacemaker The following NEW packages will be installed: libpils2 libplumb2 0 upgraded, 2 newly installed, 4 to remove and 0 not upgraded. Need to get 115 kB of archives. After this operation, 5,874 kB disk space will be freed. Do you want to continue [Y/n]? n Abort. Can I do something with fuser tools? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker