Hi Serge Thank you. I applied it to my repository.
2012/2/6 Serge Dubrouski <serge...@gmail.com>: > Takatoshi - > > Please consider apllying attached patch to your version of pgsql RA. It's > all cosmetics: replacing tabs with spaces and improving English. Thanks for > implementing recommendations I expressed earlier. > > > On Tue, Dec 13, 2011 at 3:37 AM, Takatoshi MATSUO <matsuo....@gmail.com> > wrote: >> >> Hello Serge >> >> 2011/12/13 Serge Dubrouski <serge...@gmail.com>: >> > >> > >> > On Mon, Dec 12, 2011 at 4:28 AM, Takatoshi MATSUO <matsuo....@gmail.com> >> > wrote: >> >> >> >> Hello Serge >> >> >> >> 2011/12/12 Serge Dubrouski <serge...@gmail.com>: >> >> > >> >> > >> >> > On Thu, Dec 8, 2011 at 10:04 PM, Takatoshi MATSUO >> >> > <matsuo....@gmail.com> >> >> > wrote: >> >> >> >> >> >> Hello Serge >> >> >> >> >> >> 2011/12/8 Serge Dubrouski <serge...@gmail.com>: >> >> >> > >> >> >> > >> >> >> > On Mon, Dec 5, 2011 at 9:15 PM, Takatoshi MATSUO >> >> >> > <matsuo....@gmail.com> >> >> >> > wrote: >> >> >> >> >> >> >> >> Hello Serge >> >> >> >> >> >> >> >> Serge Dubrouski <serge...@gmail.com>: >> >> >> >> > Hello - >> >> >> >> > >> >> >> >> > Takatoshi MATSUO did a tremendous job on implementing support >> >> >> >> > for >> >> >> >> > streaming >> >> >> >> > replication feature in pgsql RA. Also it looks like PostgeSQL >> >> >> >> > 9.1 >> >> >> >> > has >> >> >> >> > all >> >> >> >> > necessary interfaces to successfully implement Pacemaker's M/S >> >> >> >> > concept. >> >> >> >> > So >> >> >> >> > I think it's time to start discussion on how to merge >> >> >> >> > Takatoshi's >> >> >> >> > work >> >> >> >> > into >> >> >> >> > pgsql RA baseline. Here is the link to Takatoshi's GitHUB if >> >> >> >> > somebody >> >> >> >> > wants >> >> >> >> > to test his RA: >> >> >> >> > >> >> >> >> > https://github.com/t-matsuo/ >> >> >> >> > >> >> >> >> > So far I tested it for backward compatibility in a standard >> >> >> >> > non-replication >> >> >> >> > mode and also tested M/S model and found no real issues. >> >> >> >> > Though >> >> >> >> > it >> >> >> >> > definitely requires some more polishing and testing. >> >> >> >> > >> >> >> >> > Takatoshi, here are some changes that I want to discuss with >> >> >> >> > you: >> >> >> >> > >> >> >> >> > 1. Is it possible to add a check for PostgreSQL version and >> >> >> >> > fail >> >> >> >> > with >> >> >> >> > OCF_ERR_INSTALLED when one tries to start replication on >> >> >> >> > version >> >> >> >> > less >> >> >> >> > than >> >> >> >> > 9.1? A simple cat on PG_VERSION with some analysis would >> >> >> >> > probably >> >> >> >> > do. >> >> >> >> >> >> >> >> I'll add a check. >> >> >> >> >> >> I added a check. >> >> >> >> >> >> >> >> >> >> >> >> https://github.com/t-matsuo/resource-agents/commit/3ab7cfdcce118043cd149b348740e50e7a946eb3 >> >> >> >> >> >> >> >> >> >> >> > 2. I think that following lines should be moved from >> >> >> >> > pgsql_start >> >> >> >> > to >> >> >> >> > pgsql_validate_all >> >> >> >> > >> >> >> >> > 535 # Check whether tmpdir is readable by pgdba user >> >> >> >> > 536 if ! runasowner "test -r $OCF_RESKEY_tmpdir"; then >> >> >> >> > 537 ocf_log err "Directory $OCF_RESKEY_tmpdir is not >> >> >> >> > readable >> >> >> >> > by >> >> >> >> > $OCF_RESKEY_pgdba" >> >> >> >> > 538 return $OCF_ERR_PERM >> >> >> >> > 539 fi >> >> >> >> >> >> >> >> Thanks. I think so too. >> >> >> >> I'll fix it. >> >> >> >> >> >> >> >> >> >> I fixed it and I deleted a check for tmpdir existence >> >> >> because the checking for permittion fills the role. >> >> >> >> >> >> >> >> >> >> >> >> https://github.com/t-matsuo/resource-agents/commit/82d4939486bcca429e2deb804d7faf756099bb59 >> >> >> >> >> >> >> >> >> > On a second thought I'm not sure why we need that parameter and >> >> >> > directory at >> >> >> > all. Why not to create rep_mode.conf, PGSQL.lock and xlog.note in >> >> >> > $OCF_RESKEY_pgdata ? What problems it can create? >> >> >> > >> >> >> > One more advantage to do >> >> >> > it in $OCF_RESKEY_pgdata is an ability to handle more than >> >> >> > PostgreSQL >> >> >> > instance on the same server without a need for additional temp >> >> >> > directories. >> >> >> >> >> >> When backup is needed, customers may backup these files and restore >> >> >> it. >> >> >> It may cause problems. >> >> >> Specially PGSQL.lock causes an error on start. >> >> >> >> >> >> I think that they should be treated as a separate thing. >> >> >> because they are independent from PostgreSQL's data. >> >> > >> >> > >> >> > Ok. Then may be it should default to something like >> >> > ${OCF_RESKEY_pgdata}/temp and RA should create it and set the right >> >> > ownership and permissions if it doesn't exist? And again may be in >> >> > this >> >> > case >> >> > we don't need that parameter? >> >> > >> >> >> >> I agree that RA creates it and sets the right ownership and >> >> permissions. >> >> But considering backup using tar command, I think it's not better to >> >> create it >> >> under $OCF_RESKEY_pgdata. >> >> >> >> According to "Filesystem Hierarchy Standard" >> >> it's better to use /var/lib/somewhere. >> >> >> >> >> >> >> >> http://www.pathname.com/fhs/pub/fhs-2.3.html#VARLIBVARIABLESTATEINFORMATION, >> >> >> >> Then I designed using /var/lig/[RA Name] (=/var/lib/pgsql). >> >> What do you think? >> > >> > >> > Ahh, I see now where confusion comes from. /var/lib/pgsql is actually >> > taken >> > :-) It's used as default home directory in RedHat (at least) for >> > postgres >> > user in PostgreSQL 8.XX. When they'll start start shipping version 9 >> > they'll >> > probably use it as well. OCF_RESKEY_pg_data defaults to >> > /var/lib/pgsql/data. >> > >> > Then, /var/lib is usually used for non-temporary data, For temporary >> > files >> > it's probably better to use /var/run or /var/tmp. If you still want to >> > use >> > /var/lib/pgsql you probably need to use /var/lib/pgsql/temp or so. >> >> Files under /var/run or /var/tmp are cleared by tmpwatch or at the >> beginning >> of the boot process, aren't they? >> >> Clearing PGSQL.lock file causes problem. >> >> >> > >> >> >> >> >> >> >> >> >> >> >> Incidentally I considered to handle more than PostgreSQL instance. >> >> >> Initially I added port number to these filenames, but I deleted it >> >> >> to simplify filenames. >> >> >> >> >> >> >> >> >> >> >> >> https://github.com/t-matsuo/resource-agents/commit/b16faf2d797200048dc0fc07a45b6751cf5be190 >> >> >> >> >> >> >> >> >> > Also I think it would be good if RA was able to take care of >> >> >> > adding >> >> >> > "include >> >> >> > $WHATEVER_DIR/rep_mode.conf" in postgresql.conf. It will make the >> >> >> > RA >> >> >> > self >> >> >> > sustainable. In a current situation admin has add that directive >> >> >> > manually. >> >> >> > RA though can something like this in a start function for >> >> >> > replication >> >> >> > mode: >> >> >> > >> >> >> > if ! grep -i "include $WHATEVER_DIR/rep_mode.conf" >> >> >> > $OCF_RESKEY_config >> >> >> > then >> >> >> > echo "include $WHATEVER_DIR/rep_mode.conf" >> >> >> >> > $OCF_RESKEY_config >> >> >> > fi >> >> >> >> >> >> Sounds good. >> >> >> >> >> >> > Don't know if it makes sense to remove it on stop. >> >> >> >> >> >> I think it doesn't make sense to remove it, >> >> >> because rep_mode.conf becomes empty on stop. >> >> > >> >> > >> >> > The only problem here if admin changes temp_dir parameter but doesn't >> >> > delete >> >> > records from postgres.conf. We could end up with several include >> >> > records >> >> > then or with records pointing to the non existing files. >> >> >> >> Should we consider this situation? >> >> >> >> If same parameters exist, PostgreSQL uses last parameter. >> >> So I will implement it to add several include if temp_dir is changed. >> > >> > >> > The problem happens when file gets deleted but reference to it still >> > exists >> > in postgres.conf. They DB fails to start. >> >> >> Start manually? >> The RA manages some statuses using an attribute and PGSQL.lock file. >> I don't recommend starting it manually in the first place. >> >> >> How about just describing it in a document? meta-data? >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> >> > 3. I don't really like this part of pgsql_real_monitor: >> >> >> >> > >> >> >> >> > 775 if ! is_replication; then >> >> >> >> > 776 OCF_RESKEY_monitor_sql=`escape_string > >> >> >> >> > "$OCF_RESKEY_monitor_sql"` >> >> >> >> > 777 runasowner -q $loglevel "$OCF_RESKEY_psql >> >> >> >> > $psql_options >> >> >> >> > -c >> >> >> >> > > '$OCF_RESKEY_monitor_sql'" >> >> >> >> > 778 rc=$? >> >> >> >> > 779 else >> >> >> >> > 780 output=`su $OCF_RESKEY_pgdba -c "cd >> >> >> >> > $OCF_RESKEY_pgdata; >> >> >> >> > > >> >> >> >> > $OCF_RESKEY_psql $psql_options -Atc \"${CHECK_MS_SQL}\""` >> >> >> >> > 781 rc=$? >> >> >> >> > 782 fi >> >> >> >> > >> >> >> >> > I think that functional monitor (the one that uses monitor_sql) >> >> >> >> > should >> >> >> >> > run >> >> >> >> > always independently of DB mode since its primary role is to >> >> >> >> > check >> >> >> >> > data >> >> >> >> > and >> >> >> >> > fail if it's not correct or corrupted. In replication mode >> >> >> >> > there >> >> >> >> > should >> >> >> >> > be >> >> >> >> > additional monitoring. Other way it misleads customer on a >> >> >> >> > usage >> >> >> >> > of >> >> >> >> > monitor_sql. >> >> >> >> >> >> >> >> All right. >> >> >> >> >> >> >> >> Does it need to execute "select now();" if monitor_sql parameter >> >> >> >> is >> >> >> >> empty? >> >> >> >> I think it's unnecessary. >> >> >> > >> >> >> > >> >> >> > For a case of an empty parameter you are right and running "select >> >> >> > now()" >> >> >> > probably unnecessary, Non-empty monitro_sql shall be executed in >> >> >> > my >> >> >> > opinion. >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> > 4. You already populate several attributes with crm_attribute. >> >> >> >> > Does >> >> >> >> > it >> >> >> >> > make >> >> >> >> > sense to populate a name of a Master node in promote function >> >> >> >> > and >> >> >> >> > use >> >> >> >> > it >> >> >> >> > later on instead of running crm_mon on each monitor command? >> >> >> >> > >> >> >> >> >> >> >> >> Do you mean that you don't want to use crm_mon in monitor? >> >> >> > >> >> >> > >> >> >> > I prefer to have as few dependencies on external programs as >> >> >> > possible. >> >> >> >> >> >> Agree. >> >> >> >> >> >> > Using crm_attribute to communicate the name of a master node would >> >> >> > be >> >> >> > consistent with the rest of the script since you already use it >> >> >> > for >> >> >> > communicating state of the nodes. If you prefer using crm_mon than >> >> >> > you >> >> >> > have >> >> >> > to add a check that that binary exists on the server. In 99.99% of >> >> >> > the >> >> >> > case >> >> >> > it will but still the check is necessary I think. >> >> >> >> >> >> I use crm_mon to check for node existence and node online too. >> >> >> It is impossible to get it from attribute, so I prefer to using >> >> >> crm_mon. >> >> >> >> >> >> Do I have to add a check for crm_master, crm_attribute, >> >> >> crm_failcount >> >> >> too? >> >> >> >> >> > >> >> > That's a different interesting topic. What is the reason for doing >> >> > this? >> >> > >> >> > my_fail_count=`$CRM_FAILCOUNT -r $OCF_RESOURCE_INSTANCE -N $HOSTNAME >> >> > -G >> >> > -Q | >> >> > sed "s/INFINITY/1000000/g"` >> >> > if [ "$my_fail_count" != "0" ]; then >> >> > ocf_log info "I don't start PostgreSQL on post-demote because >> >> > of >> >> > my >> >> > fail-count=$my_fail_count." >> >> > return $OCF_SUCCESS >> >> > fi >> >> > >> >> > Why not to leave pacemaker to do its job on managing failcounts? >> >> >> >> Because I want to make fail-over faster. >> >> At any rate Pacemaker stops it? >> > >> > >> > I'm probably wrong here. Looks like you use this trick in case if >> > slave's >> > data is newer than new master's data. In this case it makes sense to >> > disable >> > such slave. >> >> Why do you think so ? >> The check whether slave's data is newer than new master's one is here. >> >> in pgsql_post_demote() >> ------------------------------------------------------- >> 886 is_slave_right || return $OCF_ERR_GENERIC >> ------------------------------------------------------- >> >> This line is executed after this trick. >> >> >> >> Regards, >> Takatoshi MATSUO >> >> _______________________________________________________ >> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org >> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev >> Home Page: http://linux-ha.org/ > > > > > -- > Serge Dubrouski. > -- Takatoshi MATSUO _______________________________________________________ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/