Re: [Linux-ha-dev] pgsql and streaming replcation

Takatoshi MATSUO Sun, 05 Feb 2012 18:43:09 -0800

Hi Serge

Thank you.
I applied it to my repository.



2012/2/6 Serge Dubrouski <serge...@gmail.com>:
> Takatoshi -
>
> Please consider apllying attached patch to your version of pgsql RA. It's
> all cosmetics: replacing tabs with spaces and improving English. Thanks for
> implementing recommendations I expressed earlier.
>
>
> On Tue, Dec 13, 2011 at 3:37 AM, Takatoshi MATSUO <matsuo....@gmail.com>
> wrote:
>>
>> Hello Serge
>>
>> 2011/12/13 Serge Dubrouski <serge...@gmail.com>:
>> >
>> >
>> > On Mon, Dec 12, 2011 at 4:28 AM, Takatoshi MATSUO <matsuo....@gmail.com>
>> > wrote:
>> >>
>> >> Hello Serge
>> >>
>> >> 2011/12/12 Serge Dubrouski <serge...@gmail.com>:
>> >> >
>> >> >
>> >> > On Thu, Dec 8, 2011 at 10:04 PM, Takatoshi MATSUO
>> >> > <matsuo....@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> Hello Serge
>> >> >>
>> >> >> 2011/12/8 Serge Dubrouski <serge...@gmail.com>:
>> >> >> >
>> >> >> >
>> >> >> > On Mon, Dec 5, 2011 at 9:15 PM, Takatoshi MATSUO
>> >> >> > <matsuo....@gmail.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> Hello Serge
>> >> >> >>
>> >> >> >> Serge Dubrouski <serge...@gmail.com>:
>> >> >> >> > Hello -
>> >> >> >> >
>> >> >> >> > Takatoshi MATSUO did a tremendous job on implementing support
>> >> >> >> > for
>> >> >> >> > streaming
>> >> >> >> > replication feature in pgsql RA. Also it looks like PostgeSQL
>> >> >> >> > 9.1
>> >> >> >> > has
>> >> >> >> > all
>> >> >> >> > necessary interfaces to successfully implement  Pacemaker's M/S
>> >> >> >> > concept.
>> >> >> >> > So
>> >> >> >> > I think it's time to start discussion on how to merge
>> >> >> >> > Takatoshi's
>> >> >> >> > work
>> >> >> >> > into
>> >> >> >> > pgsql RA baseline. Here is the link to Takatoshi's GitHUB if
>> >> >> >> > somebody
>> >> >> >> > wants
>> >> >> >> > to test his RA:
>> >> >> >> >
>> >> >> >> > https://github.com/t-matsuo/
>> >> >> >> >
>> >> >> >> > So far I tested it for backward compatibility in a standard
>> >> >> >> > non-replication
>> >> >> >> > mode  and also tested M/S model and found no real issues.
>> >> >> >> > Though
>> >> >> >> > it
>> >> >> >> > definitely requires some more polishing and testing.
>> >> >> >> >
>> >> >> >> > Takatoshi, here are some changes that I want to discuss with
>> >> >> >> > you:
>> >> >> >> >
>> >> >> >> > 1. Is it possible to add a check for PostgreSQL version and
>> >> >> >> > fail
>> >> >> >> > with
>> >> >> >> > OCF_ERR_INSTALLED when one tries to start replication on
>> >> >> >> > version
>> >> >> >> > less
>> >> >> >> > than
>> >> >> >> > 9.1? A simple cat on PG_VERSION with some analysis would
>> >> >> >> > probably
>> >> >> >> > do.
>> >> >> >>
>> >> >> >> I'll add a check.
>> >> >>
>> >> >> I added a check.
>> >> >>
>> >> >>
>> >> >>
>> >> >> https://github.com/t-matsuo/resource-agents/commit/3ab7cfdcce118043cd149b348740e50e7a946eb3
>> >> >>
>> >> >> >>
>> >> >> >> > 2. I think that following lines should be moved from
>> >> >> >> > pgsql_start
>> >> >> >> > to
>> >> >> >> > pgsql_validate_all
>> >> >> >> >
>> >> >> >> >  535     # Check whether tmpdir is readable by pgdba user
>> >> >> >> >  536     if ! runasowner "test -r $OCF_RESKEY_tmpdir"; then
>> >> >> >> >  537         ocf_log err "Directory $OCF_RESKEY_tmpdir is not
>> >> >> >> > readable
>> >> >> >> > by
>> >> >> >> > $OCF_RESKEY_pgdba"
>> >> >> >> >  538         return $OCF_ERR_PERM
>> >> >> >> >  539     fi
>> >> >> >>
>> >> >> >> Thanks. I think so too.
>> >> >> >> I'll fix it.
>> >> >> >>
>> >> >>
>> >> >> I fixed it and I deleted a check for tmpdir existence
>> >> >> because the checking for permittion fills the role.
>> >> >>
>> >> >>
>> >> >>
>> >> >> https://github.com/t-matsuo/resource-agents/commit/82d4939486bcca429e2deb804d7faf756099bb59
>> >> >>
>> >> >>
>> >> >> > On a second thought I'm not sure why we need that parameter and
>> >> >> > directory at
>> >> >> > all. Why not to create rep_mode.conf, PGSQL.lock and xlog.note in
>> >> >> > $OCF_RESKEY_pgdata ? What problems it can create?
>> >> >> >
>> >> >> > One more advantage to do
>> >> >> > it in $OCF_RESKEY_pgdata is an ability to handle more than
>> >> >> > PostgreSQL
>> >> >> > instance on the same server without a need for additional temp
>> >> >> > directories.
>> >> >>
>> >> >> When backup is needed, customers may backup these files and restore
>> >> >> it.
>> >> >> It may cause problems.
>> >> >> Specially PGSQL.lock causes an error on start.
>> >> >>
>> >> >> I think that they should be treated as a separate thing.
>> >> >> because they are independent from PostgreSQL's data.
>> >> >
>> >> >
>> >> > Ok. Then may be it should default to something like
>> >> > ${OCF_RESKEY_pgdata}/temp and RA should create it and set the right
>> >> > ownership and permissions if it doesn't exist? And again may be in
>> >> > this
>> >> > case
>> >> > we don't need that parameter?
>> >> >
>> >>
>> >> I agree that RA creates it and sets the right ownership and
>> >> permissions.
>> >> But considering backup using tar command, I think it's not better to
>> >> create it
>> >> under $OCF_RESKEY_pgdata.
>> >>
>> >> According to "Filesystem Hierarchy Standard"
>> >> it's better to use /var/lib/somewhere.
>> >>
>> >>
>> >>
>> >> http://www.pathname.com/fhs/pub/fhs-2.3.html#VARLIBVARIABLESTATEINFORMATION,
>> >>
>> >> Then I designed using /var/lig/[RA Name] (=/var/lib/pgsql).
>> >> What do you think?
>> >
>> >
>> > Ahh, I see now where confusion comes from. /var/lib/pgsql is actually
>> > taken
>> > :-) It's used as default home directory in RedHat (at least) for
>> > postgres
>> > user in PostgreSQL 8.XX. When they'll start start shipping version 9
>> > they'll
>> > probably use it as well. OCF_RESKEY_pg_data  defaults to
>> > /var/lib/pgsql/data.
>> >
>> > Then, /var/lib is usually used for non-temporary data, For temporary
>> > files
>> > it's probably better to use /var/run or /var/tmp. If you still want to
>> > use
>> > /var/lib/pgsql you probably need to use /var/lib/pgsql/temp or so.
>>
>> Files under /var/run or /var/tmp are cleared by tmpwatch or at the
>> beginning
>> of the boot process, aren't they?
>>
>> Clearing PGSQL.lock file causes problem.
>>
>>
>> >
>> >>
>> >> >>
>> >> >>
>> >> >> Incidentally I considered to handle more than PostgreSQL instance.
>> >> >> Initially I added port number to these filenames, but I deleted it
>> >> >> to simplify filenames.
>> >> >>
>> >> >>
>> >> >>
>> >> >> https://github.com/t-matsuo/resource-agents/commit/b16faf2d797200048dc0fc07a45b6751cf5be190
>> >> >>
>> >> >>
>> >> >> > Also I think it would be good if RA was able to take care of
>> >> >> > adding
>> >> >> > "include
>> >> >> > $WHATEVER_DIR/rep_mode.conf" in postgresql.conf. It will make the
>> >> >> > RA
>> >> >> > self
>> >> >> > sustainable. In a current situation admin has add that directive
>> >> >> > manually.
>> >> >> > RA though can something like this in a start function for
>> >> >> > replication
>> >> >> > mode:
>> >> >> >
>> >> >> > if ! grep -i "include $WHATEVER_DIR/rep_mode.conf"
>> >> >> > $OCF_RESKEY_config
>> >> >> > then
>> >> >> >      echo "include $WHATEVER_DIR/rep_mode.conf" >>
>> >> >> > $OCF_RESKEY_config
>> >> >> > fi
>> >> >>
>> >> >> Sounds good.
>> >> >>
>> >> >> > Don't know if it makes sense to remove it on stop.
>> >> >>
>> >> >> I think it doesn't make sense to remove it,
>> >> >> because rep_mode.conf becomes empty on stop.
>> >> >
>> >> >
>> >> > The only problem here if admin changes temp_dir parameter but doesn't
>> >> > delete
>> >> > records from postgres.conf. We could end up with several include
>> >> > records
>> >> > then or with records pointing to the non existing files.
>> >>
>> >> Should we consider this situation?
>> >>
>> >> If same parameters exist, PostgreSQL uses last parameter.
>> >> So I will implement it to add several include if temp_dir is changed.
>> >
>> >
>> > The problem happens when file gets deleted but reference to it still
>> > exists
>> > in postgres.conf. They DB fails to start.
>>
>>
>> Start manually?
>> The RA manages some statuses using an attribute and PGSQL.lock file.
>> I don't recommend starting it manually in the first place.
>>
>>
>> How about just describing it in a document? meta-data?
>>
>>
>> >
>> >>
>> >>
>> >>
>> >> >>
>> >> >> >> > 3. I don't really like this part of pgsql_real_monitor:
>> >> >> >> >
>> >> >> >> >  775     if ! is_replication; then
>> >> >> >> >  776         OCF_RESKEY_monitor_sql=`escape_string >
>> >> >> >> > "$OCF_RESKEY_monitor_sql"`
>> >> >> >> >  777         runasowner -q $loglevel "$OCF_RESKEY_psql
>> >> >> >> > $psql_options
>> >> >> >> > -c
>> >> >> >> > > '$OCF_RESKEY_monitor_sql'"
>> >> >> >> >  778         rc=$?
>> >> >> >> >  779     else
>> >> >> >> >  780         output=`su $OCF_RESKEY_pgdba -c "cd
>> >> >> >> > $OCF_RESKEY_pgdata;
>> >> >> >> > >
>> >> >> >> > $OCF_RESKEY_psql $psql_options -Atc \"${CHECK_MS_SQL}\""`
>> >> >> >> >  781         rc=$?
>> >> >> >> >  782     fi
>> >> >> >> >
>> >> >> >> > I think that functional monitor (the one that uses monitor_sql)
>> >> >> >> > should
>> >> >> >> > run
>> >> >> >> > always independently of DB mode since its primary role is to
>> >> >> >> > check
>> >> >> >> > data
>> >> >> >> > and
>> >> >> >> > fail if it's not correct or corrupted. In replication mode
>> >> >> >> > there
>> >> >> >> > should
>> >> >> >> > be
>> >> >> >> > additional monitoring. Other way it misleads customer on a
>> >> >> >> > usage
>> >> >> >> > of
>> >> >> >> > monitor_sql.
>> >> >> >>
>> >> >> >> All right.
>> >> >> >>
>> >> >> >> Does it need to execute "select now();" if monitor_sql parameter
>> >> >> >> is
>> >> >> >> empty?
>> >> >> >> I think it's unnecessary.
>> >> >> >
>> >> >> >
>> >> >> > For a case of an empty parameter you are right and running "select
>> >> >> > now()"
>> >> >> > probably unnecessary, Non-empty monitro_sql shall be executed in
>> >> >> > my
>> >> >> > opinion.
>> >> >> >
>> >> >> >>
>> >> >> >>
>> >> >> >> > 4. You already populate several attributes with crm_attribute.
>> >> >> >> > Does
>> >> >> >> > it
>> >> >> >> > make
>> >> >> >> > sense to populate a name of a Master node in promote function
>> >> >> >> > and
>> >> >> >> > use
>> >> >> >> > it
>> >> >> >> > later on instead of running crm_mon on each monitor command?
>> >> >> >> >
>> >> >> >>
>> >> >> >> Do you mean that you don't want to use crm_mon in monitor?
>> >> >> >
>> >> >> >
>> >> >> > I prefer to have as few dependencies  on external programs as
>> >> >> > possible.
>> >> >>
>> >> >> Agree.
>> >> >>
>> >> >> > Using crm_attribute to communicate the name of a master node would
>> >> >> > be
>> >> >> > consistent with the rest of the script since you already use it
>> >> >> > for
>> >> >> > communicating state of the nodes. If you prefer using crm_mon than
>> >> >> > you
>> >> >> > have
>> >> >> > to add a check that that binary exists on the server. In 99.99% of
>> >> >> > the
>> >> >> > case
>> >> >> > it will but still the check is necessary I think.
>> >> >>
>> >> >> I use crm_mon to check for node existence and node online too.
>> >> >> It is impossible to get it from attribute, so I prefer to using
>> >> >> crm_mon.
>> >> >>
>> >> >> Do I have to add a check for crm_master, crm_attribute,
>> >> >> crm_failcount
>> >> >> too?
>> >> >>
>> >> >
>> >> > That's a different  interesting topic. What is the reason for doing
>> >> > this?
>> >> >
>> >> > my_fail_count=`$CRM_FAILCOUNT -r $OCF_RESOURCE_INSTANCE -N $HOSTNAME
>> >> > -G
>> >> > -Q |
>> >> > sed "s/INFINITY/1000000/g"`
>> >> > if [ "$my_fail_count" != "0" ]; then
>> >> >       ocf_log info "I don't start PostgreSQL on post-demote because
>> >> > of
>> >> > my
>> >> > fail-count=$my_fail_count."
>> >> >       return $OCF_SUCCESS
>> >> > fi
>> >> >
>> >> > Why not to leave pacemaker to do its job on managing failcounts?
>> >>
>> >> Because I want to make fail-over faster.
>> >> At any rate Pacemaker stops it?
>> >
>> >
>> > I'm probably wrong here. Looks like you use this trick in case if
>> > slave's
>> > data is newer than new master's data. In this case it makes sense to
>> > disable
>> > such slave.
>>
>> Why do you think so ?
>> The check whether slave's data is newer than new master's one  is here.
>>
>> in pgsql_post_demote()
>> -------------------------------------------------------
>> 886 is_slave_right || return $OCF_ERR_GENERIC
>> -------------------------------------------------------
>>
>> This line is executed after this trick.
>>
>>
>>
>> Regards,
>> Takatoshi MATSUO
>>
>> _______________________________________________________
>> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>> Home Page: http://linux-ha.org/
>
>
>
>
> --
> Serge Dubrouski.
>

--
Takatoshi MATSUO
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Re: [Linux-ha-dev] pgsql and streaming replcation

Reply via email to