On Wed, Mar 10, 2010 at 10:53:54AM +0200, Marian Marinov wrote:
> +     mysql --password=$OCF_RESKEY_replication_passwd \
> +             --user=$OCF_RESKEY_replication_user \
> +             -h $master_host \
> +             -O connect_timeout=1 \
> +             -e 'SHOW MASTER STATUS\G'|while read k v; do
> +             if [ "$k" = 'File:' ]; then master_log="$v"; fi
> +             if [ "$k" = 'Position:' ]; then master_pos="$v"; fi
> +     done

This is broken, sorry I did not spot this in the previous reviews.

  k="before the subshell" v="nothing to see here"
  K="no set yet, either"  V="I'm sorry"
  echo bla blub | while read k v; do
        echo "k=$k, v=$v"
        K=$k V=$v
        echo "K=$K, V=$V"
  done
  echo "k=$k, v=$v"
  echo "K=$K, V=$V"


output:
        k=bla, v=blub
        K=bla, V=blub
        k=before the subshell, v=nothing to see here
        K=no set yet, either, V=I'm sorry

everything behind the | is executed in an implicit subshell,
so whatever variable assignments you do within the while loop,
the main shell will not see them.

Similarly for all other occurences where you do
some_command | grep | awk | sed | while :; do x=something; done
the x=something will not be seen after the "done"

Maybe, if the output of the "show master status" is stable enough
(over mysql versions, settings, and time), you could simply do
        local tmp l p
        tmp=$(mysql -N -B -e 'show master status')
        # store and test $?, if appropriate
        # if you do set -- $(mysql), you get the exit code of set,
        # which is 0 ;-)
        set -- $tmp # no quotes!
        l=$1 p=$2


That logic looks much nicer now,
only...

> +     if [ -z "$master_log" ] || [ -z "$master_pos" ]; then
> +             ocf_log err "unable to find master log or master position"
> +             return $OCF_ERR_GENERIC;

No...
you should not fail a monitor, because you cannot reach the master.
it may just have failed.
now, if you fail monitor for all slaves just because the master failed,
pacemaker will go berserk...

If you cannot reach the master, you probably should just (noisily) skip
the preference calculation, not change any preference value,
and return success, as long as the slave itself is still working fine.

If the master actually failed, then the next master monitor should
notice, and pacemaker should re-allocate the master based on the
values the various slaves set themselves on the last successful
preference update.


> +     fi
> +     master_log=$(echo $master_log|sed 's/^.\+\.0\+//')
> +     local_log=$(echo $local_log|sed 's/^.\+\.0\+//')

I'd rather say /^.*\.0*//
(though you need a lot binlogs files for this to make a difference)

> +     log_diff=$((master_log-local_log))

btw, if we had settled on bash as a shell,
instead of the sed stuff above, you could
        $(( 10#${master_log#*.} - 10#${local_log#*.} ))
even with non-bash,
        l=${l#*.}; lz=${l%%[!0]*}; l=${l#"$lz"};
would avoid sed and give the same result.
Though I doubt that it is much faster or more readable,
and I digress ;)

>       # no preference for very old slaves
> -     if [ "$master_file" != "$local_log" ]; then
> +     if [ "$log_diff" -ge 2 ]; then
>               ocf_run $CRM_MASTER -v 0
>               return 1;
>       fi


You probably need to

case $log_diff
 2)
        very low, assuming that there won't be too log file wraps
        whithin a few seconds
 1) 
        think.
        may be just wrapped "now".
        so if master position is "small" and slave position
        in previous log is some high number, ...

No. Too much guesswork again.
Hm.

How about instead of (or in addition to) "show master status",
do "show binary logs"?
To me at least it *looks* as if "show binary logs" would be a superset
of "show master status". It has been a while, though, that I knew
something about mysql replication, so beware ;-)

If I'm right, then 

get_latest_two_log_files_and_position()
{
        local M
        # below _not_ local, as those are our "return values"
        prev_log= prev_size= cur_log= cur_pos=

        M=$(mysql -N -B -e 'show binary logs')
        # test exit code, return, abort, log, whatever

        # four dummy parameters, in case we have only one logfile,
        # or no output at all
        set -- "" "" "" "" $M
        # skip to the last four values
        # (may even be our dummy empty parameters!)
        shift $(( $# - 4 ))

        # assign
        prev_log=$1 prev_size=$2 cur_log=$3 cur_pos=$4
}

There. If my assumption is correct that "show binary logs"
 - always gives correctly sorted output,
 - file size and binlog position are the same thing,
 - the last line is equivalent with file and position
   from "show master status",
then we're done:

        now you could assert that, if log_diff is 1, local_log is prev_log.
        or simply to a string compare instead.
        if [ $local_log = $cur_log ] ; ...
                based on position difference
        elif [ $local_log = $prev_log ]; ...

                and you could base your calculation on the actual
                position diff:
                $cur_pos + $(( prev_size - local_posision ))
                
        else
                nope.
        fi

 0)
        same logs, do the preference thingy based on position difference

 -*)
        SHOULD NOT HAPPEN, master overtaken my slave!
 *)
        master more than two log files ahead,
        don't become master.

Did that braindump make any sense to anyone?

        ;-)

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to