[Please Cc me for replies, I'm not subscibed to this list]

Hello All,

The anonymous rsysnc mirroring script available from
http://www.debian.org/mirror/anonftpsync
has several problems:

- It fails to exclude some arch-specific stuff, namely *.changes and
  Contrents-*.gz
- It also doesn't exclude arch-specific parts of debian-installer,
  that is *.udeb and installer-*
- The "set +e" before the rsync prevents the removal of the lockfile
  to get triggered, so stale lockfiles may remain.

I guess the "set +e" was done to get a saved log in case of errors, but
moving that upwards is better, and also prevents timestamp updates when
the rsync went wrong.

In the end, I rewrote the whole script in order to fix those flaws.
It's appended at this mail, and I think it's a better example for
a mirror script. I haven't tested it on non-Debian systems, though,
so I may have missed some portability issues.


Thiemo


#! /bin/sh
#
# Copyright 2004  Thiemo Seufer <[EMAIL PROTECTED]>
#
# This file is distributed under the terms of the GNU General Public License,
# Version 2, as published at http://www.gnu.org/licenses/gpl.html
#
# This script originates from http://www.debian.org/mirror/anonftpsync
#
# Note: You MUST have rsync 2.0.16-1 or newer, which is available in slink
# and all newer Debian releases, or at http://rsync.samba.org/
#
# Set the variables at the end of the file to fit your site. You can then use
# cron to have this script run daily to automatically update your copy of the
# archive.
#
# Don't forget:
# chmod 744 anonftpsync

# ---------------------------------------------------------------------------
# There should be no need to edit anything in this section, unless there are
# problems. See the end of the file for customization.
set -e

exclude_arch ()
{
        while [ $# -ge 1 ]; do
                echo -n "--exclude binary-${1}/ --exclude *_${1}.deb --exclude 
*_${1}.changes --exclude Contents-${1}.gz --exclude disks-${1}/ --exclude 
*_${1}.udeb --exclude installer-${1}/ "
                shift
        done
        return 0
}

exclude_source ()
{
        case "$EXCLUDE_SOURCE" in
        "y*" | "Y*") echo -n "--exclude source/ --exclude *.orig.tar.gz 
--exclude *.diff.gz --exclude *.dsc" ;;
        esac
}

do_mirror ()
{
        _rsync_host=$1
        _rsync_dir=$2
        _target_dir=$3

        # Get in the right directory and set the umask to be group writable.
        cd $HOME
        umask 002

        # Note: on some non-Debian systems, hostname doesn't accept -f option.
        # If that's the case on your system, make sure hostname prints the full
        # hostname, and remove the -f option. If there's no hostname command,
        # explicitly replace `hostname -f` with the hostname.
        _target_host=`hostname -f`

        LOCK="${_target_dir}/Archive-Update-in-Progress-${_target_host}"

        mkdir -p ${_target_dir}/project/trace

        # Check to see if another sync is in progress
        if lockfile -! -l 43200 -r 0 "$LOCK"; then
                echo ${_target_host} is unable to start rsync, lock file exists
                return 1
        fi

        # Note: on some non-Debian systems, trap doesn't accept "EXIT"
        # as signal specification. If that's the case on your system,
        # try using "0".
        trap "rm -f $LOCK > /dev/null 2>&1" EXIT

        # Note: if you don't have savelog, use any other log rotation facility, 
or
        # comment this out, the log will simply be overwritten each time.
        savelog ${_target_dir}/project/trace/rsync.log > /dev/null 2>&1

        rsync --recursive --links --hard-links --times --verbose \
                --compress --delete $RSYNC_OTHEROPTS \
                --exclude "Archive-Update-in-Progress-${_target_host}" \
                --exclude "project/trace/*" \
                `exclude_arch $EXCLUDE_ARCH` \
                `exclude_source` \
                $EXCLUDE_OTHER \
                ${_rsync_host}::${_rsync_dir} ${_target_dir} > 
${_target_dir}/project/trace/rsync.log 2>&1
        date -u > "${_target_dir}/project/trace/${_target_host}"
        rm -f $LOCK > /dev/null 2>&1

        return 0
}

# ---------------------------------------------------------------
# Customize this section

# Things to exclude.
# With blank EXCLUDE_* you will mirror the entire archive.

# This sample would exclude all architectures:
# EXCLUDE_ARCH="alpha arm m68k mips mipsel powerpc sparc i386 ia64 hppa sh s390 
hurd-i386"
EXCLUDE_ARCH=

# This sample would exclude the source code:
# EXCLUDE_SOURCE="y"
EXCLUDE_SOURCE="n"

# Exclude other things, like some symlinks or sections.
# The --exclude option must be given in this case.
# Samples:
# EXCLUDE_OTHER="--exclude stable/ --exclude testing/ --exclude unstable/"
# EXCLUDE_OTHER="--exclude /contrib/ --exclude /non-free/"
#
# Exclude old distributions. Note that this won't work well for newer
# ones which have their files in a pool directory.
# EXCLUDE_OTHER="--exclude /slink/ --exclude /slink-proposed-updates/"
# EXCLUDE_OTHER="--exclude /potato/ --exclude /poatato-proposed-updates/"
EXCLUDE_OTHER=

# Additional options for rsync
# Common options might include bwlimit (but see Debian bug #181336)
# Sample:
# RSYNC_OTHEROPTS="--bwlimit=23 --safe-links --delete-after"
RSYNC_OTHEROPTS=

# Trees to mirror
#
#   do_mirror RSYNC_HOST RSYNC_DIR TARGET_DIR
#
# RSYNC_HOST is the site you have chosen from the mirrors file.
# (http://www.debian.org/mirror/list-full)
#
# RSYNC_DIR is the directory given in the "Packages over rsync:" line of
# the mirrors file for the site you have chosen to mirror.
#
# TARGET_DIR is the destination for the base of the Debian mirror directory
# (the dir that holds dists/ and ls-lR).
#
# Samples:
# do_mirror rsync.example.org debian/ /bigdisk/mirror/debian
# do_mirror rsync.example.org debian-non-US/ /bigdisk/mirror/debian-non-US
# do_mirror rsync.example.org debian-security/ /bigdisk/mirror/debian-security

# End of customize section
# ---------------------------------------------------------------
exit 0

Reply via email to