The only thing which is not clear in the documentation is when the
delete is happening. With rsync there is an option '--delete-after' to
make sure first all files are transferred and deletion happens
afterwards. If 'aws s3 sync' does the deletion during or before all
files are transferred you might again have a small time window where
repodata has not been updated but files have already been deleted.

To be on the safe side the repodata probably needs to be transferred
before files are being deleted.

Deletion should probably only happen after doing the invalidate to be
sure that no files are being referenced by the caches which are
currently being deleted.

                Adrian

On Thu, Mar 05, 2020 at 01:05:20PM -0800, Kevin Fenzi wrote:
> First, I noticed we are running the full sync twice right now, at the
> same time: 
> 
> [root@mm-backend01 cron.d][PROD]# cat /etc/cron.d/s3.sh 
> #Ansible: s3sync
> 0 0,11 * * * s3-mirror /usr/local/bin/lock-wrapper s3sync 
> /usr/local/bin/s3.sh 2>&1 | /usr/local/bin/nag-once s3.sh 1d 2>
> &1
> #Ansible: s3sync-main
> 0 0 * * * s3-mirror /usr/local/bin/lock-wrapper s3sync-main 
> /usr/local/bin/s3.sh 2>&1 | /usr/local/bin/nag-once s3.sh 1d 
> 2>&1
> 
> Second, the attached patch changes the sync scripts to: 
> 
> * do one sync with no --delete and excluding repodata
> * do another one with --delete and including repodata
> * invalidate the repodata
> 
> I adjusted the cron jobs to handle the repodata invalidate (I think). 
> 
> TODO: only sync when things have changed. 
> 
> +1s?
> 
> kevin
> --

> From 72a86a9f4d80344b96f8752ead2cc2877de292ef Mon Sep 17 00:00:00 2001
> From: Kevin Fenzi <ke...@scrye.com>
> Date: Thu, 5 Mar 2020 20:46:53 +0000
> Subject: [PATCH] s3-mirror: Split things into 2 sync runs, one without
>  repodata and delete, the other with both.
> 
> In order to make sure the s3 mirror always is consistent, split out the 
> commands
> to make it sync without repodata and delete, then do another run with those, 
> then finally
> invalidate all the repodata/* files.
> 
> Some of the cron jobs are adjusted to allow the repodata invalidation.
> 
> Signed-off-by: Kevin Fenzi <ke...@scrye.com>
> ---
>  roles/s3-mirror/files/s3-sync-path.sh |  38 ++++++++----
>  roles/s3-mirror/files/s3.sh           | 110 
> +++++++++++++++++++++++++++++++++-
>  roles/s3-mirror/tasks/main.yml        |   4 +-
>  3 files changed, 136 insertions(+), 16 deletions(-)
> 
> diff --git a/roles/s3-mirror/files/s3-sync-path.sh 
> b/roles/s3-mirror/files/s3-sync-path.sh
> index e6ac994..40a0f90 100644
> --- a/roles/s3-mirror/files/s3-sync-path.sh
> +++ b/roles/s3-mirror/files/s3-sync-path.sh
> @@ -9,7 +9,30 @@ if [[ "$1" == "" ]] || [[ $1 != /pub* ]] || [[ $1 != */ ]]; 
> then
>    exit 1
>  fi
>  
> -CMD="aws s3 sync                   \
> +# first run do not delete anything or copy the repodata.
> +CMD1="aws s3 sync                   \
> +  --exclude */repodata/*         \
> +  --exclude *.snapshot/*          \
> +  --exclude *source/*             \
> +  --exclude *SRPMS/*              \
> +  --exclude *debug/*              \
> +  --exclude *beta/*               \
> +  --exclude *ppc/*                \
> +  --exclude *ppc64/*              \
> +  --exclude *repoview/*           \
> +  --exclude *Fedora/*             \
> +  --exclude *EFI/*                \
> +  --exclude *core/*               \
> +  --exclude *extras/*             \
> +  --exclude *LiveOS/*             \
> +  --exclude *development/rawhide/* \
> +  --no-follow-symlinks            \
> +  --only-show-errors              \
> +  "
> +  #--dryrun                         \
> +
> +# second we delete old content and also copy the repodata
> +CMD2="aws s3 sync                   \
>    --delete                         \
>    --exclude *.snapshot/*          \
>    --exclude *source/*             \
> @@ -32,19 +55,12 @@ CMD="aws s3 sync                   \
>  
>  #echo "$CMD /srv$1 s3://s3-mirror-us-west-1-02.fedoraproject.org$1"
>  echo "Starting $1 sync at $(date)" >> /var/log/s3-mirror/timestamps
> -$CMD /srv$1 s3://s3-mirror-us-west-1-02.fedoraproject.org$1
> +$CMD1 /srv$1 s3://s3-mirror-us-west-1-02.fedoraproject.org$1
> +$CMD2 /srv$1 s3://s3-mirror-us-west-1-02.fedoraproject.org$1
>  echo "Ending $1 sync at $(date)" >> /var/log/s3-mirror/timestamps
>  
>  # Always do the invalidations because they are quick and prevent issues
>  # depending on which path is synced.
> -for file in $(echo /srv/pub/epel/6/*/repodata/repomd.xml | sed 's#/srv##g'); 
> do
> -  aws cloudfront create-invalidation --distribution-id E2KJMDC0QAJDMU 
> --paths "$file" > /dev/null
> -done
> -
> -for file in $(echo /srv/pub/epel/7/*/repodata/repomd.xml | sed 's#/srv##g'); 
> do
> -  aws cloudfront create-invalidation --distribution-id E2KJMDC0QAJDMU 
> --paths "$file" > /dev/null
> -done
> -
> -for file in $(echo /srv/pub/fedora/linux/updates/*/*/*/repodata/repomd.xml | 
> sed 's#/srv##g'); do
> +for file in $(echo $1/repodata/* ); do
>    aws cloudfront create-invalidation --distribution-id E2KJMDC0QAJDMU 
> --paths "$file" > /dev/null
>  done
> diff --git a/roles/s3-mirror/files/s3.sh b/roles/s3-mirror/files/s3.sh
> index 55c1940..e36744e 100644
> --- a/roles/s3-mirror/files/s3.sh
> +++ b/roles/s3-mirror/files/s3.sh
> @@ -3,7 +3,96 @@
>  # LGPL
>  # Author: Rick Elrod <rel...@redhat.com>
>  
> -CMD="aws s3 sync                   \
> +# first run this command that syncs, but does not delete.
> +# It also excludes repodata. 
> +CMD1="aws s3 sync                   \
> +  --exclude */repodata/*           \
> +  --exclude */.snapshot/*          \
> +  --exclude */source/*             \
> +  --exclude */SRPMS/*              \
> +  --exclude */debug/*              \
> +  --exclude */beta/*               \
> +  --exclude */ppc/*                \
> +  --exclude */ppc64/*              \
> +  --exclude */repoview/*           \
> +  --exclude */Fedora/*             \
> +  --exclude */EFI/*                \
> +  --exclude */core/*               \
> +  --exclude */extras/*             \
> +  --exclude */LiveOS/*             \
> +  --exclude */development/rawhide/* \
> +  --exclude */releases/8/*         \
> +  --exclude */releases/9/*         \
> +  --exclude */releases/10/*        \
> +  --exclude */releases/11/*        \
> +  --exclude */releases/12/*        \
> +  --exclude */releases/13/*        \
> +  --exclude */releases/14/*        \
> +  --exclude */releases/15/*        \
> +  --exclude */releases/16/*        \
> +  --exclude */releases/17/*        \
> +  --exclude */releases/18/*        \
> +  --exclude */releases/19/*        \
> +  --exclude */releases/20/*        \
> +  --exclude */releases/21/*        \
> +  --exclude */releases/22/*        \
> +  --exclude */releases/23/*        \
> +  --exclude */releases/24/*        \
> +  --exclude */releases/25/*        \
> +  --exclude */releases/26/*        \
> +  --exclude */releases/27/*        \
> +  --exclude */releases/28/*        \
> +  --exclude */releases/29/*        \
> +  --exclude */updates/8/*          \
> +  --exclude */updates/9/*          \
> +  --exclude */updates/10/*         \
> +  --exclude */updates/11/*         \
> +  --exclude */updates/12/*         \
> +  --exclude */updates/13/*         \
> +  --exclude */updates/14/*         \
> +  --exclude */updates/15/*         \
> +  --exclude */updates/16/*         \
> +  --exclude */updates/17/*         \
> +  --exclude */updates/18/*         \
> +  --exclude */updates/19/*         \
> +  --exclude */updates/20/*         \
> +  --exclude */updates/21/*         \
> +  --exclude */updates/22/*         \
> +  --exclude */updates/23/*         \
> +  --exclude */updates/24/*         \
> +  --exclude */updates/25/*         \
> +  --exclude */updates/26/*         \
> +  --exclude */updates/27/*         \
> +  --exclude */updates/28/*         \
> +  --exclude */updates/29/*         \
> +  --exclude */updates/testing/8/*  \
> +  --exclude */updates/testing/9/*  \
> +  --exclude */updates/testing/10/* \
> +  --exclude */updates/testing/11/* \
> +  --exclude */updates/testing/12/* \
> +  --exclude */updates/testing/13/* \
> +  --exclude */updates/testing/14/* \
> +  --exclude */updates/testing/15/* \
> +  --exclude */updates/testing/16/* \
> +  --exclude */updates/testing/17/* \
> +  --exclude */updates/testing/18/* \
> +  --exclude */updates/testing/19/* \
> +  --exclude */updates/testing/20/* \
> +  --exclude */updates/testing/21/* \
> +  --exclude */updates/testing/22/* \
> +  --exclude */updates/testing/23/* \
> +  --exclude */updates/testing/24/* \
> +  --exclude */updates/testing/25/* \
> +  --exclude */updates/testing/26/* \
> +  --exclude */updates/testing/27/* \
> +  --exclude */updates/testing/28/* \
> +  --exclude */updates/testing/29/* \
> +  --no-follow-symlinks             \
> +  "
> +  #--dryrun                         \
> +
> +# Next we run this command which deletes old content and also includes 
> repodata.
> +CMD2="aws s3 sync                   \
>    --delete                         \
>    --exclude */.snapshot/*          \
>    --exclude */source/*             \
> @@ -38,6 +127,9 @@ CMD="aws s3 sync                   \
>    --exclude */releases/24/*        \
>    --exclude */releases/25/*        \
>    --exclude */releases/26/*        \
> +  --exclude */releases/27/*        \
> +  --exclude */releases/28/*        \
> +  --exclude */releases/29/*        \
>    --exclude */updates/8/*          \
>    --exclude */updates/9/*          \
>    --exclude */updates/10/*         \
> @@ -57,6 +149,9 @@ CMD="aws s3 sync                   \
>    --exclude */updates/24/*         \
>    --exclude */updates/25/*         \
>    --exclude */updates/26/*         \
> +  --exclude */updates/27/*         \
> +  --exclude */updates/28/*         \
> +  --exclude */updates/29/*         \
>    --exclude */updates/testing/8/*  \
>    --exclude */updates/testing/9/*  \
>    --exclude */updates/testing/10/* \
> @@ -76,6 +171,9 @@ CMD="aws s3 sync                   \
>    --exclude */updates/testing/24/* \
>    --exclude */updates/testing/25/* \
>    --exclude */updates/testing/26/* \
> +  --exclude */updates/testing/27/* \
> +  --exclude */updates/testing/28/* \
> +  --exclude */updates/testing/29/* \
>    --no-follow-symlinks             \
>    "
>    #--dryrun                         \
> @@ -83,7 +181,8 @@ CMD="aws s3 sync                   \
>  # Sync EPEL
>  #echo $CMD /srv/pub/epel/ 
> s3://s3-mirror-us-west-1-02.fedoraproject.org/pub/epel/
>  echo "Starting EPEL sync at $(date)" >> /var/log/s3-mirror/timestamps
> -$CMD /srv/pub/epel/ s3://s3-mirror-us-west-1-02.fedoraproject.org/pub/epel/
> +$CMD1 /srv/pub/epel/ s3://s3-mirror-us-west-1-02.fedoraproject.org/pub/epel/
> +$CMD2 /srv/pub/epel/ s3://s3-mirror-us-west-1-02.fedoraproject.org/pub/epel/
>  echo "Ending EPEL sync at $(date)" >> /var/log/s3-mirror/timestamps
>  
>  for file in $(echo /srv/pub/epel/6/*/repodata/repomd.xml | sed 's#/srv##g'); 
> do
> @@ -94,10 +193,15 @@ for file in $(echo /srv/pub/epel/7/*/repodata/repomd.xml 
> | sed 's#/srv##g'); do
>    aws cloudfront create-invalidation --distribution-id E2KJMDC0QAJDMU 
> --paths "$file"
>  done
>  
> +for file in $(echo /srv/pub/epel/8/*/repodata/repomd.xml | sed 's#/srv##g'); 
> do
> +  aws cloudfront create-invalidation --distribution-id E2KJMDC0QAJDMU 
> --paths "$file"
> +done
> +
>  # Sync Fedora
>  #echo $CMD /srv/pub/fedora/ 
> s3://s3-mirror-us-west-1-02.fedoraproject.org/pub/fedora/
>  echo "Starting Fedora sync at $(date)" >> /var/log/s3-mirror/timestamps
> -$CMD /srv/pub/fedora/ 
> s3://s3-mirror-us-west-1-02.fedoraproject.org/pub/fedora/
> +$CMD1 /srv/pub/fedora/ 
> s3://s3-mirror-us-west-1-02.fedoraproject.org/pub/fedora/
> +$CMD2 /srv/pub/fedora/ 
> s3://s3-mirror-us-west-1-02.fedoraproject.org/pub/fedora/
>  echo "Ending Fedora sync at $(date)" >> /var/log/s3-mirror/timestamps
>  
>  for file in $(echo /srv/pub/fedora/linux/updates/*/*/*/repodata/repomd.xml | 
> sed 's#/srv##g'); do
> diff --git a/roles/s3-mirror/tasks/main.yml b/roles/s3-mirror/tasks/main.yml
> index 12351cb..5da7a02 100644
> --- a/roles/s3-mirror/tasks/main.yml
> +++ b/roles/s3-mirror/tasks/main.yml
> @@ -69,7 +69,7 @@
>  
>  - name: s3sync cron - updates for current
>    cron: name="s3sync-updates-current" minute="0" hour="3,9,15,21" 
> user="s3-mirror"
> -        job='/usr/local/bin/lock-wrapper s3sync-updates-current 
> "/usr/local/bin/s3-sync-path.sh /pub/fedora/linux/updates/{{ 
> FedoraCycleNumber|int }}/" 2>&1 | /usr/local/bin/nag-once 
> s3-updates-current.sh 1d 2>&1'
> +        job='/usr/local/bin/lock-wrapper s3sync-updates-current 
> "/usr/local/bin/s3-sync-path.sh /pub/fedora/linux/updates/{{ 
> FedoraCycleNumber|int }}/Everything/x86_64/os" 2>&1 | /usr/local/bin/nag-once 
> s3-updates-current.sh 1d 2>&1'
>          cron_file=s3-updates-current.sh
>    when: env != 'staging' and inventory_hostname.startswith('mm-backend01.')
>    tags:
> @@ -95,7 +95,7 @@
>  
>  - name: s3sync cron - updates for current-1
>    cron: name="s3sync-updates-previous" minute="30" hour="0,6,12,18" 
> user="s3-mirror"
> -        job='/usr/local/bin/lock-wrapper s3sync-updates-previous 
> "/usr/local/bin/s3-sync-path.sh /pub/fedora/linux/updates/{{ 
> FedoraCycleNumber|int - 1 }}/" 2>&1 | /usr/local/bin/nag-once 
> s3-updates-previous.sh 1d 2>&1'
> +        job='/usr/local/bin/lock-wrapper s3sync-updates-previous 
> "/usr/local/bin/s3-sync-path.sh /pub/fedora/linux/updates/{{ 
> FedoraCycleNumber|int - 1 }}/Everything/x86_64/" 2>&1 | 
> /usr/local/bin/nag-once s3-updates-previous.sh 1d 2>&1'
>          cron_file=s3-updates-previous.sh
>    when: env != 'staging' and inventory_hostname.startswith('mm-backend01.')
>    tags:
> -- 
> 1.8.3.1
> 
_______________________________________________
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org

Reply via email to