Re: Freeze Break: fullefiletime list
On Wed, 11 May 2016 11:30:00 -0500 Dennis Gilmore wrote: > I am trying to catch up with email from the last week or so, I am > still 13000 behind, So I did not catch this. this only works if you > do not care about hardlinking, which is going to mean that people are > using an extra 500G + of disk on the mirrors. An issue some mirrors > have hit due to what I am assuming are bad mirroring practices. the > only way to fix it properly is going to mean re-evaluating how we > push content and how we message the pushing, and having tooling to > either do push mirroring or enabling intelligent pull based > mirroring, including information about whats hardlinked where and > what content we have pushed. this is like a bandaid when the sore > under it is still festering away. Well, this change was simply to allow us to explore using more data for syncing. Hopefully we can come up with a way to express hardlinks with it. If you have the fullfiletimelist file and there is a new one you can diff them. Once you have that list of files that were deleted or changed, you can sort them and possibly hard link the ones with the same name/timestamp/size. All of our hardlinked files should be the same name/timestamp/size I think. But failing all that we could easily have people rsync just the changed files (saving us LOTS AND LOTS of iops), but not getting hardlinks and then once a week or two doing a full traditional sync that would delete any removed files and hardlink everything. Doing this they would not have an extra 500GB, they would only get back those files changed in the last week that were hardlinked, so it would be much smaller I suspect. This would save us tons of iops, make their syncs super fast and only have a slight bad effect on space. If this all turns out to not work out, no harm done, but I think it might well help us out a great deal. kevin pgpq921e2Tg2F.pgp Description: OpenPGP digital signature ___ infrastructure mailing list infrastructure@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/infrastructure@lists.fedoraproject.org
Re: Freeze Break: fullefiletime list
On Wednesday, May 11, 2016 10:31:11 AM CDT Stephen John Smoogen wrote: > On 11 May 2016 at 10:21, Dennis Gilmore wrote: > > -1 > > > > It is not a good solution. > > What is your alternative? Sorry but coming in a week later with a -1 > needs more than a 1 sentence. We have backend storage which is > increasingly having to be turned down because the rsync enchilada is > causing major issues with other users. We have many mirrors who can't > get in-sync with the mirrors because the level 1 and level 2 mirrors > are not able to finish an rsync from the download servers. I am trying to catch up with email from the last week or so, I am still 13000 behind, So I did not catch this. this only works if you do not care about hardlinking, which is going to mean that people are using an extra 500G + of disk on the mirrors. An issue some mirrors have hit due to what I am assuming are bad mirroring practices. the only way to fix it properly is going to mean re-evaluating how we push content and how we message the pushing, and having tooling to either do push mirroring or enabling intelligent pull based mirroring, including information about whats hardlinked where and what content we have pushed. this is like a bandaid when the sore under it is still festering away. Dennis signature.asc Description: This is a digitally signed message part. ___ infrastructure mailing list infrastructure@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/infrastructure@lists.fedoraproject.org
Re: Freeze Break: fullefiletime list
On 11 May 2016 at 10:21, Dennis Gilmore wrote: > -1 > > It is not a good solution. What is your alternative? Sorry but coming in a week later with a -1 needs more than a 1 sentence. We have backend storage which is increasingly having to be turned down because the rsync enchilada is causing major issues with other users. We have many mirrors who can't get in-sync with the mirrors because the level 1 and level 2 mirrors are not able to finish an rsync from the download servers. -- Stephen J Smoogen. ___ infrastructure mailing list infrastructure@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/infrastructure@lists.fedoraproject.org
Re: Freeze Break: fullefiletime list
On Wed, May 11, 2016 at 09:21:27AM -0500, Dennis Gilmore wrote: > On Tuesday, May 3, 2016 1:29:50 PM CDT Kevin Fenzi wrote: > > Greetings. > > > > There was some talk on the mirror lists about making a fullfilelist > > version that also had timestamps in it so mirrors could look for all > > the files that changed since the last time they synced and only sync > > those specific files. This would be MUCH faster for them as they don't > > need to pull the entire metadata over. This would be MUCH better for us > > as they don't have to hammer our netapps to get all the metadata. > > > > Of course this is not a 100% solution, as they won't see deleted files > > or hardlinks. So, they will likely still need to do a real full sync > > every once in a while (monthly?) to get those things synced up. > > > > I'd like to push the following change to create the file and then we > > can look at writing up some simple scripting for people to use and see > > if we can get people to use it and lower our netapp metadata hammering. > > > > +1s? > > > > kevin > > -- > > diff --git a/roles/bodhi2/backend/files/update-fullfilelist > > b/roles/bodhi2/backend/files/update-fullfilelist index 0302c6a..bac3f9c > > 100755 > > --- a/roles/bodhi2/backend/files/update-fullfilelist > > +++ b/roles/bodhi2/backend/files/update-fullfilelist > > @@ -1,6 +1,6 @@ > > #!/bin/bash > > > > -# currently runs on releng2.fedora.phx.redhat.com > > +# currently runs on bodhi-backend01 after updates pushes > > > > MOD=$1 > > [ -z "$MOD" ] && { > > @@ -8,6 +8,8 @@ MOD=$1 > > exit 1 > > } > > > > +# This is the old traditional fullfilelist with no timestamps > > + > > TMPFILE=$(mktemp -p /tmp/) > > pushd /pub/$MOD > /dev/null > > find * -print > $TMPFILE > > @@ -18,3 +20,16 @@ else > > fi > > chmod 0644 fullfilelist > > popd > /dev/null > > + > > +# This is the new list with timestamps > > + > > +TMPFILE=$(mktemp -p /tmp/) > > +pushd /pub/$MOD > /dev/null > > +/usr/bin/rsync --no-h --list-only -r . > $TMPFILE > > +if diff $TMPFILE fullfiletimelist > /dev/null; then > > + rm -f $TMPFILE > > +else > > + mv $TMPFILE fullfiletimelist > > +fi > > +chmod 0644 fullfiletimelist > > +popd > /dev/null > > diff --git a/roles/releng/files/update-fullfilelist > > b/roles/releng/files/update-fullfilelist index 0302c6a..bac3f9c 100755 > > --- a/roles/releng/files/update-fullfilelist > > +++ b/roles/releng/files/update-fullfilelist > > @@ -1,6 +1,6 @@ > > #!/bin/bash > > > > -# currently runs on releng2.fedora.phx.redhat.com > > +# currently runs on bodhi-backend01 after updates pushes > > > > MOD=$1 > > [ -z "$MOD" ] && { > > @@ -8,6 +8,8 @@ MOD=$1 > > exit 1 > > } > > > > +# This is the old traditional fullfilelist with no timestamps > > + > > TMPFILE=$(mktemp -p /tmp/) > > pushd /pub/$MOD > /dev/null > > find * -print > $TMPFILE > > @@ -18,3 +20,16 @@ else > > fi > > chmod 0644 fullfilelist > > popd > /dev/null > > + > > +# This is the new list with timestamps > > + > > +TMPFILE=$(mktemp -p /tmp/) > > +pushd /pub/$MOD > /dev/null > > +/usr/bin/rsync --no-h --list-only -r . > $TMPFILE > > +if diff $TMPFILE fullfiletimelist > /dev/null; then > > + rm -f $TMPFILE > > +else > > + mv $TMPFILE fullfiletimelist > > +fi > > +chmod 0644 fullfiletimelist > > +popd > /dev/null > -1 > > It is not a good solution. Could you maybe explain a little more what this means? How is it not a good solution? What would be a better one? After all this is coming as a solution to a problem raised by one or two mirror admins, it sounds reasonable to me and I'm not seeing how it would be a bad solution. Pierre ___ infrastructure mailing list infrastructure@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/infrastructure@lists.fedoraproject.org
Re: Freeze Break: fullefiletime list
On Tuesday, May 3, 2016 1:29:50 PM CDT Kevin Fenzi wrote: > Greetings. > > There was some talk on the mirror lists about making a fullfilelist > version that also had timestamps in it so mirrors could look for all > the files that changed since the last time they synced and only sync > those specific files. This would be MUCH faster for them as they don't > need to pull the entire metadata over. This would be MUCH better for us > as they don't have to hammer our netapps to get all the metadata. > > Of course this is not a 100% solution, as they won't see deleted files > or hardlinks. So, they will likely still need to do a real full sync > every once in a while (monthly?) to get those things synced up. > > I'd like to push the following change to create the file and then we > can look at writing up some simple scripting for people to use and see > if we can get people to use it and lower our netapp metadata hammering. > > +1s? > > kevin > -- > diff --git a/roles/bodhi2/backend/files/update-fullfilelist > b/roles/bodhi2/backend/files/update-fullfilelist index 0302c6a..bac3f9c > 100755 > --- a/roles/bodhi2/backend/files/update-fullfilelist > +++ b/roles/bodhi2/backend/files/update-fullfilelist > @@ -1,6 +1,6 @@ > #!/bin/bash > > -# currently runs on releng2.fedora.phx.redhat.com > +# currently runs on bodhi-backend01 after updates pushes > > MOD=$1 > [ -z "$MOD" ] && { > @@ -8,6 +8,8 @@ MOD=$1 > exit 1 > } > > +# This is the old traditional fullfilelist with no timestamps > + > TMPFILE=$(mktemp -p /tmp/) > pushd /pub/$MOD > /dev/null > find * -print > $TMPFILE > @@ -18,3 +20,16 @@ else > fi > chmod 0644 fullfilelist > popd > /dev/null > + > +# This is the new list with timestamps > + > +TMPFILE=$(mktemp -p /tmp/) > +pushd /pub/$MOD > /dev/null > +/usr/bin/rsync --no-h --list-only -r . > $TMPFILE > +if diff $TMPFILE fullfiletimelist > /dev/null; then > + rm -f $TMPFILE > +else > + mv $TMPFILE fullfiletimelist > +fi > +chmod 0644 fullfiletimelist > +popd > /dev/null > diff --git a/roles/releng/files/update-fullfilelist > b/roles/releng/files/update-fullfilelist index 0302c6a..bac3f9c 100755 > --- a/roles/releng/files/update-fullfilelist > +++ b/roles/releng/files/update-fullfilelist > @@ -1,6 +1,6 @@ > #!/bin/bash > > -# currently runs on releng2.fedora.phx.redhat.com > +# currently runs on bodhi-backend01 after updates pushes > > MOD=$1 > [ -z "$MOD" ] && { > @@ -8,6 +8,8 @@ MOD=$1 > exit 1 > } > > +# This is the old traditional fullfilelist with no timestamps > + > TMPFILE=$(mktemp -p /tmp/) > pushd /pub/$MOD > /dev/null > find * -print > $TMPFILE > @@ -18,3 +20,16 @@ else > fi > chmod 0644 fullfilelist > popd > /dev/null > + > +# This is the new list with timestamps > + > +TMPFILE=$(mktemp -p /tmp/) > +pushd /pub/$MOD > /dev/null > +/usr/bin/rsync --no-h --list-only -r . > $TMPFILE > +if diff $TMPFILE fullfiletimelist > /dev/null; then > + rm -f $TMPFILE > +else > + mv $TMPFILE fullfiletimelist > +fi > +chmod 0644 fullfiletimelist > +popd > /dev/null -1 It is not a good solution. Dennis ___ infrastructure mailing list infrastructure@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/infrastructure@lists.fedoraproject.org
Re: Freeze Break: fullefiletime list
+1. On 4 May 2016 at 11:44, Patrick Uiterwijk wrote: > +1 > > On Tue, May 3, 2016 at 7:29 PM, Kevin Fenzi wrote: >> Greetings. >> >> There was some talk on the mirror lists about making a fullfilelist >> version that also had timestamps in it so mirrors could look for all >> the files that changed since the last time they synced and only sync >> those specific files. This would be MUCH faster for them as they don't >> need to pull the entire metadata over. This would be MUCH better for us >> as they don't have to hammer our netapps to get all the metadata. >> >> Of course this is not a 100% solution, as they won't see deleted files >> or hardlinks. So, they will likely still need to do a real full sync >> every once in a while (monthly?) to get those things synced up. >> >> I'd like to push the following change to create the file and then we >> can look at writing up some simple scripting for people to use and see >> if we can get people to use it and lower our netapp metadata hammering. >> >> +1s? >> >> kevin >> -- >> diff --git a/roles/bodhi2/backend/files/update-fullfilelist >> b/roles/bodhi2/backend/files/update-fullfilelist >> index 0302c6a..bac3f9c 100755 >> --- a/roles/bodhi2/backend/files/update-fullfilelist >> +++ b/roles/bodhi2/backend/files/update-fullfilelist >> @@ -1,6 +1,6 @@ >> #!/bin/bash >> >> -# currently runs on releng2.fedora.phx.redhat.com >> +# currently runs on bodhi-backend01 after updates pushes >> >> MOD=$1 >> [ -z "$MOD" ] && { >> @@ -8,6 +8,8 @@ MOD=$1 >> exit 1 >> } >> >> +# This is the old traditional fullfilelist with no timestamps >> + >> TMPFILE=$(mktemp -p /tmp/) >> pushd /pub/$MOD > /dev/null >> find * -print > $TMPFILE >> @@ -18,3 +20,16 @@ else >> fi >> chmod 0644 fullfilelist >> popd > /dev/null >> + >> +# This is the new list with timestamps >> + >> +TMPFILE=$(mktemp -p /tmp/) >> +pushd /pub/$MOD > /dev/null >> +/usr/bin/rsync --no-h --list-only -r . > $TMPFILE >> +if diff $TMPFILE fullfiletimelist > /dev/null; then >> + rm -f $TMPFILE >> +else >> + mv $TMPFILE fullfiletimelist >> +fi >> +chmod 0644 fullfiletimelist >> +popd > /dev/null >> diff --git a/roles/releng/files/update-fullfilelist >> b/roles/releng/files/update-fullfilelist >> index 0302c6a..bac3f9c 100755 >> --- a/roles/releng/files/update-fullfilelist >> +++ b/roles/releng/files/update-fullfilelist >> @@ -1,6 +1,6 @@ >> #!/bin/bash >> >> -# currently runs on releng2.fedora.phx.redhat.com >> +# currently runs on bodhi-backend01 after updates pushes >> >> MOD=$1 >> [ -z "$MOD" ] && { >> @@ -8,6 +8,8 @@ MOD=$1 >> exit 1 >> } >> >> +# This is the old traditional fullfilelist with no timestamps >> + >> TMPFILE=$(mktemp -p /tmp/) >> pushd /pub/$MOD > /dev/null >> find * -print > $TMPFILE >> @@ -18,3 +20,16 @@ else >> fi >> chmod 0644 fullfilelist >> popd > /dev/null >> + >> +# This is the new list with timestamps >> + >> +TMPFILE=$(mktemp -p /tmp/) >> +pushd /pub/$MOD > /dev/null >> +/usr/bin/rsync --no-h --list-only -r . > $TMPFILE >> +if diff $TMPFILE fullfiletimelist > /dev/null; then >> + rm -f $TMPFILE >> +else >> + mv $TMPFILE fullfiletimelist >> +fi >> +chmod 0644 fullfiletimelist >> +popd > /dev/null >> >> ___ >> infrastructure mailing list >> infrastructure@lists.fedoraproject.org >> http://lists.fedoraproject.org/admin/lists/infrastructure@lists.fedoraproject.org >> > ___ > infrastructure mailing list > infrastructure@lists.fedoraproject.org > http://lists.fedoraproject.org/admin/lists/infrastructure@lists.fedoraproject.org -- Stephen J Smoogen. ___ infrastructure mailing list infrastructure@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/infrastructure@lists.fedoraproject.org
Re: Freeze Break: fullefiletime list
+1 On Tue, May 3, 2016 at 7:29 PM, Kevin Fenzi wrote: > Greetings. > > There was some talk on the mirror lists about making a fullfilelist > version that also had timestamps in it so mirrors could look for all > the files that changed since the last time they synced and only sync > those specific files. This would be MUCH faster for them as they don't > need to pull the entire metadata over. This would be MUCH better for us > as they don't have to hammer our netapps to get all the metadata. > > Of course this is not a 100% solution, as they won't see deleted files > or hardlinks. So, they will likely still need to do a real full sync > every once in a while (monthly?) to get those things synced up. > > I'd like to push the following change to create the file and then we > can look at writing up some simple scripting for people to use and see > if we can get people to use it and lower our netapp metadata hammering. > > +1s? > > kevin > -- > diff --git a/roles/bodhi2/backend/files/update-fullfilelist > b/roles/bodhi2/backend/files/update-fullfilelist > index 0302c6a..bac3f9c 100755 > --- a/roles/bodhi2/backend/files/update-fullfilelist > +++ b/roles/bodhi2/backend/files/update-fullfilelist > @@ -1,6 +1,6 @@ > #!/bin/bash > > -# currently runs on releng2.fedora.phx.redhat.com > +# currently runs on bodhi-backend01 after updates pushes > > MOD=$1 > [ -z "$MOD" ] && { > @@ -8,6 +8,8 @@ MOD=$1 > exit 1 > } > > +# This is the old traditional fullfilelist with no timestamps > + > TMPFILE=$(mktemp -p /tmp/) > pushd /pub/$MOD > /dev/null > find * -print > $TMPFILE > @@ -18,3 +20,16 @@ else > fi > chmod 0644 fullfilelist > popd > /dev/null > + > +# This is the new list with timestamps > + > +TMPFILE=$(mktemp -p /tmp/) > +pushd /pub/$MOD > /dev/null > +/usr/bin/rsync --no-h --list-only -r . > $TMPFILE > +if diff $TMPFILE fullfiletimelist > /dev/null; then > + rm -f $TMPFILE > +else > + mv $TMPFILE fullfiletimelist > +fi > +chmod 0644 fullfiletimelist > +popd > /dev/null > diff --git a/roles/releng/files/update-fullfilelist > b/roles/releng/files/update-fullfilelist > index 0302c6a..bac3f9c 100755 > --- a/roles/releng/files/update-fullfilelist > +++ b/roles/releng/files/update-fullfilelist > @@ -1,6 +1,6 @@ > #!/bin/bash > > -# currently runs on releng2.fedora.phx.redhat.com > +# currently runs on bodhi-backend01 after updates pushes > > MOD=$1 > [ -z "$MOD" ] && { > @@ -8,6 +8,8 @@ MOD=$1 > exit 1 > } > > +# This is the old traditional fullfilelist with no timestamps > + > TMPFILE=$(mktemp -p /tmp/) > pushd /pub/$MOD > /dev/null > find * -print > $TMPFILE > @@ -18,3 +20,16 @@ else > fi > chmod 0644 fullfilelist > popd > /dev/null > + > +# This is the new list with timestamps > + > +TMPFILE=$(mktemp -p /tmp/) > +pushd /pub/$MOD > /dev/null > +/usr/bin/rsync --no-h --list-only -r . > $TMPFILE > +if diff $TMPFILE fullfiletimelist > /dev/null; then > + rm -f $TMPFILE > +else > + mv $TMPFILE fullfiletimelist > +fi > +chmod 0644 fullfiletimelist > +popd > /dev/null > > ___ > infrastructure mailing list > infrastructure@lists.fedoraproject.org > http://lists.fedoraproject.org/admin/lists/infrastructure@lists.fedoraproject.org > ___ infrastructure mailing list infrastructure@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/infrastructure@lists.fedoraproject.org
Freeze Break: fullefiletime list
Greetings. There was some talk on the mirror lists about making a fullfilelist version that also had timestamps in it so mirrors could look for all the files that changed since the last time they synced and only sync those specific files. This would be MUCH faster for them as they don't need to pull the entire metadata over. This would be MUCH better for us as they don't have to hammer our netapps to get all the metadata. Of course this is not a 100% solution, as they won't see deleted files or hardlinks. So, they will likely still need to do a real full sync every once in a while (monthly?) to get those things synced up. I'd like to push the following change to create the file and then we can look at writing up some simple scripting for people to use and see if we can get people to use it and lower our netapp metadata hammering. +1s? kevin -- diff --git a/roles/bodhi2/backend/files/update-fullfilelist b/roles/bodhi2/backend/files/update-fullfilelist index 0302c6a..bac3f9c 100755 --- a/roles/bodhi2/backend/files/update-fullfilelist +++ b/roles/bodhi2/backend/files/update-fullfilelist @@ -1,6 +1,6 @@ #!/bin/bash -# currently runs on releng2.fedora.phx.redhat.com +# currently runs on bodhi-backend01 after updates pushes MOD=$1 [ -z "$MOD" ] && { @@ -8,6 +8,8 @@ MOD=$1 exit 1 } +# This is the old traditional fullfilelist with no timestamps + TMPFILE=$(mktemp -p /tmp/) pushd /pub/$MOD > /dev/null find * -print > $TMPFILE @@ -18,3 +20,16 @@ else fi chmod 0644 fullfilelist popd > /dev/null + +# This is the new list with timestamps + +TMPFILE=$(mktemp -p /tmp/) +pushd /pub/$MOD > /dev/null +/usr/bin/rsync --no-h --list-only -r . > $TMPFILE +if diff $TMPFILE fullfiletimelist > /dev/null; then + rm -f $TMPFILE +else + mv $TMPFILE fullfiletimelist +fi +chmod 0644 fullfiletimelist +popd > /dev/null diff --git a/roles/releng/files/update-fullfilelist b/roles/releng/files/update-fullfilelist index 0302c6a..bac3f9c 100755 --- a/roles/releng/files/update-fullfilelist +++ b/roles/releng/files/update-fullfilelist @@ -1,6 +1,6 @@ #!/bin/bash -# currently runs on releng2.fedora.phx.redhat.com +# currently runs on bodhi-backend01 after updates pushes MOD=$1 [ -z "$MOD" ] && { @@ -8,6 +8,8 @@ MOD=$1 exit 1 } +# This is the old traditional fullfilelist with no timestamps + TMPFILE=$(mktemp -p /tmp/) pushd /pub/$MOD > /dev/null find * -print > $TMPFILE @@ -18,3 +20,16 @@ else fi chmod 0644 fullfilelist popd > /dev/null + +# This is the new list with timestamps + +TMPFILE=$(mktemp -p /tmp/) +pushd /pub/$MOD > /dev/null +/usr/bin/rsync --no-h --list-only -r . > $TMPFILE +if diff $TMPFILE fullfiletimelist > /dev/null; then + rm -f $TMPFILE +else + mv $TMPFILE fullfiletimelist +fi +chmod 0644 fullfiletimelist +popd > /dev/null pgpMD8zFcyzdq.pgp Description: OpenPGP digital signature ___ infrastructure mailing list infrastructure@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/infrastructure@lists.fedoraproject.org