Re: My Backup Script
Markus Schaber wrote on Wed, Jul 27, 2011 at 09:37:38 +0200: > Hi, Andreas, > > Von: Andreas Krey [mailto:a.k...@gmx.de] > > > On Tue, 26 Jul 2011 13:55:17 +, Les Mikesell wrote: > > ... > > > How could it possibly be up to date if there has been a commit since > > > the last time it was updated? > > > > Because the commit came from my WC. My WC was up to date before the > > commit, and the only things that change have been in my WC already, so > > there is no possible way my WC can not be up to date. > > That assumption is wrong, I guess. As far as I know, commit hooks can > modify the commit. (This behavior is discouraged, but nevertheless it is > possible. > Modifying the txn handed to the hook is discouraged because it's not possible to communicate the changes to the client, so its working copy becomes inconsistent with the repository. It's trickier but possible to write a hook that inspects the provided txn, /recreates/ it (possibly with modifications), commits /that/, and then rejects the originally-provided txn. The client working copy is consistent and will remain consistent after an 'svn up' (which will merge cleanly or conflict, as usual). > > > Best regards > > Markus Schaber > > ___ > We software Automation. > > 3S-Smart Software Solutions GmbH > Markus Schaber | Developer > Memminger Str. 151 | 87439 Kempten | Germany | Tel. +49-831-54031-0 | > Fax +49-831-54031-50 > > Email: m.scha...@3s-software.com | Web: http://www.3s-software.com > CoDeSys internet forum: http://forum.3s-software.com > Download CoDeSys sample projects: > http://www.3s-software.com/index.shtml?sample_projects > > Managing Directors: Dipl.Inf. Dieter Hess, Dipl.Inf. Manfred Werner | > Trade register: Kempten HRB 6186 | Tax ID No.: DE 167014915
Re: My Backup Script
On 7/26/11 5:14 PM, Andy Canfield wrote: I shy away from svnsync right now because it requires me to get TWO of these Subversion systems running. At present I am almost able to get one running. Almost. You don't need a 2nd server. Svnsync is a client to both repos, but the side it is writing can use file:/// access to avoid the need to have another server. Suppose we do a backup every night at midnight, copying it to a safe place. And suppose that the server dies at 8PM Tuesday evening. Then all submits that occurred on Tuesday have been lost. Presumably we'd find out about this on Wednesday. But a working copy is a valid working copy until you delete it. Assuming that the working copies still exist, all we need to do is * Restore the working SVNParent repository collection on a replacement computer. * Have everyone 'svn commit' from their working copies. * Unscramble the merge problems, which should be few. But you can't do this if the WC was updated past the rev of your restored repo. Or if you have a different uuid. -- Les Mikesell lesmikes...@gmail.com
Re: My Backup Script
On 7/26/11 4:23 PM, Andreas Krey wrote: Because the commit came from my WC. My WC was up to date before the commit, and the only things that change have been in my WC already, so there is no possible way my WC can not be up to date. Except that it 'forgets' to update the WC revision info, and requires a separate update for that. It doesn't 'forget', it knows that doing it right would, in the general case, involve changing files in your working copy that you might not want to have changed. In what case? When svn lets me commit at all, it is when the WC is up to date; that is, there is nothing that needs to be merged into my WC. What files could need to be modified, under the assumption that the WC wasn't mixed-revision to begin with? In the typical case of concurrent work, someone else will have committed a different set of files before you. Your commit only pushes your changed files. Now your workspace and the repo head are very different. While in this case you may 'know' that no one else has made any changes in the repository, but it is probably a bad idea to get into habits that won't work when you are part of a team. I seriously don't know what you mean here. If an 'svn up' wouldn't change anything in my WC before I do a commit, an 'svn up' immediately after my commit (to the revision I committed) wouldn't do either, and there is no reason why that shouldn't be reflected in the WC by the commit instead requiring me to do a separate update. In the general case, you don't know that. Any number of other things may have happened in the repo before or after your commit. It is sort-of a special case when there have been no other changes and I suppose it wouldn't hurt anything in that case if subversion marked your WC as updated to the next rev, but then it would be very confusing when there were changes and it didn't. In any case, there is always the chance of someone beating me to the commit, so I need to update beforehand. But that can happen before the first commit just as well. And for most cases two consecutive commits just work; I did need some time to come up with the scenario. Consider changes to different files than the set you commit. Your commit won't have conflicts but you won't have (and may not want) the other changes yet. Btw, is there a way to undo an update that lead to merges with my changes (or even conflicts), and get back my original modified sandbox? If you accepted their changes in a conflict, probably not. But if you can figure out the revisions before/after the changes you want to back out you should be able to reverse-merge that change. -- Les Mikesell lesmikes...@gmail.com
Re: My Backup Script
On 07/27/2011 01:34 AM, Nico Kadel-Garcia wrote: On Tue, Jul 26, 2011 at 2:33 AM, Andy Canfield wrote: For your information, this is my backup script. It produces a zip file that can be tranported to another computer. The zip file unpacks into a repository collection, giving, for each repository, a hotcopy of the repository and a dump of the repository. The hotcopy can be reloaded on a computer with the same characteristics as the original server; the dumps can be loaded onto a different computer. Comments are welcome. Andy, can we love you to pieces for giving us a new admin to educate in subtleties? Sure! I'm good at being ignorant. FYI I have a BS in Computer Science about 1970 and an MS in Operations Research in 1972, worked in Silicon Valley until I moved to Thailand in 1990. So although I am not stupid, I can be very ignorant. And also the IT environment here is quite different. For example, MySQL can sync databases if you've got a 100Mbps link. Ha ha. I invented a way to sync two MySQL databases hourly over an unreliable link that ran at about modem speeds. I can remember making a driver climb a flagpole to make a cell phone call because the signal didn't reach the ground. To this day we run portable computers out in the field and communicate via floppynet. In this region hardware costs more than people, and software often costs nothing. #! /bin/bash # requires root access if [ ! `whoami` == root ] then sudo $0 exit fi # controlling parameters SRCE=/data/svn ls -ld $SRCE DEST=/data/svnbackup APACHE_USER=www-data APACHE_GROUP=www-data Unless the repository is only readable owned by root, this should *NOT* run as root. Seriously. Never do things as the root user that you don't have to. If the repository owner is "svn" or "www-data" as you've described previously, execute this as the relevant repository owner. There are reasonable justifications for running it as root: [1] Other maintenance scripts must be run as root, and this puts all maintenance in a central pool. My maintenance scripts are crontab jobs of the form /root/bin/TaskName.job which runs /root/bin/TaskName.sh and pipes all stderr and stdout to /root/TaskName.out. Thus I can skim /root/*.out and have all the job status information at my fingertips. [2] For some tasks, /root/bin/TaskName.job is also responsible for appending /root/TaskName.out to /root/TaskName.all so that I can see earlier outputs. There is a job that erases /root/*.all the first of every month. [3] I have heard for a long time never run GUI as root. None of these maintenance scripts are GUI. [4] There are many failure modes that will only arise if it is run as non-root. For example, if run as root, the command "rm -rf /data/svnbackup" will absolutely, for sure, get rid of any existing /data/svnbackup that exists, whoever it is owned by, whatever junk is inside it. # Construct a new empty SVNParent repository collection rm -rf $DEST mkdir $DEST chown $APACHE_USER $DEST chgrp $APACHE_GROUP $DEST chmod 0700 $DEST ls -ld $DEST And do. what? You've not actually confirmed that this has succeded unless you do something if these bits fail. Many of your comments seem to imply that this script has not been tested. Of course it's been tested already, and in any production environment it will be tested again. And if stdout and stderr are piped to /root/SVNBackup.out then I can check that output text reasonably often and see that it is still running. In this case I would check it daily for a week, weekly for a month or two, yearly forever, and every time somebody creates a new repository. Also, by the standards of this part of the world, losing a day's work is not a catastrophe. Most people can remember what they did, and do it again, and it probably only takes a half-day to redo. # Get all the names of all the repositories # (Also gets names of any other entry in the SVNParent directory) cd $SRCE ls -d1 *>/tmp/SVNBackup.tmp And *HERE* is where you start becoming a dead man id mkdir $DEST failed. I believe that it works in your current environment, but if the parent of $DEST does not exist, you're now officially in deep danger executing these operations in whatever directory the script was run from. As noted above, $DEST is /data/svnbackup. The parent of $DEST is /data. /data is a partition on the server. If that partition is gone, that's a failure that we're talking about recovering from. # Process each repository for REPO in `cat /tmp/SVNBackup.tmp` And again you're in trouble. If any of the repositories have whitespace in their names, or funky EOL characters, the individual words will be parsed as individual arguments. This is Linux. Anyone who creates a repository with white space in the name gets shot. do # some things are not repositories; ignore them if [ -d $SRCE/$REPO ] Here is a likely bug in the script. I treat every subdirectory of the SVNParent repository collection as if it were a reposi
Re: My Backup Script
On Tue, 26 Jul 2011 15:59:59 +, Les Mikesell wrote: ... > >Because the commit came from my WC. My WC was up to date before the > >commit, and the only things that change have been in my WC already, > >so there is no possible way my WC can not be up to date. Except that it > >'forgets' to update the WC revision info, and requires a separate update > >for that. > > It doesn't 'forget', it knows that doing it right would, in the general > case, involve changing files in your working copy that you might not > want to have changed. In what case? When svn lets me commit at all, it is when the WC is up to date; that is, there is nothing that needs to be merged into my WC. What files could need to be modified, under the assumption that the WC wasn't mixed-revision to begin with? ... > While in this case you may 'know' that no one else has made any changes > in the repository, but it is probably a bad idea to get into habits that > won't work when you are part of a team. I seriously don't know what you mean here. If an 'svn up' wouldn't change anything in my WC before I do a commit, an 'svn up' immediately after my commit (to the revision I committed) wouldn't do either, and there is no reason why that shouldn't be reflected in the WC by the commit instead requiring me to do a separate update. In any case, there is always the chance of someone beating me to the commit, so I need to update beforehand. But that can happen before the first commit just as well. And for most cases two consecutive commits just work; I did need some time to come up with the scenario. Btw, is there a way to undo an update that lead to merges with my changes (or even conflicts), and get back my original modified sandbox? Andreas -- "Totally trivial. Famous last words." From: Linus Torvalds Date: Fri, 22 Jan 2010 07:29:21 -0800
Re: My Backup Script
On 7/26/2011 3:03 PM, Andreas Krey wrote: On Tue, 26 Jul 2011 13:55:17 +, Les Mikesell wrote: ... How could it possibly be up to date if there has been a commit since the last time it was updated? Because the commit came from my WC. My WC was up to date before the commit, and the only things that change have been in my WC already, so there is no possible way my WC can not be up to date. Except that it 'forgets' to update the WC revision info, and requires a separate update for that. It doesn't 'forget', it knows that doing it right would, in the general case, involve changing files in your working copy that you might not want to have changed. (But probably I'm too much in the whole-tree-mindset to see the problems when a commit paints new revisions outside the area it committed. There is no 'the' revision info.) While in this case you may 'know' that no one else has made any changes in the repository, but it is probably a bad idea to get into habits that won't work when you are part of a team. -- Les Mikesell lesmikes...@gmail.com
Re: My Backup Script
On Tue, 26 Jul 2011 13:55:17 +, Les Mikesell wrote: ... > How could it possibly be up to date if there has been a commit since the > last time it was updated? Because the commit came from my WC. My WC was up to date before the commit, and the only things that change have been in my WC already, so there is no possible way my WC can not be up to date. Except that it 'forgets' to update the WC revision info, and requires a separate update for that. (But probably I'm too much in the whole-tree-mindset to see the problems when a commit paints new revisions outside the area it committed. There is no 'the' revision info.) Andreas -- "Totally trivial. Famous last words." From: Linus Torvalds Date: Fri, 22 Jan 2010 07:29:21 -0800
Re: My Backup Script
On 7/26/2011 1:47 PM, Andreas Krey wrote: This is one of the high strangitude svn behaviour artefacts: That you can't do two consecutive commits without getting an error (in some relatively popular cases). And you generally shouldn't be doing that unless there is some special need to avoid picking up concurrent changes No, there are no concurrent changes. I'm the only one working the repo; For the second of the back-to-back commits I got the error, which is easily explained when you know about mixed revisions and that. But at the same time it looks utterly idiotic, since the WC was up to date before, so how can it possibly be not up to date after the first commit. How could it possibly be up to date if there has been a commit since the last time it was updated? -- Les Mikesell lesmikes...@gmail.com
Re: My Backup Script
On Tue, 26 Jul 2011 11:53:15 +, Les Mikesell wrote: > On 7/26/2011 11:42 AM, Andreas Krey wrote: ... > >This is one of the high strangitude svn behaviour artefacts: That you > >can't do two consecutive commits without getting an error (in some > >relatively popular cases). > > And you generally shouldn't be doing that unless there is some special > need to avoid picking up concurrent changes No, there are no concurrent changes. I'm the only one working the repo; For the second of the back-to-back commits I got the error, which is easily explained when you know about mixed revisions and that. But at the same time it looks utterly idiotic, since the WC was up to date before, so how can it possibly be not up to date after the first commit. Script: set -xe rm -rf repo wc svnadmin create repo svn checkout file:///`pwd`/repo wc cd wc mkdir D touch A D/B D/C E # svn add . # <- That nuisance 'already under control'... svn add A D E svn commit -m 'initial' svn up date >D/B date >A svn propset blub blah . svn commit D -m green svn commit . -m blau And output (i'm on 1.6.6, however): + rm -rf repo wc + svnadmin create repo ++ pwd + svn checkout file:Users/andreaskrey/svnt/wc/repo wc Checked out revision 0. + cd wc + mkdir D + touch A D/B D/C E + svn add A D E A A A D A D/B A D/C A E + svn commit -m initial Adding A Adding D Adding D/B Adding D/C Adding E Transmitting file data Committed revision 1. + svn up At revision 1. + date + date + svn propset blub blah . property 'blub' set on '.' + svn commit D -m green SendingD/B Transmitting file data . Committed revision 2. + svn commit . -m blau Sending. svn: Commit failed (details follow): svn: Directory '/' is out of date Andreas
Re: My Backup Script
On Tue, Jul 26, 2011 at 06:42:39PM +0200, Andreas Krey wrote: > On Tue, 26 Jul 2011 10:09:47 +, Les Mikesell wrote: > ... > > Yes, but it is then a mixed rev and needs an update. That is, the > > changes you committed belong to the rev the commit creates while the > > unchanged files belong to the rev of the prior update or checkout. > > This is one of the high strangitude svn behaviour artefacts: That you > can't do two consecutive commits without getting an error (in some > relatively popular cases). There were some annoying bugs in this regard which have been fixed partly in 1.6.16 and completely 1.7: http://subversion.tigris.org/issues/show_bug.cgi?id=3525 http://subversion.tigris.org/issues/show_bug.cgi?id=3526 (<-- this one in particular was annoying) http://subversion.tigris.org/issues/show_bug.cgi?id=3533
Re: My Backup Script
On 7/26/2011 11:42 AM, Andreas Krey wrote: On Tue, 26 Jul 2011 10:09:47 +, Les Mikesell wrote: ... Yes, but it is then a mixed rev and needs an update. That is, the changes you committed belong to the rev the commit creates while the unchanged files belong to the rev of the prior update or checkout. This is one of the high strangitude svn behaviour artefacts: That you can't do two consecutive commits without getting an error (in some relatively popular cases). And you generally shouldn't be doing that unless there is some special need to avoid picking up concurrent changes (and if it were handled automatically, you wouldn't be able to). -- Les Mikesell lesmikes...@gmail.com
Re: My Backup Script
On Tue, 26 Jul 2011 10:09:47 +, Les Mikesell wrote: ... > Yes, but it is then a mixed rev and needs an update. That is, the > changes you committed belong to the rev the commit creates while the > unchanged files belong to the rev of the prior update or checkout. This is one of the high strangitude svn behaviour artefacts: That you can't do two consecutive commits without getting an error (in some relatively popular cases). Andreas -- "Totally trivial. Famous last words." From: Linus Torvalds Date: Fri, 22 Jan 2010 07:29:21 -0800
Re: My Backup Script
On 7/26/2011 7:48 AM, Andy Canfield wrote: As I understand Subversion, [a] The server has no idea who has a working copy. [b] The checkout builds a working copy on the workstation from the server's repository. [c] What is on the developers hard disk is a working copy. [d] What is on the developer's hard disk continues to be a working copy, even after a commit. Yes, but it is then a mixed rev and needs an update. That is, the changes you committed belong to the rev the commit creates while the unchanged files belong to the rev of the prior update or checkout. [e] If the developer tries to make revisions to his working copy six months after his last commit, then tries to commit, he's going to have a major mess on his hands trying to reconcile things. The working copy is still a valid working copy. Normally you would do an update to pick up and reconcile other changes, then commit. [f] Unlike a lock, which he grabs and then releases, he never gives up his working copy; it is valid perpetually. But, in the context of backups, note that you can only commit to the repository where the checkout originated (determined by uuid) and the revision of the repo can't go backwards. So, if you restore a backup older than the rev of your working copy's checkout or last update you won't be able to commit the changes from that working copy. The only reasonable approach to doing that would be to check out the closest match from what is currently in the restored repo, copy over your changed files, and then commit. This could be cumbersome with a large tree with many changes. -- Les Mikesell lesmikes...@gmail.com
Re: My Backup Script
On 07/26/2011 04:20 PM, Stefan Sperling wrote: On Tue, Jul 26, 2011 at 01:33:09PM +0700, Andy Canfield wrote: For your information, this is my backup script. It produces a zip file that can be tranported to another computer. The zip file unpacks into a repository collection, giving, for each repository, a hotcopy of the repository and a dump of the repository. The hotcopy can be reloaded on a computer with the same characteristics as the original server; the dumps can be loaded onto a different computer. Comments are welcome. Please also make a backup of every individual revision from the post-commit hook, like this: [[[ #!/bin/sh REPOS="$1" REV="$2" svnadmin dump "${REPOS}" -q --incremental --deltas -r "${REV}"> /backup/${REV}.dmp ]]] And make /backup a separate filesystem, preferably on a different host or some disk storage that runs independently from the host. In Linux a separate "filesystem" is often another partition on the hard disk, and thus not to be trusted too much. For safety an external hard disk, flushed, should be good enough. No need for an entire other host. Yes? You will thank me one day when your server's disks die at the wrong moment e.g. because of power failure or overheating. In such cases it is possible that not all data has been flushed to disk yet. The only good data is in the OS buffer cache which the above svnadmin dump command will get to see. However, even revisions committed several *minutes* before such a crash can appear corrupted on disk when you reboot. Thank you very much. Linux ext3 filesystems used to be pretty much immune to unflushed buffers; anything older than 5 seconds would be written to disk. But now, with ext4, there's no guarantee, and I've seen the thing have unflushed buffers after one minute. And I'm not supposed to be able to see that! Of course, even Linux isn't immune to a burned-out hard disk controller. As for overheading, I can't help but joke that perhaps the overheading was caused by all the backups you were doing? Ha ha. Of course not! I've seen this happening (on Windows, with NTFS -- lots of damage; but other operating systems aren't immune to this either). We could tell that the buffer cache data was good because there were multiple corrupted revision files (one repository had more than 20 broken revisions), each with random junk in some parts, and all broken parts were 512 byte blocks, i.e. disk sectors. But in the parts that were not broken they referred to each other in ways that made perfect sense. So before the crash they were all OK. There were no backups so we had to manually repair the revisions (this requires intricate knowledge about the revision file format and takes time...) When this happens you have an unusable repository. Anything referring to the broken revisions will fail (commit, diff, update, ...). Assuming the incremental dumps weren't destroyed in the catastrophe you can load the incremental dumps on top of your last full backup and get to a good state that is very close to the point in time when the crash happened. Without the incremental dumps you'll have the last full backup. But anything committed since could be lost forever. Anything commited since the last full backup would be lost only if no longer exists in the developer's working copy. The size of the mess depends on how many commits you lost. Ten per developer per day, and 20 developer, is a massive headache. Two per developer per week, and three developers, is not a catastrophe. As I understand Subversion, [a] The server has no idea who has a working copy. [b] The checkout builds a working copy on the workstation from the server's repository. [c] What is on the developers hard disk is a working copy. [d] What is on the developer's hard disk continues to be a working copy, even after a commit. [e] If the developer tries to make revisions to his working copy six months after his last commit, then tries to commit, he's going to have a major mess on his hands trying to reconcile things. The working copy is still a valid working copy. [f] Unlike a lock, which he grabs and then releases, he never gives up his working copy; it is valid perpetually. [g] The usual way a working copy goes away is with the "rm -rf" command. Thanks for the great information!
Re: My Backup Script
On Tue, Jul 26, 2011 at 01:33:09PM +0700, Andy Canfield wrote: > For your information, this is my backup script. It produces a zip > file that can be tranported to another computer. The zip file > unpacks into a repository collection, giving, for each repository, a > hotcopy of the repository and a dump of the repository. The hotcopy > can be reloaded on a computer with the same characteristics as the > original server; the dumps can be loaded onto a different computer. > Comments are welcome. Please also make a backup of every individual revision from the post-commit hook, like this: [[[ #!/bin/sh REPOS="$1" REV="$2" svnadmin dump "${REPOS}" -q --incremental --deltas -r "${REV}" > /backup/${REV}.dmp ]]] And make /backup a separate filesystem, preferably on a different host or some disk storage that runs independently from the host. You will thank me one day when your server's disks die at the wrong moment e.g. because of power failure or overheating. In such cases it is possible that not all data has been flushed to disk yet. The only good data is in the OS buffer cache which the above svnadmin dump command will get to see. However, even revisions committed several *minutes* before such a crash can appear corrupted on disk when you reboot. I've seen this happening (on Windows, with NTFS -- lots of damage; but other operating systems aren't immune to this either). We could tell that the buffer cache data was good because there were multiple corrupted revision files (one repository had more than 20 broken revisions), each with random junk in some parts, and all broken parts were 512 byte blocks, i.e. disk sectors. But in the parts that were not broken they referred to each other in ways that made perfect sense. So before the crash they were all OK. There were no backups so we had to manually repair the revisions (this requires intricate knowledge about the revision file format and takes time...) When this happens you have an unusable repository. Anything referring to the broken revisions will fail (commit, diff, update, ...). Assuming the incremental dumps weren't destroyed in the catastrophe you can load the incremental dumps on top of your last full backup and get to a good state that is very close to the point in time when the crash happened. Without the incremental dumps you'll have the last full backup. But anything committed since could be lost forever.
Re: My Backup Script
On 7/25/2011 11:33 PM, Andy Canfield wrote: For your information, this is my backup script. It produces a zip file that can be tranported to another computer. The zip file unpacks into a repository collection, giving, for each repository, a hotcopy of the repository and a dump of the repository. The hotcopy can be reloaded on a computer with the same characteristics as the original server; the dumps can be loaded onto a different computer. Comments are welcome. The dump should use the hot copy as its source. Otherwise it may differ from the hot copy. See my note inline. #! /bin/bash # requires root access if [ ! `whoami` == root ] then sudo $0 exit fi # controlling parameters SRCE=/data/svn ls -ld $SRCE DEST=/data/svnbackup APACHE_USER=www-data APACHE_GROUP=www-data # Construct a new empty SVNParent repository collection rm -rf $DEST mkdir $DEST chown $APACHE_USER $DEST chgrp $APACHE_GROUP $DEST chmod 0700 $DEST ls -ld $DEST # Get all the names of all the repositories # (Also gets names of any other entry in the SVNParent directory) cd $SRCE ls -d1 * >/tmp/SVNBackup.tmp # Process each repository for REPO in `cat /tmp/SVNBackup.tmp` do # some things are not repositories; ignore them if [ -d $SRCE/$REPO ] then # back up this repository echo "Backing up $REPO" # use hotcopy to get an exact copy # that can be reloaded onto the same system svnadmin hotcopy $SRCE/$REPO $DEST/$REPO # use dump to get an inexact copy # that can be reloaded anywhere svnadmin dump $SRCE/$REPO >$DEST/$REPO.dump svnadmin dump $DEST/$REPO >$DEST/${REPO}.dump I generally use curly braces when punctuation is present to make sure variable substitution occurs the way I want it. fi done # Show the contents echo "Contents of the backup:" ls -ld $DEST/* # zip up the result cd $DEST zip -r -q -y $DEST.zip . # Talk to the user echo "Backup is in file $DEST.zip:" ls -ld $DEST.zip # The file $DEST.zip can now be transported to another computer. -- David Chapman dcchap...@acm.org Chapman Consulting -- San Jose, CA