RE: How Big A Dump File Can Be Handled?

Geoff Field Tue, 20 Aug 2013 23:30:58 -0700

> From: Ben Reser
> Sent: Wednesday, 21 August 2013 12:12 PM
> On Tue Aug 20 16:44:08 2013, Geoff Field wrote:
> > I've seen some quite large dump files already - one got up 
> > to about 28GB.  The svnadmin 1.2.3 tool managed to cope with 
> > that quite successfully.  Right now, our largest repository 
> > (some 19,000 revisions with many files, including 
> > installation packages) is dumping.  In the 5300 range of 
> > revisions, the dump file has just passed 9GB.


Overall, it got to about 29GB.  Dump and load worked fine, although they got a 
bit slow towards the end.  (In fact, I delayed sending this until it had 
actually finished.)

> Shouldn't be a problem within the limits of the OS and filesystem.  

I've just realised that my concern was based on a power-of-2 limitation that 
means that a 32-bit signed integer would roll over at the 2GB mark, with an 
unsigned roll-over at 4GB.  It's possible the Windows Server 2003 file system 
might have started to complain when it ran out of block indices/counters or 
some such, but there's no reason a 32GB+ file won't work if 4.1GB or more works.

> However, I'd say why are you bothering to produce dump files? 
>  Why not simply pipe the output of your dump command to a 
> load command, e.g.
> 
> svnadmin create newrepo
> svnadmin dump --incremental oldrepo | svnadmin load newrepo

I've been working in Windoze too long - I failed to think of that option.  I'll 
use that for the rest of the repositories (about 19 remain to be done).  Thank 
you for that application of the clue-by-four.  You've made the rest of my task 
a lot easier.

I really should have done it all using a scripting language of some sort, too.  
I've told myself it's really too close to the end of the process to think of 
*that* change now, except I've just managed to quickly throw together a batch 
file to do the job.  I could probably have done it in python or some other 
scripting language, but batch files are quick and easy.  Again, thanks Ben for 
the prompt to use my head a bit better (even though you didn't explicitly 
suggest this aspect).

CopyBDBToFSFS.bat:

  rem Create a new repository - using the OLD format just in case we need to 
switch back to the old server
  "C:\Program Files\Subversion\bin\svnadmin.exe" create "%1_FSFS"
  rem Copy the data from the old repository to the new one
  "C:\Program Files\Subversion\bin\svnadmin.exe" dump --incremental "%1" | 
"C:\Program Files\Subversion\bin\svnadmin.exe" load "%1_FSFS"
  rem Change the names to make the new repository accessible using the existing 
authentication and URLs and the old one accessible for emergency use.
  ren "%1" "%1_BDB"
  ren "%1_FSFS" "%1"
  rem Check the new repository with the current tools to confirm it's OK.
  svnadmin verify "%1"


Note that we have the old version 1.2.3 server software installed at the 
C:\Program Files\Subversion location, and later versions are stored under other 
locations, with the path set to point to the new version.  I'm creating the new 
repositories with the old version for those (hopefully rare) occasions when we 
need to switch back to the old server version.

> You'll need space for two repos but that should be less than 
> the space the dump file will take.  

We're keeping the old repos anyway, just in case.  We're an automotive parts 
company with support requirements for some quite old versions, so we can't 
afford to throw away too much history.  Even though it's a RAID system (using 
Very Expensive disk drives, so it's actually a RAVED system), there's lots of 
space available on the drive where the repositories live.

> I included the 
> --incremental option above because there's no reason to 
> describe the full tree for every revision when you're doing a 
> dump/load cycle.

That makes sense.

>  You can save space with --deltas if you 
> really want the dump files, but at the cost of extra CPU time.  
> If you're just piping to load the CPU to calculate the delta 
> isn't worth it since you're not saving the dump file.

I agree.  The server's not particularly new, so if I can save on processor time 
that's a good thing.  I'm discarding/reusing the dump files anyway, since we're 
keeping the original repositories (and we have a separate backup system for the 
servers - I know it works too, because I've had to restore some of the BDB 
repositories from it).

Regards,

Geoff

-- 
Apologies for the auto-generated legal boilerplate added by our IT department:


- The contents of this email, and any attachments, are strictly private
and confidential.
- It may contain legally privileged or sensitive information and is intended
solely for the individual or entity to which it is addressed.
- Only the intended recipient may review, reproduce, retransmit, disclose,
disseminate or otherwise use or take action in reliance upon the information
contained in this email and any attachments, with the permission of
Australian Arrow Pty. Ltd.
- If you have received this communication in error, please reply to the sender
immediately and promptly delete the email and attachments, together with
any copies, from all computers.
- It is your responsibility to scan this communication and any attached files
for computer viruses and other defects and we recommend that it be
subjected to your virus checking procedures prior to use.
- Australian Arrow Pty. Ltd. does not accept liability for any loss or damage
of any nature, howsoever caused, which may result
directly or indirectly from this communication or any attached files.

RE: How Big A Dump File Can Be Handled?

Reply via email to