[ 
https://issues.apache.org/jira/browse/SVN-4668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812021#comment-15812021
 ] 

Julian Foad commented on SVN-4668:
----------------------------------

Hello. I fixed some bugs in the various kinds of dump output code (including 
svnadmin dump, svndumpfilter, and svnrdump dump), and that did lead to some 
ordering changes. There were stability guarantees about certain things but not 
about others. The issue in general is that we never guaranteed a stable 
representation of the data. There are more ways it can vary than just the 
ordering of headers. My changes weren't the first changes that affected the 
representation, even if it is the first time it has affected your particular 
cases.

It's not just that a particular code change changed the output, there were 
already different code paths that would output the same thing in different 
ways. The comments "TODO: use a stable order" were meaning it would be a nice 
enhancement, not that it is temporarily broken.

I agree with you that it would be good to have a stable representation, but 
unfortunately we never did. In the absence of that I agree it would be good to 
have ways to verify the integrity of a dump and to compare two dumps for 
equality. It might be possible to achieve those two things, although the design 
of dump files is not well suited.

I believe the best way to solve this class of problems for the future will be 
the idea known as Merkle tree hashes: hashes that (sufficiently uniquely) 
represent the entire state of the repository and of well defined subsets of it 
(property, properties-list, file-revision, directory-revision, 
directory-tree-revision, whole revision with revprops, etc.). That is quite a 
big undertaking to retro-fit to Subversion, but should be possible if anyone is 
prepared to devote enough effort to it. In principle, this would have the same 
effect as converting the Subversion repository to Git and seeing what commit 
ids Git generates.

> svnserve dump format order has changed
> --------------------------------------
>
>                 Key: SVN-4668
>                 URL: https://issues.apache.org/jira/browse/SVN-4668
>             Project: Subversion
>          Issue Type: Bug
>          Components: svnserve
>    Affects Versions: 1.9.3
>         Environment:  Ubuntu 16.04.1 LTS (GNU/Linux 4.4.0-53-generic x86_64)
>            Reporter: Luke Perkins
>         Attachments: SvnserveDumpIssue_20170107.jpg
>
>
> The format of the svnserve dump file has changed somewhere between version 
> 1.8 and 1.9.3 ( version 1.9.3 (r1718519)). I routinely perform svnserve dump 
> operations of my repositories and compare them against archived copies of 
> dump files to be used for emergency recovery operations.
> It appears the content order difference is benign other than "diff" 
> operations fail. I have file illustrating the difference.
> The version information for svnserve dump is:
> svnserve, version 1.9.3 (r1718519)
>    compiled Mar 14 2016, 07:39:01 on x86_64-pc-linux-gnu
> Copyright (C) 2015 The Apache Software Foundation.
> This software consists of contributions made by many people;
> see the NOTICE file for more information.
> Subversion is open source software, see http://subversion.apache.org/
> The following repository back-end (FS) modules are available:
> * fs_fs : Module for working with a plain file (FSFS) repository.
> * fs_x : Module for working with an experimental (FSX) repository.
> * fs_base : Module for working with a Berkeley DB repository.
> Cyrus SASL authentication is available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to