Extensible changeset format proposal

2010-08-26 Thread anatoly techtonik
Hello,

Don't you think it is time to design an extensible changeset format
for exchanging information about changesets between systems?

Right now I am struggling to extract full information from uncommitted
Subversion changeset for uploading it for review (in Rietveld
project). Rietveld code review tool was initially designed to work
with Subversion, but so far it is still impossible to get complete
diff of changes from SVN that reviewer can apply to its working copy
and commit after review. The problem to get complete diff is twofold:

1. Subversion data for uncommited changeset is scattered and it is
hard to say if it ever complete.
2. svn diff format is too limited.

For the first part I can give an example of problem I am trying to
solve currently - 'Rietveld code review data is missing files that
were created as a result of svn copy or svn move operation'. If a
text file is added with svn add - its contents will appear in svn
diff output, but text files created as a result of svn move or svn
copy operation will not. To get this missing information one need to
run svn status, check for the presence of copied or moved files
(marked with A  +), check these files are not binary, manually
reconstruct change chunk for them and append missing data to the
output of svn diff. But even after that reviewer still won't be able
to exactly reproduce changeset, because svn diff format will not
contain information about source of copied or moved file. And here
comes the second part.

svn diff format doesn't record enough information to reproduce
committed changeset. For example, it doesn't have data about source of
copied and moved files. This is believed to be solved by git diff
format, but it won't be a panacea either, because Subversion
changesets also contain information about properties, mime types etc.
It is also impossible to include binary files (if needed) or original
author info (can be useful for contibulyzer), or any other information
that a given VCS (Subversion in this case) is needed to completely
reconstruct its own changeset.

For code reviews, ideally, code review system such as Rietveld should
grab the changeset, parse it and extract relevant information for
reviewer (skipping or filtering non-interesting parts and giving
warning about unknown parts). It should also save original or filtered
changeset file to be imported and committed if review is successful.


That's why extensible changeset format is required. It will not only
be useful for sending changesets for review, but also for
synchronizing changes with other VCSes. With new changeset format
mirroring tool could automatically analyze incoming data to find
Subversion related attributes to save them into repository directly
and automatically save all other attributes to properties.

I see this format as an XML format that resembles Atom feed, with
logical order of events (i.e. file removed after it was copied etc.).
Subversion already uses XML formats internally, so I logically assume
that folks here possess required experience and may even have some
ready pieces to work out an initial draft of such format.

Please, CC.
--
anatoly t.


Re: Extensible changeset format proposal

2010-08-26 Thread Greg Hudson
On Thu, 2010-08-26 at 05:57 -0400, anatoly techtonik wrote:
 Don't you think it is time to design an extensible changeset format
 for exchanging information about changesets between systems?

Mostly for your entertainment, see:

http://www.red-bean.com/pipermail/changesets/2003-April/thread.html

There was an attempt to create a unified cross-system changeset format
seven years ago, but it didn't get very far.  However, the principals
are different today and more is known about the space of successful DVCS
tools.