SVN Branching Language
Hi, I'm reproducing an email I found interesting on the Git list [0] especially for those invested in the git-svn effort. [0]: http://thread.gmane.org/gmane.comp.version-control.git/192801/focus=193737 -- This is the second draft of the SVN branching language. If you're interested in how SVN revisions map to git commits, please read it and let me know what you think. You might also be interested in the reference implementation and nascent collection of tests[1], although these wouldn't yet withstand a thorough review. I plan to pause language work and concentrate on SVN export for a while now, because it will be easier to e.g. write tests once I have some code to test. Here's a little background for people new to the list: I'm interested in ways to map SVN revisions to git commits, but have found it surprisingly hard to grasp the problem. Having trouble grasping a problem is generally a good sign that your solution is too small, so I split the problem in three: 1. A language to describe SVN branching and merging 2. A program to export such a file from an SVN repository 3. A program to import such a file into a git repository This has let me concentrate on manageable parts of the problem. For example, Sam Vilain recently pointed out[2] so-called piecemeal merges that split a merge across several revisions. There's no way I could have figured out a plan to solve that in one go, but between us we were able to find a representation and a way of importing it into git. I still don't know how to detect and export it from Subversion, but I can worry about that another day. This is the second draft I've brought to the git mailing list for discussion. The list has a long and proud tradition of including patches in e-mails so people can review code from their e-mail client, but in this case the patch would be so large that it's better simply to include the new text. I've deliberately left the reference implementation and tests out of this e-mail because they're not ready for review. - Andrew [1] https://github.com/andrew-sayers/SVN-Branching-Language [2] http://article.gmane.org/gmane.comp.version-control.git/192418 SVN Branching Language v0.1 === Andrew Sayers andrew-sbl at pileofstuff.org This file specifies a simple language for describing how SVN revisions relate to branches. The goals of this language are to be a communication protocol for programs operating on SVN history and to provide a lingua franca for developers discussing unusual SVN use cases. Overview The SVN Branching Language (``SBL'') is a line-based UTF-8 format, where each line specifies one action in the SVN history. An SBL file contains two sections: the first is the ``header'' section that provides information about the file as a whole, the second is the ``body'' section that provides information about things that happened in specific revisions. Any line in an SBL file that begins with a hash or semicolon character, or that contains only whitespace (including empty lines) is considered to be a comment, and must be ignored. Clients should use a leading '#' to create ordinary comments to be read by users, and a leading ';' for commented-out actions the client wishes to suggest to a user. This makes it easier for a user to accept/reject actions by search-and-replace. Here is an example file: # the file must begin by specifying the revision of the language being used: This is a version 0.1 SVN Branching Language file Body: In r1, create branch trunk In r10, create branch branches/1.0 from trunk r9 In r20, create tag tags/version_1 from branches/1.0 r19 In r20, deactivate tags/version_1 # User intervention is required to confirm this action: ; In r25, merge trunk up to r24 into branches/1.0 Terminology --- The following words should be interpreted as specified below: Directory:: A string representing a directory in the SVN tree. A directory can have several flows (i.e. it can be created and deleted multiple times), some of which may have names and others not. The exact rules for valid directory names are discussed below, but should be as close as possible to the rules for a valid SVN directory name. Name:: A string representing a branch or tag. This is the conventional name for a tag as described by users, and is not represented anywhere in SVN. For example, a directory branches/v1.x/v1.0 might have the name v1.0. Note that although tags and branches are functionally identical, clients must treat tag names and branch names as being in different namespaces - for example, a branch name v1.0 and a tag name v1.0 can both exist at the same time. Exist:: A directory is said to exist in a given revision if the directory would appear in an `svn ls` command for that revision. Clients can fully implement this language without tracking whether directories exist - the term is defined here purely to make
Re: SVN Branching Language
Hi Mark, Mark Phippard wrote: On Thu, Mar 29, 2012 at 10:39 AM, Ramkumar Ramachandra artag...@gmail.com wrote: [0]: http://thread.gmane.org/gmane.comp.version-control.git/192801/focus=193737 Isn't this similar to what Eric Raymond has already done? http://esr.ibiblio.org/?p=4071 Quite similar- the difference is that reposurgeon parses the dump stream to figure out the mapping, while this file format makes no such effort. Thanks for the pointer nevertheless. Ram
Re: SVN Branching Language
Hi, Ramkumar Ramachandra wrote: Quite similar- the difference is that reposurgeon parses the dump stream to figure out the mapping, while this file format makes no such effort. Thanks for the pointer nevertheless. Er, sorry; looks like both of them parse the dumpstream. Ram
Re: Performance comparison of svnrdump and svnsync
Hi Stefan, Stefan Sperling writes: On Wed, Aug 17, 2011 at 01:55:27PM -0400, Mark Phippard wrote: I could just be basing it on the original proposal. Maybe that planted the seed with me that it was going to be faster: http://svn.haxx.se/dev/archive-2010-07/0154.shtml The proposal says: it currently performs significantly better than `svnsync` because it doesn't need to touch the filesystem or apply any deltas. I would guess this benefit is probably only measurable with file:// URLs. Yes, I recall initially comparing their performances with file:// URLs, but that was a long time ago. It's possible that svnsync is equally fast now. And yes, today's concerns are related to the usecase of importing from/ exporting into Git: so, for feedback-driven-optimization, we'll need to develop a new benchmark. Thanks. -- Ram
Re: [PROPOSAL] commandline switch to override native eol style
Hi, Guenter Knauf writes: I would like to suggest adding a commandline switch for svn to override the native eol style. Why do I want this? On windows platform the native eol style defaults to CRLF which is usual what we want. Now if you want to work from MSYS bash shell and want to check out sources from a repo with a normal win32 svn binary you may end up with autoconf shell scripts in CRLF format - of course depending on how the files are marked in the repo, but lets here assume that the files are marked as 'svn:eol-stype native', and that we have no control over the remote repo; so as workaround to get working autoconf files you now either need to convert these files with dos2unix, or you need to use 'svn export --native-eol LF' which then requires another export again and again every time you want to update from repo. If '--native-eol LF' would also work with checkout then all would be fine, and you could just add a bash alias to profile like alias svn='svn --native-eol LF' and you could use standard windows binaries, and there would be no need to create special MSYS binaries. I'd just like to add to this: Git has solved this problem elegantly using three configuration variables, namely core.eol, core.autocrlf and core.safecrlf. The motivation is described in the git-config manpage [1]. Even if we don't copy this design, I think we should look into the motivation behind this design and see what we can take from it. [1]: http://www.kernel.org/pub/software/scm/git/docs/git-config.html -- Ram
Re: svn commit: r1070980 - /subversion/trunk/subversion/svnrdump/load_editor.c
Hi Daniel, Daniel Shahaf writes: CC += artagnon Ramkumar, I think the lesson from Mike and I's recent fixes is: Mark unreachable code with a run-time assertion, not with a source comment. i.e., if those places were marked with SVN__NOT_IMPLEMENTED(), debugging would have been easier. (Tests pass if I add a SVN__NOT_IMPLEMENTED() to that function.) Cool, thanks for pointing this out. Since this is such a common task, maybe add this information to HACKING? -- Ram
Re: Bindings use is *painful*
Hi, Stefan Sperling writes: On Wed, Feb 16, 2011 at 03:08:39PM +, Hyrum K Wright wrote: This is because the bindings are generated with SWIG. We cannot really fix this without using a different approach. We could improve the documentation. An introductory document that explains how to apply the C API docs to the python bindings would be helpful. Maybe someone with some experience in using the python bindings could write something like this? Also, there's an API built on top of the bindings that is easier to use and has better documentation: http://pysvn.tigris.org/ Does it make sense to write bindings ground up by hand and hand-craft a nice API? Since the C API is guaranteed to be relatively stable and backward compatible, maintainability shouldn't be much of an issue. Plus, it'll probably make a good GSoC project. -- Ram
Re: 'svnrdump load' tests unmirrored
Hi Daniel, On Tue, Jan 25, 2011 at 7:24 PM, Daniel Shahaf d...@daniel.shahaf.name wrote: In svnrdump_tests.py, nearly all tests are paired --- dump test and corresponding load test --- except a handful (basic_dump, only_trunk_dump, only_trunk_A_with_changes_dump, copy_bad_line_endings_dump). Is there any particular reason that these don't have corresponding _load tests? I can't check right now, but it looks like I just forgot to write them. Feel free to fill them in. I'll be able to pitch in next week- I'm currently on vacation in Nepal. Thanks! -- Ram
Re: [l10n] Translation status report for trunk r1055292
Hi, Hyrum K Wright writes: I'm working on getting a web-based translation system installed on ASF hardware for use by Subversion, and possibly other projects, and will report progress here as it happens. I am in the unfortunate position of being a monoglot, and hence unable to actually do the translation, but I would like to lower the barrier for other people to contribute. I believe Transifex has been quite successful in this regard. I'm considering using it for translations in git.git too. -- Ram
Re: svn commit: r1004286 - in /subversion/trunk: ./ subversion/libsvn_subr/io.c
Hi Julian and Daniel, Here's another revision after your suggestions on IRC. Thanks. [[[ Determine default perms in an elegant thread-safe way, not racily. * subversion/libsvn_subr/io.c (default_perms_baton, perms_init_state): New struct, variable. (get_default_file_perms): Remove all functional code into init_default_file_perms, and use apr_atomic__init_once so that code is only executed once. (init_default_file_perms): New function to determine default file permissions by creating a temporary file and extracting permissions from it. Default permissions are not determined racily anymore, since this function is an apr_atomic__init_once callback. ]]] Index: subversion/libsvn_subr/io.c === --- subversion/libsvn_subr/io.c (revision 1057166) +++ subversion/libsvn_subr/io.c (working copy) @@ -1300,6 +1300,39 @@ reown_file(const char *path, return svn_error_return(svn_io_remove_file2(unique_name, FALSE, pool)); } +static volatile svn_atomic_t perms_init_state = 0; +struct default_perms_baton +{ + apr_fileperms_t *default_perms; +}; + + +/* Helper function to discover default file permissions; it does this + by creating a temporary file and extracting the permissions from + it. Passed to svn_atomic__init_once. All temporary allocations are + done in SCRATCH_POOL. */ +static svn_error_t * +init_default_file_perms(void *baton, apr_pool_t *scratch_pool) +{ + apr_fileperms_t *default_perms = +((struct default_perms_baton *) baton)-default_perms; + apr_finfo_t finfo; + apr_file_t *fd; + + if (*default_perms != 0) +/* Nothing to do */ +return SVN_NO_ERROR; + + SVN_ERR(svn_io_open_uniquely_named(fd, NULL, NULL, svn-tempfile, .tmp, + svn_io_file_del_on_pool_cleanup, + scratch_pool, scratch_pool)); + SVN_ERR(svn_io_file_info_get(finfo, APR_FINFO_PROT, fd, scratch_pool)); + SVN_ERR(svn_io_file_close(fd, scratch_pool)); + *default_perms = finfo.protection; + + return SVN_NO_ERROR; +} + /* Determine what the PERMS for a new file should be by looking at the permissions of a temporary file that we create. Unfortunately, umask() as defined in POSIX provides no thread-safe way @@ -1310,46 +1343,13 @@ reown_file(const char *path, static svn_error_t * get_default_file_perms(apr_fileperms_t *perms, apr_pool_t *scratch_pool) { - /* the default permissions as read from the temp folder */ static apr_fileperms_t default_perms = 0; + struct default_perms_baton baton; - /* Technically, this racy: Multiple threads may use enter here and - try to figure out the default permisission concurrently. That's fine - since they will end up with the same results. Even more technical, - apr_fileperms_t is an atomic type on 32+ bit machines. - */ - if (default_perms == 0) -{ - apr_finfo_t finfo; - apr_file_t *fd; - - /* Get the perms for a newly created file to find out what bits -should be set. - -Normally del_on_close can be problematic because APR might -delete the file if we spawned any child processes. In this -case, the lifetime of this file handle is about 3 lines of -code, so we can safely use del_on_close here. - -Not so fast! If some other thread forks off a child, then the -APR cleanups run, and the file will disappear. So use -del_on_pool_cleanup instead. - -Using svn_io_open_uniquely_named() here because other tempfile -creation functions tweak the permission bits of files they create. - */ - SVN_ERR(svn_io_open_uniquely_named(fd, NULL, NULL, svn-tempfile, .tmp, -svn_io_file_del_on_pool_cleanup, -scratch_pool, scratch_pool)); - SVN_ERR(svn_io_file_info_get(finfo, APR_FINFO_PROT, fd, scratch_pool)); - SVN_ERR(svn_io_file_close(fd, scratch_pool)); - - *perms = finfo.protection; - default_perms = finfo.protection; -} - else -*perms = default_perms; - + baton.default_perms = default_perms; + SVN_ERR(svn_atomic__init_once(perms_init_state, init_default_file_perms, +baton, scratch_pool)); + *perms = default_perms; return SVN_NO_ERROR; }
Re: svn commit: r1004286 - in /subversion/trunk: ./ subversion/libsvn_subr/io.c
Hi Julian and Daniel, Julian Foad writes: Just a few comments on the implementation from me... Thanks for the wonderful review! I've also included a few of Daniel's suggestions (on IRC)- I hope I've got it right this time. I'm not sure the volatile qualifier is necessary either- I'm just using it for extra safety, making sure that no compiler optimizations mangle it. [[[ Determine default perms in an elegant thread-safe way, not racily. * subversion/libsvn_subr/io.c (default_perms_baton, perms_init_state): New struct, variable. (get_default_file_perms): Remove all functional code into init_default_file_perms, and use apr_atomic__init_once so that code is only executed once. (init_default_file_perms): New function to determine default file permissions by creating a temporary file and extracting permissions from it. Default permissions are not determined racily anymore, since this function is an apr_atomic__init_once callback. ]]] Index: subversion/libsvn_subr/io.c === --- subversion/libsvn_subr/io.c (revision 1057166) +++ subversion/libsvn_subr/io.c (working copy) @@ -1300,6 +1300,39 @@ reown_file(const char *path, return svn_error_return(svn_io_remove_file2(unique_name, FALSE, pool)); } +static volatile svn_atomic_t perms_init_state = 0; +struct default_perms_baton +{ + volatile apr_fileperms_t *default_perms; +}; + + +/* Helper function to discover default file permissions; it does this + by creating a temporary file and extracting the permissions from + it. Passed to svn_atomic__init_once. All temporary allocations are + done in SCRATCH_POOL. */ +static svn_error_t * +init_default_file_perms(void *baton, apr_pool_t *scratch_pool) +{ + volatile apr_fileperms_t *default_perms = +((struct default_perms_baton *) baton)-default_perms; + apr_finfo_t finfo; + apr_file_t *fd; + + if (*default_perms != 0) +/* Nothing to do */ +return SVN_NO_ERROR; + + SVN_ERR(svn_io_open_uniquely_named(fd, NULL, NULL, svn-tempfile, .tmp, + svn_io_file_del_on_pool_cleanup, + scratch_pool, scratch_pool)); + SVN_ERR(svn_io_file_info_get(finfo, APR_FINFO_PROT, fd, scratch_pool)); + SVN_ERR(svn_io_file_close(fd, scratch_pool)); + *default_perms = finfo.protection; + + return SVN_NO_ERROR; +} + /* Determine what the PERMS for a new file should be by looking at the permissions of a temporary file that we create. Unfortunately, umask() as defined in POSIX provides no thread-safe way @@ -1310,46 +1343,13 @@ reown_file(const char *path, static svn_error_t * get_default_file_perms(apr_fileperms_t *perms, apr_pool_t *scratch_pool) { - /* the default permissions as read from the temp folder */ - static apr_fileperms_t default_perms = 0; + static volatile apr_fileperms_t default_perms = 0; + struct default_perms_baton baton; - /* Technically, this racy: Multiple threads may use enter here and - try to figure out the default permisission concurrently. That's fine - since they will end up with the same results. Even more technical, - apr_fileperms_t is an atomic type on 32+ bit machines. - */ - if (default_perms == 0) -{ - apr_finfo_t finfo; - apr_file_t *fd; - - /* Get the perms for a newly created file to find out what bits -should be set. - -Normally del_on_close can be problematic because APR might -delete the file if we spawned any child processes. In this -case, the lifetime of this file handle is about 3 lines of -code, so we can safely use del_on_close here. - -Not so fast! If some other thread forks off a child, then the -APR cleanups run, and the file will disappear. So use -del_on_pool_cleanup instead. - -Using svn_io_open_uniquely_named() here because other tempfile -creation functions tweak the permission bits of files they create. - */ - SVN_ERR(svn_io_open_uniquely_named(fd, NULL, NULL, svn-tempfile, .tmp, -svn_io_file_del_on_pool_cleanup, -scratch_pool, scratch_pool)); - SVN_ERR(svn_io_file_info_get(finfo, APR_FINFO_PROT, fd, scratch_pool)); - SVN_ERR(svn_io_file_close(fd, scratch_pool)); - - *perms = finfo.protection; - default_perms = finfo.protection; -} - else -*perms = default_perms; - + baton.default_perms = default_perms; + SVN_ERR(svn_atomic__init_once(perms_init_state, init_default_file_perms, +(void *) baton, scratch_pool)); + *perms = default_perms; return SVN_NO_ERROR; }
Re: svn commit: r1004286 - in /subversion/trunk: ./ subversion/libsvn_subr/io.c
Hi Daniel, Daniel Shahaf writes: If you send another patch that indents/dedents a block, could you please attach a 'svn diff -x-w' version of the patch too? It would make reviewing easier. Sure. Here's the (hopefully) final patch. Sorry about the slopiness earlier -- I was in a hurry to get the concept right. diff -x-w version (for review): Index: subversion/libsvn_subr/io.c === --- subversion/libsvn_subr/io.c (revision 1056662) +++ subversion/libsvn_subr/io.c (working copy) @@ -1299,56 +1299,47 @@ return svn_error_return(svn_io_remove_file2(unique_name, FALSE, pool)); } -/* Determine what the PERMS for a new file should be by looking at the - permissions of a temporary file that we create. - Unfortunately, umask() as defined in POSIX provides no thread-safe way - to get at the current value of the umask, so what we're doing here is - the only way we have to determine which combination of write bits - (User/Group/World) should be set by default. - Make temporary allocations in SCRATCH_POOL. */ +static volatile apr_fileperms_t default_perms = 0; +static volatile svn_atomic_t perms_init_state; + +/* Helper function to discover default file permissions; it does this + by creating a temporary file and extracting the permissions from + it. Passed to svn_atomic__init_once. All temporary allocations are + done in SCRATCH_POOL. */ static svn_error_t * -get_default_file_perms(apr_fileperms_t *perms, apr_pool_t *scratch_pool) +get_tmpfile_perms(void *baton, apr_pool_t *scratch_pool) { - /* the default permissions as read from the temp folder */ - static apr_fileperms_t default_perms = 0; - - /* Technically, this racy: Multiple threads may use enter here and - try to figure out the default permisission concurrently. That's fine - since they will end up with the same results. Even more technical, - apr_fileperms_t is an atomic type on 32+ bit machines. - */ - if (default_perms == 0) -{ + apr_fileperms_t *perms = (apr_fileperms_t *) baton; apr_finfo_t finfo; apr_file_t *fd; - /* Get the perms for a newly created file to find out what bits -should be set. + if (*perms != 0) +/* Nothing to do */ +return SVN_NO_ERROR; -Normally del_on_close can be problematic because APR might -delete the file if we spawned any child processes. In this -case, the lifetime of this file handle is about 3 lines of -code, so we can safely use del_on_close here. - -Not so fast! If some other thread forks off a child, then the -APR cleanups run, and the file will disappear. So use -del_on_pool_cleanup instead. - -Using svn_io_open_uniquely_named() here because other tempfile -creation functions tweak the permission bits of files they create. - */ SVN_ERR(svn_io_open_uniquely_named(fd, NULL, NULL, svn-tempfile, .tmp, svn_io_file_del_on_pool_cleanup, scratch_pool, scratch_pool)); SVN_ERR(svn_io_file_info_get(finfo, APR_FINFO_PROT, fd, scratch_pool)); SVN_ERR(svn_io_file_close(fd, scratch_pool)); - *perms = finfo.protection; - default_perms = finfo.protection; + + return SVN_NO_ERROR; } - else -*perms = default_perms; +/* Determine what the PERMS for a new file should be by looking at the + permissions of a temporary file that we create. + Unfortunately, umask() as defined in POSIX provides no thread-safe way + to get at the current value of the umask, so what we're doing here is + the only way we have to determine which combination of write bits + (User/Group/World) should be set by default. + Make temporary allocations in SCRATCH_POOL. */ +static svn_error_t * +get_default_file_perms(apr_fileperms_t *perms, apr_pool_t *scratch_pool) +{ + SVN_ERR(svn_atomic__init_once(perms_init_state, get_tmpfile_perms, +(void *) default_perms, scratch_pool)); + *perms = default_perms; return SVN_NO_ERROR; } @@ -1360,9 +1351,9 @@ apr_pool_t *scratch_pool) { apr_finfo_t finfo; - apr_fileperms_t default_perms; - SVN_ERR(get_default_file_perms(default_perms, scratch_pool)); + SVN_ERR(get_default_file_perms((apr_fileperms_t *) default_perms, + scratch_pool)); SVN_ERR(svn_io_file_info_get(finfo, APR_FINFO_PROT, fd, scratch_pool)); /* Glom the perms together. */ --8-- [[[ Determine default perms in an elegant thread-safe way, not racily * subversion/libsvn_subr/io.c (): Add two static volatile global variables to be used for storing permissions: default_perms and perms_init_state. (get_default_file_perms): Remove all functional code into get_tmpfile_perms, and use apr_atomic__init_once so that code is only executed once. (get_tmpfile_perms): New function to
Re: [PATCH] Makefile.svn: Make stock SVN executable configurable
Hi Stefan, Stefan Sperling writes: LD_LIBRARY_PATH in the Makefile is only used by targets that are supposed to run the executables compiled by the Makefile, and need to *dynamically* load some libraries at run time (e.g. bindings) via dlopen(). This doesn't affect the checkout/export steps at all. They don't set or require any special LD_LIBRARY_PATH. Very odd. When executing from the Makefile script, `svn` kept complaining about APR not being compiled with threading support (because it picked up the APR in ~/svn/prefix), but when I executed the same command plainly from command-line, it worked fine. It sounds like you had LD_LIBRARY_PATH set in your environment. Why did you have it set? Maybe it's because the Linux runtime linker doesn't find the svn libraries installed in ~/svn/prefix otherwise? If so, that's not a problem with the Makefile. The binaries installed in your home dir should be linked with -rpath by libtool to make sure that they find their libraries in the right place. Then you won't have to set LD_LIBRARY_PATH at all if all you want to do is run an svn trunk binary. If there is anything to fix here, linking with -rpath is what you'd need to look at. Hm, but I didn't have LD_LIBRARY_PATH set. That's what's odd. The trunk svn works fine -- when I add ~/svn/prefix/svn-trunk/bin to $PATH, `svn` picks up the right APR and just works. Thanks for the help in diagnosing the problem -- I'll read a little more about the Linux linker's behavior and try to find out what the real problem is. -- Ram
[PATCH] Makefile.svn: Make stock SVN executable configurable
[[[ Makefile.svn: Make stock SVN binary configurable * tools/dev/unix-build/Makefile.svn: Introduce a new variable $STOCK_SVN that defaults to `env LD_LIBRARY_PATH= svn`. Change references to `svn export` and `svn co` to use $STOCK_SVN. ]]] Index: tools/dev/unix-build/Makefile.svn === --- tools/dev/unix-build/Makefile.svn (revision 1056662) +++ tools/dev/unix-build/Makefile.svn (working copy) @@ -10,6 +10,9 @@ ENABLE_JAVA_BINDINGS ?= no # they don't build with USE_APR_ICONV ?= no # set to yes to use APR iconv instead of GNU iconv PARALLEL ?= PARALLEL=1 CLEANUP=1 +# Assumes that `svn` is in $PATH and uses distribution APR libraries +STOCK_SVN = env LD_LIBRARY_PATH= svn + PWD= $(shell pwd) UNAME = $(shell uname) @@ -230,7 +233,7 @@ apr-clean: $(APR_OBJDIR)/.retrieved: [ -d $(APR_OBJDIR) ] || mkdir -p $(APR_OBJDIR) if [ ! -d $(APR_SRCDIR) ]; then \ - svn export $(APR_URL)/tags/$(APR_VER)/ $(APR_SRCDIR); \ + $(STOCK_SVN) export $(APR_URL)/tags/$(APR_VER)/ $(APR_SRCDIR); \ fi touch $@ @@ -426,7 +429,7 @@ apr-util-clean: $(APR_UTIL_OBJDIR)/.retrieved: [ -d $(APR_UTIL_OBJDIR) ] || mkdir -p $(APR_UTIL_OBJDIR) if [ ! -d $(APR_UTIL_SRCDIR) ]; then \ - svn export $(APR_UTIL_URL)/tags/$(APR_UTIL_VER)/ \ + $(STOCK_SVN) export $(APR_UTIL_URL)/tags/$(APR_UTIL_VER)/ \ $(APR_UTIL_SRCDIR); \ fi touch $@ @@ -622,7 +625,7 @@ serf-clean: $(SERF_OBJDIR)/.retrieved: [ -d $(SERF_OBJDIR) ] || mkdir -p $(SERF_OBJDIR) if [ ! -d $(SERF_SRCDIR) ]; then \ - svn export $(SERF_URL) $(SERF_SRCDIR); \ + $(STOCK_SVN) export $(SERF_URL) $(SERF_SRCDIR); \ fi touch $@ @@ -663,7 +666,7 @@ serf-old-clean: $(SERF_OLD_OBJDIR)/.retrieved: [ -d $(SERF_OLD_OBJDIR) ] || mkdir -p $(SERF_OLD_OBJDIR) if [ ! -d $(SERF_OLD_SRCDIR) ]; then \ - svn export $(SERF_OLD_URL) $(SERF_OLD_SRCDIR); \ + $(STOCK_SVN) export $(SERF_OLD_URL) $(SERF_OLD_SRCDIR); \ fi touch $@ @@ -842,7 +845,7 @@ $(SVN_OBJDIR)/.retrieved: co=co; \ fi; \ if [ ! -d $(SVN_WC) ] [ ! -h $(SVN_WC) ]; then \ - svn $${co} $(SUBVERSION_REPOS_URL)/$${branchdir} \ + $(STOCK_SVN) $${co} $(SUBVERSION_REPOS_URL)/$${branchdir} \ $(SVN_WC); \ fi touch $@
Re: svn commit: r1004286 - in /subversion/trunk: ./ subversion/libsvn_subr/io.c
Hi Daniel, Daniel Shahaf writes: Ping, this doesn't seem to have been fixed yet? Thanks for the ping, and sorry about the delay. I think this should work -- I'll write a commit message if you think this is okay. Index: subversion/libsvn_subr/io.c === --- subversion/libsvn_subr/io.c (revision 1056662) +++ subversion/libsvn_subr/io.c (working copy) @@ -1299,59 +1299,64 @@ reown_file(const char *path, return svn_error_return(svn_io_remove_file2(unique_name, FALSE, pool)); } -/* Determine what the PERMS for a new file should be by looking at the - permissions of a temporary file that we create. - Unfortunately, umask() as defined in POSIX provides no thread-safe way - to get at the current value of the umask, so what we're doing here is - the only way we have to determine which combination of write bits - (User/Group/World) should be set by default. - Make temporary allocations in SCRATCH_POOL. */ +/* the default permissions as read from the temp folder */ +static volatile apr_fileperms_t default_perms = 0; +static volatile svn_atomic_t perms_init_state; + +/* Helper function to set default permissions. Passed to svn_atomic__init_once */ static svn_error_t * -get_default_file_perms(apr_fileperms_t *perms, apr_pool_t *scratch_pool) +set_default_perms(void *baton, apr_pool_t *scratch_pool) { - /* the default permissions as read from the temp folder */ - static apr_fileperms_t default_perms = 0; + apr_fileperms_t *default_perms = (apr_fileperms_t *) baton; - /* Technically, this racy: Multiple threads may use enter here and - try to figure out the default permisission concurrently. That's fine - since they will end up with the same results. Even more technical, - apr_fileperms_t is an atomic type on 32+ bit machines. - */ - if (default_perms == 0) -{ - apr_finfo_t finfo; - apr_file_t *fd; + if (default_perms != 0) +/* Nothing to do */ +return SVN_NO_ERROR; - /* Get the perms for a newly created file to find out what bits -should be set. + apr_finfo_t finfo; + apr_file_t *fd; -Normally del_on_close can be problematic because APR might -delete the file if we spawned any child processes. In this -case, the lifetime of this file handle is about 3 lines of -code, so we can safely use del_on_close here. + /* Get the perms for a newly created file to find out what bits + should be set. -Not so fast! If some other thread forks off a child, then the -APR cleanups run, and the file will disappear. So use -del_on_pool_cleanup instead. + Normally del_on_close can be problematic because APR might + delete the file if we spawned any child processes. In this + case, the lifetime of this file handle is about 3 lines of + code, so we can safely use del_on_close here. -Using svn_io_open_uniquely_named() here because other tempfile -creation functions tweak the permission bits of files they create. - */ - SVN_ERR(svn_io_open_uniquely_named(fd, NULL, NULL, svn-tempfile, .tmp, -svn_io_file_del_on_pool_cleanup, -scratch_pool, scratch_pool)); - SVN_ERR(svn_io_file_info_get(finfo, APR_FINFO_PROT, fd, scratch_pool)); - SVN_ERR(svn_io_file_close(fd, scratch_pool)); + Not so fast! If some other thread forks off a child, then the + APR cleanups run, and the file will disappear. So use + del_on_pool_cleanup instead. - *perms = finfo.protection; - default_perms = finfo.protection; -} - else -*perms = default_perms; + Using svn_io_open_uniquely_named() here because other tempfile + creation functions tweak the permission bits of files they create. + */ + SVN_ERR(svn_io_open_uniquely_named(fd, NULL, NULL, svn-tempfile, .tmp, + svn_io_file_del_on_pool_cleanup, + scratch_pool, scratch_pool)); + SVN_ERR(svn_io_file_info_get(finfo, APR_FINFO_PROT, fd, scratch_pool)); + SVN_ERR(svn_io_file_close(fd, scratch_pool)); + default_perms = finfo.protection; return SVN_NO_ERROR; } +/* Determine what the PERMS for a new file should be by looking at the + permissions of a temporary file that we create. + Unfortunately, umask() as defined in POSIX provides no thread-safe way + to get at the current value of the umask, so what we're doing here is + the only way we have to determine which combination of write bits + (User/Group/World) should be set by default. + Make temporary allocations in SCRATCH_POOL. */ +static svn_error_t * +get_default_file_perms(apr_fileperms_t *perms, apr_pool_t *scratch_pool) +{ + SVN_ERR(svn_atomic__init_once(perms_init_state, set_default_perms, +default_perms, scratch_pool)); + *perms = default_perms; +
Re: [PATCH] Makefile.svn: Make stock SVN executable configurable
Hi Stefan, Stefan Sperling writes: On Sat, Jan 08, 2011 at 02:33:46PM +0530, Ramkumar Ramachandra wrote: [[[ Makefile.svn: Make stock SVN binary configurable * tools/dev/unix-build/Makefile.svn: Introduce a new variable $STOCK_SVN that defaults to `env LD_LIBRARY_PATH= svn`. Change references to `svn export` and `svn co` to use $STOCK_SVN. ]]] Hi Ram, Why don't you just put the svn binary you want to run into the first entry of $PATH? I did that, but it still doesn't solve the problem*: It picks up the wrong APR libraries because Makefile.svn sets LD_LIBRARY_PATH in several places. * Actually I encountered the problem before I had a compiled version of SVN. I'd only just compiled the APR libraries. -- Ram
Re: New VM for committers
Hi Lieven, Lieven Govaerts writes: Yes, what will you be using it for? Any specific ideas? I originally requested it to run callgraph and heap profilers on svnrdump. I suppose others can use it to plug leaks and tune the performance of several other Subversion components. -- Ram
[PATCH] Fix Makefile.svn to build APR with threads
Hi, Here's a quick patch to fix Makefile.svn. I need a +1 to commmit this. Thanks. [[[ Makefile.svn: Fix build * tools/dev/unix-build/Makefile.svn: Configure APR to build with threads enabled to prevent breaking the Subversion build. ]]] Index: tools/dev/unix-build/Makefile.svn === --- tools/dev/unix-build/Makefile.svn (revision 1040428) +++ tools/dev/unix-build/Makefile.svn (working copy) @@ -238,8 +238,7 @@ cd $(APR_OBJDIR) \ env CFLAGS=-O0 -g $(APR_SRCDIR)/configure \ --prefix=$(PREFIX)/apr \ - --enable-maintainer-mode \ - --disable-threads + --enable-maintainer-mode touch $@ # compile apr
Re: If RCS can stand it, why can ’t your system?
Hi Eric, Greg Stein writes: We have an import/export format that has existed since svn 1.0, and that is our dumpfile format. I do not think we are required or need to be shamed into supporting additional formats. And for what it's worth, a *git* GSoC student wrote a remote dumpfile generator to feed into git's fast-import. We gave the guy commit access to *our* trunk and let him build the tool. Yep, that's completely true. I am that GSoC student, and subversion/svnrdump is complete. The Subversion community was supportive of it. And yes, I wrote both dump and load functionality. So there is some factual inaccuracy in your statement: Subversion is typical in that it has a proof-of-concept third-party exporter that loses some metadata, and no importer We have also finished building a converter from dumpfilev3 - git fast-import stream, and are currently working on building the reverse. That way, there should be no bias and people should be able to move into Subversion as easily as they can move out of it. At most, you could criticize the fact that we weren't able to build it earlier/ faster. Yes, it was a very non-trivial task (atleast for me). Also: we have extended our diff generation to support git's diff markers. Indeed. Daniel did this as part of another GSoC project. `svn diff --git` works perfectly now. In short, there is NO WAY anybody can say we have something against git. If somebody came along with code to deal with git's streams, then we'd consider it, and the utility it may provide over our own dumpfile format. Since Subversion already has so much infrastructure to deal with deltified dumpfile, I'd vote to keep that as the native import/export format. -- Ram
Re: If RCS can stand it, why can ’t your system?
Hi Eric, Eric Raymond writes: Ramkumar Ramachandra artag...@gmail.com: Yep, that's completely true. I am that GSoC student, and subversion/svnrdump is complete. The Subversion community was supportive of it. And yes, I wrote both dump and load functionality. So there is some factual inaccuracy in your statement: Subversion is typical in that it has a proof-of-concept third-party exporter that loses some metadata, and no importer Thanks for the information. Where can I find documentation on this capability? I'm particularly interested in how it handles the two issues I mentioned - mapping between Subversion's server-loxcal commit identities and import=-stream's full addresses, and mapping between Subversion merge-tracking info and the input-stream's commit-ancestry fields. `svnrdump (dump|load)` behaves almost exactly like `svnadmin (dump|load)` over the network. The conversion to git fast-import is merged into git.git: the infrastructure is in ^/vcs-svn and a small end-user tool is available in ^/contrib/svn-fe -- usage information is in svn-fe.txt: http://git.kernel.org/?p=git/git.git;a=blob;f=contrib/svn-fe/svn-fe.txt It currently doesn't handle any of the issues you mentioned- we haven't built the mapper yet, and it's just a dumb converter at the moment. We are currently working on some plumbing to support incremental imports, and build a remote helper. After that, we will work on the mapper and the git fast-import - dumpfile v3 converter to feed `svnrdump load`. Also, there has been a lot of discussion about the mapper on the mailing list, and here's one particularly huge thread: http://thread.gmane.org/gmane.comp.version-control.git/158940/focus=159054 Hope that helps :) Since Subversion already has so much infrastructure to deal with deltified dumpfile, I'd vote to keep that as the native import/export format. That decision is orthogonal to what I'm interested in, and I have no position on it. Ok. -- Ram
Re: If RCS can stand it, why can ’t your system?
Hi, Eric Raymond writes: Ramkumar Ramachandra artag...@gmail.com: `svnrdump (dump|load)` behaves almost exactly like `svnadmin (dump|load)` over the network. What do you mean 'over the network'? Can I get a exported stream on stdout from the dump mode? Can I feed a stream on stdin to the load mode? Yep, it does *exactly* that :) $ svnrdump help dump dump: usage: svnrdump dump URL [-r LOWER[:UPPER]] Dump revisions LOWER to UPPER of repository at remote URL to stdout in a 'dumpfile' portable format. If only LOWER is given, dump that one revision. $ svnrdump help load load: usage: svnrdump load URL Load a 'dumpfile' given on stdin to a repository at remote URL. http://git.kernel.org/?p=git/git.git;a=blob;f=contrib/svn-fe/svn-fe.txt Looks like svnadmin dump --incremental REPO | svn-fe the export command reposurgeon needs, but I'm not clear what the corresponding import command would be. Yes, that is correct. You might consider using svnrdump for local repositories as well though- just use 'file://path' in place of 'path'; it's faster than svnadmin in some bechmarks. You can import once we finish building the git fast-import stream - dumpfile converter :) -- Ram
[PATCH] Include a reference to Makefile.svn in HACKING
Hi, I didn't know what else to write, so it's fairly short :p Thanks. [[[ community-guide: Add new section referring to ^/tools/dev/unix-build * publish/docs/community-guide/building.part.html: Add a new quickstart section pointing users to Makefile.svn * publish/docs/community-guide/building.toc.html: Update the table of contents to include this new section. ]]] Index: publish/docs/community-guide/building.toc.html === --- publish/docs/community-guide/building.toc.html (revision 1040464) +++ publish/docs/community-guide/building.toc.html (working copy) @@ -1,5 +1,6 @@ !--#if expr='$DOCUMENT_NAME = building.html || $DOCUMENT_NAME = community-guide.html' -- ul + lia href=#quickstartQuickly setting up a build environment under unix/a/li lia href=#configuryThe configuration/build system under unix/a/li lia href=#automated-testsAutomated tests/a/li lia href=#build-farmBuild farm/a/li @@ -7,6 +8,7 @@ /ul !--#else -- ul + lia href=building.html#quickstartQuickly setting up a build environment under unix/a/li lia href=building.html#configuryThe configuration/build system under unix/a/li lia href=building.html#automated-testsAutomated tests/a/li lia href=building.html#build-farmBuild farm/a/li Index: publish/docs/community-guide/building.part.html === --- publish/docs/community-guide/building.part.html (revision 1040464) +++ publish/docs/community-guide/building.part.html (working copy) @@ -6,6 +6,27 @@ !--#include virtual=building.toc.html -- +div class=h2 id=quickstart +h2Quickly setting up a build environment under unix + a class=sectionlink href=!--#echo var=GUIDE_BUILDING_PAGE --#quickstart +title=Link to this sectionpara;/a +/h2 + +pStefan Sperling wrote a nice script that starts from scratch to set +up a nice build environment under unix; it fetches, configures and +compiles the latest versions of dependencies as required. To start +development from scratch, simply +download a +href=http://svn.apache.org/repos/asf/subversion/trunk/tools/dev/unix-build/Makefile.svn; +Makefile.svn/a, rename it to `Makefile` and invoke `make`. For more +information, read the a +href=http://svn.apache.org/repos/asf/subversion/trunk/tools/dev/unix-build/READNE; +documentation/a supplied along with it. +/p + +/div !-- quickstart -- + + div class=h2 id=configury h2The configuration/build system under unix a class=sectionlink href=!--#echo var=GUIDE_BUILDING_PAGE --#configury
[PATCH] community-guide: Update buildbot section
Hi, I need a +1 to commit this. Also, please suggest a better link for the ASF Infra team (if there is one). Thanks. [[[ community-guide: Rewrite the Buildbot section * publish/docs/community-guide/building.part.html: Remove historical cruft and update with relevant information. ]]] Index: publish/docs/community-guide/building.part.html === --- publish/docs/community-guide/building.part.html (revision 1040482) +++ publish/docs/community-guide/building.part.html (working copy) @@ -278,32 +278,15 @@ title=Link to this sectionpara;/a /h2 -pLieven Govaerts has set up a -a href=http://buildbot.sourceforge.net/; BuildBot/a build/test -farm at a href=http://buildbot.subversion.org/buildbot/; -http://buildbot.subversion.org/buildbot//a, see this message:/p +pThe a href=http://apache.org/dev/#infrastructure;ASF Infra/a +team manages a a href=http://buildbot.sourceforge.net/;BuildBot/a +build/test farm. The Buildbot waterfall for the Subversion project is +located at a href=http://subversion.apache.org/buildbot/; +http://subversion.apache.org/buildbot//a. For more information +about build services, head over +to a href=http://ci.apache.org;ci.apache.org/a. +/p -pre - a href=http://subversion.tigris.org/servlets/ReadMsg?list=devamp;msgNo=114212;http://subversion.tigris.org/servlets/ReadMsg?list=devamp;msgNo=114212/a - (Thread URL: a href=http://subversion.tigris.org/servlets/BrowseList?list=devamp;by=threadamp;from=450110;http://subversion.tigris.org/servlets/BrowseList?list=devamp;by=threadamp;from=450110/a) - Message-ID: 20060326205918.f3c5070...@adicia.telenet-ops.be - From: Lieven Govaerts lt;l...@mobsol.begt; - To: lt;d...@subversion.tigris.orggt; - Subject: Update: Subversion build and test farm with Buildbot. - Date: Sun, 26 Mar 2006 22:56:11 +0200 -/pre - -pfor more details. (a href=http://buildbot.sourceforge.net/; -BuildBot/a is a system for centrally managing multiple automated -testing environments; it's especially useful for portability testing, -including of uncommitted changes.)/p - -p class=todoOnce the details of the build farm get finalized, -integrate those instructions here. Perhaps the information at -a href=https://svn.apache.org/repos/asf/subversion/trunk/www/build-farm.html?p=867958; -https://svn.apache.org/repos/asf/subversion/trunk/www/build-farm.html?p=867958/a -can serve as a good starting point?/p - /div !-- build-farm --
Re: [PATCH] Fix Makefile.svn to build APR with threads
Hi, Stefan Sperling writes: The first is that I use OpenBSD and OpenBSD's threading implementation is a pure userspace implementation. It has to set stdin and stdout to non-blocking -- otherwise, whenever a thread blocks for i/o in the kernel the userspace thread-scheduler would also be blocked and couldn't switch to another thread. However, non-blocking stdio makes it very inconvenient to do things like svn diff | less because less expects blocking reads. The lines get garbled when you start scrolling in less. (Kernel thread support is being worked on by OpenBSD developers.) The second reason is that virtually nobody else is compiling and running without thread support in APR. This sometimes causes thread-less cases to be overlooked. Most recently I found a bug in the performance branch due to this. So it's good to have test coverage for the thread-less case. Because of this I will likely continue compiling APR thread-less even when OpenBSD finally gets kernel thread support. Thanks for the explanation! Unfortunately, I'm not able to get the build to break with threading disabled anymore :| Have to run a few more experiments before I can conclusively say what went wrong back then. Your patch is too short. It's missing at least one hunk. Because if threading is enabled for APR, you should also enable it for sqlite and any other dependencies affected. Ok, got it. Here's my second try. [[[ Makefile.svn: Optionally allow building with threading support * tools/dev/unix-build/Makefile.svn: Add a new ENABLE_THREADING variable to control whether APR and sqlite should be built with threading support. ]]] Index: tools/dev/unix-build/Makefile.svn === --- tools/dev/unix-build/Makefile.svn (revision 1040658) +++ tools/dev/unix-build/Makefile.svn (working copy) @@ -3,6 +3,7 @@ # WARNING: This may or may not work on your system. This Makefile is # an example, rather than a ready-made universal solution. +ENABLE_THREADING ?= no # OpenBSD doesn't have kernel threads for example ENABLE_PYTHON_BINDINGS ?= yes ENABLE_RUBY_BINDINGS ?= yes ENABLE_PERL_BINDINGS ?= yes @@ -241,12 +242,21 @@ | sed -e '/^.*APR_ADDTO(CPPFLAGS, \[-D_POSIX_THREADS\]).*$$/d' \ $(APR_SRCDIR)/build/apr_hints.m4 cd $(APR_SRCDIR) ./buildconf - cd $(APR_OBJDIR) \ -env CFLAGS=-O0 -g $(PROFILE_CFLAGS) \ - $(APR_SRCDIR)/configure \ - --prefix=$(PREFIX)/apr \ - --enable-maintainer-mode \ - --disable-threads + if [ $(ENABLE_THREADING) = yes ]; then \ + cd $(APR_OBJDIR) \ +env CFLAGS=-O0 -g $(PROFILE_CFLAGS) \ + $(APR_SRCDIR)/configure \ + --prefix=$(PREFIX)/apr \ + --enable-maintainer-mode \ + --enable-threads; \ + else \ + cd $(APR_OBJDIR) \ +env CFLAGS=-O0 -g $(PROFILE_CFLAGS) \ + $(APR_SRCDIR)/configure \ + --prefix=$(PREFIX)/apr \ + --enable-maintainer-mode \ + --disable-threads; \ + fi; touch $@ # compile apr @@ -706,12 +716,21 @@ # configure sqlite $(SQLITE_OBJDIR)/.configured: $(SQLITE_OBJDIR)/.retrieved - cd $(SQLITE_OBJDIR) \ -env CFLAGS=-g $(PROFILE_CFLAGS) \ - $(SQLITE_SRCDIR)/configure \ - --prefix=$(PREFIX)/sqlite \ - --disable-tcl \ - --disable-threadsafe + if [ $(ENABLE_THREADING) = yes ]; then \ + cd $(SQLITE_OBJDIR) \ +env CFLAGS=-g $(PROFILE_CFLAGS) \ + $(SQLITE_SRCDIR)/configure \ + --prefix=$(PREFIX)/sqlite \ + --disable-tcl \ + --enable-threadsafe; \ + else \ + cd $(SQLITE_OBJDIR) \ +env CFLAGS=-g $(PROFILE_CFLAGS) \ + $(SQLITE_SRCDIR)/configure \ + --prefix=$(PREFIX)/sqlite \ + --disable-tcl \ + --disable-threadsafe; \ + fi; touch $@ # compile sqlite
Re: [PATCH] Fix Makefile.svn to build APR with threads
Hi Stefan, Thanks for the suggestions. How about this? [[[ Makefile.svn: Optionally allow building with threading support * tools/dev/unix-build/Makefile.svn: Add a new THREADING variable to control whether APR and sqlite should be built with threading support. Suggested by: stsp Review by: stsp ]]] Index: tools/dev/unix-build/Makefile.svn === --- tools/dev/unix-build/Makefile.svn (revision 1040690) +++ tools/dev/unix-build/Makefile.svn (working copy) @@ -233,6 +233,12 @@ fi touch $@ +ifdef THREADING +THREADS_FLAG=--enable-threads +else +THREADS_FLAG=--disable-threads +endif + # configure apr $(APR_OBJDIR)/.configured: $(APR_OBJDIR)/.retrieved cp $(APR_SRCDIR)/build/apr_hints.m4 \ @@ -246,7 +252,7 @@ $(APR_SRCDIR)/configure \ --prefix=$(PREFIX)/apr \ --enable-maintainer-mode \ - --disable-threads + $(THREADS_FLAG) touch $@ # compile apr @@ -704,6 +710,12 @@ tar -C $(SRCDIR) -zxf $(DISTDIR)/$(SQLITE_DIST) touch $@ +ifdef THREADING +THREADSAFE_FLAG=--enable-threadsafe +else +THREADSAFE_FLAG=--disable-threadsafe +endif + # configure sqlite $(SQLITE_OBJDIR)/.configured: $(SQLITE_OBJDIR)/.retrieved cd $(SQLITE_OBJDIR) \ @@ -711,7 +723,7 @@ $(SQLITE_SRCDIR)/configure \ --prefix=$(PREFIX)/sqlite \ --disable-tcl \ - --disable-threadsafe + $(THREADSAFE_FLAG) touch $@ # compile sqlite
Re: Announcing reposurgeon, and requesting fast-import support.
Hi, [+CC: Daniel, for making me notice this email in the first place] Eric Raymond writes: My interest in tools for repository surgery has continued, and I recently spotted an opportunity in the increasing use of git-fast-import streams as a history-interchange format. I have written what I believe is the first *native* application for fast-import streams, a repository editor I call reposurgeon. Cool tool! Which brings me to my feature request. Please add native support for fast-export and fast-import to svndump. This would be a good idea in general, but my specific reason for wanting it is to enable reposurgeon to edit Subversion repositories. The export side is, of course, almost trivial. Proof of concept under MIT license is here: http://c133.org/code/svn-fast-export.c. It needs a bit of extension work around tags and branches; I won't belabor the obvious (and easily solvable) issues with those. There are two more substantive ones: With svnrdump (merged into Subversion trunk in subversion/svnrdump) and svn-fe (merged into git.git in contrib/svn-fe + vcs-svn/), it's possible to produce a fast-import stream from a remote repository without the need for any local mirroring. Unforutunately, svnrdump can only produce a deltified dumpfile v3, and the patch series that adds dumpfile v3 support to svn-fe hasn't been merged into git.git yet- you can pick up the branch `dumpfilev3` from David's repository http://github.com/barrbrain/git though. 1) Whatever merge-tracking hair you represent internally should be dumped 'as 'merge' commit properties. 2) User commit properties (e.g. those not in the svn: namespace) should be exported using the bzr properties extension, which reposurgeon handles now and which seems likely to make it into git core at some point. Syntax: property space NAME space VALUE-LENGTH space VALUE LF or, if the value is empty: property space NAME LF NAME and VALUE are utf8-encoded. The properties for each commit are sorted by the property name. Also note that an import stream actually containing commit-property declarations should have a line reading feature commit-properties before the first commit. Actually, the objective of svn-fe is to produce a conformant fast-import stream (so Git can import it into its object store): some information is lost in the process. Does reposurgeon require all the information, or can it operate on the stream that svn-fe produces? That brings us to another point: a fast-import stream is probably not the most faithful representation of a Subversion repository, and I think a dumpfile v3 fits this bill. Subversion already supports native export/ import of this format: svnadmin (dump|load) when mirrored locally and svnrdump (dump|load) when it's not :) The import side is less trivial, but given that you've already got internal representations for merge-tracking it shouldn't be too difficult either. http://www.kernel.org/pub/software/scm/git/docs/git-fast-import.html. svn-fe already supports converting a dumpfile v3 to a fast-import stream. Getting it to do the reverse shouldn't be too hard- we are already working on it :) Finally, I will note that I think this feature could be significant for Subversion's competitive posture. Because exporters are easy while importers are more difficult, supporting import streams only with exporters and only through sketchy third-party tools tends to encourage migration to git while discouraging migration away from it. Are you happy with having a combination of svnrdump and svn-fe for this, or do you think Subversion should natively support fast-import? I don't think it'll be very difficult to support natively, but it's kind of a hack because Subversion already has so much infrastructure to deal with dumpfile v3. p.s- I'm one of the students who did a GSoC project with Git this summer. If you recall, you even commented on the proposal I posted to the Git list :) svnrdump and svn-fe are the products of that same project. -- Ram
Re: Some tips on profiling
Hi Stefan, Stefan Fuhrmann writes: Hm. Works here (LINUX ggc 4.3.3 and Win32 MS VisualStudio 2010). I'll check this -- must be some issue at my end; saw Daniel's response too. Thanks. That would be awesome! Any weekend would do ;) :) -- Ram
Re: svn commit: r1004286 - in /subversion/trunk: ./ subversion/libsvn_subr/io.c
Hi Philip, Philip Martin writes: Most of the functions don't share any state between threads, so they have no atomic issues. svn_io_temp_dir is one of the few that does, so it uses svn_atomic__init_once. I think the file perms stuff should also use it. I see. Could you point me to some functions that share some state between threads or some sort of manual? I want to see how to do it right- I'm not satisfied with my current understanding. -- Ram
Re: svn commit: r1004286 - in /subversion/trunk: ./ subversion/libsvn_subr/io.c
Hi Philip, [sorry about the delayed reply; was ill] Philip Martin writes: Ramkumar Ramachandra artag...@gmail.com writes: Hm. I read up a little more about this, but what confuses me is- shouldn't the rest of the code already be needing this? I don't understand your questions. To what does rest of the code refer? What I meant is that the rest of the functions in io.c should also have to handle atomicity. I see svn_io_temp_dir using svn_atomic__init_once for example. Why are we re-thinking everything from scratch? To what does everything refer? Frankly, I don't understand this too well. Could the author of io.c (or other similar files) please correct this? -- Ram
Re: svn commit: r1004286 - in /subversion/trunk: ./ subversion/libsvn_subr/io.c
Hi Philip, [sorry about the delayed reply: I had a bad internet connection] Philip Martin writes: If we are going to use the APR atomic interface then the two reads should use apr_atomic_read32. It would be better to use svn_atomic__init_once. It's a clear indication that we are doing once only initialisation, so we don't need all the comments, and it avoids any problems related to the size of apr_fileperms_t. Also if enhancements are required (more memory barriers say) then svn_atomic__init_once is the place to do it. Hm. I read up a little more about this, but what confuses me is- shouldn't the rest of the code already be needing this? Why are we re-thinking everything from scratch? -- Ram
Re: svn commit: r1004286 - in /subversion/trunk: ./ subversion/libsvn_subr/io.c
Hi Bert and Julian, Bert Huijben writes: -Original Message- From: Julian Foad [mailto:julian.f...@wandisco.com] On Mon, 2010-10-11, Julian Foad wrote: http://stackoverflow.com/questions/54188/are-c-reads-and-writes-of- an-int-atomic. On an IA32 a correctly aligned address will be an atomic operation. [... otherwise... can't assume it is]. Sorry, I pressed Send too early. That's not the most important bit of information. (That paragraph talks mostly about = 32-bit CPUs, where of course there will be problems.) Bert explained to me on IRC that atomicity is not guaranteed even on = 32 bit architectures, and the highest-ranked answer on that web page agrees. I'm no expert in this so I'll go away now. Let me add that calling apr_atomic_set32() instead of the 'x = value' part of the pattern will fix this issue in the way that was documented by the comment in the code: all threads do the same thing and one of the results is left in the static variable. Another option is to use an atomic initialization function to initialize the value just once. I see; I don't think I know enough to comment. Bert: So does this solve the issue or did I misunderstand something? Index: subversion/libsvn_subr/io.c === --- subversion/libsvn_subr/io.c (revision 1005706) +++ subversion/libsvn_subr/io.c (working copy) @@ -1269,7 +1269,8 @@ static svn_error_t * get_default_file_perms(apr_fileperms_t *perms, apr_pool_t *scratch_pool) { /* the default permissions as read from the temp folder */ - static apr_fileperms_t default_perms = 0; + static apr_fileperms_t default_perms; + apr_atomic_set32(default_perms, 0); /* Technically, this racy: Multiple threads may use enter here and try to figure out the default permisission concurrently. That's fine @@ -1303,7 +1304,7 @@ get_default_file_perms(apr_fileperms_t *perms, apr SVN_ERR(svn_io_file_close(fd, scratch_pool)); *perms = finfo.protection; - default_perms = finfo.protection; + apr_atomic_set32(default_perms, finfo.protection); } else *perms = default_perms;
Re: [PATCH] Fix svnrdump test 27 on Windows
Hi Gavin, Gavin Beau Baumanis writes: Ping. This patch has received no comments. This patch has been committed. See r1003064- the directory baton is now allocated in `eb-pool`. I hoped that it would fix test 27, but it didn't- that is fixed in r1005035/ r1005050. Thanks for picking this up. -- Ram
Re: Some tips on profiling
Hi Stefan, Stefan Fuhrmann writes: I enabled it, but there's still some issue: subversion/svnadmin/main.c:1892: undefined reference to `svn_fs_get_cache_config' It builds here. Did you run autogen.sh before ./configure? Yep, I did. I tried it several times again; same issue. Is the build.conf broken? For the MD5 stuff, try r986459 (performance branch). It should eliminate 1 of the 3 MD5 calculations. Why doesn't STATUS document this and everything else consistently? Because there is no simple mapping rev-feature / improvement. In particular, there are a number of intermediate steps that were replaced by new code later on. There is no point in reviewing these earlier revisions but the newer ones can't be reviewed and merged on their own. Hence the integration branch for the first major change. Ah, I saw that. As soon as a larger number of patches got reviewed and merged, I will prepare further changes for integration. So far, nobody had free cycles to do the reviews. I'm being stretched really thin myself- I sometimes have to sacrifice several hours of sleep to keep up :| I'll try my best but I can't promise. Also, there's the additional overhead of having to wait for approvals- if I can't pull it off, I'll request a full committer to take over. I had the chance to check them out and apply them just now. They work as expected. I'll submit some (uneducated) patch reviews to the list and request a merge. I don't have access to the areas your patches touch. I really appreciate that. It would be great if someone had the time to review the 3 commits to the membuffer cache integration branch. The review should not require too much context knowledge. An in-depth review will take a full day or so (like for an average sized C++ class). Thanks for the estimate- Instead of jumping between classes and attempting to review it bit-by-bit, I'll try to allocate a Saturday or Sunday to this task. -- Ram
Re: [Merge request] Merge r985477 from performance branch
Hi Julian, Julian Foad writes: Hi Ram. I wasn't comfortable with giving a +1 for this change just then, but now I've satisfied myself. The only potential negative impact I can imagine is if a user has a very long-running instance of Subversion and is accustomed to Subversion tracking changes of umask. To such a user, this might be seen as a regression, but the impact on such a user is low. In all other respects, determining the permissions just once per execution is more correct as well as more efficient. +1 to merge, with minor tweaks if you wish. Thanks. Committed in r1004286. (Sorry for the delay: classes and labs) -- Ram
Re: svn commit: r1003924 - /subversion/trunk/subversion/svnrdump/dump_editor.c
Hi Bert, Bert Huijben writes: Do you really need a 'named' file here? The named file algorithm of svn_io_open_uniquely_named() is very slow compared to the 'just get me a temporary file' implemented by svn_io_open_unique_file3(). Especially when you need more than a few tempfiles with the same basename. (It tries the generated name itself, then it tries a 1 suffix, a 2 suffix, etc. Until it finds a name that doesn't exist) Ah, I see. The named temporary made debugging easier initially- I could easily see if the file was being removed or not, but I suppose it doesn't matter anymore. Fixed in r1003958. Thanks for the suggestion. -- Ram
Re: [Merge request] Merge r985477 from performance branch
Hi Julian, Julian Foad writes: Looks good to me. I wondered if it is safe in a long-running Subversion process, like TortoiseSvn or a Linux equivalent. It seems to me that it won't really matter much in practice. If someone changes their umask and finds that Subversion carries on creating files with the 'old' permissions that were in effect when Subversion was started... that's fine, as far as I'm concerned. Can I get an explicit +1 to commit this? I just want to get as many of Stefan's changes merged in quickly so that there's enough time before the 1.7 release to test them. Technical detail: How do I merge? I want to make some modifications to this patch before committing, but I want to preserve authorship. -- Ram
Re: [Merge request] Merge r985477 from performance branch
Hi Stefan, Stefan Sperling writes: Can I get an explicit +1 to commit this? I just want to get as many of Stefan's changes merged in quickly so that there's enough time before the 1.7 release to test them. Technical detail: How do I merge? cd svn-trunk-working-copy svn merge -c r985477 ^/subversion/branches/performance # edit if necessary svn commit # Take note of committed revision, e.g. rN. # I'd recommend doing the following to avoid a cyclic (aka reflective) merge. # The commit just made should not bounce back to the performance branch # when someone runs svn merge ^/subversion/trunk on that branch cd svn-performance-branch-working-copy svn merge --record-only -rN ^/subversion/trunk svn commit -m Block rN from being merged into the performance branch to avoid a cyclic merge If you don't understand why the record-only merge is necessary, see http://mail-archives.apache.org/mod_mbox/subversion-dev/201004.mbox/%3c20100406152724.gm19...@noel.stsp.name%3e Thanks for the elaborate explanation! :) I want to make some modifications to this patch before committing What kinds of modifications do you want to make? You'll need a +1 for those, too. Please mail the diff which shows the results of the merge plus your modifications. Just minor indentation/ comment changes (some mentioned in my review). Nothing functional. but I want to preserve authorship. r985477 lists stefan2 as author. There is no concept of authorship in svn beyond that. You'll be listed as author of the merge commit, and your log message should say which change you're merging (the mergeinfo will also say it, but it doesn't hurt to mention it in the commit message anyway). Ok, so when the build on trunk breaks, I'll be blamed, not stefan2- I think I can live with that. I'll also need to merge 5~10 more commits: will do shortly. My objective is to get whatever I think I can evaluate into trunk as quickly as possible so that they're well-tested and shipped with Subversion 1.7. I'm inexperienced, and will only attempt to evaluate a subset of patches that directly affect svnrdump: I would request someone else to pick up the other patches. -- Ram
Re: Some tips on profiling
Hi Stefan, Stefan Fuhrmann writes: You obviously don't have APR threads enabled. Thanks for finding this. Fixed in r1003430. I enabled it, but there's still some issue: subversion/svnadmin/main.c:1892: undefined reference to `svn_fs_get_cache_config' The server caches almost everything. So, the first (cold) run demonstrates the worst-case, the second run (hot) the best case. Real world performance depends on server memory and load. Ah. Thanks for the clarification. For the merge part: please review the integrate-membuffer branch (only 3 patches). You may also review and merge any of the patches listed in /branches/performance/STATUS. integrate-cache-membuffer, you mean? Thanks! Your emails contains references exactly to the patches I'm looking for :) For the MD5 stuff, try r986459 (performance branch). It should eliminate 1 of the 3 MD5 calculations. Why doesn't STATUS document this and everything else consistently? As for the temp files, there are some improvements listed in /branches/performance/STATUS. These would also reduce your system CPU time. I had the chance to check them out and apply them just now. They work as expected. I'll submit some (uneducated) patch reviews to the list and request a merge. I don't have access to the areas your patches touch. -- Ram
[Merge request] Merge r985477 from performance branch
Hi, I would like to get r985477 merged into trunk. I've applied and used it successfully and checked that all tests pass. Warning: I have no background knowledge. I'm just reviewing the patch as-it-is, because Stefan asked me to. [[[ r985477 | stefan2 | 2010-08-14 18:02:04 +0530 (Sat, 14 Aug 2010) | 9 lines Eliminate all file system access in get_default_file_perms() for all but the first execution. The only downside is that we don't detect FS permission changes made while the (client) process runs. Because such changes would cause race conditions and file I/O errors anyway, we are not make things worse by omitting those tests. * subversion/libsvn_subr/io.c (get_default_file_perms): store result in a singleton in the first run and bypass the FS access in all later runs I looked at this after reading the patch, and I must admit that I didn't get the idea from the log message: how about using static variable instead of singleton. ]]] Index: subversion/libsvn_subr/io.c === --- subversion/libsvn_subr/io.c (revision 985476) +++ subversion/libsvn_subr/io.c (revision 985477) @@ -1297,30 +1297,47 @@ reown_file(const char *path, static svn_error_t * get_default_file_perms(apr_fileperms_t *perms, apr_pool_t *scratch_pool) { - apr_finfo_t finfo; - apr_file_t *fd; + /* the default permissions as read from the temp folder */ + static apr_fileperms_t default_perms = 0; From the APR documentation, this is just an int32. Hm, you've got rid of the file description and finfo -- let's see how this works out. - /* Get the perms for a newly created file to find out what bits - should be set. + /* Technically, this racy: Multiple threads may use enter here and + try to figure out the default permisission concurrently. That's fine + since they will end up with the same results. Even more technical, + apr_fileperms_t is an atomic type on 32+ bit machines. + */ + if (default_perms == 0) +{ + apr_finfo_t finfo; + apr_file_t *fd; Ah, so you didn't remove them. - NOTE: normally del_on_close can be problematic because APR might - delete the file if we spawned any child processes. In this case, - the lifetime of this file handle is about 3 lines of code, so - we can safely use del_on_close here. + /* Get the perms for a newly created file to find out what bits +should be set. - NOTE: not so fast, shorty. if some other thread forks off a child, - then the APR cleanups run, and the file will disappear. sigh. +NOTE: normally del_on_close can be problematic because APR might + delete the file if we spawned any child processes. In this case, + the lifetime of this file handle is about 3 lines of code, so + we can safely use del_on_close here. Bogus diff due to re-indent. The only real addition is Get the perms for a newly created file to find out what bits should be set. - Using svn_io_open_uniquely_named() here because other tempfile - creation functions tweak the permission bits of files they create. - */ - SVN_ERR(svn_io_open_uniquely_named(fd, NULL, NULL, svn-tempfile, .tmp, - svn_io_file_del_on_pool_cleanup, - scratch_pool, scratch_pool)); - SVN_ERR(svn_io_file_info_get(finfo, APR_FINFO_PROT, fd, scratch_pool)); - SVN_ERR(svn_io_file_close(fd, scratch_pool)); +NOTE: not so fast, shorty. if some other thread forks off a child, + then the APR cleanups run, and the file will disappear. sigh. Ok. You've moved this comment down to say why the del_on_close idea in the previous comment is a bad idea. + *perms = finfo.protection; + default_perms = finfo.protection; Very simple patch. Basically, if the permissions were found in the last function call, simply return the found value, using a static variable for remembering it. Otherwise, re-calculate permissions. -- Ram
[PATCH] Fix svnrdump test 27 on Windows
Hi, This is an experimental patch to fix svnrdump test 27 dump: subdirectory with changes on root on Windows. It's experimental because it's based simply on a hunch, and I don't have Windows to test it. Could someone who has Windows test it for me? Thanks. Index: subversion/svnrdump/dump_editor.c === --- subversion/svnrdump/dump_editor.c (revision 1002954) +++ subversion/svnrdump/dump_editor.c (working copy) @@ -147,7 +147,7 @@ make_dir_baton(const char *path, new_db-copyfrom_rev = copyfrom_rev; new_db-added = added; new_db-written_out = FALSE; - new_db-deleted_entries = apr_hash_make(eb-pool); + new_db-deleted_entries = apr_hash_make(pool); return new_db; } @@ -375,7 +375,7 @@ open_root(void *edit_baton, eb-propstring = svn_stringbuf_create(, eb-pool); *root_baton = make_dir_baton(NULL, NULL, SVN_INVALID_REVNUM, - edit_baton, NULL, FALSE, pool); + edit_baton, NULL, FALSE, eb-pool); LDR_DBG((open_root %p\n, *root_baton)); return SVN_NO_ERROR; @@ -415,12 +415,14 @@ add_directory(const char *path, { struct dir_baton *pb = parent_baton; void *val; - struct dir_baton *new_db -= make_dir_baton(path, copyfrom_path, copyfrom_rev, pb-eb, pb, TRUE, pool); + struct dir_baton *new_db; svn_boolean_t is_copy; LDR_DBG((add_directory %s\n, path)); + new_db = make_dir_baton(path, copyfrom_path, copyfrom_rev, pb-eb, + pb, TRUE, pb-eb-pool); + /* Some pending properties to dump? */ SVN_ERR(dump_props(pb-eb, (pb-eb-dump_props), TRUE, pool)); @@ -483,7 +485,7 @@ open_directory(const char *path, } new_db = make_dir_baton(path, copyfrom_path, copyfrom_rev, pb-eb, pb, - FALSE, pool); + FALSE, pb-eb-pool); *child_baton = new_db; return SVN_NO_ERROR; }
Re: svn commit: r1002470 - /subversion/trunk/subversion/svnrdump/dump_editor.c
Hi Daniel, Daniel Shahaf writes: So, you use a top-level pool but never destroy it? That's not good. Can you please find another solution? (Either get the caller to guarantee something about the lifetime of the pool they provide (there is precedent for this), or figure out why close_edit() isn't called (and possibly patch the faulty driver). Fixed in r1002502 :) Thanks for the review. -- Ram
Re: svn commit: r1002503 - /subversion/trunk/subversion/svnrdump/dump_editor.c
Hi Daniel, Daniel Shahaf writes: == --- subversion/trunk/subversion/svnrdump/dump_editor.c (original) +++ subversion/trunk/subversion/svnrdump/dump_editor.c Wed Sep 29 07:52:55 2010 @@ -143,8 +143,7 @@ make_dir_baton(const char *path, new_db-eb = eb; new_db-parent_dir_baton = pb; new_db-abspath = abspath; - new_db-copyfrom_path = copyfrom_path ? -apr_pstrdup(pool, copyfrom_path) : NULL; + new_db-copyfrom_path = copyfrom_path; Does this function now assume that COPYFROM_PATH has a certain lifetime? If so, should that assumption go in the docstring? The function simply sets a parameter: there are many functions that assume many things, so I think it would be best to document it in the struct itself. Thanks for this review :) -- Ram
Re: Some tips on profiling
Hi Philip, Philip Martin writes: The performance of svnrdump is likely to be dominated by IO from the repository, network or disk depending on the RA layer. strace is a useful tool to see opens/reads/writes. You can see what order the calls occur, how many there are, how big they are and how long they take. Ah, thanks for the tip. Valgrind/Callgrind is good and doesn't require you to instrument the code, but it does help to build with debug information. It does impose a massive runtime overhead. I don't mind -- I'm mostly using some remote machines to gather the profiling data :) This is what I get when dumping 1000 revisions from a local mirror of the Subversion repository over ra_neon: CPU: Core 2, speed 1200 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 10 samples %app name symbol name 4738 41.1893 no-vmlinux (no symbols) 1037 9.0150 libxml2.so.2.6.32(no symbols) 700 6.0854 libneon.so.27.1.2(no symbols) 238 2.0690 libc-2.7.so _int_malloc 228 1.9821 libc-2.7.so memcpy 221 1.9212 libc-2.7.so memset 217 1.8865 libc-2.7.so strlen 191 1.6604 libsvn_subr-1.so.0.0.0 decode_bytes 180 1.5648 libc-2.7.so vfprintf 171 1.4866 libc-2.7.so strcmp 153 1.3301 libapr-1.so.0.2.12 apr_hashfunc_default 134 1.1649 libapr-1.so.0.2.12 apr_vformatter 130 1.1301 libapr-1.so.0.2.12 apr_palloc That's on my Debian desktop. At the recent Apache Retreat I tried to demonstrate OProfile on my Ubuntu laptop and could not get it to work properly, probably because I forgot about -fno-omit-frame-pointer. Ah, now I see why it didn't work for me. The data from Callgrind is very interesting- it seems to suggest that APR hashtables are prohibitively expensive. @Stefan: Thoughts on hacking APR hashtables directly? Finally there is traditional gprof. It's a long time since I used it so I don't remember the details. You instrument the code at compile time using CFLAGS=-pg. If an instrumented function foo calls into a library bar that is not instrumented then bar is invisible, all you see is how long foo took to execute. Yes, I used gprof initially. Callgrind is WAY more useful. -- Ram
What happened to svndiff1
Hi, I'd like to know what happened to svndiff version 1 format. All the specs and code including notes/dump-load-format.txt seem to refer only to svndiff0. -- Ram
Re: Some tips on profiling
Hi Stefan, Stefan Fuhrmann writes: My measurements seem to support what Philip wrote: The expensive part is run on the server. Even with my optimized server, the svnrdump CPU usage is less than the time taken by the server. Some numbers (hot file cache): svnadmin dump 1.7 trunk 70s real 66s user 4s system perf-branch 30s real 28s user 2s system 1.7 trunk svnrdump ra-local 88s real 81s user 7s system svn: (1.7 trunk) 99s real 6s user 4s system svn: (perf-branch, cold) 72s real 5s user 6s system svn: (perf-branch, hot) 17s real 5s user 5s system Thus, svnrdump is slower only for ra-local where it is of no particular use in the first place. To really speed things up, the caching infrastructure from the performance branch should be merged into /trunk. Wow, that looks awesome. Yeah, Daniel suggested that I run svnrdump against your perf branch yesterday, but I wasn't able to get your branch to build: subversion/libsvn_subr/svn_file_handle_cache.c:890: error: 'svn_file_handle_cache_t' has no member named 'mutex' What exactly are the changes that hot introduces- can you point me to the specific revision numbers that we intend to merge? @Stefan: Thoughts on hacking APR hashtables directly? Are you sure?! Which operation is the most expensive one and how often is it called? Who calls it and why? My bad. I got muddled up when using ra_local: when I saw lots of MD5 functions, I assumed it had to do something with the hashtable since the checksum was caculated by server-side. The profiler profiled both server-side and client-side. Over svn://, I got a much cleaner output. The new results indicate: 1. Server-client IO is the major bottleneck, as Philip and you pointed out. 2. On the client side, the major bottlneck is the IO during textdelta application. David (CC'ed now) and I are trying some experiments with a single temporary file. Thoughts? At least the results are much more useful when there is a tool like Kcachegrind that allows easy navigation though the huge amount of information that was gathered. Yep! The visualizer is quite awesome :) -- Ram
Re: svnrdump tool
Hi Daniel, Daniel Shahaf writes: $ cat /test.dump Node-path: test.txt Node-path: trunk Node-path: trunk/A Node-path: trunk/A/B Node-path: trunk/A/B/E Node-path: trunk/A/B/E/alpha Node-path: trunk/A/B/E/beta Node-path: trunk/A/B/F Node-path: trunk/A/B/lambda Node-path: trunk/A/C Node-path: trunk/A/D Node-path: trunk/A/D/G Node-path: trunk/A/D/G/pi Node-path: trunk/A/D/G/rho Node-path: trunk/A/D/G/tau Node-path: trunk/A/D/H Node-path: trunk/A/D/H/chi Node-path: trunk/A/D/H/omega Node-path: trunk/A/D/H/psi Node-path: trunk/A/D/gamma Node-path: trunk/A/mu Node-path: trunk/iota Node-path: trunk/A/D/H/psi Node-path: trunk/A Node-path: trunk/iota Node-path: trunk/B Node-path: trunk/A Node-path: trunk/A Node-path: trunk/B/D Node-path: trunk Why does Node-path: trunk appear twice? Some prop change. Unrelated anyway. -- Ram
Re: Some tips on profiling
Hi Stefan, Stefan Fuhrmann writes: Under Linux, I'm using Valgrind / Callgrind and visualize in KCachegrind. That gives me a good idea of what code gets executed too often, how often a jump (loop or condition) has been taken etc. It will not show the non-user and non-CPU runtime, e.g. wait for disk I/O. And it will slow the execution be a factor of 100 (YMMV). Ah, they're brilliant! Just what I was looking for :D I'm yet to interpret the output and interpret it to opimize svnrdump though- might need some more help there; are you available on #svn-dev or something sometime? Will keep you updated on this either way. Under Windows, I'm using the VisualStudio sampling profiler. The measurements are pretty accurate and the overhead is low. It does not tell me, how often a certain code path has been executed. Due to the low overhead, it is well-suited for long running (1 min) operations. Interesting. I didn't know Visual Studio had a profiler. Then again, I've not used Windows for several years now. Also, I find a simple time command very useful to get a first impression whether my code is bound by user-runtime, I/O wait or system runtime. Ah, I want to know how to interpret this correctly. For equivalent operations svnrdump says: 7.49s user 1.34s system 91% cpu 9.682 total svnadmin says: 4.84s user 0.44s system 97% cpu 5.417 total To collect data on I/O activity, compression rates and other not readily available information, I use simple static counters and printf() their values at the end of each client call, for instance. I see. Hope that helps. A ton! Thanks a lot :) -- Ram
Re: place of svnrdump
Hi Neels, Neels J Hofmeyr writes: On a side note, svnsync happens to be relatively slow. I tried to svnsync the ASF repos once (for huge test data). The slowness of svnsync made it practically unfeasible to pull off. I ended up downloading a zipped dump and 'svnadmin load'ing that dump. Even with a zipped dump already downloaded, 'unzip | svnadmin load' took a few *days* to load the 950.000+ revisions. (And someone rebooted that box after two days, halfway through, grr. Took some serious hacking to finish up without starting over.) Yeah, we had a tough time obtaining the complete undeltified ASF dump for testing purposes as well. So, that experience tells me that svnsync and svnadmin dump/load aren't close to optimal, for example compared to a straight download of 34 gigs that the ASF repos is... Anything that could speed up a remote dump/load process would probably be good -- while I don't know any details about svnrdump. I just benchmarked it recently and found that it dumps 1 revisions of the ASF repository in 106 seconds: that's about 94 revisions per second. It used to be faster than `svnadmin` in an older benchmark: I'll work on perf issues this week. I estimate that it should be possible to get it to dump at ~140 revisions/second. @Daniel and others: I'd recommend a feature freeze. I'm currently profiling svnrdump and working on improving especially the I/O profile. My two cents: Rephrasing everything into the dump format and back blows up both data size and ETA. Maybe a remote backup mechanism could even break loose from discrete revision boundaries during transfer/load... I've been thinking about this too: we'll have to start attacking the RA layer itself to make svnrdump even faster. The replay API isn't optimized for this kind of operation. P.S.: If the whole ASF repos were a single Git WC, how long would that take to pull? (Given that Git tends to take up much more space than a Subversion repos, I wonder.) The gzipped undeltified dump of the complete ASF repository comes to about 25 GiB and it takes ~70 minutes to import it into the Git object store using a tool which is currently under development in Git. Thanks to David for these statistics. Cloning takes as long as it takes to transmit this data. After a repack, it'll probably shrink in size, but that's besides the point. Git was never designed to handle this- each project being a separate repository would be a fairer comparison. Even linux-2.6.git contains just 210887 revisions, and it tests Git's limits. -- Ram
Re: place of svnrdump (was: Re: svnmucc multiline property issue)
Hi Stefan, Stefan Sperling writes: I don't really mind where svnrdump lives. The community is committed to supporting both the tools/ and subversion/ directories. tools and subversion are merely directory names. All I'm saying is this: I don't want packaging/ distribution overheads for such a simple package; users should be able to use whatever Subversion-interop tools that other developers build by just having Subversion installed. If backwards-compat rules automatically apply to everything below subversion/ (they do?), and that people feel that svnrdump may still change in backwards-incompatible ways, we might as well decide to make the subversion/svnrdump directory exempt from such guarantees during the 1.7 release. It is a simple technical decision: Do we think that the current feature set is worth supporting under our backwards-compat rules? I do. Hm, I still don't understand this back-porting thing very well. Does it mean that the svnrdump should always do what it currently does functionally (plus anything additional)? Does it mean that any improvements to the command-line UI should ensure that the current command-line UI still works? Does it mean that the public API it exposes through the headers should not break- do we instead have to write corresponding '_2' functions? It seems to be quite sane at the moment, and I don't think backporting is an issue; I'm not very experienced in this though, so I wouldn't take my own opinion too seriously. Hyrum K. Wright writes: I would add the further thought that as this is a tool rather than a first-class component, I think we can play a little bit looser with the rules. For what it's worth, I have a different opinion on this issue. It really doesn't matter. It's just the name of a directory and a set of promises we give to our users about backwards-compat. There's no need for hard feelings. Hey, no hard feelings! I was merely citing how other version control systems make it easy for users to get access to their own history, and suggesting that Subversion should too. In the long-term, I think of svnrdump as just a simple dumping/ loading functionality of Subversion: I don't really care *how* it's present; I just think it should be present: either as part of svnrdump, svnadmin, svnsync, or something else. -- Ram
Re: place of svnrdump (was: Re: svnmucc multiline property issue)
Hi Stefan, Stefan Sperling writes: tools and subversion are merely directory names. All I'm saying is this: I don't want packaging/ distribution overheads for such a simple package; users should be able to use whatever Subversion-interop tools that other developers build by just having Subversion installed. There are many interoperability tools that are built on top of Subversion, and they're hosted as independent projects. By the above logic, we'd have to ship all those, and host them in our repository. No. That's what I tried to point out in my email: the interop software are tools like fast-import for hg and bzr, or even git-p4. That's not what svnrdump is: svnrdump itself is *far* from being able to provide interop. It's the *infrastructure* that's necessary for interop tools to be built in a sane and maintainable manner. If you're interested, check out git.git contrib/svn-fe leveraging the infrastructure in vcs-svn/ to convert a Subverion dumpfile v2 into a fast-import stream. It's VERY non-trivial. And what does having Subversion installed really mean? E.g. in the Linux world, the Subversion client/server packages are often separate, but not always. It's also possible for svnsync to live in a separate package from the svn binary. And if you install from source, you get whatever the make install target installs. And maybe you also run make install-tools? Who knows. The point is that this is something packagers need to worry about, not us. With a well-maintained distribution, you can also install svnmucc easily, just like you can install svn easily. Of course, svnrdump is more likely to end up being installed if it gets installed with the default make install target. But packagers might as well decide to split the result of make install into separate packages, and we can't do anything about it. In practice, I don't think any of this is very important. People who need the svnrdump tool will find it no matter what. Even if it was hosted as an entirely separate project. I see- that's an interesting perspective. Hm, I still don't understand this back-porting thing very well. Does it mean that the svnrdump should always do what it currently does functionally (plus anything additional)? Does it mean that any improvements to the command-line UI should ensure that the current command-line UI still works? Does it mean that the public API it exposes through the headers should not break- do we instead have to write corresponding '_2' functions? It means all of the above. We'll promise to support its current state until Subversion 2.0 breaks the world. That's why it's important to make sure everyone agrees that it is ready for that level of dedication. If it isn't, then we need to make sure our users understand that (by moving it to tools/, or by declaring it as experimental, or whatever). Ah. Yeah, I think it's sane enough for this. I'll put in as much work as I can before the release to fix perf issues. For now, it passes the complete svnsync testsuite but for Issue #3717. -- Ram
Re: svnrdump tool
Hi Daniel, Daniel Shahaf writes: svnsync allows you to sync a subdir of a repository (i.e., svnsync $REPOS/trunk/A/B $MIRROR ), but does it also create /trunk/A/B in the mirror? Yes, it does. But for now I still think that svnrdump invocation I quoted above shouldn't have outputted a 'create /trunk' entry in the dumpfile :-). @all, what do you think? Again, it works exactly like svnsync- you might like to read the tests. Here's a verbose example. $ cat /test.dump Node-path: test.txt Node-path: trunk Node-path: trunk/A Node-path: trunk/A/B Node-path: trunk/A/B/E Node-path: trunk/A/B/E/alpha Node-path: trunk/A/B/E/beta Node-path: trunk/A/B/F Node-path: trunk/A/B/lambda Node-path: trunk/A/C Node-path: trunk/A/D Node-path: trunk/A/D/G Node-path: trunk/A/D/G/pi Node-path: trunk/A/D/G/rho Node-path: trunk/A/D/G/tau Node-path: trunk/A/D/H Node-path: trunk/A/D/H/chi Node-path: trunk/A/D/H/omega Node-path: trunk/A/D/H/psi Node-path: trunk/A/D/gamma Node-path: trunk/A/mu Node-path: trunk/iota Node-path: trunk/A/D/H/psi Node-path: trunk/A Node-path: trunk/iota Node-path: trunk/B Node-path: trunk/A Node-path: trunk/A Node-path: trunk/B/D Node-path: trunk $ # svnadmin create /test-repo, /mirror and enable pre-revprop-change $ svnadmin load /test-repo /test.dump $ svnsync init file:///mirror file:///test-repo/trunk/A/B $ svnsync sync file:///mirror $ svnadmin dump /mirror | grep Node-path: Node-path: trunk Node-path: trunk/A Node-path: trunk/A/B Node-path: trunk/A/B/E Node-path: trunk/A/B/E/alpha Node-path: trunk/A/B/E/beta Node-path: trunk/A/B/F Node-path: trunk/A/B/lambda Node-path: trunk/A Node-path: trunk/A Node-path: trunk/A Node-path: trunk/A/G Node-path: trunk/A/G/pi Node-path: trunk/A/G/rho Node-path: trunk/A/G/tau Node-path: trunk/A/H Node-path: trunk/A/H/chi Node-path: trunk/A/H/omega Node-path: trunk/A/H/psi Node-path: trunk/A/gamma Node-path: trunk $ svnrdump dump file:///test-repo/trunk/A/B Node-path: trunk Node-path: trunk/A Node-path: trunk/A/B Node-path: trunk/A/B/E Node-path: trunk/A/B/E/alpha Node-path: trunk/A/B/E/beta Node-path: trunk/A/B/F Node-path: trunk/A/B/lambda Node-path: trunk/A Node-path: trunk/A Node-path: trunk/A Node-path: trunk/A/G Node-path: trunk/A/G/pi Node-path: trunk/A/G/rho Node-path: trunk/A/G/tau Node-path: trunk/A/H Node-path: trunk/A/H/chi Node-path: trunk/A/H/omega Node-path: trunk/A/H/psi Node-path: trunk/A/gamma Node-path: trunk Hope that helps. -- Ram
Some tips on profiling
Hi Stefan, Could you tell me which tools you use to profile the various applications in trunk? I'm looking to profile svnrdump to fix some perf issues, but OProfile doesn't seem to work for me. Thanks. -- Ram
Re: svnmucc multiline property issue
Hi, Daniel Shahaf writes: Would svnrdump benefit if, once 1.7.x branched and RC's start being rolled, it were subjected to a more relaxed backporting policy? If so, we might consider moving it to tools/ for 1.7.x, with intent to move it back to subversion/svnrdump/ for 1.8.x (as soon as 1.7.x is branched from trunk). Hm? I don't understand how it'll help backporting. I already maintain an out-of-tree build that successfully compiles against 1.6 [1]. It'll be fairly trivial to write an svn18_compat module. Hyrum K. Wright writes: I would add the further thought that as this is a tool rather than a first-class component, I think we can play a little bit looser with the rules. For what it's worth, I have a different opinion on this issue. Many of the modern DVCS's speak the Git fast-import protocol: See hg-fast-import or even BzrFastImport for example [2] [3]. Even for those that don't, backing up a repository is as simple as tar'ing up a local checkout. It's a problem in the case of many centralized versioning systems, but there are third-party scripts to even get the data out of Perforce [4]. Agreed, these modules should not be part of the core. However, in the case of Subversion, there absolutely NO way to get/ back up the revision history data* [5]. Getting it out in a version-control independent format is a secondary challenge- the primary challenge is to get the data out in /any/ format. We're currently building a module that'll assist the second task- the infrastructure is already in vcs-svn/ in Git 1.7.3. I'll propose to merge /that/ into the Subversion trunk as a tool when it's ready, however svnrdump should be part of core. Even if you don't agree with the above and claim that `svnadmin (dump|load)` fits the bill, svnrdump can provide the same functionality- it can do what svnadmin can, only faster. There's been some speculation about getting svnrdump merged into svnadmin, but let's not get into that right now. As I see it, the reason we should have svnrdump in trunk is to build it and distribute it as a part of core Subversion: to enable people to get access to their own history, and to encourage them write additional tools on top of it. -- Ram [1]: http://github.com/artagnon/svnrdump [2]: http://mercurial.selenic.com/wiki/FastImportExtension [3]: http://wiki.bazaar.canonical.com/BzrFastImport [4]: http://repo.or.cz/w/git.git?a=blob_plain;f=contrib/fast-import/git-p4;hb=maint [5]: There's a Subversion equivalent of the git-p4 script called git-svn. It's a long, ugly, unmaintainable 5000-line Perl script http://repo.or.cz/w/git.git?a=blob_plain;f=git-svn.perl * I suppose it's worth mentioning svnsync here although it takes years to finish and eats up eons of disk space. Mirroring an entire repository is NOT the way to get the data of a centralized versioining system- it was not designed to be used that way.
Re: place of svnrdump (was: Re: svnmucc multiline property issue)
Hi, Daniel Shahaf writes: I agree that svnrdump does something very useful and that it belongs in Subversion. But I'm not sure whether it's mature enough today to belong in subversion/svnrdump/. svnrdump is still young (less than, how much, 6 months old?). The code still needs a bit of cleanup to remove rough edges (e.g., the most recent one I recall is pool usage), and hasn't been tested widely. Yet, svnrdump is in subversion/ and distributed as part of the core, while svnmucc --- with 5 years of core developers' support under its belt --- doesn't get built by 'make' by default. Less than 6 months, and I'm not a very experienced developer. The code in dump_editor definitely needs a little cleanup. Except for a few tests, it passes the complete svnsync testssuite. Ofcourse it's not perfect. By the time 1.7 is released, it'll probably be better than what it is now. I have no interest in politics, and I don't want to speculate why svnmucc isn't built by `make` by default. I would like to keep this discussion focused purely on the benefits and trade-offs of including svnrdump in subversion/svnrdump. If the discussion deviates from this, I would NOT like to be included in it- I'm a just a new partial committer and I don't know how your organization works. Just to repeat: it's not a question of 'whether' svnrdump belongs in subversion/svnrdump/, it's a question of whether it belongs there right now. As far as I'm concerned, development will happily chug along and whoever wants to use svnrdump will use it- the out-of-tree build will build against both Subversion 1.6 and Subversion 1.7 nicely and everyone's happy. The major downside of including svnrdump now is that users might complain that it doesn't work and will file some bugs. However, being a simple client-side utility, I doubt it'll cause any major catastrophies, even if it's used in production. The benefit is that it'll get more exposure- developers will find out about it and start writing tools that use it earlier. They will eventually discover more bugs and file them which will speed up its development. This is not speculation- the Git tool is almost ready, and will certainly be merged within the next couple of months. There are some projects using rsvndump- they will also benefit immediately. Ofcourse I understand that having authored svnrdump, it's possible that my personal interests have clouded my judgement. I've kept this in mind, and tried to be as unbiased as possible. I'll appeal to everyone else to do the same- do what you think is in the best interest of the greater good. -- Ram
Re: svnrdump tool
Hi Karl, Karl Heinz Marbaise writes: i just want to give an other information which i found about the subject which seemed to be related to the svnrdump tool... http://rsvndump.sourceforge.net/ Yes, I saw this tool before I decided to write svnrdump. The main difference is that rsvndump produces a non-deltified dump, while svnrdump can only produce a deltified dump (dumpfile v3). As a consequence, rsvndump is slow and needs a lot of memory while svnrdump is fast and uses little memory. -- Ram
Re: svnrdump tool
Hi Daniel, Daniel Shahaf writes: Does svnrdump dump -r 100 https://svn.apache.org/repos/asf/subversion also produce a deltified dump? (of r100 to its predecessor) Yes, it does. Is it currently possible to cause svnrdump to dump rN in full? (like svnadmin dump --incremental) No, it's not. -- Ram
Re: [WIP PATCH] svnrdump dump --incremental
Hi Daniel, Daniel Shahaf writes: Here's a stab at implementing 'svnrdump dump --incremental'. It does what 'svn checkout' does to get the first revision --- namely, calls svn_ra_do_update2() with start_empty=TRUE. TODO: it passes send_copyfrom_args=TRUE to svn_ra_do_update2(), so it still has to go by itself and requests fulltexts (or, copyfrom fulltexts and txdeltas against those) for files that have been copied. Currently, for a cp-with-changes, it dumps nothing (neither fulltext nor delta). TODO: dump revprops for the first revision. (haven't tested this) Overall, the path appending logic seems to be alright. I'd thought that rebasing the RA session would be cleaner, but you explained to me over IRC that it's not possible due to authz issues- issue #3242. My primary concern is handling copyfrom references where copyfrom_rev start_revision. We'd have to fetch the full text of the file from history to apply the text delta to, and this requires us to make an RA call in the middle of a replay. Using two RA sessions (and possibly keeping both open so we can reuse them) might work if there are few of these references. However, we discussed that either of the sessions might timeout if one operation takes too long to complete. The whole business seems messy to me, and we should ideally like to be able to avoid having two RA sessions. I'd like to attack the problem at the RA API level itself, but this can be very non-trivial. Can we get some inputs from the others on this? -- Ram
Re: svnrdump tool
Hi Daniel, Daniel Shahaf writes: Daniel Shahaf wrote on Sat, Sep 25, 2010 at 16:49:08 +0200: 0:/tmp/svn% $svnrdump dump file://$PWD/r1/trunk/A/B 21 | grep Node-path Node-path: trunk Node-path: trunk/A Node-path: trunk/A/B Node-path: trunk/A/B/E Node-path: trunk/A/B/E/alpha Node-path: trunk/A/B/E/beta Node-path: trunk/A/B/F Node-path: trunk/A/B/lambda Nice :-) Is it? If I try to dump a subdir, then I /shouldn't/ get a Create /trunk entry in the dumpfile... instead, it should be the user's business to create that directory in the target repository out of band, before loading the dump into it. svnrdump behaves exactly like svnsync in this manner. See the tests only_trunk_dump, only_trunk_A_with_changes_dump, and descend_into_replace_dump that I imported from svnsync. They all exercise this subdirectory feature. Maybe the tests are not exhaustive? -- Ram
Re: [PATCH] Infrastructure to exclude comparing lines
Hi Daniel, Daniel Shahaf writes: Ramkumar Ramachandra wrote on Mon, Sep 20, 2010 at 09:39:57 +0530: Daniel Shahaf writes: * ignore Prop-delta... why? This is because svnrdump outputs Prop-delta: true everytime, whether it's really a delta or a delta from /dev/null. This is quite harmless, and the logic for omitting it when it's a delta from /dev/null can be fairly non-trivial and unnecessary. Again, svnadmin dump does it right. Okay, but if the Prop-delta: header is generated by one but not by the other, then the rest of the dump'd props hash should be different too. Huh? Why? i.e., if the tests ignore difference in the Prop-delta: header, they should also be finding more differences later in the props hash --- and should somehow be ignoring those correctly... Ok, I think you've misunderstood this. Both of them DO dump the exact same properties. Think of the Prop-delta header as just an extra cosmetic line. In dumpfile v3, everything is a delta against the previous revision including props. When the Prop-delta header is absent, it doesn't mean that the full props are being listed- it just means that the file or directory didn't exist in a previous revision. For a file or directory that didn't exist in a previous revision- * [Prop-delta missing] is a cosmetic way of saying: Here's a new file or directory along with it's properties. * Prop-delta: true is a cosmetic way of saying: Here's a file or directory. If it's new, the delta is against nothing. If it's old, the delta is against a previous version. Check it for yourself- information about whether the file or directory is new is already present in the dumpfile. I hope that clears it up? -- Ram
Re: [PATCH] Infrastructure to exclude comparing lines
Hi Daniel, Daniel Shahaf writes: Ramkumar Ramachandra wrote on Wed, Sep 22, 2010 at 22:24:10 +0530: Ok, I think you've misunderstood this. Both of them DO dump the exact same properties. Think of the Prop-delta header as just an extra cosmetic line. In dumpfile v3, everything is a delta against the previous revision including props. Did you mean: s/In dumpfile v3/In 'svnadmin dump --deltas'/ ? IOW, the dumpfile format version doesn't enforce that, but svnadmin never produces non-delta'd props when it dumps as v3 --- right? Right. Technically, I meant dumpfile v3 with all the v3 features exercised :) (I think/assume that you can take any v2 dumpfile and just change the header from format: 2 to format: 3 --- without making any other changes --- and still have the result a valid dumpfile...) That's right. A dumpfile v3 just adds some extra features to v2. When the Prop-delta header is absent, it doesn't mean that the full props are being listed- it just means that the file or directory didn't exist in a previous revision. For a file or directory that didn't exist in a previous revision- * [Prop-delta missing] is a cosmetic way of saying: Here's a new file or directory along with it's properties. * Prop-delta: true is a cosmetic way of saying: Here's a file or directory. If it's new, the delta is against nothing. If it's old, the delta is against a previous version. Check it for yourself- information about whether the file or directory is new is already present in the dumpfile. In other words, the only case where svnrdump and svnadmin disagree on the presence of the Prop-delta: header is in the case of added-without-history files; and in that case, svnrdump prints it and svnadmin doesn't, and this doesn't matter since interpreting the dumped hash as a delta or as an absolute full properties list gives the same result... Right? Exactly! P.S. Doesn't the Node-action: header allow you to distinguish whether it's a newly-added file or not, and thus be able to generate Prop-delta: only in the cases that svnadmin generates it? Yes. Unfortunately, the actual dumping is separated from determining *what* to dump which is why dump_editor needs ugly hacks like dump_props and dump_newlines. To solve this problem, we have to reverse-engineer what `svnadmin dump` dumps. It should then be possible to determine what happens in the edge cases: is_copy replace, is_copy add, is_copy change etc. Then we have to set another flag, extend dump_props to accept another argument and conditionally write that Prop-delta header. To put it tersely, it's non-trivial, inelegant and time consuming- plus, it's not priority, considering that many other headers mismatch anyway. -- Ram
Re: svn commit: r999507 - in /subversion/trunk/subversion: svnrdump/dump_editor.c tests/cmdline/svnrdump_tests.py
Hi Daniel, Daniel Shahaf writes: It actually makes sense: either there is a parent baton, or I'm creating the root baton (and therefore the caller passed path=NULL). Choosing path=NULL to represent the root seems odd to me (that's not how we usually represent it), but that's for another day. Ah. I guess that explains it was there in the first place :p I think I'm using to represent the root somewhere, but I'll have to check this. I'll figure something out during the cleanup. True. However, abspath is a `char *` with no memory allocated to it: I want whatever its value is to be allocated in `pool` since it's passed around. I think you've got your pointers confused. Can you drop the pstrdup(), see that nothing breaks, and then we can discuss why nothing broke? Hehe, I wouldn't be surprised if this is the case -- I've been working with APR pools day and night now :p Okay, I'll drop this during the cleanup. Do explain this to me though :) -- Ram
Re: svn commit: r999507 - in /subversion/trunk/subversion: svnrdump/dump_editor.c tests/cmdline/svnrdump_tests.py
Hi again, Ramkumar Ramachandra writes: I think you've got your pointers confused. Can you drop the pstrdup(), see that nothing breaks, and then we can discuss why nothing broke? Hehe, I wouldn't be surprised if this is the case -- I've been working with APR pools day and night now :p Okay, I'll drop this during the cleanup. Do explain this to me though :) Just to be clear: I know that nothing will break. Memory for the string / will be allocated statically by the compiler- its lifetime is therefore the lifetime of the program, and it cannot be freed before that. I made this change merely for the purpose of elegance- except in this case, abspath always points to a dynamically allocated memory location owned by a `pool` that can be freed. So I thought: why not make make it always point to a dynamically allocated memory location? Does that make sense or am I just being silly? :p -- Ram
Re: svn commit: r999507 - in /subversion/trunk/subversion: svnrdump/dump_editor.c tests/cmdline/svnrdump_tests.py
Hi, Daniel Shahaf writes: abspath is a pointer, lives on the stack, and is passed by value (like any old int) in function calls. What you allocate from the pool is the value it points to --- i.e., the two bytes of /. However, literal string constants are static strings --- they are loaded into memory from the binary image (ever tried 'strings /usr/bin/svn'?) and stay there until that is unloaded --- while pool-allocated strings allocated on the heap, and stays there until the pool is cleared. The other difference is that literal strings are not writable (they are const char *) while pool-allocate strings are mutable (non-const char *). (waiting to hear from you that dropping the strdup() causes a segfault) Hehe. See my other email- don't worry, I wasn't confused about static and dynamic allocation :p I just got muddled up in my whole elegance argument. So, the only reasons you'd have to duplicate a static string is if the static string lives in a library which you know you'll unload[1] or if you need a non-const char * for some reason. In your case, 'abspath' is a const char *, so neither of these cases applies. Right, it's been justified then- I was being silly. Okay? Thanks for the explanation. Will drop during the cleanup. -- Ram
Re: svn commit: r999507 - in /subversion/trunk/subversion: svnrdump/dump_editor.c tests/cmdline/svnrdump_tests.py
Hi Daniel, Thanks for the review. Daniel Shahaf writes: - /* Disallow a path relative to nothing. */ - SVN_ERR_ASSERT_NO_RETURN(!path || pb); - Why did you drop this assertion? I thought about it for a bit, and I can't seem to figure out why I added it in the first place. My current sanity logic: Without a pb, this function (make_dir_baton) generates a root baton. Whatever `path` the user passes is ignored anyway- the root baton has `abspath` set to `/`. With a `pb`, I MUST pass a non-null `path` -- there's probably something about this I haven't thought about properly that's making test 14 fail (there's an empty Node-path there). Either way, I don't see where an assert fits into the picture. Also; if you intend to move it back, SVN_ERR_ASSERT() would be better (for when this code becomes a library function). Ok, I'll keep that in mind when I write more asserts. /* Construct the full path of this node. */ if (pb) abspath = svn_uri_join(/, path, pool); else -abspath = /; +abspath = apr_pstrdup(pool, /); It is most likely unnecessary to duplicate a static string constant. :-) True. However, abspath is a `char *` with no memory allocated to it: I want whatever its value is to be allocated in `pool` since it's passed around. -- Ram
Re: svn commit: r998502 - /subversion/trunk/subversion/svnrdump/load_editor.c
Hi, Daniel Shahaf writes: Stefan Sperling wrote on Sun, Sep 19, 2010 at 11:40:49 +0200: On Sun, Sep 19, 2010 at 10:29:58AM +0100, Daniel Shahaf wrote: 1. Please don't duplicate code. I think it's fine for svnrdump to have its own copy of this for now. We could at some point merge them into libsvn_subr/ of course, but I don't see a huge problem with just 2 instances. Ok, let's have this copy for now. 2. If you do duplicate code, then add big comments (in all instances of the duplication) pointing to the other instances. +1 I've mentioned it in the commit message, but alright- I'll add a comment in the file. 3. Incidentally, I have modified the svnsync instance of this function on the atomic-revprops branch. So the desire to avoid duplication isn't just theoretical in this case... We should fix the race in svnrdump on the atomic-revprop branch as well. No problem; r998622. (The svnsync patch isn't committed yet, but it should be easy to port it to svnrdump.) Ok, let me know when it's done. I'll port it to svnrdump too. -- Ram
[PATCH] Infrastructure to exclude comparing lines
Hi, I thought this patch might need a little bit of special attention. Let me know quickly if something is obviously wrong with it- as far as I know, all tests pass and this patch works as expected. I'm going ahead and adding lots of tests and fixing whatever bugs there may be shortly. [[[ Add some testing infrastructure to exclude comparison of certain lines specified by a regular expression so that svnrdump dump tests can report actual failures. Exercise this infrastructure in svnrdump_tests.py. This is necessary because there is some mismatch between the headers that `svnadmin dump` and `svnrdump dump` are able to provide. * subversion/tests/cmdline/svntest/verify.py (ExpectedOutput.matches_except): Write new function that resembling ExpectedOutput.display_lines with an extra except_re argument. It compares EXPECTED and ACTUAL on an exact line-by-line basis, excluding lines that match the regular expression specified in except_re. (ExceptedOutput.matches): Modify function to accept one more optional argument: except_re. Call matches_except if except_re is not None. (compare_and_display_lines): Take an extra argument except_re that defaults to None, and call ExcpectedOutput.matches with this argument. * subversion/tests/cmdline/svnrdump_tests.py (mismatched_headers_re): Declare a new global string that specifies the headers in which a mismatch is expected during a dumping operation. (run_dump_test): Pass the above regular expression string to svntest.verify.compare_and_display_lines). (test_list): Mark copy_and_modify_dump as a passing test. ]]] Index: subversion/tests/cmdline/svnrdump_tests.py === --- subversion/tests/cmdline/svnrdump_tests.py (revision 998490) +++ subversion/tests/cmdline/svnrdump_tests.py (working copy) @@ -41,6 +41,11 @@ XFail = svntest.testcase.XFail Item = svntest.wc.StateItem Wimp = svntest.testcase.Wimp +# Mismatched headers during dumping operation +mismatched_headers_re = \ +Prop-delta: |Text-content-sha1: |Text-copy-source-md5: | \ +Text-copy-source-sha1: |Text-delta-base-sha1: .* + ## # Helper routines @@ -80,7 +85,8 @@ def run_dump_test(sbox, dumpfile_name): # Compare the output from stdout svntest.verify.compare_and_display_lines( -Dump files, DUMP, svnadmin_dumpfile, svnrdump_dumpfile) +Dump files, DUMP, svnadmin_dumpfile, svnrdump_dumpfile, +None, mismatched_headers_re) def run_load_test(sbox, dumpfile_name): Load a dumpfile using 'svnrdump load', dump it with 'svnadmin @@ -177,7 +183,7 @@ test_list = [ None, revision_0_load, skeleton_load, copy_and_modify_load, - Wimp(Need to fix headers in RA layer, copy_and_modify_dump), + copy_and_modify_dump, ] if __name__ == '__main__': Index: subversion/tests/cmdline/svntest/verify.py === --- subversion/tests/cmdline/svntest/verify.py (revision 998490) +++ subversion/tests/cmdline/svntest/verify.py (working copy) @@ -108,7 +108,7 @@ class ExpectedOutput: def __cmp__(self, other): raise 'badness' - def matches(self, other): + def matches(self, other, except_re=None): Return whether SELF.output matches OTHER (which may be a list of newline-terminated lines, or a single string). Either value may be None. @@ -126,8 +126,32 @@ class ExpectedOutput: if not isinstance(expected, list): expected = [expected] -return self.is_equivalent_list(expected, actual) +if except_re: + return self.matches_except(expected, actual, except_re) +else: + return self.is_equivalent_list(expected, actual) + def matches_except(self, expected, actual, except_re): +Return whether EXPECTED and ACTUAL match except for except_re. +if not self.is_regex: + i_expected = 0 + i_actual = 0 + while i_expected len(expected) and i_actual len(actual): +if re.match(except_re, actual[i_actual]): + i_actual += 1 +elif re.match(except_re, expected[i_expected]): + i_expected += 1 +elif expected[i_expected] == actual[i_actual]: + i_expected += 1 + i_actual += 1 +else: + return False + if i_expected == len(expected) and i_actual == len(actual): +return True + return False +else: + raise self.is_regex and except_re are mutually exclusive + def is_equivalent_list(self, expected, actual): Return whether EXPECTED and ACTUAL are equivalent. if not self.is_regex: @@ -308,7 +332,7 @@ def display_lines(message, label, expected, actual sys.stdout.write(x) def compare_and_display_lines(message, label, expected, actual, - raisable=None): +
Re: [PATCH] Infrastructure to exclude comparing lines
Hi, I've committed this to r998661, but the Subversion bot reports some breakages. I'm not sure what to make of them though. -- Ram
Re: [PATCH] Infrastructure to exclude comparing lines
Hi Daniel, Daniel Shahaf writes: +# Mismatched headers during dumping operation +mismatched_headers_re = \ +Prop-delta: |Text-content-sha1: |Text-copy-source-md5: | \ +Text-copy-source-sha1: |Text-delta-base-sha1: .* + So: * ignore sha1 because RA doesn't provide it yet Yes, that's correct. * ignore Text-copy-source-md5... why? for the same reason? Yes. * ignore Prop-delta... why? This is because svnrdump outputs Prop-delta: true everytime, whether it's really a delta or a delta from /dev/null. This is quite harmless, and the logic for omitting it when it's a delta from /dev/null can be fairly non-trivial and unnecessary. Again, svnadmin dump does it right. (It would be good to have the answer in a comment, I think.) Ok. Couldn't you redefine one of these two functions (matches_except() and is_equivalent_list()) such that they share more code? For example: def matches_except(self, expected, actual, except_re): # TODO: possibly use a slightly different error message # if an exception is thrown return self.is_equivalent_list( filter(lambda x: not re.match(except_re, x), expected), filter(lambda x: not re.match(except_re, x), actual)) ... or add some sort of optional argument to is_equivalent_list(), such that the functionalities of both functions can be achieved by varying the value of the optional argument. In second thought: you could make except_re a lambda expression (taking a single argument, the line) instead of a regex; i.e., s/foo('Text-sha1:')/foo(lambda line: 'Text-sha1' not in line)/ ? Actually, this was sort of intentional on my part. I don't know if this except_re (the way I've written it) is such a great idea in the long run. We know that svnrdump needs it, but I'd like to wait until we have more usecases before restructuring it. We might figure out that some more general way of including/ excluding lines to match in actual/ expected: the lambda function example you've provided is quite illustrative. For now, let this except_re feature serve just the svnrdump case. -- Ram
Re: svn commit: r998502 - /subversion/trunk/subversion/svnrdump/load_editor.c
Hi Daniel, Daniel Shahaf writes: Ramkumar Ramachandra wrote on Sun, Sep 19, 2010 at 16:52:50 +0530: Daniel Shahaf writes: Stefan Sperling wrote on Sun, Sep 19, 2010 at 11:40:49 +0200: On Sun, Sep 19, 2010 at 10:29:58AM +0100, Daniel Shahaf wrote: 2. If you do duplicate code, then add big comments (in all instances of the duplication) pointing to the other instances. +1 I've mentioned it in the commit message, but alright- I'll add a comment in the file. Please add a comment to svnsync's get_lock() as well. +1 needed :) [[[ * subversion/svnsync/main.c (get_lock): Add a comment explaining what the function does along with a note about it being duplicated in svnrdump. * subversion/svnrdump/load_editor.c (get_lock): Add a comment explaining what the function does along with a note about it being duplicated in svnsync. ]]] Index: subversion/svnsync/main.c === --- subversion/svnsync/main.c (revision 998652) +++ subversion/svnsync/main.c (working copy) @@ -285,7 +285,13 @@ check_lib_versions(void) /* Acquire a lock (of sorts) on the repository associated with the - * given RA SESSION. + * given RA SESSION. This lock is just a revprop change attempt in a + * time-delay loop. This function is duplicated by svnrdump in + * load_editor.c. + * + * ## TODO: Make this function more generic and + * expose it through a header for use by other Subversion + * applications to avoid duplication. */ static svn_error_t * get_lock(svn_ra_session_t *session, apr_pool_t *pool) Index: subversion/svnrdump/load_editor.c === --- subversion/svnrdump/load_editor.c (revision 998772) +++ subversion/svnrdump/load_editor.c (working copy) @@ -56,6 +56,14 @@ commit_callback(const svn_commit_info_t *commit_in return SVN_NO_ERROR; } +/* Acquire a lock (of sorts) on the repository associated with the + * given RA SESSION. This lock is just a revprop change attempt in a + * time-delay loop. This function is duplicated by svnsync in main.c. + * + * ## TODO: Make this function more generic and + * expose it through a header for use by other Subversion + * applications to avoid duplication. + */ static svn_error_t * get_lock(svn_ra_session_t *session, apr_pool_t *pool) { Thanks for the offer. I can do that myself, though; svnrdump is a first-class citizen now, as is this duplication (per the direction this thread seems to be going), so AFAICT all cards say the branch maintainer should be adding that to svnrdump. Sure :) -- Ram
Re: [PATCH]svnrdump to dump single specified revision
Hi Vijayguru, Vijayaguru Guruchave writes: 'svnrdump dump -r n URL' dumps the revisions from 'n' to HEAD.The attached patch dumps the single specified revision alone. [[[ Log: svnrdump to dump the single specified revision. * subversion/svnrdump/svnrdump.c (main): Dump the single specified revision by assigning start_revision to end_revision while passing revision number to '-r' option. Patch by: Vijayaguru G vi...@collab.net ]]] Thanks for the patch. Although this was intended behavior, I figured that `svnadmin` follows the convention of your patch. Committed to r993102: Index: subversion/svnrdump/svnrdump.c === --- subversion/svnrdump/svnrdump.c (revision 992366) +++ subversion/svnrdump/svnrdump.c (working copy) @@ -58,8 +58,7 @@ static const svn_opt_subcommand_desc2_t svnrdump__ N_(usage: svnrdump dump URL [-r LOWER[:UPPER]]\n\n Dump revisions LOWER to UPPER of repository at remote URL to stdout in a 'dumpfile' portable format.\n - If omitted, LOWER defaults to zero and UPPER to the latest - latest revision.\n), + If only LOWER is given, dump that one revision.\n), { 0 } }, { load, load_cmd, { 0 }, N_(usage: svnrdump load URL\n\n @@ -75,7 +74,7 @@ static const svn_opt_subcommand_desc2_t svnrdump__ static const apr_getopt_option_t svnrdump__options[] = { -{revision, 'r', 1, N_(REV1[:REV2] range of revisions to dump)}, +{revision, 'r', 1, N_(specify revision number ARG (or X:Y range))}, {quiet, 'q', 0, N_(no progress (only errors) to stderr)}, {config-dir,opt_config_dir, 1, N_(read user configuration files from directory ARG) }, @@ -471,7 +470,10 @@ main(int argc, const char **argv) NULL, 10); } else - opt_baton-start_revision = (svn_revnum_t)strtoul(opt_arg, NULL, 10); + { +opt_baton-start_revision = (svn_revnum_t)strtoul(opt_arg, NULL, 10); +opt_baton-end_revision = opt_baton-start_revision; + } } break; case 'q':
Re: race condition
Hi Daniel, Daniel Shahaf writes: From IRC logs... 09:53 @danielsh wayita: tell artagnon http://colabti.org/irclogger/irclogger_log/svn-dev?date=2010-08-20#l83 --- has that been resolved? (can't find any record in mail/irc archives) So, do you remember what that race condition is? And whether it's been resolved? I've been a little preoccupied lately with some kernel and Git patches et al. I'll restart work on svnrdump this weekend. Also, I have some exams coming up next week, so I don't know how much time I'll be able to give after this weekend :| Thanks for the ping. -- Ram
Re: svn commit: r987513 - /subversion/trunk/subversion/svnrdump/svnrdump.c
Hi Daniel, Daniel Shahaf writes: Ping. Was the 'race condition' I mentioned in my other mail related to 'svnrdump load' setting revprop (which may fail if a hook hadn't been set up)? Yep. I have to try setting some dummy revprop and barf out quickly before getting caught up in between real work like svnsync does. -- Ram
Re: [RFC/PATCH] Create a fresh svn_repos_parse_fns3_t
Hi Julian, Julian Foad writes: On Thu, 2010-08-19 at 10:25 +0300, Daniel Shahaf wrote: Ramkumar Ramachandra wrote on Thu, Aug 19, 2010 at 11:12:14 +0530: Hi, I sent a patch a while ago for svn_repos_parse_dumpstream3. While I wait for approval, this is an RFC patch describing my future plan once that patch gets approved. It can be described tersely as While at it (adding the new callback), fix everything that's wrong with the struct. I have just read through Daniel's diff of this patch, and the change appears to be: Add a dumpstream_version_record() method. Add scratch pool arguments to all methods. Is that what you intended? If so, it sounds sane to me. (I would name the scratch pool arguments 'scratch_pool'.) Yep, and I'm changing the API to pass the scratch pool around. I'm using Git to stage all my changes and produce a coherent series with relevant changes, but it's taking some time- expect to see a series soon though. -- Ram
Re: [PATCH v2] New dumpstream parser to check version number
Hi Daniel, Daniel Shahaf writes: - SVN_ERR(svn_repos_parse_dumpstream2(stream, parser, parse_baton, - NULL, NULL, pool)); + SVN_ERR(svn_repos_parse_dumpstream3(stream, parser, parse_baton, + NULL, NULL, + SVN_REPOS_DUMPFILE_FORMAT_VERSION, Wrong macro: when we introduce dumpfile format 4, this will become 4, but I think you want it to remain 3 even then, right? Actually, I'd want to svnrdump tests to fail so someone (or me) immediately fixes it to use version 4: it'll probably be a performance boost as well. +#define SVN_REPOS_DUMPFILE_FORMAT_UNKNOWN_VERSION-1 #define SVN_REPOS_DUMPFILE_UUID UUID #define SVN_REPOS_DUMPFILE_CONTENT_LENGTHContent-length @@ -2646,6 +2647,11 @@ typedef svn_repos_parse_fns2_t svn_repos_parser_fn * @a cancel_baton as argument to see if the client wishes to cancel * the dump. * + * If @a exact_version is SVN_REPOS_DUMPFILE_FORMAT_UNKNOWN_VERSION, #define macroname SVN_REPOS_DUMPFILE_FORMAT_UNKNOWN_VERSION In doxygen, you can write #macroname or @c macroname. (IIRC the first makes a link and the second makes constant-width font; untested.) Ok, thanks for the pointer. Fixed. + * it is ignored and the dumpstream is parsed without this + * information. Else, the function checks the dumpstream's version + * number, and errors out if there's a mismatch. + * Sometimes, in situations like this we guarantee exactly which error code will be returned, for the benefit of API users who want to trap a specific (non-fatal) error condition. See svn_ra_open4() for an example. Oh. Is this necessary though? I'm going to move out this code into a callback soon- users can handle it there after that. +SVN_DEPRECATED +svn_error_t *svn_repos_parse_dumpstream2(svn_stream_t *stream, +const svn_repos_parse_fns2_t *parse_fns, +void *parse_baton, Style nitpick: The parameters aren't aligned properly. (the previous declaration does it correctly.) Fixed. I'm not sure what we gain by committing this patch when you already have a parse_fns3_t patch outstanding. (I haven't looked at that thread yet.) Wouldn't it make more sense to first commit that patch, than to commit this one, then that one, and then a revision of that one to use the new API? That patch is still an RFC, and it's unlikely to be approved soon I think. If I were able to send a series, it would roughly look like this: 1. Create parse_dumpstream3 to include the logic for checking equality in version. 2. Note the lack of flexibility in 1, and create a new struct (the other patch). 3. Modify parse_dumpstream3 to use the new struct, and move out the logic for checking the version into a fresh callback. Note: it's acceptable to post patches that depend on previous patches. (So you could write this patch in terms of parse_fns3_t directly.) There are a number of ways to manage the interdependencies... (you mentioned quilt on IRC; I know some folks here use Mercurial patch queues and similar tricks) Oh, ok. I'll learn to use Quilt then. Using Git to stage is a bit of an overkill. -- Ram -- 8 -- Index: subversion/include/svn_repos.h === --- subversion/include/svn_repos.h (revision 986884) +++ subversion/include/svn_repos.h (working copy) @@ -2286,6 +2286,7 @@ svn_repos_node_from_baton(void *edit_baton); /* The RFC822-style headers in our dumpfile format. */ #define SVN_REPOS_DUMPFILE_MAGIC_HEADERSVN-fs-dump-format-version #define SVN_REPOS_DUMPFILE_FORMAT_VERSION 3 +#define SVN_REPOS_DUMPFILE_FORMAT_UNKNOWN_VERSION-1 #define SVN_REPOS_DUMPFILE_UUID UUID #define SVN_REPOS_DUMPFILE_CONTENT_LENGTHContent-length @@ -2646,6 +2647,11 @@ typedef svn_repos_parse_fns2_t svn_repos_parser_fn * @a cancel_baton as argument to see if the client wishes to cancel * the dump. * + * If @a exact_version is #SVN_REPOS_DUMPFILE_FORMAT_UNKNOWN_VERSION, + * it is ignored and the dumpstream is parsed without this + * information. Else, the function checks the dumpstream's version + * number, and errors out if there's a mismatch. + * * This parser has built-in knowledge of the dumpfile format, but only * in a general sense: * @@ -2661,16 +2667,31 @@ typedef svn_repos_parse_fns2_t svn_repos_parser_fn * This is enough knowledge to make it easy on vtable implementors, * but still allow expansion of the format: most headers are ignored. * - * @since New in 1.1. + * @since New in 1.7. */ svn_error_t * -svn_repos_parse_dumpstream2(svn_stream_t *stream, +svn_repos_parse_dumpstream3(svn_stream_t *stream, const svn_repos_parse_fns2_t *parse_fns, void
Re: [PATCH v2] New dumpstream parser to check version number
Hi Daniel, Daniel Shahaf writes: That patch is still an RFC, and it's unlikely to be approved soon I think. If I were able to send a series, it would roughly look like this: 1. Create parse_dumpstream3 to include the logic for checking equality in version. 2. Note the lack of flexibility in 1, and create a new struct (the other patch). 3. Modify parse_dumpstream3 to use the new struct, and move out the logic for checking the version into a fresh callback. Note: it's acceptable to post patches that depend on previous patches. (So you could write this patch in terms of parse_fns3_t directly.) So we first create parse_dumpstream3() and then fix it a second later? I'd rather just revv the parse_fns3_t API (with the other patch) and then touch parse_dumpstream3() once. Does that make sense to you? Okay, got it. I'll post a series soon. -- Ram
Re: [PATCH] New dumpstream parser to check version number
Hi Hyrum, Hyrum K. Wright writes: + * If @a version_number is -1, it is ignored and the dumpstream is + * parsed without this information. If not -1, the function checks the We try to avoid magic numbers. Is there a #define or constant somewhere that you could use? If not, I'd add a #define for UNKNOWN_VERSION or something similar (look at other #define's in the header files). Right, done. - * @since New in 1.1. + * @since New in 2.0. Uh, we haven't (not do we have an immediate plans to) released 2.0 yet. :) Right. I seem to have forgotten about 1.7-1.9 :p +/** + * Similar to svn_repos_parse_dumpstream3(), but is dumpfile version + * agnostic. + * + * @deprecated Provided for backward compatibility with the 1.6 API. Should maintain the @since tag from before. Ok. + /* Error out if version doesn't match with the provided version_number */ + if (version_number != -1 version_number != version) Another magic number. Fixed. - SVN_ERR(svn_repos_parse_dumpstream2(stream, parser, parse_baton, - NULL, NULL, pool)); + SVN_ERR(svn_repos_parse_dumpstream3(stream, parser, parse_baton, + NULL, NULL, 3, pool)); Another magic number. Fixed. -- Ram
Re: [PATCH] New dumpstream parser to check version number
Hi Daniel, Daniel Shahaf writes: - * @since New in 1.1. + * @since New in 2.0. s/2.0/1.7/ Fixed. +/** + * Similar to svn_repos_parse_dumpstream3(), but is dumpfile version + * agnostic. + * Perhaps say: with @a version_number always set to -1 ? Fixed, along with Hyrum's suggestions. + /* Error out if version doesn't match with the provided version_number */ + if (version_number != -1 version_number != version) svnrdump needs an inequality check here. But 'svnadmin load', for example, needs an is at most check. So I don't think this is generic enough: it would be better to provide something that 'svnadmin load' can use too. So, two options: * Repeat the parse_format_version() trick (from load.c) in svnrdump. It's actually not possible. If I read one line of the dumpstream in load_editor.c and then call svn_repos_parse_dumpstream2, the function won't have the version number information because I've already read that out from the dumpstream, and I can't rewind the stream. Note that as you already pointed out on IRC, parse_format_version imposes a maximum already (load.c:647). * In the API, PARSE_FNS could grow a dumpfile_version_record() callback (like it has a uuid_record()) callback. The caller can implement whatever check they want in that callback. (possibly with a special error code that svn_repos_parse_dumpstreamN() recognizes; compare usage of SVN_ERR_CANCELLED) Excellent idea; too much work for one patch though- I'll add a TODO comment about this and re-post the patch. -- Ram
[RFC/PATCH] Create a fresh svn_repos_parse_fns3_t
Hi, I sent a patch a while ago for svn_repos_parse_dumpstream3. While I wait for approval, this is an RFC patch describing my future plan once that patch gets approved. It can be described tersely as While at it (adding the new callback), fix everything that's wrong with the struct. I'm planning to fix a few other things as well, but this is the basic sketch. See load_editor.c for the usecase- I actually stuffed a scratch pool into the parse_baton so I could use it everywhere. -- Ram Index: subversion/include/svn_repos.h === --- subversion/include/svn_repos.h (revision 986884) +++ subversion/include/svn_repos.h (working copy) @@ -2544,6 +2544,129 @@ svn_repos_load_fs(svn_repos_t *repos, /** + * A vtable that is driven by svn_repos_parse_dumpstream3(). + * + * @since New in 1.7. + */ +typedef struct svn_repos_parse_fns3_t +{ + /** The parser has parsed the version information from header of + * the dumpsteeam within the parsing session represented by + * @parse_baton. The version information is available in @a + * version, and a scratch pool @a pool is available for any + * temporary allocations. + */ + svn_error_t *(*dumpstream_version_record)(int version, +void *parse_baton, +apr_pool_t *pool); + + /** The parser has discovered a new revision record within the + * parsing session represented by @a parse_baton. All the headers are + * placed in @a headers (allocated in @a pool), which maps ttconst + * char */tt header-name == ttconst char */tt header-value. + * The @a revision_baton received back (also allocated in @a pool) + * represents the revision. + */ + svn_error_t *(*new_revision_record)(void **revision_baton, + apr_hash_t *headers, + void *parse_baton, + apr_pool_t *pool); + + /** The parser has discovered a new uuid record within the parsing + * session represented by @a parse_baton. The uuid's value is + * @a uuid, and it is allocated in @a pool. + */ + svn_error_t *(*uuid_record)(const char *uuid, + void *parse_baton, + apr_pool_t *pool); + + /** The parser has discovered a new node record within the current + * revision represented by @a revision_baton. All the headers are + * placed in @a headers (as with @c new_revision_record), allocated in + * @a pool. The @a node_baton received back is allocated in @a pool + * and represents the node. + */ + svn_error_t *(*new_node_record)(void **node_baton, + apr_hash_t *headers, + void *revision_baton, + apr_pool_t *pool); + + /** For a given @a revision_baton, set a property @a name to @a + * value. Scratch pool @a pool is available for any temporary + * allocations. + */ + svn_error_t *(*set_revision_property)(void *revision_baton, +const char *name, +const svn_string_t *value, +apr_pool_t *pool); + + /** For a given @a node_baton, set a property @a name to @a + * value. Scratch pool @a pool is available for any temporary + * allocations. + */ + svn_error_t *(*set_node_property)(void *node_baton, +const char *name, +const svn_string_t *value, +apr_pool_t *pool); + + /** For a given @a node_baton, delete property @a name. Scratch pool +* @a pool is available for any temporary allocations. +*/ + svn_error_t *(*delete_node_property)(void *node_baton, + const char *name, + apr_pool_t *pool); + + /** For a given @a node_baton, remove all properties. Scratch pool + @a pool is available for any temporary allocations. */ + svn_error_t *(*remove_node_props)(void *node_baton, apr_pool_t *pool); + + /** For a given @a node_baton, receive a writable @a stream capable of + * receiving the node's fulltext. After writing the fulltext, call + * the stream's close() function. + * + * If a @c NULL is returned instead of a stream, the vtable is + * indicating that no text is desired, and the parser will not + * attempt to send it. + * + * Scratch pool @a pool is available for any temporary allocations. + */ + svn_error_t *(*set_fulltext)(svn_stream_t **stream, + void *node_baton, + apr_pool_t *pool); + + /** For a given @a node_baton, set @a handler and @a handler_baton + * to a window handler and baton capable of receiving a delta + * against the node's previous
[PATCH v2] New dumpstream parser to check version number
[[[ * subversion/libsvn_repos/load.c (svn_repos_parse_dumpstream2, svn_repos_parse_dumpstream3): Rename the older function and add a version_number argument; error out if there's a version mismatch. * subversion/include/svn_repos.h (svn_repos_parse_dumpstream2, svn_repos_parse_dumpstream3): Add new function and mark svn_repos_parse_dumpstream2 as deprecated. * subversion/svnrdump/load_editor.c (drive_dumpstream_loader): Update to use the new API, and call it with version_number 3. ]]] Index: subversion/include/svn_repos.h === --- subversion/include/svn_repos.h (revision 986884) +++ subversion/include/svn_repos.h (working copy) @@ -2286,6 +2286,7 @@ svn_repos_node_from_baton(void *edit_baton); /* The RFC822-style headers in our dumpfile format. */ #define SVN_REPOS_DUMPFILE_MAGIC_HEADERSVN-fs-dump-format-version #define SVN_REPOS_DUMPFILE_FORMAT_VERSION 3 +#define SVN_REPOS_DUMPFILE_FORMAT_UNKNOWN_VERSION-1 #define SVN_REPOS_DUMPFILE_UUID UUID #define SVN_REPOS_DUMPFILE_CONTENT_LENGTHContent-length @@ -2646,6 +2647,11 @@ typedef svn_repos_parse_fns2_t svn_repos_parser_fn * @a cancel_baton as argument to see if the client wishes to cancel * the dump. * + * If @a exact_version is SVN_REPOS_DUMPFILE_FORMAT_UNKNOWN_VERSION, + * it is ignored and the dumpstream is parsed without this + * information. Else, the function checks the dumpstream's version + * number, and errors out if there's a mismatch. + * * This parser has built-in knowledge of the dumpfile format, but only * in a general sense: * @@ -2661,17 +2667,33 @@ typedef svn_repos_parse_fns2_t svn_repos_parser_fn * This is enough knowledge to make it easy on vtable implementors, * but still allow expansion of the format: most headers are ignored. * - * @since New in 1.1. + * @since New in 1.7. */ svn_error_t * -svn_repos_parse_dumpstream2(svn_stream_t *stream, +svn_repos_parse_dumpstream3(svn_stream_t *stream, const svn_repos_parse_fns2_t *parse_fns, void *parse_baton, svn_cancel_func_t cancel_func, void *cancel_baton, +int exact_version, apr_pool_t *pool); +/** + * Similar to svn_repos_parse_dumpstream3(), but with @a exact_version + * always set to SVN_REPOS_DUMPFILE_FORMAT_UNKNOWN_VERSION. + * + * @since New in 1.1. + * @deprecated Provided for backward compatibility with the 1.6 API. + */ +SVN_DEPRECATED +svn_error_t *svn_repos_parse_dumpstream2(svn_stream_t *stream, +const svn_repos_parse_fns2_t *parse_fns, +void *parse_baton, +svn_cancel_func_t cancel_func, +void *cancel_baton, +apr_pool_t *pool); + /** * Set @a *parser and @a *parse_baton to a vtable parser which commits new * revisions to the fs in @a repos. The constructed parser will treat Index: subversion/libsvn_repos/load.c === --- subversion/libsvn_repos/load.c (revision 986884) +++ subversion/libsvn_repos/load.c (working copy) @@ -654,14 +654,29 @@ parse_format_version(const char *versionstring, in } +svn_error_t * +svn_repos_parse_dumpstream2(svn_stream_t *stream, +const svn_repos_parse_fns2_t *parse_fns, +void *parse_baton, +svn_cancel_func_t cancel_func, +void *cancel_baton, +apr_pool_t *pool) +{ + return svn_repos_parse_dumpstream3(stream, parse_fns, parse_baton, + cancel_func, cancel_baton, + SVN_REPOS_DUMPFILE_FORMAT_UNKNOWN_VERSION, + pool); +} + /* The Main Parser Logic */ svn_error_t * -svn_repos_parse_dumpstream2(svn_stream_t *stream, +svn_repos_parse_dumpstream3(svn_stream_t *stream, const svn_repos_parse_fns2_t *parse_fns, void *parse_baton, svn_cancel_func_t cancel_func, void *cancel_baton, +int exact_version, apr_pool_t *pool) { svn_boolean_t eof; @@ -690,6 +705,15 @@ svn_error_t * return svn_error_createf(SVN_ERR_STREAM_MALFORMED_DATA, NULL, _(Unsupported dumpfile version: %d), version); + /* Error out if exact_version doesn't match version exactly */ + /* ### TODO: Extend svn_parse_fns2_t to provide another callback for + ### version number; that way we can impose arbitrary constraints + ### on it, not just check for exact version */ +
Re: svnrdump: The BIG update
Hi Daniel, Daniel Shahaf writes: Ramkumar Ramachandra wrote on Thu, Aug 12, 2010 at 12:17:34 +0530: The dump functionality is also complete- thanks to Stefan's review and MANY others for cleaning it up. It's however hit a brick wall now because of missing headers in the RA layer. Until I (or someone else) figures out how to fix the RA layer, we can't do better than the XFail copy-and-modify test I've committed. Part of the diff there is lack of SHA-1 headers --- which is unavoidable until editor is revved --- but part of it is a missing Text-copy-source-md5. Why don't you output that information --- doesn't the editor give it to you? Afaik, no. I don't see Text-copy-source-* anywhere in the RA layer. Maybe I'm not looking hard enough? Hmm. It seems you're right. So you might have to use two RA session in parallel... (and then, you might have to have the user authenticate twice?) Hm, I also have to find out if it's allowed. The commit_editor doesn't allow it for instance. Besides, it's a very inelegant solution- I'd rather fix the RA layer than do this. - Make dumpfile v3 the de-facto standard and improve it for optimized loading/ generation. The former part was suggested by Stefan. - Integrate it into svnadmin etc as appropriate. I think there's enough work here for a mini-GSoC project? How would it be integrated into svnadmin? Do you want to push the logic into the standard 'svnadmin dump' command? This is something I haven't given thought either. I brought it up because of an earlier discussion in which everyone seemed to be in favor of NOT having a new command. It feels like we're stuffing a lot of functionality into one tool though. Personally I also like having svnadmin operates only locally (so it doesn't even link against libsvn_ra), but that was hashed out already on that moderately-long thread a few weeks ago. Yeah. It looks like I'll have to ressurect this thread soon and reach a concrete conclusion. -- Ram
Re: Proposal: Change repository's UUID over RA layer
Hi Hyrum, Hyrum K. Wright writes: On Sun, Aug 1, 2010 at 1:17 AM, Ramkumar Ramachandra artag...@gmail.com wrote: Hi, I've found that I need the functionality to change a repository's UUID over the RA layer in 'svnrdump load' (see my recently committed tests to see why). I initially planned to put this off until functionality until someone thinks I'm capable of handling this and gets me commit access to libsvn_ra. However, Daniel asked me not to expect this and start working on a branch instead after notifying the list- I've therefore started a new ra-uuid branch for this (see r981164). Expect to see some commits soon :) It looks like this idea has been discussed, but probably won't go forward. It's probably time to remove the branch. Indeed. Thanks for the reminder. Branch removed in r986566. -- Ram
Re: svn commit: r986466 - /subversion/trunk/subversion/svnrdump/dump_editor.c
Hi Bert, Bert Huijben writes: copyfrom_path is probably relative from the repository or session root (not sure which in your case). You are now appending the basename of a/paths/basename and then joining the copyfrom_path after it. Shouldn't that be the other way around? Really silly mistake, and this is very apparent from the diff. Thanks for catching; fixed in r986567 :) -- Ram
Re: svnrdump: The BIG update
Hi Daniel, Daniel Shahaf writes: It's been a few weeks since I got partial committer access, and ~80 commits later, this is what we have: Firstly, thanks to Daniel for motivating me and driving me to submit the series to the list, and guiding me through everything. Without him, I'd probably not have finished svnrdump to begin with. The command line interface and argument parsing library is ready- thanks to Bert and lots of others for getting me started with this. The interface is solid and looks like the one used in the other SVN tools. The dump functionality is also complete- thanks to Stefan's review and MANY others for cleaning it up. It's however hit a brick wall now because of missing headers in the RA layer. Until I (or someone else) figures out how to fix the RA layer, we can't do better than the XFail copy-and-modify test I've committed. Part of the diff there is lack of SHA-1 headers --- which is unavoidable until editor is revved --- but part of it is a missing Text-copy-source-md5. Why don't you output that information --- doesn't the editor give it to you? Afaik, no. I don't see Text-copy-source-* anywhere in the RA layer. Maybe I'm not looking hard enough? Nitpick: svnrdump_tests 5 6 have the same textual description / docstring as each other, could you please change that? See other test files (e.g., ./commit_tests.py --list) for plenty of examples. Fixed. Thanks for noticing this. It's quite mature and dumps surprisingly fast though. I'm tempted to run benchmarks, but I haven't done it yet because I fear I might be biased towards the tool :p Just write all the benchmarks before running them? Hehe, yeah. Will do- I just have to make sure that no external factors affect the tests (example: variations of network speed, disk speed, cache with time). The load functionality is also quite complete, thanks to Bert et al for helping me debug all the cryptic errors. The code is mostly unreviewed though- there might be plenty of bugs and code cleanup opportunities. Not to say that I've stopped working on it- just that the work has become less challenging, now that all the tests pass :) Okay, good. Some field testing probably needed here? Yeah, lots. I've tested against 1000 revisions of the ASF successfully, but I'll need more time and patience to run more tests. TODO: - Write more tests and start using svnrdump for real! Advertise it, especially to developers of other versioning systems looking to communicate with SVN. Remember how this project started out? Don't forget to inform us...@subversion.apache.org :-) Oh, okay. I'll write another email for them. - More optimizations. Since svnrdump is already so fast compared to the other tools, I think we can squeeze some more speed out of it. - Huge documentation effort. svnrdump is a hack- I just did what I felt like and got it to work somehow. It's very unlike svnmucc, which does things by the book. - Build more infrastructure around svnrdump- I've mostly used existing SVN API. Although a lot of new functions were suggested, I never really got down to writing them. Yep. There was also talk of moving some of the logic into the libraries --- where does that stand? Yeah, I haven't started working on this yet. I'll need some guidance for this- I have to sketch out a roadmap and ask for access to the specified regions or branch; planning is something I'm not used to at all :p - Make dumpfile v3 the de-facto standard and improve it for optimized loading/ generation. The former part was suggested by Stefan. - Integrate it into svnadmin etc as appropriate. I think there's enough work here for a mini-GSoC project? How would it be integrated into svnadmin? Do you want to push the logic into the standard 'svnadmin dump' command? This is something I haven't given thought either. I brought it up because of an earlier discussion in which everyone seemed to be in favor of NOT having a new command. It feels like we're stuffing a lot of functionality into one tool though. - GitHub support (?) -- I saw this discussed on IRC somewhere, but I didn't understand this myself. Can someone clarify? Joke. GitHub implemented a mod_dav_svn interface to their repositories [1], so it's now possible (if their implementation is sound) to generate an svn dump of a GitHub git repository. Ah, yes. I'm aware. With the infrastructure I've written on the Git end (incomplete), the SVN - Git bidirectional bridge should be seamless and awesome :) Note: I'll be visiting home this weekend (that means: mostly travelling). I'll be back to hack next week. -- Ram
Re: svn commit: r983096 - in /subversion/trunk/subversion/svnrdump: load_editor.c load_editor.h
Hi Daniel, Daniel Shahaf writes: Ramkumar, I've noticed before in your patches that they tend to have multiple separable parts, and this commit is a good opportunity to explain: Thanks for the review. if (strcmp(hname, SVN_REPOS_DUMPFILE_NODE_COPYFROM_REV) == 0) nb-copyfrom_rev = atoi(hval); if (strcmp(hname, SVN_REPOS_DUMPFILE_NODE_COPYFROM_PATH) == 0) - nb-copyfrom_path = -svn_path_url_add_component2(rb-pb-root_url, -apr_pstrdup(rb-pool, hval), -rb-pool); +nb-copyfrom_path = + svn_path_url_add_component2(rb-pb-root_url, + apr_pstrdup(rb-pool, hval), + rb-pool); Whitespace-only change; should have been done in a separate commit. (Don't mix functional and non-functional change in the same commit.) Yeah, this is a stray change. } if (svn_path_compare_paths(svn_relpath_dirname(nb-path, pool), @@ -252,14 +254,14 @@ new_node_record(void **node_baton, nb-copyfrom_path, nb-copyfrom_rev, rb-pool, (nb-file_baton))); - LDR_DBG((Adding file %s to dir %p as %p\n, nb-path, rb-db-baton, nb-file_baton)); + LDR_DBG((Added file %s to dir %p as %p\n, nb-path, rb-db-baton, nb-file_baton)); Non-functional change; should have been in a separate commit (if at all). Right, I probably shouldn't do more than one thing at once. I'll try to commit more often then at the risk of making many trivial commits. break; case svn_node_action_replace: /* Absent in dumpstream; represented as a delete + add */ @@ -411,9 +401,9 @@ apply_textdelta(svn_txdelta_window_handl nb = node_baton; commit_editor = nb-rb-pb-commit_editor; pool = nb-rb-pool; - SVN_ERR(commit_editor-apply_textdelta(nb-file_baton, NULL /* base_checksum */, - pool, handler, handler_baton)); LDR_DBG((Applying textdelta to %p\n, nb-file_baton)); + SVN_ERR(commit_editor-apply_textdelta(nb-file_baton, nb-base_checksum, + pool, handler, handler_baton)); The line reordering makes it harder to spot the functional change done here: passing nb-base_checksum instead of NULL. Right. Should have probably been in two separate commits - I get the point now. While here, how is this change related to the svn_node_action_delete bug? (The log message doesn't say.) Are the checksum-related changes and the delete-related changes logically part of one patch/bugfix? Or could you have committed them as two separate patches? return SVN_NO_ERROR; } @@ -429,8 +419,8 @@ close_node(void *baton) if (nb-kind == svn_node_file) { - SVN_ERR(commit_editor-close_file(nb-file_baton, NULL, nb-rb-pool)); LDR_DBG((Closing file %p\n, nb-file_baton)); + SVN_ERR(commit_editor-close_file(nb-file_baton, NULL, nb-rb-pool)); This actually *is* a functional change (which is not mentioned in the log message), but should have been in a separate commit. Oh, I thought changed tense of LDR_DBG messages would suffice. Have to be more explicit/ careful. -- Ram
Re: svn commit: r983222 - /subversion/trunk/subversion/svnrdump/load_editor.c
Hi Daniel, Daniel Shahaf writes: artag...@apache.org wrote on Sat, Aug 07, 2010 at 12:31:50 -: Author: artagnon Date: Sat Aug 7 12:31:50 2010 New Revision: 983222 URL: http://svn.apache.org/viewvc?rev=983222view=rev Log: svnrdump: Fix a bug in the load_editor; it was unable to handle revisions without node information previously. * subversion/svnrdump/load_editor.c (close_revision): Add a new if-branch; if the commit_editor doesn't exist, create one, open_root and close_edit on it to indicate that we've finished processing the revision. While at it, also fix indentation. ^^ I take it you haven't seen my previous commit review yet? The trade-off is the creation of many trivial commits :) -- Ram
Re: Proposal: Change repository's UUID over RA layer
Hi Daniel, Daniel Shahaf writes: The implementation should be trivial (over ra_local and ra_svn; probably over ra_dav too but I don't know that one as well). But what about authz concerns? Do you want to allow any authenticated user to change the repository UUID? No, this doesn't make sense. Do you want UUID changing to be disallowed by default (unless the server admin took an explicit action)? Yeah. Maybe something like the pre-revprop-change hook? -- Ram
Re: Proposal: Change repository's UUID over RA layer
Hi Greg, Greg Stein writes: Why would an admin install a hook to allow changing a UUID? Why would a UUID be allowed to change over time? If a UUID is supposed to be changed, then why wouldn't that admin just do it himself? Why does this have to be allowed remotely? Agreed- this feature has a very narrow usecase. Anyway, since I started the thread, let me explain the motivation for this feature. It derives from `svnrdump` -- Anybody without administrative priviliges can actually mirror a complete Subversion repository using `svnrdump dump | svnrdump load` quite efficiently. Let's say some sponsor wants to provide a mirroring service- with the proposed feature the person will simply have to create a blank repository and enable pre-revprop-change/ pre-uuid-change on and wait for someone to load the content into that repository. After the intial mirroring is done, he can always turn off the pre-uuid-change. It's probably a dangerous feature, and I don't know if it's worth the trouble. -- Ram
Re: svn commit: r979282 - in /subversion/trunk/subversion/svnrdump: dump_editor.c dump_editor.h load_editor.c load_editor.h svnrdump.c
Hi Kamesh and Philip, Philip Martin writes: Kamesh Jayachandran kam...@collab.net writes: On 07/26/2010 07:25 PM, artag...@apache.org wrote: Author: artagnon Date: Mon Jul 26 13:55:25 2010 New Revision: 979282 URL: http://svn.apache.org/viewvc?rev=979282view=rev Log: svnrdump: Add an a no-op load editor * subversion/svnrdump/svnrdump.c (load_revisions): New function to fetch and drive the load editor. Currently just fetches the load editor. (load_cmd): Call load_revisions with the appropriate arguments. * subversion/svnrdump/load_editor.c (get_load_editor): Add a no-op load editor. Please name this function 'get_load_editor' as 'svnrdump__get_load_editor'. You also need to #include load_editor.h in load_editor.c to make the prototype visible (possibly instead of dump_editor.h?) Sorry about the late response- there was a huge amount of development and API changes over the last week. I'm sure you'll agree that it's quite unpleasant to type svnrdump___get_load_editor into gdb everytime during development. Ofcourse, when everything starts working and the API becomes more stable, I'll expose all re-usable functionality. Thank you for understanding. Also, please mark me on the TO or CC when specifically addressing my commits- I have to handle massive volumes of email everyday, and I sometimes miss email on which I'm not marked. Thanks again. -- Ram
Re: svn commit: r979282 - in /subversion/trunk/subversion/svnrdump: dump_editor.c dump_editor.h load_editor.c load_editor.h svnrdump.c
Hi Hyrum, Hyrum K. Wright writes: I don't think that easy of typing into the debugger should be a primary qualification for API naming conventions... Hehe, ofcourse. I just meant to say that I'll save the formal naming until I get past the point where the program can go a few seconds without segfaulting :) -- Ram
Re: Proposal: Change repository's UUID over RA layer
Hi Daniel, Daniel Shahaf writes: Ramkumar Ramachandra wrote on Sun, Aug 01, 2010 at 11:47:51 +0530: However, Daniel asked me not to expect this and start working on a branch instead after notifying the list- No, I didn't ask you to notify the list; I asked you to start a design discussion on the list. That's a different beast. You should not just tell us Hey, I started a branch for feature X; you should also discuss the plans --- the UI (both for users and admins) and the implementation --- and solicit feedback. Sounds good in theory, but I don't know enough about the infrastructure to sketch out any concrete design plans- the moment I start studying it in this detail, I'll have a mock implementation ready- when I'm there, I think I'll post the RFC patch to the list and wait for feedback. Thanks. -- Ram
Re: [PATCH] Fix a historical detail in the commit editor
Hi Bert, Bert Huijben writes: But I think you forget that before creating a commit editor that user had to setup an ra session. And every ra session allows retrieving the repository root (in most cases it is even cached before the first ra operation). So you can just store this in the initial commit editor baton. I think this is a compromise between keeping an option on allowing to record copies from different repositories (maybe compatibility with DeltaV?) and otherwise the option to hide the real repository root. (We now need it in several places so it is available everywhere, but around 1.0 the repository root was mostly optional information). But if we would rev this function today, I would recommend switching to a repos_relpath. (And we should certainly check how we are going to handle this case with the new function of automatic relocation). Thanks for the explanation. Fixed along with other minor things in r981876. -- Ram
Proposal: Change repository's UUID over RA layer
Hi, I've found that I need the functionality to change a repository's UUID over the RA layer in 'svnrdump load' (see my recently committed tests to see why). I initially planned to put this off until functionality until someone thinks I'm capable of handling this and gets me commit access to libsvn_ra. However, Daniel asked me not to expect this and start working on a branch instead after notifying the list- I've therefore started a new ra-uuid branch for this (see r981164). Expect to see some commits soon :) Thanks! -- Ram
[PATCH] Fix a historical detail in the commit editor
Hi, The patch should be pretty much self-explanatory. It makes no sense for the caller of the commit editor to find out the repository's URL. This information is actually accessible but hidden inside the `edit_baton` structure; but this structure is not public and the caller simply sees a void pointer. As long as the path relative to the repository's root is present in the repository, it should be enough for the commit editor. I assume that it's currently comparing against the full URL for some historical reasons. I realize that this patch will break too many things, and hence it is not intended for inclusion. However, I just thought it would be nice to point this out so anyone starting out to build a new commit editor will be able to use this information. Index: commit.c === --- commit.c(revision 981223) +++ commit.c(working copy) @@ -307,6 +307,7 @@ add_directory(const char *path, svn_fs_root_t *copy_root; svn_node_kind_t kind; size_t repos_url_len; + svn_boolean_t is_dir; /* Copy requires recursive write access to the destination path and write access to the parent path. */ @@ -324,16 +325,17 @@ add_directory(const char *path, if ((kind != svn_node_none) (! pb-was_copied)) return out_of_date(full_path, kind); - /* For now, require that the url come from the same repository - that this commit is operating on. */ + /* Check that copy_path is present in the repository */ copy_path = svn_path_uri_decode(copy_path, subpool); repos_url_len = strlen(eb-repos_url); - if (strncmp(copy_path, eb-repos_url, repos_url_len) != 0) + svn_fs_is_dir(is_dir, eb-txn_root, copy_path, subpool); + if (!is_dir) return svn_error_createf (SVN_ERR_FS_GENERAL, NULL, - _(Source url '%s' is from different repository), copy_path); + _(Directory '%s' cannot be found in this repository), + copy_path); - fs_path = apr_pstrdup(subpool, copy_path + repos_url_len); + fs_path = apr_pstrdup(subpool, copy_path); /* Now use the fs_path as an absolute path within the repository to make the copy from. */ @@ -452,6 +454,7 @@ add_file(const char *path, svn_fs_root_t *copy_root; svn_node_kind_t kind; size_t repos_url_len; + svn_boolean_t is_file; /* Copy requires recursive write to the destination path and parent path. */ @@ -468,16 +471,17 @@ add_file(const char *path, if ((kind != svn_node_none) (! pb-was_copied)) return out_of_date(full_path, kind); - /* For now, require that the url come from the same repository - that this commit is operating on. */ + /* Check that copy_path is present in the repository */ copy_path = svn_path_uri_decode(copy_path, subpool); repos_url_len = strlen(eb-repos_url); - if (strncmp(copy_path, eb-repos_url, repos_url_len) != 0) + svn_fs_is_file(is_file, eb-txn_root, copy_path, subpool); + if (!is_file) return svn_error_createf (SVN_ERR_FS_GENERAL, NULL, - _(Source url '%s' is from different repository), copy_path); + _(File '%s' cannot be found in this repository), + copy_path); - fs_path = apr_pstrdup(subpool, copy_path + repos_url_len); + fs_path = apr_pstrdup(subpool, copy_path); /* Now use the fs_path as an absolute path within the repository to make the copy from. */
Re: [PATCH] Fix a historical detail in the commit editor
Hi, Ramkumar Ramachandra writes: The patch should be pretty much self-explanatory. It makes no sense for the caller of the commit editor to find out the repository's URL. This information is actually accessible but hidden inside the `edit_baton` structure; but this structure is not public and the caller simply sees a void pointer. As long as the path relative to the repository's root is present in the repository, it should be enough for the commit editor. I assume that it's currently comparing against the full URL for some historical reasons. I realize that this patch will break too many things, and hence it is not intended for inclusion. However, I just thought it would be nice to point this out so anyone starting out to build a new commit editor will be able to use this information. [...] Related to this, see how I've worked around the problem in svnrdump in r981266. -- Ram
Re: [PATCH] Handle the expected_is_regexp case in display_lines
Hi, Ramkumar Ramachandra writes: [[[ Followup r979295 to handle the expected_is_regexp case in display_lines. * subversion/tests/cmdline/svntest/verify.py (display_lines): When expected is a string and not a list (in the expected_in_regexp case), put it in a one-member list, and don't output a diff. Found by: rhuijben ]]] Committed in r979975. In r979972, Bert granted me priviliges to svntest. -- Ram
Re: [PATCH] Add copy-and-modify test to svnrdump
Hi, Ramkumar Ramachandra writes: [[[ svnrdump: Add new copy-and-modify test. * subversion/tests/cmdline/svnrdump_tests_data/copy-and-modify.dump: Add a new testdata to be used by the copy_and_modify test in svnrdump. Taken originally from svnsync_test_data/ and converted to dumpfile v3 format. * subversion/tests/cmdline/svnrdump_tests_data/revision0.dump: Cosmetically rename revision0.dump to revision-0.dump. * subversion/tests/cmdline/svnrdump_tests_data/revision-0.dump: Add. * subversion/tests/cmdline/svnrdump_tests.py (revision0, revision_0): Cosmetically rename the revision 0 test. (copy_and_modify): Import a new test from svnsync. It is expected to fail due to mismatch in the headers produced by svnrdump and svnadmin. (test_list): Add copy_and_modify to the list of tests to be run, carefully noting that it is expected to fail. ]]] Committed in r979974. In r979972, Bert granted me priviliges to svntest. -- Ram
[PATCH] Add copy-and-modify test to svnrdump
[[[ svnrdump: Add new copy-and-modify test. * subversion/tests/cmdline/svnrdump_tests_data/copy-and-modify.dump: Add a new testdata to be used by the copy_and_modify test in svnrdump. Taken originally from svnsync_test_data/ and converted to dumpfile v3 format. * subversion/tests/cmdline/svnrdump_tests_data/revision0.dump: Cosmetically rename revision0.dump to revision-0.dump. * subversion/tests/cmdline/svnrdump_tests_data/revision-0.dump: Add. * subversion/tests/cmdline/svnrdump_tests.py (revision0, revision_0): Cosmetically rename the revision 0 test. (copy_and_modify): Import a new test from svnsync. It is expected to fail due to mismatch in the headers produced by svnrdump and svnadmin. (test_list): Add copy_and_modify to the list of tests to be run, carefully noting that it is expected to fail. ]]] Index: subversion/tests/cmdline/svnrdump_tests_data/revision0.dump === --- subversion/tests/cmdline/svnrdump_tests_data/revision0.dump (revision 979511) +++ subversion/tests/cmdline/svnrdump_tests_data/revision0.dump (working copy) @@ -1,30 +0,0 @@ -SVN-fs-dump-format-version: 3 - -UUID: 95f5730d-843b-4fe3-8879-f46da52f2123 - -Revision-number: 0 -Prop-content-length: 258 -Content-length: 258 - -K 8 -svn:date -V 27 -2003-01-08T10:33:40.549533Z -K 26 -svn:sync-currently-copying -V 2 -15 -K 17 -svn:sync-from-url -V 31 -http://svn.apache.org/repos/asf -K 18 -svn:sync-from-uuid -V 36 -13f79535-47bb-0310-9956-ffa450edef68 -K 24 -svn:sync-last-merged-rev -V 2 -14 -PROPS-END - Index: subversion/tests/cmdline/svnrdump_tests_data/copy-and-modify.dump === --- subversion/tests/cmdline/svnrdump_tests_data/copy-and-modify.dump (revision 0) +++ subversion/tests/cmdline/svnrdump_tests_data/copy-and-modify.dump (working copy) @@ -0,0 +1,82 @@ +SVN-fs-dump-format-version: 3 + +UUID: 6ad9f820-0205-0410-94a2-c8cf366bb2b3 + +Revision-number: 0 +Prop-content-length: 56 +Content-length: 56 + +K 8 +svn:date +V 27 +2005-11-07T23:36:48.095832Z +PROPS-END + +Revision-number: 1 +Prop-content-length: 112 +Content-length: 112 + +K 10 +svn:author +V 6 +rooneg +K 8 +svn:date +V 27 +2005-11-07T23:37:17.705159Z +K 7 +svn:log +V 11 +add foo.txt +PROPS-END + +Node-path: foo.txt +Node-kind: file +Node-action: add +Prop-content-length: 10 +Text-delta: true +Text-content-length: 4 +Text-content-md5: d41d8cd98f00b204e9800998ecf8427e +Text-content-sha1: da39a3ee5e6b4b0d3255bfef95601890afd80709 +Content-length: 14 + +PROPS-END +SVN + +Revision-number: 2 +Prop-content-length: 135 +Content-length: 135 + +K 10 +svn:author +V 6 +rooneg +K 8 +svn:date +V 27 +2005-11-07T23:37:44.549695Z +K 7 +svn:log +V 34 +copy and change at the same time. + +PROPS-END + +Node-path: bar.txt +Node-kind: file +Node-action: add +Node-copyfrom-rev: 1 +Node-copyfrom-path: foo.txt +Text-copy-source-md5: d41d8cd98f00b204e9800998ecf8427e +Text-copy-source-sha1: da39a3ee5e6b4b0d3255bfef95601890afd80709 +Text-delta: true +Text-delta-base-md5: d41d8cd98f00b204e9800998ecf8427e +Text-delta-base-sha1: da39a3ee5e6b4b0d3255bfef95601890afd80709 +Text-content-length: 19 +Text-content-md5: d2508118d0d39e198d1129d87d692d59 +Text-content-sha1: e2fb5f2139d086ded2cb600d5a91a196e76bf020 +Content-length: 19 + +SVN �modified + + Index: subversion/tests/cmdline/svnrdump_tests.py === --- subversion/tests/cmdline/svnrdump_tests.py (revision 979511) +++ subversion/tests/cmdline/svnrdump_tests.py (working copy) @@ -95,10 +95,14 @@ def basic_svnrdump(sbox): if not out[0].startswith('SVN-fs-dump-format-version:'): raise svntest.Failure('No valid output') -def revision0(sbox): +def revision_0(sbox): dump revision zero - run_test(sbox, dumpfile_name = revision0.dump) - + run_test(sbox, dumpfile_name = revision-0.dump) + +def copy_and_modify(sbox): + copy and modify + run_test(sbox, copy-and-modify.dump) + # Run the tests @@ -106,7 +110,8 @@ def basic_svnrdump(sbox): # list all tests here, starting with None: test_list = [ None, basic_svnrdump, - revision0, + revision_0, + XFail(copy_and_modify), ] if __name__ == '__main__':
[PATCH] Handle the expected_is_regexp case in display_lines
[[[ Followup r979295 to handle the expected_is_regexp case in display_lines. * subversion/tests/cmdline/svntest/verify.py (display_lines): When expected is a string and not a list (in the expected_in_regexp case), put it in a one-member list, and don't output a diff. Found by: rhuijben ]]] Index: subversion/tests/cmdline/svntest/verify.py === --- subversion/tests/cmdline/svntest/verify.py (revision 979710) +++ subversion/tests/cmdline/svntest/verify.py (working copy) @@ -280,25 +280,25 @@ def display_lines(message, label, expected, actual output = 'EXPECTED %s' % label if expected_is_regexp: output += ' (regexp)' + expected = [expected + '\n'] if expected_is_unordered: output += ' (unordered)' output += ':' print(output) for x in expected: sys.stdout.write(x) -if expected_is_regexp: - sys.stdout.write('\n') if actual is not None: print('ACTUAL %s:' % label) for x in actual: sys.stdout.write(x) # Additionally print unified diff - print('DIFF ' + ' '.join(output.split(' ')[1:])) - for x in unified_diff(expected, actual, -fromfile=EXPECTED %s % label, -tofile=ACTUAL %s % label): -sys.stdout.write(x) + if not expected_is_regexp: +print('DIFF ' + ' '.join(output.split(' ')[1:])) +for x in unified_diff(expected, actual, + fromfile=EXPECTED %s % label, + tofile=ACTUAL %s % label): + sys.stdout.write(x) def compare_and_display_lines(message, label, expected, actual, raisable=None):
Re: [PATCH v2] Make display_lines output a diff
Hi Daniel, Ramkumar Ramachandra writes: Daniel Shahaf writes: I suggest fromfile=EXPECTED %s % label, tofile=ACTUAL %s % label, Do I get a +1 for this? I just got one from Bert. Committed in r979710. Thanks. -- Ram
Re: [PATCH] Import svnsync tests into svnrdump
Hi, Ramkumar Ramachandra writes: The tests in svnsync_tests_data/ are in dumpfile v2 and these are unsuitable for testing svnrdump. Hence, load all of them into a repository and re-dump them in dumpfile v3 format before attempting to add them to svnrdump_tests_data/. I still have to figure out how to extend the test framework to ignore certain headers in the diff- currently, using run_test() in svnrdump to run tests corresponding to these files will fail all tests. I've discussed this with Daniel on the channel, and I've told him that I specifically want this because we don't want svnrdump tests to depend on svnadmin. Does anyone object to this patch? It was created by dumping the data in svn-test-data/ after running svnsync_tests.py. Thanks. -- Ram