Re: Inconsistent behavior in cat command
Noorul Islam K M noo...@collab.net writes: When I was trying to come up with a patch for issue 3713, I observed the following. For example I have two files 1.txt and 2.txt in a repository located at file:///tmp/testrepo svn cat behaves differently for local paths and URLs. See the illustration below. noo...@noorul:/tmp/wc/testrepo$ svn cat 1.txt 2.txt 1 2 A) Local non-existent target followed by existing target noo...@noorul:/tmp/wc/testrepo$ svn cat 3.txt 1.txt svn: warning: '/tmp/wc/testrepo/3.txt' is not under version control 1 B) Non-existent URL followed by existing URL noo...@noorul:/tmp/wc/testrepo$ svn cat ^/3.txt ^/1.txt svn: File not found: revision 1, path '/3.txt' In case A, even though the first target was non-existent it performs cat operation on the second target but in the case of B, svn errors out at the first failure itself. I am not sure about behavior of other svn commands which accepts multiple targets. When I discussed this Julian in IRC, he said a discussion is needed to come up with standardized behavior across svn commands. Any thoughts? Any updates? Thanks and Regard Noorul
Re: Can I add NOT NULL to PRISTINE table columns?
I (Julian Foad) wrote: Julian Foad julian.f...@wandisco.com writes: I imagine it should be possible to add 'NOT NULL' to these columns without performing a format bump or writing any upgrade code. Am I right? Hyrum Wright wrote: If we're already enforcing it in the C code, I see no problem with doing so in the sql schema. Additionally, I don't really see how you could add the 'NOT NULL' clause in a format bump without doing table duplication, which doesn't seem worth the trouble for something like this. Right. Philip Martin wrote: Existing working copies would not be upgraded, although I suppose the client would work. Perhaps make it a format 99 step? The software won't see any difference except it will provide a sanity check when we're developing and modifying Subversion code and testing against new WCs. I don't think there is any need to upgrade existing development WCs in this respect. I'll mention it in a comment in the format 99 step, but don't think we need to write any SQL for it. Committed r1041321. - Julian
Re: svn commit: r1041230 - /subversion/trunk/subversion/include/svn_checksum.h
On Wed, 2010-12-01, Blair Zajac wrote: On 12/1/10 4:38 PM, stef...@apache.org wrote: Author: stefan2 Date: Thu Dec 2 00:38:17 2010 New Revision: 1041230 URL: http://svn.apache.org/viewvc?rev=1041230view=rev Log: Fix the svn_checksum_to_cstring() docstring to actually say what was intended. Also, make clear that the behavior is new for 1.7 and trying to use it in 1.6 will cause segfaults. * subversion/include/svn_checksum.h (svn_checksum_to_cstring): fix docstring What happens if somebody makes a svn tool that is compiled and built against the new 1.7 behavior and then it is backported to 1.6, it may core dump. Any back-port effort of this kind needs to take account of all API changes, not just newly created APIs. If it doesn't, then yes it may crash. I believe that is an expected result of our compatibility rules, and not without precedent, although I can't quickly lay my hands on a good example. Should we add a svn_checksum_to_cstring2() instead with the new behavior or backport this change to 1.6? But even then we'll have 1.6 versions with different behavior. It seems making a new svn_checksum_to_cstring2() is better. We should not port this change to 1.6.x. I don't believe we need to rev the API for this reason; however, I haven't reviewed the change and so I don't know if there are other reasons. - Julian
Re: svn commit: r1040831 - in /subversion/trunk/subversion: include/svn_checksum.h libsvn_subr/checksum.c
Hi Stefan2. A good test for whether it's worth making an API accept NULL as an input is: what proportion of the callers would find that useful? I see there are about 40 callers in the code base. Would you mind scanning through them and letting us know? - Julian On Thu, 2010-12-02 at 07:51 +0200, Daniel Shahaf wrote: Stefan Fuhrmann wrote on Thu, Dec 02, 2010 at 01:39:34 +0100: On 01.12.2010 04:25, Daniel Shahaf wrote: stef...@apache.org wrote on Tue, Nov 30, 2010 at 23:56:41 -: Author: stefan2 Date: Tue Nov 30 23:56:40 2010 New Revision: 1040831 URL: http://svn.apache.org/viewvc?rev=1040831view=rev Log: Fix 'svnadmin verify' for format 5 repositories. During the checking process, they yield NULL for SHA1 checksums. The most robust way to fix that is to allow NULL for checksum in svn_checksum_to_cstring. This seems justified as NULL is already a valid return value instead of e.g. an empty string. So, callers may receive NULL as result and could therefore assume that NULL is a valid input, too Can you explain how to trigger the bug you are fixing? Just running 'svnadmin verify' on my r1040058-created Greek repository doesn't. Sure: $svnadmin-1.5.4 create /mnt/svnroot/test $add pre-revprop-change hook $svnsync-1.5.4 init file:///mnt/svnroot/test http://svn.apache.org/repos/asf $svnsync-1.5.4 sync file:///mnt/svnroot/test $cancel after a few revs; rev 1 will already trigger the error $svnadmin-trunk verify /mnt/svnroot/test And then the dump logic calls svn_fs_file_checksum(force=FALSE), which causes a NULL checksum to be returned. Thanks for the recipe; knowing it it's easier to review the fix. Daniel
Re: svn commit: r1041102 - /subversion/trunk/subversion/include/svn_io.h
On Thu, 2010-12-02, Daniel Shahaf wrote: julianf...@apache.org wrote on Wed, Dec 01, 2010 at 17:44:50 -: -/** Copy file @a file from location @a src_path to location @a dest_path. - * Use @a pool for memory allocations. +/** Copy the file whose basename or relative path is @a file within + * directory @a src_path to the same basename or relative path within + * directory @a dest_path. Overwrite the destination if it already If you allow relative paths as input for @a file, then shouldn't the docstring say whether or not this creates intermediate directories? Example: svn_io_dir_file_copy('A/B', 'A/C', 'E/alpha', NULL). r1041336. Thanks. - Julian + * exists. Set the destination file's permissions to match those of + * the source. Use @a pool for memory allocations. */ svn_error_t * svn_io_dir_file_copy(const char *src_path, const char *dest_path, const char *file, apr_pool_t *pool)
Re: svn commit: r1040832 - Port a fix for a FSFS packing race
On Wed, 2010-12-01, Daniel Shahaf wrote: Julian Foad wrote on Wed, Dec 01, 2010 at 12:32:45 +: On Wed, 2010-12-01, stef...@apache.org wrote: Port (not merge) a fix for a FSFS packing race condition from the performance branch to /trunk: There is a slight time window between finding the name of a rev file and actually open it. If the revision in question gets packed just within this window, we will find the new (and final) file name with just one retry. * subversion/libsvn_fs_fs/fs_fs.c (open_pack_or_rev_file): retry once upon missing file error Hi Stefan. (1) I think the doc string of svn_fs_fs__path_rev_absolute() should note that the returned path may no longer be valid by the time you come to use it, and the reason for that. Does my patch below sound right? Well, yes, and Stefan already added such a comment at the definition at my suggestion. But +1 to move that to the declaration. (2) Doesn't the exact same race exist in *all* uses of svn_fs_fs__path_rev_absolute()? There are five other calls to it, As far as I can tell, apart from open_pack_or_rev_file(), all callers always[1] run under the write lock. Since pack operation take the write lock too, it's not possible for the result of svn_fs_fs__path_rev_absolute() to become outdated for those callers. OK, that's good. No change needed then. [1] with the exception of set_revision_proplist(), which can be called either under the write lock or under write_revision_zero() --- but in the latter case it's called before the format file has been created. all of them using the result as a permissions reference to set the file permissions on some new file. It seems to me that those uses can fail in the same way. The root problem is the existence of a get the path of this file API in the presence of semantics whereby the file might be removed from that path at any time. Perhaps your fix is the best fix for this caller, but for the other callers I think creating and using a copy file permissions from rev-file of rev R API would be a better solution than adding re-tries there. That API can do the re-try internally. What do you think? Patch for (1) above: [[[ Index: subversion/libsvn_fs_fs/fs_fs.h === --- subversion/libsvn_fs_fs/fs_fs.h (revision 1040662) +++ subversion/libsvn_fs_fs/fs_fs.h (working copy) @@ -404,13 +404,16 @@ svn_error_t *svn_fs_fs__txn_changes_fetc PERMS_REFERENCE. Temporary allocations are from POOL. */ svn_error_t *svn_fs_fs__move_into_place(const char *old_filename, const char *new_filename, const char *perms_reference, apr_pool_t *pool); -/* Sets *PATH to the path of REV in FS, whether in a pack file or not. +/* Set *PATH to the path of REV in FS, whether in a pack file or not. + The path is not guaranteed to remain correct after the function returns, + because it is possible that the revision will become packed just after + this call, so the caller should re-try once if the path is not found. Allocate in POOL. */ Sounds right. Bordering on bikeshed, I would suggest some changes: * Separate describing what the function does (Sets *PATH to FOO and allocates in POOL) from describing the conditions the caller should beware of / check for (sometimes the return value will be out-of-date by the time you receive it). * Mention that the non-guarantee is not in effect if the caller has the write lock. OK, committed in r1041056 with these improvements. svn_error_t * svn_fs_fs__path_rev_absolute(const char **path, svn_fs_t *fs, svn_revnum_t rev, apr_pool_t *pool); Also, svn_fs_fs__path_rev_absolute() also does, internally, a single repeat loop. Theoretically this isn't needed any more now (since all callers either run under the write lock or retry)... Can you check the attached patch for this, please? - Julian Remove the re-try logic from svn_fs_fs__path_rev_absolute(). Since r1040832, all its callers correctly account for the possibility of an out-of-date result due to a concurrent packing operation. The re-try logic was introduced in r875097 and reduced but did not eliminate the window of opportunity for the caller to use an out-of-date result. See the email thread http://svn.haxx.se/dev/archive-2010-12/0019.shtml, subject Re: svn commit: r1040832 - Port a fix for a FSFS packing race. * subversion/libsvn_fs_fs/fs_fs.c (svn_fs_fs__path_rev_absolute): Remove the re-try logic. * subversion/libsvn_fs_fs/fs_fs.h (svn_fs_fs__path_rev_absolute): Update the doc string accordingly. --This line, and those below, will be ignored-- Index:
Re: Input validation observations
On Thu, 2010-12-02 at 14:05 +0530, Noorul Islam K M wrote: Julian Foad julian.f...@wandisco.com writes: I tried some potentially invalid inputs to svn a week or two ago and made notes on what I found. Just posting here in case anyone wants to do something about one or more of them. Noorul, I'm including you in the To addresses because you said you were looking for more small tasks to do, so feel free to pick one of these if you want to. Where I end with a question mark, it means I'm not sure if we want this change, it's just a suggestion. * svn checkout ^/ ^/y - A asf/cxf, A asf/cxf/utils, (Don't try this without being ready on the Ctrl-C or Ctrl-\!) It seems to ignore ^/y and create ./(basename(^/)); should fail: '^/y' is not a WC path. * svn checkout ^/subversion/trunk/build ^/y - Checked out revision 1040465. URL 'https://svn.apache.org/repos/asf/y' doesn't exist. Bleach - that's just crazy. Should fail: '^/y' is not a WC path. In this case code assumes that the user is trying to checkout from both source into current directory. i.e, If all source targets are URLs then the checkout target becomes . Oh yes, you're right - there's no problem there. I didn't notice that svn checkout URL URL was a supported syntax. Thanks. - Julian
Re: Input validation observations
On Thu, 2010-12-02 at 13:58 +0530, Noorul Islam K M wrote: Julian Foad julianf...@btopenworld.com writes: On Tue, 2010-11-30 at 18:42 +0530, Noorul Islam K M wrote: Julian Foad julian.f...@wandisco.com writes: I tried some potentially invalid inputs to svn a week or two ago and made notes on what I found. Just posting here in case anyone wants to do something about one or more of them. Noorul, I'm including you in the To addresses because you said you were looking for more small tasks to do, so feel free to pick one of these if you want to. Thank you! I will start working on these one by one. Great! Where I end with a question mark, it means I'm not sure if we want this change, it's just a suggestion. * svn checkout ^/ ^/y - A asf/cxf, A asf/cxf/utils, (Don't try this without being ready on the Ctrl-C or Ctrl-\!) It seems to ignore ^/y and create ./(basename(^/)); should fail: '^/y' is not a WC path. * svn checkout ^/subversion/trunk/build ^/y - Checked out revision 1040465. URL 'https://svn.apache.org/repos/asf/y' doesn't exist. Bleach - that's just crazy. Should fail: '^/y' is not a WC path. * svn copy a ^/b c doesn't detect the mixed source types in cl, only in lib; should reject them at CLI level? * svn info ^/b - Not a valid URL; should be path '/b' not found in revision REV? * svn mkdir ^/ a - Illegal repository URL 'a'; should say can't mix URL and local targets? * svn mkdir a ^/ - Can't create directory 'https:/svn.apache.org/repos/asf'; should not print the URL as if it's a local path. * svn mv ^/ ^/ - Cannot move path 'https:/svn.apache.org/repos/asf' into itself; should not print the URL as if it's a local path. * svn update ^/a - Skipped 'https://svn.apache.org/repos/asf/a' ...; should fail: '^/a' is not a WC path? svn help update says that the argument should be a PATH. I think it is right to force the user to enter local PATH. Good. Thanks for finding that. I agree. That change will make some of my recent patch (r1040232) redundant: it will no longer need to remove URL targets from the array of targets, since it will just return an error if it finds any, like you already did in some of the other subcomands. I further looked into the code, since update accepts multiple targets, the program skips URL targets assuming that there might one or more local paths as part of targets list. The skipping part is intentional and I am not sure whether we should change this behavior. I think we should change this behaviour and make svn update throw an error if any target is a URL. - Julian
Re: svn commit: r1040832 - Port a fix for a FSFS packing race
Julian Foad wrote on Thu, Dec 02, 2010 at 12:15:19 +: On Wed, 2010-12-01, Daniel Shahaf wrote: Julian Foad wrote on Wed, Dec 01, 2010 at 12:32:45 +: (2) Doesn't the exact same race exist in *all* uses of svn_fs_fs__path_rev_absolute()? There are five other calls to it, As far as I can tell, apart from open_pack_or_rev_file(), all callers always[1] run under the write lock. Since pack operation take the write lock too, it's not possible for the result of svn_fs_fs__path_rev_absolute() to become outdated for those callers. OK, that's good. No change needed then. -/* Sets *PATH to the path of REV in FS, whether in a pack file or not. +/* Set *PATH to the path of REV in FS, whether in a pack file or not. + The path is not guaranteed to remain correct after the function returns, + because it is possible that the revision will become packed just after + this call, so the caller should re-try once if the path is not found. Allocate in POOL. */ Sounds right. Bordering on bikeshed, I would suggest some changes: * Separate describing what the function does (Sets *PATH to FOO and allocates in POOL) from describing the conditions the caller should beware of / check for (sometimes the return value will be out-of-date by the time you receive it). * Mention that the non-guarantee is not in effect if the caller has the write lock. OK, committed in r1041056 with these improvements. +1, thanks. svn_error_t * svn_fs_fs__path_rev_absolute(const char **path, svn_fs_t *fs, svn_revnum_t rev, apr_pool_t *pool); Also, svn_fs_fs__path_rev_absolute() also does, internally, a single repeat loop. Theoretically this isn't needed any more now (since all callers either run under the write lock or retry)... Can you check the attached patch for this, please? - Julian Remove the re-try logic from svn_fs_fs__path_rev_absolute(). Since r1040832, all its callers correctly account for the possibility of an out-of-date result due to a concurrent packing operation. The re-try logic was introduced in r875097 and reduced but did not eliminate the window of opportunity for the caller to use an out-of-date result. See the email thread http://svn.haxx.se/dev/archive-2010-12/0019.shtml, subject Re: svn commit: r1040832 - Port a fix for a FSFS packing race. * subversion/libsvn_fs_fs/fs_fs.c (svn_fs_fs__path_rev_absolute): Remove the re-try logic. * subversion/libsvn_fs_fs/fs_fs.h (svn_fs_fs__path_rev_absolute): Update the doc string accordingly. --This line, and those below, will be ignored-- Index: subversion/libsvn_fs_fs/fs_fs.c === --- subversion/libsvn_fs_fs/fs_fs.c (revision 1041339) +++ subversion/libsvn_fs_fs/fs_fs.c (working copy) @@ -246,8 +246,6 @@ path_rev(svn_fs_t *fs, svn_revnum_t rev, apr_psprintf(pool, %ld, rev), NULL); } -/* Returns the path of REV in FS, whether in a pack file or not. - Allocate in POOL. */ svn_error_t * svn_fs_fs__path_rev_absolute(const char **path, svn_fs_t *fs, @@ -256,45 +254,16 @@ svn_fs_fs__path_rev_absolute(const char { fs_fs_data_t *ffd = fs-fsap_data; - if (ffd-format SVN_FS_FS__MIN_PACKED_FORMAT) + if (ffd-format SVN_FS_FS__MIN_PACKED_FORMAT + || ! is_packed_rev(fs, rev)) { *path = path_rev(fs, rev, pool); - return SVN_NO_ERROR; } - - if (! is_packed_rev(fs, rev)) + else { - svn_node_kind_t kind; - - /* Initialize the return variable. */ - *path = path_rev(fs, rev, pool); - - SVN_ERR(svn_io_check_path(*path, kind, pool)); - if (kind == svn_node_file) -{ - /* *path is already set correctly. */ - return SVN_NO_ERROR; -} - else -{ - /* Someone must have run 'svnadmin pack' while this fs object - * was open. */ - - SVN_ERR(update_min_unpacked_rev(fs, pool)); - - /* The rev really should be present now. */ - if (! is_packed_rev(fs, rev)) -return svn_error_createf(APR_ENOENT, NULL, - _(Revision file '%s' does not exist, - and r%ld is not packed), - svn_dirent_local_style(*path, pool), - rev); - /* Fall through. */ -} + *path = path_rev_packed(fs, rev, pack, pool); } - *path = path_rev_packed(fs, rev, pack, pool); - return SVN_NO_ERROR; } This part looks good. I'm more concerned about a bug in the All callers use the write lock, ... analysis than about a bug in the above hunk. In fact, doesn't the
Re: 1.5.8 up for signing/testing
Stefan Sperling s...@elego.de writes: On Wed, Dec 01, 2010 at 01:14:30PM -0600, Hyrum K. Wright wrote: 1.5.8 tarballs are up for testing and signing. The magic revision is r1041089: http://people.apache.org/~hwright/svn/1.5.8/ 1.5.8 is missing the critial blame -g server-side memory leak crash fix. The trunk revisions merge cleanly into 1.5.x (apart from the property conflicts mentioned in the STATUS entry), so I've nominated those revisions for 1.5.x. I am running tests on 1.5.8 anyway, but would prefer cutting 1.5.9 with the blame -g fix instead of releasing 1.5.8. The blame fix does look like something we should release. I've just finished running the tests on 1.5.8, so for the record: Platform: Linux (Debian/stable). Verified: signatures, MD5 checksums, SHA1 checksums only expected difference compared to tags/1.5.8 Tested: 1.5.8 tarball with local dependencies (not deps tarball) (local, svn, serf, neon) x (fsfs, bdb) + swig-pl/py/rb + javahl + svn+sasl Dependencies: subversion : 1.5.8 apache2-threaded-dev : 2.2.9-10+lenny8 libapr1-dev : 1.4.2-3~bpo50+2 libaprutil1-dev : 1.2.12+dfsg-8+lenny5 libdb4.6-dev : 4.6.21-11 libneon27-dev : 0.28.2-6.1 libsasl2-dev : 2.1.22.dfsg1-23+lenny1 libsqlite3-dev : 3.6.21-2~bpo50+1 swig : 1.3.36-1 perl : 5.10.0-19lenny2 python2.5-dev : 2.5.2-15+lenny1 ruby1.8-dev : 1.8.7.72-3lenny1 openjdk-6-jdk : 6b11-9.1+lenny2 serf : tags/0.3.1 Results: All tests pass. Signatures: subversion-1.5.8.tar.bz2 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) iQEcBAABCAAGBQJM959UAAoJEHbXiOHtGlmcRi0H/3ulIEf/O0Gdo+IAJ71f9WbG FnNUzOt1AWRi9GXo5EqQRciuSy7atKF+N3gQOLcXLVEiox6B/xnv2kErKmSpFBOn O9RGbd1QUxW4IKo6XtvR7L5taVQqWLeXdbhaMF9xmoVkOeMKDy22/hUsFpoizmIN 3GYt/2Clu0PJRy4JOdirM/VtJyeEDJXlpgbANvXipFXvzdwFWLR8fJzuyNOSrxTT Pclj7czadJDOklxqmDO26DIbCp6PPqVPVpXZAaDKV+XSiqcG0uq5hXxYfxY0yY++ gGouXDEBG3JXkxPjXyQkmQnwtTfDoYqMty7hh4YIN5jVHqgdUv0mzb7fCUbiqjY= =YvJv -END PGP SIGNATURE- subversion-1.5.8.tar.gz -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) iQEcBAABCAAGBQJM959wAAoJEHbXiOHtGlmcotoH/j3nZ8+PzJcBGwrCG+6qhl3y PxM54qe8hscV6Um7UUND0xL0SdrcKqbW10LMcBbquAXrpIPY51nvT3DumzZGf1wO iTYNj0FiyOUM0OSMEqcG8xD3qf8B2WRivEqC+fhbHP4byLIYK6fZf7wGS7I/v2+B 86Ro/3TnQGJhvasKlyMU+HX850rvPt3mlsGzKmVBR0mBM+RCdslHYmtA5fPQoKxQ YyFCKrbRR6r69dEqHmv4Xa/zRO+IgvoF4FVCqObtK1EXg+ZHYbD+h2xLuwx8yIfz 9yGzeH/d4OImT1t6iHvSgD+CEfnF1/dLXhTI53BZmFF0rvvSXT32fxqtmUL9VgQ= =5mKA -END PGP SIGNATURE- -- Philip
Re: diff-optimizations-tokens branch: I think I'm going to abandon it
On Thu, Dec 2, 2010 at 6:43 AM, Daniel Shahaf d...@daniel.shahaf.name wrote: [ finally getting back to this mail; having slept on it, etc. ] Johan Corveleyn wrote on Wed, Dec 01, 2010 at 13:34:48 +0100: On Wed, Dec 1, 2010 at 11:44 AM, Daniel Shahaf d...@daniel.shahaf.name wrote: Johan Corveleyn wrote on Wed, Dec 01, 2010 at 10:05:29 +0100: On Wed, Dec 1, 2010 at 3:38 AM, Daniel Shahaf d...@daniel.shahaf.name wrote: Johan Corveleyn wrote on Wed, Dec 01, 2010 at 00:25:27 +0100: I am now considering to abandon the tokens-approach, for the following reasons: ... So, unless someone can convince me otherwise, I'm probably going to stop with the token approach. Because of 2), I don't think it's worth it to spend the effort needed for 1), especially because the byte-based approach already works. In other words, you're saying that the token-based approach: (b) won't be as fast as the bytes-based approach can be, and (a) requires much effort to be spent on implementing the reverse reading of tokens? (i.e., a backwards gets()) Yes. The reverse reading is quite hard (in the diff_file.c case) because of the chunked reading of files. A line may be split by a chunk boundary (it may even be split in the middle of an eol sequence (between \r and \n), and it all still needs to be canonicalized/normalized correctly (that's why we'll also need a reverse normalization function). The current forward get_next_token does this very well, and the reverse should be analogous, but I just find it hard to reason about, and to get it implemented correctly. It will take me a lot of time, and with a high probability of introducing subtle bugs for special edge cases. OK, so a reverse get_next_token() could be tricky to implement. But, worse, won't having it mean that implementors of svn_diff_fns_t can't have streamy access to their source? Since they would be required to provide sometimes a token from the start and sometimes a token from the end... Well, it's not streamy in our usual sense of the word, but it's double-streamy (no one requests the middle byte until either all bytes before or all bytes after it were transmitted already) Oooh, I hadn't considered other implementors besides the internal diff_memory.c and diff_file.c. Just to clarify: diff_file.c and diff_memory.c are the only implementors of svn_diff_fns_t in the core code, correct? Yes. You're right, that would impose additional constraints on other implementors. I don't know if being non-streamy (or less streamy anyway) would be problem ... We should have asked this before, but: Do we know who are the other implementors / typical use-cases of svn_diff_fns_t? Yeah, was wondering about this too. Are there in fact other implementors? Maybe plugins for IDE's or something? How could we find out? How public is this API in fact? For blame, I know all revisions are first converted to full-texts, which are written out to temp files. Then diff_file.c works on those two files. In my last commit on the -tokens branch, I added a flag open_at_end to the datasource_open function (function of svn_diff_fns_t), so the datasource could be opened at the end. Also, other supporting functions were needed: token_pushback_prefix, token_pushback_suffix (to push back tokens that were read too far when scanning for prefix/suffix) and the dreaded datasource_get_previous_token. Anyway, the high-level idea was: 1) Open datasources at the end. 2) Scan for identical suffix (getting tokens in reverse). 3) At first non-match: pushback suffix token, and note where suffix starts. 4) Open datasources at the beginning. 5) Scan for identical prefix (getting tokens normally, but without hash). 6) At first non-match: pushback prefix token. 7) Run the old algorithm: getting tokens forward, but with hash. The get_next_token function should stop (return NULL for token) when it arrives at the suffix. Sidestep: I just now realized that I probably don't need to have the reverse normalization algorithm for implementing get_previous_token. The call to the normalization function in get_next_token is (I think) only needed to be able to calculate the hash. But since get_previous_token doesn't need to calculate hashes, I may be able to get by without normalization there. I'd only need to normalize inside token_compare, and I *think* I can just to that forwardly, instead of backwards. Just thinking out loud here ... Is normalization function the thing that collapses newlines and whitespaces if -x-w or -x--ignore-eol-style is in effect? If so, another purpose of calling that might be to make suffixes that are not bytewise-identical, but only modulo-whitespace-identical, also be lumped into the identical suffix. (Haven't dived into the code yet; the above is based on my understanding of your high-level descriptions) Yes, it's svn_diff__normalize_buffer in
Re: 1.5.8 up for signing/testing
On 12/02/2010 01:11 AM, C. Michael Pilato wrote: On 12/01/2010 02:14 PM, Hyrum K. Wright wrote: 1.5.8 tarballs are up for testing and signing. The magic revision is r1041089: http://people.apache.org/~hwright/svn/1.5.8/ Summary: +0 to release, pending review of the ra_serf situation (see below). In light of discussion elsethread, I'm upgrading this to +1, but with the *recommendation* that we scuttle this release and retry with a version that includes the 'blame -g' fixes (and perhaps some 'configure'-time warning about building against a Serf newer than 0.3.1. -- C. Michael Pilato cmpil...@collab.net CollabNet www.collab.net Distributed Development On Demand
Re: svn commit: r1040832 - Port a fix for a FSFS packing race
Julian Foad wrote on Thu, Dec 02, 2010 at 14:33:19 +: On Thu, 2010-12-02 at 14:56 +0200, Daniel Shahaf wrote: Julian Foad wrote on Thu, Dec 02, 2010 at 12:15:19 +: Remove the re-try logic from svn_fs_fs__path_rev_absolute(). Since r1040832, all its callers correctly account for the possibility of an out-of-date result due to a concurrent packing operation. The re-try logic was introduced in r875097 and reduced but did not eliminate the window of opportunity for the caller to use an out-of-date result. See the email thread http://svn.haxx.se/dev/archive-2010-12/0019.shtml, subject Re: svn commit: r1040832 - Port a fix for a FSFS packing race. * subversion/libsvn_fs_fs/fs_fs.c (svn_fs_fs__path_rev_absolute): Remove the re-try logic. * subversion/libsvn_fs_fs/fs_fs.h (svn_fs_fs__path_rev_absolute): Update the doc string accordingly. In fact, doesn't the correctness of this change depend on the fact that svn_fs_fs__with_write_lock() calls update_min_unpacked_rev() before executing the body() callback? Yes, it does. Do you think it shouldn't? I'm not sure whether it should or shouldn't depend. The change IS correct, but it is only correct because the ffd-min_unpacked_rev is known to be up-to-date whenever the write lock is held. That's a bit of a spaghetti effect there. Now, is there another such effect that means the patch is wrong? (I haven't come up with one yet, but that doesn't mean there isn't any.) (Otherwise we don't know whether there is a concurrent packer, or just ffd-min_unpacked_rev is stale.) You know, I'm not too clear on the design intention here. I would have assumed it would be the job of all the fs_fs.c code to ensure that ffd-min_unpacked_rev is never stale (due to its own actions) at any point where it matters. But the way it does that seems to be to re-read the file in several places, instead of simply updating ffd-min_unpacked_rev when it writes the file. Maybe the idea was that that method would pick up both internal and concurrent (external) changes in the same way - I don't know. Seems odd. Yes, that was the idea. If two processes access the filesystem at the same time, and one of them calls svn_fs_pack(), then the other's ffd-min_unpacked_rev would be stale. It's the same story with ffd-youngest_rev, by the way; compare ensure_revision_exists(), which I believe predates the packing logic. Do you have an idea whether something should be done differently? I don't, not without studying it further. I'm not sure. Ultimately, ffd-min_unpacked_rev is just a cache, so we have to update it when we're about to do something that depends on its value. In the meantime, let me at least attempt to define the something which perhaps should be done differently: Given the way packing works today, it's possible for a revision file to stop existing where you expected it to exist. Today the code informs itself of that situation by re-reading min-packed-rev and retrying. (Further up the stack, the code calculating the offset to seek to also needs to know whether it's seeking a pack file or a revision file.) I note that the following comment in pack_shard() is not quite right: /* Update the min-unpacked-rev file to reflect our newly packed shard. * (ffd-min_unpacked_rev will be updated by open_pack_or_rev_file().) That may or may not be true, but it doesn't seem like this function has any right to make that assertion. We could remove the comment and have that function call update_min_unpacked_rev(). (This would also save us a failed disk access later.) But is it strictly *necessary* to do so? I think not: by not calling update_min_unpacked_rev(), we cause ffd-min_unpacked_rev to become stale --- a situation which the code has to account for anyway. And in pack_revprop_shard(), it's the same, verbatim: /* Update the min-unpacked-rev file to reflect our newly packed shard. should say 'the min-unpacked-revprop file'; * (ffd-min_unpacked_rev will be updated by open_pack_or_rev_file().) and that second line looks completely bogus in this context. Yes, whoever did the copy-paste forgot to update some places. Index: subversion/libsvn_fs_fs/fs_fs.h === --- subversion/libsvn_fs_fs/fs_fs.h (revision 1041339) +++ subversion/libsvn_fs_fs/fs_fs.h (working copy) @@ -411,8 +411,10 @@ svn_error_t *svn_fs_fs__move_into_place( Allocate *PATH in POOL. Note: If the caller does not have the write lock on FS, then the path is - not guaranteed to remain correct after the function returns, because the - revision might become packed just after this call. */ + not guaranteed to be correct or to remain correct after the function + returns, because the revision might become packed before or after this + call. If a file
Re: diff-optimizations-tokens branch: I think I'm going to abandon it
Hi Johan. I've just read the whole of this thread. I didn't quite understand your original point (2) that token-based suffix scanning will not be as fast as byte-based suffix scanning. Sure it won't, but is there any reason you mentioned suffix scanning there specifically? The same is true of prefix scanning, of course. And both of them could be fast enough, I assume, if you disable the hash calculation in the get token callbacks like you were talking about. But I don't think that necessarily affects the main point. It looks like you've thoroughly investigated using a token based approach. Thank you for doing so. My initial feeling that it was worth investigating was in the hope that you might find some fairly straightforward and self-contained modification to the existing token-handling layer. I think the result of this investigation, in which you needed to add token-fetch-backwards callbacks and so on, shows that this approach is too complex. I don't want to see a complex implementation. Therefore I support your inclination to abandon that approach and use the byte-wise approach instead. - Julian
Re: svn commit: r1040832 - Port a fix for a FSFS packing race
Daniel Shahaf wrote: Julian Foad wrote on Thu, Dec 02, 2010 at 14:33:19 +: I note that the following comment in pack_shard() is not quite right: /* Update the min-unpacked-rev file to reflect our newly packed shard. * (ffd-min_unpacked_rev will be updated by open_pack_or_rev_file().) That may or may not be true, but it doesn't seem like this function has any right to make that assertion. We could remove the comment and have that function call update_min_unpacked_rev(). (This would also save us a failed disk access later.) But is it strictly *necessary* to do so? I think not: by not calling update_min_unpacked_rev(), we cause ffd-min_unpacked_rev to become stale --- a situation which the code has to account for anyway. And in pack_revprop_shard(), it's the same, verbatim: /* Update the min-unpacked-rev file to reflect our newly packed shard. should say 'the min-unpacked-revprop file'; * (ffd-min_unpacked_rev will be updated by open_pack_or_rev_file().) and that second line looks completely bogus in this context. Yes, whoever did the copy-paste forgot to update some places. First step: this patch fixes the comments. Good to commit? [[[ Index: subversion/libsvn_fs_fs/fs_fs.c === --- subversion/libsvn_fs_fs/fs_fs.c (revision 1041350) +++ subversion/libsvn_fs_fs/fs_fs.c (working copy) @@ -7606,8 +7606,8 @@ pack_shard(const char *revs_dir, SVN_ERR(svn_io_set_file_read_only(manifest_file_path, FALSE, pool)); /* Update the min-unpacked-rev file to reflect our newly packed shard. - * (ffd-min_unpacked_rev will be updated by open_pack_or_rev_file().) - */ + * (This doesn't update ffd-min_unpacked_rev. That will be updated by + * open_pack_or_rev_file() when necessary.) */ final_path = svn_dirent_join(fs_path, PATH_MIN_UNPACKED_REV, iterpool); SVN_ERR(svn_stream_open_unique(tmp_stream, tmp_path, fs_path, svn_io_file_del_none, iterpool, iterpool)); @@ -7674,9 +7674,8 @@ pack_revprop_shard(svn_fs_t *fs, SVN_ERR(svn_sqlite__insert(NULL, stmt)); } - /* Update the min-unpacked-rev file to reflect our newly packed shard. - * (ffd-min_unpacked_rev will be updated by open_pack_or_rev_file().) - */ + /* Update the min-unpacked-revprop file to reflect our newly packed shard. + * (This doesn't update ffd-min_unpacked_revprop.) */ final_path = svn_dirent_join(fs_path, PATH_MIN_UNPACKED_REVPROP, iterpool); SVN_ERR(svn_stream_open_unique(tmp_stream, tmp_path, fs_path, svn_io_file_del_none, iterpool, iterpool)); ]]] - Julian
Re: svn commit: r1040832 - Port a fix for a FSFS packing race
Julian Foad wrote on Thu, Dec 02, 2010 at 15:34:48 +: First step: this patch fixes the comments. Good to commit? [[[ Index: subversion/libsvn_fs_fs/fs_fs.c === --- subversion/libsvn_fs_fs/fs_fs.c (revision 1041350) +++ subversion/libsvn_fs_fs/fs_fs.c (working copy) @@ -7606,8 +7606,8 @@ pack_shard(const char *revs_dir, SVN_ERR(svn_io_set_file_read_only(manifest_file_path, FALSE, pool)); /* Update the min-unpacked-rev file to reflect our newly packed shard. - * (ffd-min_unpacked_rev will be updated by open_pack_or_rev_file().) - */ + * (This doesn't update ffd-min_unpacked_rev. That will be updated by + * open_pack_or_rev_file() when necessary.) */ Didn't you mean s/open_pack_or_rev_file/update_min_unpacked_rev/? final_path = svn_dirent_join(fs_path, PATH_MIN_UNPACKED_REV, iterpool); SVN_ERR(svn_stream_open_unique(tmp_stream, tmp_path, fs_path, svn_io_file_del_none, iterpool, iterpool)); @@ -7674,9 +7674,8 @@ pack_revprop_shard(svn_fs_t *fs, SVN_ERR(svn_sqlite__insert(NULL, stmt)); } - /* Update the min-unpacked-rev file to reflect our newly packed shard. - * (ffd-min_unpacked_rev will be updated by open_pack_or_rev_file().) - */ + /* Update the min-unpacked-revprop file to reflect our newly packed shard. + * (This doesn't update ffd-min_unpacked_revprop.) */ final_path = svn_dirent_join(fs_path, PATH_MIN_UNPACKED_REVPROP, iterpool); SVN_ERR(svn_stream_open_unique(tmp_stream, tmp_path, fs_path, svn_io_file_del_none, iterpool, iterpool)); ]]] +1 - Julian
Re: svn commit: r1040832 - Port a fix for a FSFS packing race
On Thu, 2010-12-02 at 17:40 +0200, Daniel Shahaf wrote: Julian Foad wrote on Thu, Dec 02, 2010 at 15:34:48 +: First step: this patch fixes the comments. Good to commit? [[[ Index: subversion/libsvn_fs_fs/fs_fs.c === --- subversion/libsvn_fs_fs/fs_fs.c (revision 1041350) +++ subversion/libsvn_fs_fs/fs_fs.c (working copy) @@ -7606,8 +7606,8 @@ pack_shard(const char *revs_dir, SVN_ERR(svn_io_set_file_read_only(manifest_file_path, FALSE, pool)); /* Update the min-unpacked-rev file to reflect our newly packed shard. - * (ffd-min_unpacked_rev will be updated by open_pack_or_rev_file().) - */ + * (This doesn't update ffd-min_unpacked_rev. That will be updated by + * open_pack_or_rev_file() when necessary.) */ Didn't you mean s/open_pack_or_rev_file/update_min_unpacked_rev/? I was just modifying the existing comment and assuming it had some truth or meaning in mentioning open_pack_or_rev_file(). Not sure what, exactly. final_path = svn_dirent_join(fs_path, PATH_MIN_UNPACKED_REV, iterpool); SVN_ERR(svn_stream_open_unique(tmp_stream, tmp_path, fs_path, svn_io_file_del_none, iterpool, iterpool)); @@ -7674,9 +7674,8 @@ pack_revprop_shard(svn_fs_t *fs, SVN_ERR(svn_sqlite__insert(NULL, stmt)); } - /* Update the min-unpacked-rev file to reflect our newly packed shard. - * (ffd-min_unpacked_rev will be updated by open_pack_or_rev_file().) - */ + /* Update the min-unpacked-revprop file to reflect our newly packed shard. + * (This doesn't update ffd-min_unpacked_revprop.) */ final_path = svn_dirent_join(fs_path, PATH_MIN_UNPACKED_REVPROP, iterpool); SVN_ERR(svn_stream_open_unique(tmp_stream, tmp_path, fs_path, svn_io_file_del_none, iterpool, iterpool)); ]]] +1 r1041422. If you the point above still needs changing, please do so or let me know. - Julian
Re: 1.5.8 up for signing/testing
On Thu, Dec 02, 2010 at 08:54:35AM -0500, C. Michael Pilato wrote: On 12/02/2010 07:38 AM, Stefan Sperling wrote: You need to use serf 0.3.x with Subversion 1.5. Subversion 1.6.x includes changes to make it compatible with newer serf versions, but those haven't been merged to 1.5.x. 1.5.x works fine with serf 0.3.x. Great, thanks. Now, if we wind up rolling a new release anyway, then can we enforce this serf version cap at build time for 1.5.x? Makes sense to me.
Re: diff-optimizations-tokens branch: I think I'm going to abandon it
Note: This email only tangentially relates to svn diff and more about reverse token scanning in general: As someone who has implemented suffix reverse token scanning before: * It simply isn't possible in DBCS code pages. Stick to byte only here. SBCS and UTF-16 make reverse token stuff relatively straightforward. UTF-8 is a little trickier but still tractable. At least UTF-8 is tractable in a way that DBCS isn't. You always know which part of a Unicode code point you are in. (i.e. byte 4 vs. byte 3 vs. etc...) * I would recommend only supporting a subset of the diff options for reverse token scanning. i.e. ignore whitespace/ignore eol but skip ignore case (if svn has that, I forget...) If tokens include keyword expansion operations then stop once you hit one. The possible source of bugs outways the perf gain in my mind here. * Suffix scanning does really require a seekable stream, if it isn't seekable then don't perform the reverse scanning. It is only an optimization after all. Additional ignore whitespace related comment: * IIRC, Perforce had an interesting twist on ignoring whitespace. You could ignore just line leading/ending whitespace instead of all whitespace differences but pay attention to any whitespace change after the trim operation had completed. e.g.: * aaa bbb vs aaa bbb would compare as equal * aaa bbb vs aaa bbb would compare as equal * aaa bbb vs aaa bbb would compare as non-equal due to the white space change in the middle of the line Fyi, Bill
Re: diff-optimizations-tokens branch: I think I'm going to abandon it
On 12/02/2010 12:18 PM, Bill Tutt wrote: [...] . o O ( Who the heck is this Bill Tutt guy? ) Nice to read you again, Bill! -- C. Michael Pilato cmpil...@collab.net CollabNet www.collab.net Distributed Development On Demand signature.asc Description: OpenPGP digital signature
Re: diff-optimizations-tokens branch: I think I'm going to abandon it
On Thu, Dec 2, 2010 at 4:21 PM, Julian Foad julian.f...@wandisco.com wrote: Hi Johan. I've just read the whole of this thread. I didn't quite understand your original point (2) that token-based suffix scanning will not be as fast as byte-based suffix scanning. Sure it won't, but is there any reason you mentioned suffix scanning there specifically? The same is true of prefix scanning, of course. I mentioned suffix specifically, because for prefix scanning the bytes approach doesn't have that same advantage. It has to count the number of lines, to have a correct line offset for the rest of the normal algorithm. So the byte-based prefix scanning needs to check for eol-characters just like the token-based version. That means that theoretically (implementation details aside) both approaches should be equally fast for prefix scanning. But for suffix scanning, it seems I don't need to know the number of lines, so the bytes approach can just ignore line structure for the suffix. And both of them could be fast enough, I assume, if you disable the hash calculation in the get token callbacks like you were talking about. Granted, the difference could be minimal (I haven't actually tested). However, an additional two comparisons (against \r and \n), for every byte in the suffix of say 500Kb, for say 2000 diffs of a blame operation, it might cost another couple of seconds :-). But I don't think that necessarily affects the main point. It looks like you've thoroughly investigated using a token based approach. Thank you for doing so. My initial feeling that it was worth investigating was in the hope that you might find some fairly straightforward and self-contained modification to the existing token-handling layer. I think the result of this investigation, in which you needed to add token-fetch-backwards callbacks and so on, shows that this approach is too complex. I don't want to see a complex implementation. Therefore I support your inclination to abandon that approach and use the byte-wise approach instead. Yep, it just seems too complex (as also confirmed by Bill Tutt in the other mail). The pushback stuff (which is inevitable, because we will always read one token too far to discover the non-match, and that line needs to be reread to calculate the hash) is also rather dirty/annoying (adds additional svn_diff_fns_t-ness). It was definitely worth investigating, if only to give me some peace of mind that we have considered the options (and it was a great learning experience). Cheers, -- Johan
Re: diff-optimizations-tokens branch: I think I'm going to abandon it
On Thu, Dec 2, 2010 at 6:18 PM, Bill Tutt b...@tutts.org wrote: Note: This email only tangentially relates to svn diff and more about reverse token scanning in general: As someone who has implemented suffix reverse token scanning before: Thanks for the input. It's nice to see other people have also struggled with this :-). * It simply isn't possible in DBCS code pages. Stick to byte only here. SBCS and UTF-16 make reverse token stuff relatively straightforward. UTF-8 is a little trickier but still tractable. At least UTF-8 is tractable in a way that DBCS isn't. You always know which part of a Unicode code point you are in. (i.e. byte 4 vs. byte 3 vs. etc...) Ok, this further supports the decision to focus on the byte-based approach. We'll only consider stuff identical if all bytes are identical. That's the simplest route, and since it's only an optimization anyway ... * I would recommend only supporting a subset of the diff options for reverse token scanning. i.e. ignore whitespace/ignore eol but skip ignore case (if svn has that, I forget...) svn diff doesn't have an ignore-case option, so that's ok :-). If tokens include keyword expansion operations then stop once you hit one. The possible source of bugs outways the perf gain in my mind here. Haven't thought about keyword expansion yet. But as you suggest: I'm not going to bother doing special stuff for (expanded) keywords. If we find a mismatch, we'll stop with the optimized scanning, and fall back to the default algorithm. * Suffix scanning does really require a seekable stream, if it isn't seekable then don't perform the reverse scanning. It is only an optimization after all. Hm, yes, we'll need to be careful about that. I'll start another mail thread asking for known implementors of the svn_diff_fns_t functions, to find out whether seeking around like that for suffix would be supported. Additional ignore whitespace related comment: * IIRC, Perforce had an interesting twist on ignoring whitespace. You could ignore just line leading/ending whitespace instead of all whitespace differences but pay attention to any whitespace change after the trim operation had completed. e.g.: * aaa bbb vs aaa bbb would compare as equal * aaa bbb vs aaa bbb would compare as equal * aaa bbb vs aaa bbb would compare as non-equal due to the white space change in the middle of the line Cool (svn doesn't have that option). But I'm not sure what that would be useful for (as a user, I can't immediately imagine an important use case). Anyway, could still be a nice option... Cheers, -- Johan
Re: gpg-agent branch treats PGP passphrase as repository password?
On Wed, Dec 01, 2010 at 06:29:06PM -0500, Dan Engel wrote: On Wed, 2010-12-01 at 14:08 +0100, Stefan Sperling wrote: However, I still see a potential risk here because the name gpg-agent is very misleading. It violates the principle of least surprise. How can we prevent users misunderstanding what Subversion's gpg-agent feature does from entering their private pgp key passphrase (which will then be sent to the server)? Can we control the prompt printed by gpg-agent? (Enter your Subversion password, NOT your secret PGP passphrase!) Yes, the agent protocol provides for customized prompts, and the patch itself refers to the Subversion repository server (or something like that) in that prompt. If we can control the prompt, then let's just make the prompt clear enough so that people who read it don't accidentally enter their pgp passphrase. That will make the mistake much less likely. If people don't read the prompt, well, then we cannot help them either. Thanks for pointing out how gpg-agent really works. Stefan
Re: diff-optimizations-tokens branch: I think I'm going to abandon it
Johan Corveleyn wrote on Thu, Dec 02, 2010 at 21:21:20 +0100: On Thu, Dec 2, 2010 at 6:18 PM, Bill Tutt b...@tutts.org wrote: If tokens include keyword expansion operations then stop once you hit one. The possible source of bugs outways the perf gain in my mind here. Haven't thought about keyword expansion yet. But as you suggest: I'm not going to bother doing special stuff for (expanded) keywords. If we find a mismatch, we'll stop with the optimized scanning, and fall back to the default algorithm. Only for the suffix scanning? Or also for the prefix scanning? (I think you meant the former.) For prefix scanning, I think it's common to have $Id$ or similar near the top of the file, so being able to collect matching lines past it too would be beneficial.
Re: 1.7.x - merge now accesses all files in WC?
On Wed, Dec 1, 2010 at 7:34 PM, Philip Martin philip.mar...@wandisco.comwrote: Daniel Becroft djcbecr...@gmail.com writes: Under 1.7.x, the following file(s) are accessed (merging the same revision as above): - .svn\wc.db - Every versioned file in the working copy What does accessed mean? stat(), open(), read(), write()? I'm not sure. I'm pretty sure it's not read() or write(), but it's most likely stat(). There is a known issue that involves a stat() on files when accessing the metadata (svn_wc__db_pdh_parse_local_abspath). Is that what you are seeing? I've just managed to build/install trunk on my ubuntu box at home (first application I've ever compiled on it - yey!). What debugging tools would you recommend to investigate this further? I've seen output posted that lists function names, and time spent on each. I can't see any reason why all these files would need to be accessed. I seem to recall some discussion about preventing/warning merging into modified working copies, could this be the cause? The new check is for a single revision working copy, not an unmodified one. Ah, that makes more sense, I guess. Checking for an unmodified WC would mean that the ability to run consecutive 'svn merge -c' commands would be removed. Cheers, Daniel B.
Re: 1.7.x - merge now accesses all files in WC?
On Thu, Dec 2, 2010 at 3:43 PM, Daniel Becroft djcbecr...@gmail.com wrote: On Wed, Dec 1, 2010 at 7:34 PM, Philip Martin philip.mar...@wandisco.comwrote: ... I can't see any reason why all these files would need to be accessed. I seem to recall some discussion about preventing/warning merging into modified working copies, could this be the cause? The new check is for a single revision working copy, not an unmodified one. Ah, that makes more sense, I guess. Checking for an unmodified WC would mean that the ability to run consecutive 'svn merge -c' commands would be removed. You can run 'svn merge -c17,85,90,123' if you need to merge multiple revs in the same operation. -Hyrum
1.5.9 up for testing/signing
1.5.9 tarballs are up for testing and signing. The magic revision is r1041577: http://people.apache.org/~hwright/svn/1.5.9/ To sign the release, please input your signatures using the script here: http://work.hyrumwright.org/pub/svn/collect_sigs.py (The script worked pretty well for 1.6.15, but if you discover any bugs please let me know.) Testing by enthusiastic users is also welcomed (but remember that this is not yet a blessed release, with all that that implies). If you are a package maintainer, please do not included this release in your distribution until after it has been formally released. I'd like to collect all the signatures in time to do a release by December 6. -Hyrum
Re: 1.7.x - merge now accesses all files in WC?
On Fri, Dec 3, 2010 at 7:50 AM, Hyrum K. Wright hyrum_wri...@mail.utexas.edu wrote: On Thu, Dec 2, 2010 at 3:43 PM, Daniel Becroft djcbecr...@gmail.com wrote: On Wed, Dec 1, 2010 at 7:34 PM, Philip Martin philip.mar...@wandisco.comwrote: ... I can't see any reason why all these files would need to be accessed. I seem to recall some discussion about preventing/warning merging into modified working copies, could this be the cause? The new check is for a single revision working copy, not an unmodified one. Ah, that makes more sense, I guess. Checking for an unmodified WC would mean that the ability to run consecutive 'svn merge -c' commands would be removed. You can run 'svn merge -c17,85,90,123' if you need to merge multiple revs in the same operation. Very true, but I've had some instances where it's easier to do one merge -c (postponing conflicts), resolve, and then do the next one. Not all the time, but occasionally. Cheers, Daniel B.
Re: 1.7.x - merge now accesses all files in WC?
Daniel Becroft wrote on Fri, Dec 03, 2010 at 08:06:40 +1000: On Fri, Dec 3, 2010 at 7:50 AM, Hyrum K. Wright hyrum_wri...@mail.utexas.edu wrote: On Thu, Dec 2, 2010 at 3:43 PM, Daniel Becroft djcbecr...@gmail.com wrote: On Wed, Dec 1, 2010 at 7:34 PM, Philip Martin philip.mar...@wandisco.comwrote: ... I can't see any reason why all these files would need to be accessed. I seem to recall some discussion about preventing/warning merging into modified working copies, could this be the cause? The new check is for a single revision working copy, not an unmodified one. There is a flag that bypasses the new check: --allow-mixed-revisions : Allow merge into mixed-revision working copy. Use of this option is not recommended! Please run 'svn update' instead. Ah, that makes more sense, I guess. Checking for an unmodified WC would mean that the ability to run consecutive 'svn merge -c' commands would be removed. You can run 'svn merge -c17,85,90,123' if you need to merge multiple revs in the same operation. Very true, but I've had some instances where it's easier to do one merge -c (postponing conflicts), resolve, and then do the next one. Not all the time, but occasionally. Cheers, Daniel B.
Re: 1.5.9 up for testing/signing
On 12/02/2010 04:55 PM, Hyrum K. Wright wrote: 1.5.9 tarballs are up for testing and signing. The magic revision is r1041577: http://people.apache.org/~hwright/svn/1.5.9/ Summary: +1 to release. Platform: Ubuntu 10.04 (lucid) Linux. Tested: 1.5.9 tarball with local dependencies (not from deps tarball) ((neon,local,svn) x fsfs) + ((fs,fs_base) x bdb) + py + pl + rb + javahl Verified that 1.5.9 deps were logically equivalent to 1.5.8 deps. Results: All tests pass. MD5 Checksums: 973e87cd8aa64f44ed6b4e569c635abc subversion-1.5.9.tar.gz 1154485516cfb59db0f009b2d1a5c0cc subversion-deps-1.5.9.tar.gz d8de4f33decb9e608c8cfd43288ebe89 subversion-1.5.9.tar.bz2 50b4559ad6cef3caa810b80f1c18679c subversion-deps-1.5.9.tar.bz2 GPG Signatures (left-aligned because Hyrum likes it that way): ::: subversion-1.5.9.tar.gz ::: -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) iEYEABECAAYFAkz4F3MACgkQokEGqRcG/W7QjgCgko2Jj6TGMydTDCIavuZn0z8e QKEAnR/lbg2oyBW23c6r9q02qTng8zqB =lVxF -END PGP SIGNATURE- ::: subversion-1.5.9.tar.bz2 ::: -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) iEYEABECAAYFAkz4F3UACgkQokEGqRcG/W6s8QCeLKN0e8Y/1i+NW2QsPOIBaQeS 80AAoMCVeRcMGhxpN7kVqOBHICJsKiYu =df9z -END PGP SIGNATURE- ::: subversion-deps-1.5.9.tar.gz ::: -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) iEYEABECAAYFAkz4F3YACgkQokEGqRcG/W55bACePkiq3tLax9+mQ/2NifEe+Nzb dVcAnRPJTMfZgBCV/HbdBnJERR+FBD5o =A2u7 -END PGP SIGNATURE- ::: subversion-deps-1.5.9.tar.bz2 ::: -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) iEYEABECAAYFAkz4F3gACgkQokEGqRcG/W73awCg0IdiPcuvnPcyvpBreMScXVbz exEAoNDVkIX+b3J7uSuDTlXwvCTneH1a =17vb -END PGP SIGNATURE- -- C. Michael Pilato cmpil...@collab.net CollabNet www.collab.net Distributed Development On Demand
Re: 1.7.x - merge now accesses all files in WC?
[Daniel Becroft] I've just managed to build/install trunk on my ubuntu box at home (first application I've ever compiled on it - yey!). What debugging tools would you recommend to investigate this further? I've seen output posted that lists function names, and time spent on each. The obvious start is 'strace', as in 'strace svn merge ...'. It spits out every system call. There's a lot of noise up front as it's loading shared libraries and such, but it should still be obvious what we're doing when crawling the tree (stat / lstat, open, etc.). -- Peter Samuelson | org-tld!p12n!peter | http://p12n.org/
Implementations of svn_diff_fns_t?
Hi, This question came up during recent discussion about the diff-optimizations-tokens branch [1]: What are the known implementors of svn_diff_fns_t, the vtable of svn_diff callback functions in subversion/include/svn_diff.h? Besides the internal diff_memory.c and diff_file.c that is. Are there other implementations out there of these callbacks? IDE plugins, third-party tools, ...? Cheers, -- Johan [1] http://svn.haxx.se/dev/archive-2010-12/0057.shtml
Re: subversion cross compile (arm)
Hi! I continued the work on my issue. It seems to be a memory allocation or over-writing problem. There is the section (see between HEADER_TEXT and HEADER_TEXT OK) where it calling representation_string, which has to generate the 'text: ...' string. I printed out the input parameters. Later, there is a totally different string (at the parsing section, at the bottom of the log). If I'm changing the order of the debug print lines (rev, offset, size, etc) The value of these are changing too (everything else is the same). Ie: If there is only the rev: ... debug line, the result is: rev 4618626049922564096 My imagine is that, this is a memory corruption bug, but I'm unable to compile valgrind, or any other high level debug utility to my development board. Can anybody help me in debugging and fixing??? Regards, András / # svn mkdir file:///var/svn/testrepo/xxx -m aaa fs_fs: [LINE 2082] calling svn_fs_fs__read_noderev fs_fs: [LINE 2140] calling read_rep_offsets '0 0 4 4 2d2977d1c96f487abe4a1e202dd03b4e' read_rep_offsets: [LINE 1947] '0 0 4 4 2d2977d1c96f487abe4a1e202dd03b4e' read_rep_offsets: [LINE 1956] '0' read_rep_offsets: [LINE 1973] '0' read_rep_offsets: [LINE 1984] '4' read_rep_offsets: [LINE 1995] '4' read_rep_offsets: [LINE 2009] '2d2977d1c96f487abe4a1e202dd03b4e' apr_file_open: '/var/svn/testrepo/db/transactions/0-0.txn/node.0.0' Call svn_fsfs__write_noderev in svn_fs_fs__put_node_revision [LINE 2390] svn_fsfs__write_noderev HEADER_TEXT rev 0 offs 4618626049922564096 size 4 exp size 4 md5 2d2977d1c96f487abe4a1e202dd03b4e svn_fsfs__write_noderev HEADER_TEXT OK fs_fs: [LINE 2082] calling svn_fs_fs__read_noderev fs_fs: [LINE 2140] calling read_rep_offsets '0 4 4 531704 (null)' read_rep_offsets: [LINE 1947] '0 4 4 531704 (null)' read_rep_offsets: [LINE 1956] '0' read_rep_offsets: [LINE 1973] '4' read_rep_offsets: [LINE 1984] '4' read_rep_offsets: [LINE 1995] '531704' subversion/libsvn_fs_fs/fs_fs.c:2239: (apr_err=160004) svn: Corrupt node-revision '0.0.t0-0' subversion/libsvn_fs_fs/fs_fs.c:2006: (apr_err=160004) svn: Malformed text representation offset line in node-rev read_rep_offsets: apr_strtok #4 last_string '' string '0' str '(null)' strlen(str) 6 (APR_MD5_DIGESTS IZE*2) 32 revision 0 offset 4 size 0 expsize 4 / # -- Takács András Skype: wakoond GTalk: wakoond MSN: wako...@freestart.hu 2010/12/2 Takács András wako...@gmail.com: Hi All, Hi daniel, Daniel, thanks a lot your help, I'm getting more closer to the origin of the issue. (I hope.) I CC'ed the dev@ list as well, so I'm summarize the problem again. I'm cross compiling subversion for an ARM9 based development board (mini2440). (You'll find my configure options below.) This is a totaly clean environbent. There are just the libraries of toolchain (codesourcery). I built apr, apr-util, zlib, and sqlite from subversion-deps package, and the subversion itself. It is version 1.6.15. I'm createing a new repository with svn create, and after it I'm trying svn mkdir file:///... The result is: svn: Commit failed (details follow): svn: Corrupt node-revision '0.0.t0-0' svn: Malformed text representation offset line in node-rev Based on Daniel's help, I started to patch fs_fs.c, and adding debug messages to it: / # svnadmin create /var/svn/testrepo === I'm creating the repository / # cat /var/svn/testrepo/db/revs/0/0 === Checking the revision file. Seems to be OK. PLAIN END ENDREP id: 0.0.r0/17 type: dir count: 0 text: 0 0 4 4 2d2977d1c96f487abe4a1e202dd03b4e cpath: / 17 107 / # svn mkdir file:///var/svn/testrepo/xxx -m m fs_fs: [LINE 2082] calling svn_fs_fs__read_noderev === It is the bottom of the get_node_revision_body function fs_fs: [LINE 2140] calling read_rep_offsets '0 0 4 4 2d2977d1c96f487abe4a1e202dd03b4e' read_rep_offsets: [LINE 1947] '0 0 4 4 2d2977d1c96f487abe4a1e202dd03b4e' == The string parameter of read_rep_offsets read_rep_offsets: [LINE 1956] '0' === Return str of first apr_strtok read_rep_offsets: [LINE 1973] '0' === Return str of second apr_strtok read_rep_offsets: [LINE 1984] '4' === Return str of third apr_strtok read_rep_offsets: [LINE 1995] '4' === Return str of fourth apr_strtok read_rep_offsets: [LINE 2009] '2d2977d1c96f487abe4a1e202dd03b4e' === Return str of fifth apr_strtok fs_fs: [LINE 2082] calling svn_fs_fs__read_noderev apr_strtok: [LINE 35] '0.0.t0-0' apr_strtok: [LINE 58] '0' apr_strtok: [LINE 35] '0.t0-0' apr_strtok: [LINE 58] '0' apr_strtok: [LINE 35] 't0-0' apr_strtok: [LINE 58] 't0-0' fs_fs: [LINE 2140] calling read_rep_offsets '0 4 4 4621479420036127952 (null)' read_rep_offsets: [LINE 1947] '0 4 4 4621479420036127952 (null)' read_rep_offsets: [LINE 1956] '0' read_rep_offsets: [LINE 1973] '4' read_rep_offsets: [LINE 1984] '4' read_rep_offsets: [LINE 1995] '4621479420036127952' subversion/libsvn_fs_fs/fs_fs.c:2239: (apr_err=160004) svn: Corrupt node-revision '0.0.t0-0' subversion/libsvn_fs_fs/fs_fs.c:2006: (apr_err=160004) svn: