Re: [PATCH] Add --dry-run flag to "svn update" client command
On Tuesday 15 March 2011 08:34 PM, Julian Foad wrote: On Wed, 2011-03-02, Arwin Arni wrote: In my effort to understand the delta editor API, I took it upon myself to try and implement the --dry-run flag for "svn update". http://subversion.tigris.org/issues/show_bug.cgi?id=2491 Hi Arwin. Kamesh asked for technical feedback about this patch so here's a review. In my view, a "dry run" mode for update is useful but not very important. The complexity of this implementation looks about what I'd expect, not too bad, but I think right now is maybe not a good time to add this because it does add significant complexity to code that we are trying to work on. Currently, externals are handled inside subversion/libsvn_client/externals.c by running checkout/switch. For a dry-run update to mimic a real update, the notifications have to be the same. Since some of these notifications are generated by the above mentioned checkout/switch runs, I have to implement dry-run for them also. I'll take this up as a follow-up exercise. Now, the dry-run will simply ignore any externals in the working copy. It's fine to handle externals in a follow-up patch. I see that '--parents' and '--set-depth' are not allowed in dry-run mode. What is the reason behind that? Is it because they seem to be difficult to implement, or you think they are not needed, or you intend to implement them in a follow-up patch, or something else? Well, the reason is that both --parents and --set-depth make permanent changes to the WC which will not be reported in the output at all.. If the user is passing these parameters, he has a fair idea of what these "invisible" changes are, and so shouldn't be adding them to a dry-run command. (Maybe we could just make them FALSE during a dry-run update so it doesn't throw those errors that I coded. The user will be oblivious to this, as the output isn't going to change in any way) The reason I'm concerned about orthogonality is because if some parts of the code don't support one feature, and other parts of the code don't support another feature, that can make tasks such as refactoring the code much more difficult. Not quite sure I understand what you mean.. Index: subversion/include/svn_wc.h === --- subversion/include/svn_wc.h (revision 1075841) +++ subversion/include/svn_wc.h (working copy) @@ -4990,7 +4990,8 @@ * @a notify_baton and the path of the restored file. @a notify_func may * be @c NULL if this notification is not required. If @a * use_commit_times is TRUE, then set restored files' timestamps to - * their last-commit-times. + * their last-commit-times. If @a dry_run is TRUE, only notifications are + * done. Using @a dry_run is meaningless if @a restore_files is FALSE. The last sentence is not clear. What does it mean in terms of the promises and requirements of this API? This is in svn_wc_crawl_revisions5. The value of dry_run is never referenced unless restore_files is TRUE. restore_files dry_runoutcome TRUETRUE dry-run restore TRUEFALSE actually restore FALSE TRUE/FALSE no restore Index: subversion/svn/main.c === @@ -1733,6 +1733,13 @@ [...] case opt_dry_run: +if (opt_state.parents) + { +err = svn_error_create +(SVN_ERR_CL_ARG_PARSING_ERROR, NULL, + _("Can't specify --parents with --dry-run")); +return svn_cmdline_handle_exit_error(err, pool, "svn: "); + } opt_state.dry_run = TRUE; break; It looks to me like this check will only work if "--parents" is specified before "--dry-run" on the command line, and not if they are specified the other way around. Also, by putting the check here in "main.c", this restriction applies to all commands, whereas I think this restriction should be specific to the "update" command, even if there are currently no other commands that accept both options. To fix those two problems, please move the check into the "update" command itself. Hmm.. I guess my solution of allowing the user to pass these combination of options, and making them a no-op would solve all these problems.. @@ -1764,6 +1771,14 @@ case opt_set_depth: +if (opt_state.dry_run) + { [...] Similarly here. + } TEST SUITE * subversion/tests/cmdline/svntest/actions.py (_post_update_checks) : New function to do the checks after an update. (run_and_verify_update): Accept a dry_run parameter. If TRUE, check that the dry run hasn't affected the wc. The phrase "the dry run" makes the reader ask, "What dry run?". I would suggest: "If TRUE, first do a dry run and check that it hasn't affected the WC." Okay..
[Noorul Islam K M] Re: [PATCH] Fix for issue 3826
I wonder, why this patch was not considered for fixing issue 3826. I could see a fix 1081799 which I see very similar to what I have done. Did I miss anything here? Thanks and Regards Noorul --- Begin Message --- Julian Foad writes: > On Thu, 2011-03-03 at 22:48 +0530, Noorul Islam K M wrote: > >> Julian Foad writes: >> >> > On Wed, 2011-03-02, Noorul Islam K M wrote: >> > >> >> Please find attached patch for issue 3826. All tests pass using 'make >> >> check' >> > [...] >> >> Index: subversion/svn/diff-cmd.c >> >> === >> >> --- subversion/svn/diff-cmd.c (revision 1076214) >> >> +++ subversion/svn/diff-cmd.c (working copy) >> >> @@ -324,8 +324,11 @@ >> >> return svn_error_createf(SVN_ERR_CL_ARG_PARSING_ERROR, NULL, >> >> _("Path '%s' not relative to base >> >> URLs"), >> >> path); >> >> + if (! svn_dirent_is_absolute(path)) >> >> +path = svn_relpath_canonicalize(path, iterpool); >> >> + else >> >> +path = svn_dirent_canonicalize(path, iterpool); > > Your new code here expects that 'path' could be either an absolute local > path (know as a 'dirent'), or a 'relpath' ... > >> >> >> >> - path = svn_relpath_canonicalize(path, iterpool); >> >>if (svn_path_is_url(old_target)) >> >> target1 = svn_path_url_add_component2(old_target, path, >> >> iterpool); > > ... but here (if old_target is a URL) the code requires that 'path' is a > relpath. So what happens if 'path' is a 'dirent'? Does this > 'url_add_component' line get executed (and if so it will produce wrong > results), or does it not get executed (and if not, why not)? > >From 'svn help diff' 2. Display the differences between OLD-TGT as it was seen in OLDREV and NEW-TGT as it was seen in NEWREV. PATHs, if given, are relative to OLD-TGT and NEW-TGT and restrict the output to differences for those paths. OLD-TGT and NEW-TGT may be working copy paths or URL[@REV]. NEW-TGT defaults to OLD-TGT if not specified. -r N makes OLDREV default to N, -r N:M makes OLDREV default to N and NEWREV default to M. I modified the patch a bit so that conversion to relpath happens only when user specified new-tgt or old-tgt. In other cases I think it can be absolute path. Attached is the modified patch. Thanks and Regards Noorul Index: subversion/tests/cmdline/diff_tests.py === --- subversion/tests/cmdline/diff_tests.py (revision 1079662) +++ subversion/tests/cmdline/diff_tests.py (working copy) @@ -3761,7 +3761,6 @@ '-c2', '--git') os.chdir(was_cwd) -@XFail() @Issue(3826) def diff_abs_localpath_from_wc_folder(sbox): "diff absolute localpath from wc folder" @@ -3770,9 +3769,10 @@ A_path = os.path.join(wc_dir, 'A') B_path = os.path.join(wc_dir, 'A', 'B') + B_abspath = os.path.abspath(B_path) os.chdir(os.path.abspath(A_path)) svntest.actions.run_and_verify_svn(None, None, [], 'diff', - os.path.abspath(B_path)) + B_abspath) #Run the tests Index: subversion/svn/diff-cmd.c === --- subversion/svn/diff-cmd.c (revision 1079662) +++ subversion/svn/diff-cmd.c (working copy) @@ -325,7 +325,11 @@ _("Path '%s' not relative to base URLs"), path); - path = svn_relpath_canonicalize(path, iterpool); + if ((strcmp(old_target, "") != 0) || (strcmp(new_target, "") != 0)) +path = svn_relpath_canonicalize(path, iterpool); + else +path = svn_dirent_canonicalize(path, iterpool); + if (svn_path_is_url(old_target)) target1 = svn_path_url_add_component2(old_target, path, iterpool); else --- End Message ---
Re: wc_db performance (was: wc_db API discussion)
On Mon, Mar 14, 2011 at 2:36 PM, Hyrum K Wright wrote: > On Sat, Mar 12, 2011 at 6:47 AM, Stefan Sperling wrote: >> On Fri, Mar 11, 2011 at 10:43:46PM -0500, Greg Stein wrote: >>> 2011/3/11 Branko Čibej : >>> >... >>> > For the second task, I think the first order of business is to change >>> > the wc-db tree crawler to do one query instead of zillions, or at least, >>> > where several queries are required, to do them all in one transaction. >>> >>> stsp has been working this recently. Killing the node walker, and >>> moving to table scans. >> >> Yes. So far, I've been working in the revision status code. >> There are two problems left to fix before I'll move on to the next task: >> >> - There are API layering issues (wc_db.c calls into node.c). >> This is related to the API discussions in the other thread >> so I'll follow up there. > > Before reading this thread, I saw the call into node.c, and have > subsequently removed it. > >> - The revision status code issues about 5 separate queries, >> which aren't combined via a transaction and don't use temporary tables. >> This is no worse than the previous code using the node walker, >> obviously :) But I'll look at fixing this so that the results >> returned correspond to the state of the DB as of the time the >> svn_wc__db_revision_status() call was made. > > I wrapped this API in a txn in r1081510. > >> For others who want to jump in and help, here is a list of places >> where the node walker is still being used. I'm not sure if we can >> eliminate it everywhere before release, but each of these should >> be looked at to see whether we can use an alternative approach to >> increase performance: >> >> subversion/libsvn_client/changelist.c >> subversion/libsvn_client/commit_util.c >> subversion/libsvn_client/info.c >> subversion/libsvn_client/merge.c >> subversion/libsvn_client/mergeinfo.c >> subversion/libsvn_client/prop_commands.c >> (This should be propget and propset. Proplist is already using >> queries involving temporary tables. Rewriting propget on top >> of the proplist code would be easy. Is anyone working on propget? I'll take stab at that if not. It would be useful for dealing with merge.c:merge_reintegrate_locked's use of svn_wc__node_walk_children(). Paul >> Propset needs more work.) >> subversion/libsvn_client/ra.c >> subversion/libsvn_wc/update_editor.c > > I'll take a gander at some of these, too. (But I'm not entirely sure > what I' gandering at or for...) > > -Hyrum >
Re: Subversion trunk (r1078338) HTTP(/WC?) performance problems?
On Tue, Mar 15, 2011 at 22:38, John Beranek wrote: > On 14/03/2011 19:52, Ivan Zhakov wrote: >> On Mon, Mar 14, 2011 at 22:47, John Beranek wrote: >>> On 14/03/2011 19:45, Ivan Zhakov wrote: On Mon, Mar 14, 2011 at 20:46, John Beranek wrote: > On 14/03/11 17:14, Philip Martin wrote: >> Ivan Zhakov writes: >> >>> I investigated your problem with high CPU usage and fixed in serf in >>> r1456. Could you please test it and confirm that problem with high CPU >>> load is fixed in your case? >> > > Hmm, your timings still don't match mine. I still see ra_serf being > (significantly) slower than ra_neon. With serf r1456 and SVN r1081484 > > Export over a LAN from a trunk server: > > trunk ra_neon: 1.42 > trunk ra_serf: 2.44 > > Export over localhost from a trunk server: > > trunk ra_neon: 1.69 > trunk ra_serf: 3.00 > > [I imagine this is a little slower because one host (a dual-core > workstation) is now both client and server] > > Cheers, Hi John, Does your server requires authentication? What is MaxKeepAliveRequests settings? >>> >>> Yes, and: >>> >>> Timeout 30 >>> KeepAlive On >>> MaxKeepAliveRequests 0 >>> KeepAliveTimeout 50 >>> >> >> Btw, I forgot to ask you: is gzip compression enabled on your server? > > Compression is _off_ on my trunk server. > That's also reduce performance for ra_serf, unfortunately mod_deflate has memory leaks when used with mod_dav_svn [1] [1] http://svn.haxx.se/dev/archive-2009-08/0274.shtml -- Ivan Zhakov
Re: wc_db performance
Greg Stein writes: > Why can't we simply put restrictions on those callbacks? I'm unclear > on this part. We seem to be unsure if such a restriction is a good idea. Suppose a client calls proplist, what can it do when it sees a property of interest? Can it delete the property? Modify the value? Run diff on the file? Run proplist on some other path? -- Philip
Re: wc_db performance
Branko Čibej writes: > On 15.03.2011 15:34, Philip Martin wrote: >> So with the temporary table approach the callback really has to use a >> separate database connection to read/write the database from within the >> callback. >> >> However I think that is also the case if we were to avoid the table and >> simply lock the main database; if just one connection was reused it >> might be attempt to reuse an SQLite statement. > > There's a trick we could use if that's a problem, namely: instead of > simply creating temporary tables by using the connection's default temp > database; create another temp *database* per query. It's anonymous, so > only the code that holds its handle knows about it, and there's no way > for the callback to access it. > > This would require a bit more code, however, the concept is a reusable > pattern (and the code for the temp database handling would be reusable, > too.) Sounds plausible. Is it valid for an application to call proplist and then in the callback call proplist again, say on a different part of the working copy? To support it I think we need to avoid using a fixed name for the temporary database, perhaps we could use a sequence number in the database connection structure? I thought we would have this problem with the name of the temporary table, but we get a database locked error first. -- Philip
Re: wc_db performance
On Tue, Mar 15, 2011 at 10:34, Philip Martin wrote: >... > So with the temporary table approach the callback really has to use a > separate database connection to read/write the database from within the > callback. > > However I think that is also the case if we were to avoid the table and > simply lock the main database; if just one connection was reused it > might be attempt to reuse an SQLite statement. And note that we work really hard to use a single SDB connection. We re-use that as much as we can. I don't even know how we'd start to bring in more connections. Maybe under the covers, wc_db could open specialized connections. But, really? Why can't we simply put restrictions on those callbacks? I'm unclear on this part. Cheers, -g
Re: Subversion trunk (r1078338) HTTP(/WC?) performance problems?
On 14/03/2011 19:52, Ivan Zhakov wrote: > On Mon, Mar 14, 2011 at 22:47, John Beranek wrote: >> On 14/03/2011 19:45, Ivan Zhakov wrote: >>> On Mon, Mar 14, 2011 at 20:46, John Beranek wrote: On 14/03/11 17:14, Philip Martin wrote: > Ivan Zhakov writes: > >> I investigated your problem with high CPU usage and fixed in serf in >> r1456. Could you please test it and confirm that problem with high CPU >> load is fixed in your case? > Hmm, your timings still don't match mine. I still see ra_serf being (significantly) slower than ra_neon. With serf r1456 and SVN r1081484 Export over a LAN from a trunk server: trunk ra_neon: 1.42 trunk ra_serf: 2.44 Export over localhost from a trunk server: trunk ra_neon: 1.69 trunk ra_serf: 3.00 [I imagine this is a little slower because one host (a dual-core workstation) is now both client and server] Cheers, >>> Hi John, >>> >>> Does your server requires authentication? What is MaxKeepAliveRequests >>> settings? >> >> Yes, and: >> >> Timeout 30 >> KeepAlive On >> MaxKeepAliveRequests 0 >> KeepAliveTimeout 50 >> > > Btw, I forgot to ask you: is gzip compression enabled on your server? Compression is _off_ on my trunk server. John. -- John Beranek To generalise is to be an idiot. http://redux.org.uk/ -- William Blake smime.p7s Description: S/MIME Cryptographic Signature
Re: svn commit: r1081892 - in /subversion/trunk/subversion: include/private/svn_io_private.h libsvn_subr/io.c libsvn_wc/adm_ops.c libsvn_wc/questions.c libsvn_wc/wc.h tests/cmdline/revert_tests.py
phi...@apache.org writes: > Author: philip > Date: Tue Mar 15 18:33:52 2011 > New Revision: 1081892 > > URL: http://svn.apache.org/viewvc?rev=1081892&view=rev > Log: > Make revert change file permissions on the basis of the current magic > properties and the current permissions, rather than on any change in > the magic permissions. This means file permissions follow the same > rules as file content, and that permissions-only reverts occur. > > * subversion/include/private/svn_private_io.h: New. > > * subversion/libsvn_subr/io.c > (): Include svn_private_io.h. > (svn_io__is_finfo_read_only, svn_io__is_executable): New. > (svn_io_is_file_executable): Call svn_io__is_executable. > > * subversion/libsvn_wc/wc.h > (svn_wc__internal_file_modified_p): New. > > * subversion/libsvn_wc/questions.c > (): Include svn_private_io.h. > (svn_wc__internal_file_modified_p): New. > (svn_wc__internal_text_modified_p): Call svn_wc__internal_file_modified_p. > > * subversion/libsvn_wc/adm_ops.c > (new_revert_internal): Use svn_wc__internal_file_modified_p to >determine whether to reset contents or permissions. > > * subversion/tests/cmdline/revert_tests.py > (revert_permissions_only): New, marked XFail but passes with new revert. > (test_list): Add revert_permissions_only. The implementation is a little ugly, a private svn_io API. In the past a file with text and permission changes would have its permissions reset by revert, and a file with magic property changes would have its permissions reset by revert. However a file with just permission changes would not have its permission reset by revert. The new code I'm working on will now revert permissions in that last case. This could be viewed as fixing a bug, but I suppose some people might not like it. Permission-only changes don't show up in status. Thoughts? -- Philip
Re: wc_db performance
On 15.03.2011 15:34, Philip Martin wrote: > Philip Martin writes: > >> Branko Čibej writes: >> >>> The temporary table is: >>> >>> * in a different database, so the read lock on it does not affect >>> the main wc-db; >>> * per-connection, so even the same process using a different >>> database connection will not even see that temporary table. >> OK, so the process is: >> >> - read lock on main database >> - write lock on temporary database >> - populate temporary table >> - release write lock >> - release read lock >> - read lock on temporary database >> - make callbacks >> - release read lock >> >> without the temporary table it could be: >> >> - read lock on main database >> - make callbacks >> - release lock >> >> What we gain is that the callback can modify the main database, because >> there is no read lock. It still can't modify it using any functions >> that require write access to the temporary database. >> >> What we lose is that the callback cannot call any "read-only" functions >> that use temporary tables because the step that requires a write lock >> will fail. > So with the temporary table approach the callback really has to use a > separate database connection to read/write the database from within the > callback. > > However I think that is also the case if we were to avoid the table and > simply lock the main database; if just one connection was reused it > might be attempt to reuse an SQLite statement. There's a trick we could use if that's a problem, namely: instead of simply creating temporary tables by using the connection's default temp database; create another temp *database* per query. It's anonymous, so only the code that holds its handle knows about it, and there's no way for the callback to access it. This would require a bit more code, however, the concept is a reusable pattern (and the code for the temp database handling would be reusable, too.) -- Brane
Re: [PATCH] Add --dry-run flag to "svn update" client command
On Wed, 2011-03-02, Arwin Arni wrote: > In my effort to understand the delta editor API, I took it upon myself > to try and implement the --dry-run flag for "svn update". > http://subversion.tigris.org/issues/show_bug.cgi?id=2491 Hi Arwin. Kamesh asked for technical feedback about this patch so here's a review. In my view, a "dry run" mode for update is useful but not very important. The complexity of this implementation looks about what I'd expect, not too bad, but I think right now is maybe not a good time to add this because it does add significant complexity to code that we are trying to work on. > Currently, externals are handled inside > subversion/libsvn_client/externals.c by running checkout/switch. For a > dry-run update to mimic a real update, the notifications have to be the > same. Since some of these notifications are generated by the above > mentioned checkout/switch runs, I have to implement dry-run for them > also. I'll take this up as a follow-up exercise. Now, the dry-run will > simply ignore any externals in the working copy. It's fine to handle externals in a follow-up patch. I see that '--parents' and '--set-depth' are not allowed in dry-run mode. What is the reason behind that? Is it because they seem to be difficult to implement, or you think they are not needed, or you intend to implement them in a follow-up patch, or something else? The reason I'm concerned about orthogonality is because if some parts of the code don't support one feature, and other parts of the code don't support another feature, that can make tasks such as refactoring the code much more difficult. > Index: subversion/include/svn_wc.h > === > --- subversion/include/svn_wc.h (revision 1075841) > +++ subversion/include/svn_wc.h (working copy) > @@ -4990,7 +4990,8 @@ > * @a notify_baton and the path of the restored file. @a notify_func may > * be @c NULL if this notification is not required. If @a > * use_commit_times is TRUE, then set restored files' timestamps to > - * their last-commit-times. > + * their last-commit-times. If @a dry_run is TRUE, only notifications are > + * done. Using @a dry_run is meaningless if @a restore_files is FALSE. The last sentence is not clear. What does it mean in terms of the promises and requirements of this API? > Index: subversion/svn/main.c > === > @@ -1733,6 +1733,13 @@ > [...] >case opt_dry_run: > +if (opt_state.parents) > + { > +err = svn_error_create > +(SVN_ERR_CL_ARG_PARSING_ERROR, NULL, > + _("Can't specify --parents with --dry-run")); > +return svn_cmdline_handle_exit_error(err, pool, "svn: "); > + } > opt_state.dry_run = TRUE; > break; It looks to me like this check will only work if "--parents" is specified before "--dry-run" on the command line, and not if they are specified the other way around. Also, by putting the check here in "main.c", this restriction applies to all commands, whereas I think this restriction should be specific to the "update" command, even if there are currently no other commands that accept both options. To fix those two problems, please move the check into the "update" command itself. > @@ -1764,6 +1771,14 @@ >case opt_set_depth: > +if (opt_state.dry_run) > + { > [...] Similarly here. > + } > TEST SUITE > > * subversion/tests/cmdline/svntest/actions.py > (_post_update_checks) : New function to do the checks after an update. > (run_and_verify_update): Accept a dry_run parameter. If TRUE, check that >the dry run hasn't affected the wc. The phrase "the dry run" makes the reader ask, "What dry run?". I would suggest: "If TRUE, first do a dry run and check that it hasn't affected the WC." > * subversion/tests/cmdline/changelist_tests.py > subversion/tests/cmdline/externals_tests.py > subversion/tests/cmdline/prop_tests.py > subversion/tests/cmdline/update_tests.py > subversion/tests/cmdline/resolved_tests.py > subversion/tests/cmdline/trans_tests.py > subversion/tests/cmdline/switch_tests.py > subversion/tests/cmdline/stat_tests.py > subversion/tests/cmdline/depth_tests.py > > (Tests) : Fix callers of run_and_verify_update. Rather than "Fix callers", say what the change is: "Pass dry_run=True to some callers and dry_run=False to other callers ...". Why do some callers pass True and some pass False? I guess it's because some callers use externals or --parents or --set-depth, and none of those work properly with dry-run, so those are passing False. But we could take a different approach. Rather than the caller deciding whether a dry-run check is going to work, we could let run_and_verify_update() decide to perform the additional check automatically when it can. If any of the args
Re: svn commit: r1079400 - /subversion/trunk/subversion/tests/cmdline/log_tests.py
On Tue, Mar 15, 2011 at 9:56 AM, Daniel Shahaf wrote: > Paul Burba wrote on Tue, Mar 15, 2011 at 09:38:31 -0400: >> On Tue, Mar 15, 2011 at 12:24 AM, Daniel Shahaf >> wrote: >> > pbu...@apache.org wrote on Tue, Mar 08, 2011 at 15:46:10 -: >> >> +++ subversion/trunk/subversion/tests/cmdline/log_tests.py Tue Mar 8 >> >> 15:46:09 2011 >> >> @@ -1148,9 +1148,9 @@ def check_merge_results(log_chain, expec >> >> >> >> # Check to see if the number and values of the revisions is correct >> >> for log in log_chain: >> >> - if (log['revision'] not in expected_merges >> >> - and (expected_reverse_merges is not None >> >> - and log['revision'] not in expected_reverse_merges)): >> >> + if not ((expected_merges and log['revision'] in expected_merges) >> >> + or (expected_reverse_merges >> >> + and log['revision'] in expected_reverse_merges)): >> > >> > If EXPECTED_MERGES and EXPECTED_REVERSE_MERGES are both None, >> > then the if() would trigger --- and I don't think that's the >> > intention. >> >> Hi Daniel, >> >> It is the intention. If EXPECTED_MERGES and EXPECTED_REVERSE_MERGES >> are both None, then the caller believes that no merged revisions >> (normal or reverse) are present. However, there *is* something in the >> LOG_CHAIN, so there is an error. Admittedly, none of the present >> callers pass EXPECTED_MERGES=None and EXPECTED_REVERSE_MERGES=None, >> but we might have reason to do so in the future. >> >> Paul Possibly I'm abusing a Python convention, but... > I see; I assumed that passing None means "I don't care about this piece > of information; do not attempt to validate it", That makes sense for a function like svntest/actions.py:run_and_verify_svn2(), where if EXPECTED_STDOUT is None, we don't check stdout. That function's core purpose is to invoke main.run_svn(). The caller may or may not care about confirming the stdout, so EXPECTED_STDOUT is justifiably optional. On the other hand, log_tests.py:check_merge_results() has only one purpose, to check that the expected '(Reverse) Merged via' headers match the actual output. That's it. Why would we ever want to ignore either? All that would enable us to do is write a test that spuriously passes. Paul > and that callers who > believe there are no merged revisions would pass an empty dict/list. > > Thanks for the clarification, > > Daniel >
Re: wc_db performance
Philip Martin writes: > Branko Čibej writes: > >> The temporary table is: >> >> * in a different database, so the read lock on it does not affect >> the main wc-db; >> * per-connection, so even the same process using a different >> database connection will not even see that temporary table. > > OK, so the process is: > > - read lock on main database > - write lock on temporary database > - populate temporary table > - release write lock > - release read lock > - read lock on temporary database > - make callbacks > - release read lock > > without the temporary table it could be: > > - read lock on main database > - make callbacks > - release lock > > What we gain is that the callback can modify the main database, because > there is no read lock. It still can't modify it using any functions > that require write access to the temporary database. > > What we lose is that the callback cannot call any "read-only" functions > that use temporary tables because the step that requires a write lock > will fail. So with the temporary table approach the callback really has to use a separate database connection to read/write the database from within the callback. However I think that is also the case if we were to avoid the table and simply lock the main database; if just one connection was reused it might be attempt to reuse an SQLite statement. -- Philip
Re: wc_db performance
Branko Čibej writes: > The temporary table is: > > * in a different database, so the read lock on it does not affect > the main wc-db; > * per-connection, so even the same process using a different > database connection will not even see that temporary table. OK, so the process is: - read lock on main database - write lock on temporary database - populate temporary table - release write lock - release read lock - read lock on temporary database - make callbacks - release read lock without the temporary table it could be: - read lock on main database - make callbacks - release lock What we gain is that the callback can modify the main database, because there is no read lock. It still can't modify it using any functions that require write access to the temporary database. What we lose is that the callback cannot call any "read-only" functions that use temporary tables because the step that requires a write lock will fail. -- Philip
Re: wc_db performance
On Tue, Mar 15, 2011 at 12:52:42PM +, Philip Martin wrote: > Yes, by the time we invoke the callbacks we have a read-lock. If we > simply ran a single transaction and made callbacks within the > transaction then it would have the same effect. I don't see what the > temporary table achieves. The temp tables live in a separate database. Without them, lock contention is on a single database. With them, the contention situation is different -- the callback cannot read from or write to the temp table. At least that is how I understand this text: http://sqlite.org/lang_createtable.html: If the "TEMP" or "TEMPORARY" keyword occurs between the "CREATE" and "TABLE" then the new table is created in the temp database. It is an error to specify both a and the TEMP or TEMPORARY keyword, unless the is "temp". If no database name is specified and the TEMP keyword is not present then the table is created in the main database. So we use the temp table to create a separate database that contains the data we need to present to the callback, and which is unaffected by whatever else the callback might be doing. Does that make sense?
RE: Race condition in libsvn_client code?
> -Original Message- > From: Philip Martin [mailto:philip.mar...@wandisco.com] > Sent: dinsdag 15 maart 2011 14:52 > To: C. Michael Pilato > Cc: Subversion Development > Subject: Re: Race condition in libsvn_client code? > > "C. Michael Pilato" writes: > > > [Tweaking Subject: for (hopefully) additional visibility.] > > > > On 03/15/2011 09:15 AM, Daniel Shahaf wrote: > >> C. Michael Pilato wrote on Tue, Mar 15, 2011 at 09:05:56 -0400: > >>> On 03/15/2011 12:34 AM, Daniel Shahaf wrote: > cmpil...@apache.org wrote on Tue, Mar 08, 2011 at 20:05:51 -: > > Author: cmpilato > > Date: Tue Mar 8 20:05:50 2011 > > New Revision: 1079508 > >>> > >>> [...] > >>> > > svn_client_commit4(svn_commit_info_t **commit_info_p, > > const apr_array_header_t *targets, > >>> > >>> [...] > >>> > > + /* Ensure that the original notification system is in place. */ > > + ctx->notify_func2 = notify_baton.orig_notify_func2; > > + ctx->notify_baton2 = notify_baton.orig_notify_baton2; > > + > > This is actually a race condition, isn't it? (for clients that call the > deprecated API while using CTX->notify_func2 in another thread (in > the > same or another API)) > > (Okay, so maybe we'll just let it live on. Presumably N other > deprecated wrappers do this too.) > >>> > >>> I hadn't considered that. We do alot of pointer swaps like this up in the > >>> command-line client code itself, but that's a bit different than doing so > >>> down in the libsvn_client library as in this case. There is one prior > >>> instance of us doing this kind of swap inside the libsvn_client library: in > >>> libsvn_client/copy.c:repos_to_wc_copy_single(). But I'd prefer mere > >>> precedent not to be our reason for allowing badness to persist in the > codebase. > >>> > >>> Got suggestions? > >> > >> Would this work? --- > >> > >> svn_client_ctx_t ctx2 = *ctx; > >> ctx2.notify_func2 = notify_baton.orig_notify_func2; > >> ctx2.notify_baton2 = notify_baton.orig_notify_baton2; > >> svn_client_commit5(ctx=&ctx2); > > It has a different problem, if the client modifies the context in a > callback those modifications will not persist. I don't think any of our callbacks receives the current ctx instance. But see my other mail: We store a svn_wc_context_t in the client context. So svn_client_ctx_t can't be shared between functions running on different threads anyway. Bert
Re: svn commit: r1073366 - in /subversion/trunk: notes/wc-ng/pristine-store subversion/libsvn_wc/wc-queries.sql subversion/libsvn_wc/wc_db_pristine.c
On Tue, Mar 15, 2011 at 16:49, Julian Foad wrote: > Ivan Zhakov wrote: >> On Tue, Mar 1, 2011 at 21:13, Julian Foad wrote: >> > Bert Huijben wrote: >> >> > -Original Message- >> >> > From: Julian Foad [mailto:julian.f...@wandisco.com] >> >> > >> >> > I'm not clear exactly what problem we would avoid by eliminating the >> >> > "select a unique name" step of this process. Is it what I describe >> >> > below at the end of note [1] - that a scanner might be more likely to >> >> > re-scan the content, and therefore more likely to cause a delay? >> >> >> >> No, the problem I try to avoid is >> >> * you create a file >> >> >> >> * you delete the file (after the virusscanner releases the file) >> >> * you rename a file to be at the old location >> >> >> >> While we really need something like >> >> * rename to a unique name. >> >> > >> original location> >> > >> > >From discussing on IRC we agree that the main concern is that this >> > involves too many separate filesystem operations, rather than any >> > specific problem with a particular step. >> > >> > I have committed it as is in r1075942, and would like to improve it as a >> > follow-up. >> > >> > Options for improvement: >> > >> > (a) Don't open a pristine file with FILE_SHARE_DELETE. Instead, >> > accept that an attempt to delete it may fail, and if that happens, leave >> > the there as an orphan. When adding a pristine file, if it already >> > exists on disk then just keep the copy that already exists. When >> > cleaning up orphan files, if a delete fails, just leave the file there. >> > Consider implementing automatic clean-up to prevent orphan files from >> > accumulating indefinitely. >> > >> > (b) Find or write a "rename to a unique name" function that operates >> > in a single step instead of a creating a new file and then overwriting >> > it. >> > >> > (c) Don't rename the file before deleting it; instead, just delete it. >> > On Windows, when adding a new file, if a file with that name already >> > exists, *then* rename the existing file before moving the new file into >> > place. (We can't just keep the existing file because it is pending >> > deletion and will disappear when the reader finishes reading it.) >> > >> > Pros and cons: >> > >> > (a) This would reduce the number of file system operations to a >> > minimum. It would involve bypassing APR to avoid the FILE_SHARE_DELETE >> > flag, which is not ideal but possible. >> > >> > (b) This would remove one of the file system operations. It would >> > involve writing a function similar to APR's fall-back implementation of >> > apr_file_mktemp() that exists for systems that do not provide a >> > "mkstemp" system API. I'm not sure if there are any concrete problems >> > with doing this sort of thing in "user space". >> > >> > (c) This would remove two of the file system operations. It sounds >> > straightforward. >> > >> > Comments? >> > >> >> (b) Is best option for. > > For ... what? for me :) >> It should be big problem to generate random >> name using current time and then try rename file to this name. Repeat >> on error. > > I assume you mean "it should not be a big problem...". > Yes, I meant "should not be a big problem". Sorry. -- Ivan Zhakov
Re: svn commit: r1079400 - /subversion/trunk/subversion/tests/cmdline/log_tests.py
Paul Burba wrote on Tue, Mar 15, 2011 at 09:38:31 -0400: > On Tue, Mar 15, 2011 at 12:24 AM, Daniel Shahaf > wrote: > > pbu...@apache.org wrote on Tue, Mar 08, 2011 at 15:46:10 -: > >> +++ subversion/trunk/subversion/tests/cmdline/log_tests.py Tue Mar 8 > >> 15:46:09 2011 > >> @@ -1148,9 +1148,9 @@ def check_merge_results(log_chain, expec > >> > >> # Check to see if the number and values of the revisions is correct > >> for log in log_chain: > >> - if (log['revision'] not in expected_merges > >> - and (expected_reverse_merges is not None > >> - and log['revision'] not in expected_reverse_merges)): > >> + if not ((expected_merges and log['revision'] in expected_merges) > >> + or (expected_reverse_merges > >> + and log['revision'] in expected_reverse_merges)): > > > > If EXPECTED_MERGES and EXPECTED_REVERSE_MERGES are both None, > > then the if() would trigger --- and I don't think that's the > > intention. > > Hi Daniel, > > It is the intention. If EXPECTED_MERGES and EXPECTED_REVERSE_MERGES > are both None, then the caller believes that no merged revisions > (normal or reverse) are present. However, there *is* something in the > LOG_CHAIN, so there is an error. Admittedly, none of the present > callers pass EXPECTED_MERGES=None and EXPECTED_REVERSE_MERGES=None, > but we might have reason to do so in the future. > > Paul I see; I assumed that passing None means "I don't care about this piece of information; do not attempt to validate it", and that callers who believe there are no merged revisions would pass an empty dict/list. Thanks for the clarification, Daniel
RE: svn commit: r1079508 - in /subversion/trunk/subversion: include/svn_client.h include/svn_wc.h libsvn_client/commit_util.c libsvn_client/deprecated.c svn/notify.c
> -Original Message- > From: Daniel Shahaf [mailto:d...@daniel.shahaf.name] > Sent: dinsdag 15 maart 2011 5:34 > To: dev@subversion.apache.org > Subject: Re: svn commit: r1079508 - in /subversion/trunk/subversion: > include/svn_client.h include/svn_wc.h libsvn_client/commit_util.c > libsvn_client/deprecated.c svn/notify.c > > cmpil...@apache.org wrote on Tue, Mar 08, 2011 at 20:05:51 -: > > Author: cmpilato > > Date: Tue Mar 8 20:05:50 2011 > > New Revision: 1079508 > > > > URL: http://svn.apache.org/viewvc?rev=1079508&view=rev > > Log: > > For issue #3752 ("Warn about committing copied directory at depth > > 'empty'"), introduce new notification types for overwriting and > > non-overwriting copies. Current, the command-line client treats these > > just as it does their non-copy-ful equivalents. > > > > * subversion/include/svn_wc.h > > (svn_wc_notify_action_t): Add new 'svn_wc_notify_commit_copied' and > > 'svn_wc_notify_commit_copied_replaced' notification types. > > > > * subversion/include/svn_client.h > > (svn_client_commit5): Update docstring to indicate that this > > function will transmit the new notification types, too. > > (svn_client_commit4): Explicitly note that this interface does *not* > > use the new notification types. > > > > * subversion/libsvn_client/commit_util.c > > (do_item_commit): Check for a non-NULL copyfrom_url on the commit > > item and, if found, use the new notification types rather than > > mere "added" and "replaced" notifications. > > > > * subversion/libsvn_client/deprecated.c > > (struct downgrade_commit_copied_notify_baton, > >downgrade_commit_copied_notify_func): New notification wrapper > > function and baton. > > (svn_client_commit4): Use downgrade_commit_copied_notify_func() > > wrapper as necessary to avoid sending unrecognized notification > > types to the caller. > > > > * subversion/svn/notify.c > > (notify): Handle svn_wc_notify_commit_copied as > > svn_wc_notify_commit_added; > svn_wc_notify_commit_copied_replaced > > as svn_wc_notify_commit_replaced. > > > > Modified: > > subversion/trunk/subversion/include/svn_client.h > > subversion/trunk/subversion/include/svn_wc.h > > subversion/trunk/subversion/libsvn_client/commit_util.c > > subversion/trunk/subversion/libsvn_client/deprecated.c > > subversion/trunk/subversion/svn/notify.c > > svn_error_t * > > svn_client_commit4(svn_commit_info_t **commit_info_p, > > const apr_array_header_t *targets, > > @@ -437,14 +474,33 @@ svn_client_commit4(svn_commit_info_t **c > > apr_pool_t *pool) > > { > >struct capture_baton_t cb; > > + struct downgrade_commit_copied_notify_baton notify_baton; > > + svn_error_t *err; > > + > > + notify_baton.orig_notify_func2 = ctx->notify_func2; > > + notify_baton.orig_notify_baton2 = ctx->notify_baton2; > > > >*commit_info_p = NULL; > >cb.info = commit_info_p; > >cb.pool = pool; > > > > - SVN_ERR(svn_client_commit5(targets, depth, keep_locks, > keep_changelists, > > - changelists, revprop_table, > > - capture_commit_info, &cb, ctx, pool)); > > + /* Swap out the notification system (if any) with a thin filtering > > + wrapper. */ > > + if (ctx->notify_func2) > > +{ > > + ctx->notify_func2 = downgrade_commit_copied_notify_func; > > + ctx->notify_baton2 = ¬ify_baton; > > +} > > + > > + err = svn_client_commit5(targets, depth, keep_locks, keep_changelists, > > + changelists, revprop_table, > > + capture_commit_info, &cb, ctx, pool); > > + > > + /* Ensure that the original notification system is in place. */ > > + ctx->notify_func2 = notify_baton.orig_notify_func2; > > + ctx->notify_baton2 = notify_baton.orig_notify_baton2; > > + > > This is actually a race condition, isn't it? (for clients that call the > deprecated API while using CTX->notify_func2 in another thread (in the > same or another API)) > > (Okay, so maybe we'll just let it live on. Presumably N other > deprecated wrappers do this too.) Sharing svn_client_ctx_t between different threads at the same time is already broken in 1.7. The client context contains a svn_wc_context_t instance, which is not thread safe. (I never assumed that it was safe to share the context instance between threads in the first place) Bert
Re: Race condition in libsvn_client code?
"C. Michael Pilato" writes: > [Tweaking Subject: for (hopefully) additional visibility.] > > On 03/15/2011 09:15 AM, Daniel Shahaf wrote: >> C. Michael Pilato wrote on Tue, Mar 15, 2011 at 09:05:56 -0400: >>> On 03/15/2011 12:34 AM, Daniel Shahaf wrote: cmpil...@apache.org wrote on Tue, Mar 08, 2011 at 20:05:51 -: > Author: cmpilato > Date: Tue Mar 8 20:05:50 2011 > New Revision: 1079508 >>> >>> [...] >>> > svn_client_commit4(svn_commit_info_t **commit_info_p, > const apr_array_header_t *targets, >>> >>> [...] >>> > + /* Ensure that the original notification system is in place. */ > + ctx->notify_func2 = notify_baton.orig_notify_func2; > + ctx->notify_baton2 = notify_baton.orig_notify_baton2; > + This is actually a race condition, isn't it? (for clients that call the deprecated API while using CTX->notify_func2 in another thread (in the same or another API)) (Okay, so maybe we'll just let it live on. Presumably N other deprecated wrappers do this too.) >>> >>> I hadn't considered that. We do alot of pointer swaps like this up in the >>> command-line client code itself, but that's a bit different than doing so >>> down in the libsvn_client library as in this case. There is one prior >>> instance of us doing this kind of swap inside the libsvn_client library: in >>> libsvn_client/copy.c:repos_to_wc_copy_single(). But I'd prefer mere >>> precedent not to be our reason for allowing badness to persist in the >>> codebase. >>> >>> Got suggestions? >> >> Would this work? --- >> >> svn_client_ctx_t ctx2 = *ctx; >> ctx2.notify_func2 = notify_baton.orig_notify_func2; >> ctx2.notify_baton2 = notify_baton.orig_notify_baton2; >> svn_client_commit5(ctx=&ctx2); It has a different problem, if the client modifies the context in a callback those modifications will not persist. -- Philip
Re: wc_db performance
On 15.03.2011 11:26, Philip Martin wrote: > Branko Čibej writes: > >> The restriction that you may not invoke a callback from within a sqlite >> transaction remains. That's what the temporary results tables are for -- >> they live outside transactions on the main wc-db. > I don't understand the temporary table approach. Taking the > temp__node_props_cache table as an example: it gets populated by a > transaction and lives beyond that transaction. The table then gets > queried by a second transaction and the callbacks are made while that > second transaction is in progress. So callbacks are still being invoked > from within a transaction. > > As far as I can see using a temporary table has made the overall "read > properties" operation into one that modifies the database, to create the > temporary table. I guess that by the time the callbacks are made the > database write-lock has been dropped, but there would have been no > write-lock if the temporary table were not used. The temporary table is: * in a different database, so the read lock on it does not affect the main wc-db; * per-connection, so even the same process using a different database connection will not even see that temporary table. -- Brane
Re: svn commit: r1073366 - in /subversion/trunk: notes/wc-ng/pristine-store subversion/libsvn_wc/wc-queries.sql subversion/libsvn_wc/wc_db_pristine.c
Ivan Zhakov wrote: > On Tue, Mar 1, 2011 at 21:13, Julian Foad wrote: > > Bert Huijben wrote: > >> > -Original Message- > >> > From: Julian Foad [mailto:julian.f...@wandisco.com] > >> > > >> > I'm not clear exactly what problem we would avoid by eliminating the > >> > "select a unique name" step of this process. Is it what I describe > >> > below at the end of note [1] - that a scanner might be more likely to > >> > re-scan the content, and therefore more likely to cause a delay? > >> > >> No, the problem I try to avoid is > >> * you create a file > >> > >> * you delete the file (after the virusscanner releases the file) > >> * you rename a file to be at the old location > >> > >> While we really need something like > >> * rename to a unique name. > >> >> original location> > > > > >From discussing on IRC we agree that the main concern is that this > > involves too many separate filesystem operations, rather than any > > specific problem with a particular step. > > > > I have committed it as is in r1075942, and would like to improve it as a > > follow-up. > > > > Options for improvement: > > > > (a) Don't open a pristine file with FILE_SHARE_DELETE. Instead, > > accept that an attempt to delete it may fail, and if that happens, leave > > the there as an orphan. When adding a pristine file, if it already > > exists on disk then just keep the copy that already exists. When > > cleaning up orphan files, if a delete fails, just leave the file there. > > Consider implementing automatic clean-up to prevent orphan files from > > accumulating indefinitely. > > > > (b) Find or write a "rename to a unique name" function that operates > > in a single step instead of a creating a new file and then overwriting > > it. > > > > (c) Don't rename the file before deleting it; instead, just delete it. > > On Windows, when adding a new file, if a file with that name already > > exists, *then* rename the existing file before moving the new file into > > place. (We can't just keep the existing file because it is pending > > deletion and will disappear when the reader finishes reading it.) > > > > Pros and cons: > > > > (a) This would reduce the number of file system operations to a > > minimum. It would involve bypassing APR to avoid the FILE_SHARE_DELETE > > flag, which is not ideal but possible. > > > > (b) This would remove one of the file system operations. It would > > involve writing a function similar to APR's fall-back implementation of > > apr_file_mktemp() that exists for systems that do not provide a > > "mkstemp" system API. I'm not sure if there are any concrete problems > > with doing this sort of thing in "user space". > > > > (c) This would remove two of the file system operations. It sounds > > straightforward. > > > > Comments? > > > > (b) Is best option for. For ... what? > It should be big problem to generate random > name using current time and then try rename file to this name. Repeat > on error. I assume you mean "it should not be a big problem...". - Julian
Re: svn commit: r1079400 - /subversion/trunk/subversion/tests/cmdline/log_tests.py
On Tue, Mar 15, 2011 at 12:24 AM, Daniel Shahaf wrote: > pbu...@apache.org wrote on Tue, Mar 08, 2011 at 15:46:10 -: >> Author: pburba >> Date: Tue Mar 8 15:46:09 2011 >> New Revision: 1079400 >> >> URL: http://svn.apache.org/viewvc?rev=1079400&view=rev >> Log: >> Follow-up to rr1076726, fix a flawed log -g test helper. >> >> * subversion/tests/cmdline/log_tests.py >> (check_merge_results): Account for the fact that the EXPECTED_MERGES arg >> might be None. Fix the check of expected merges so it doesn't spuriously >> pass when EXPECTED_REVSERSE_MERGES is none. >> > > EXPECTED_REVERSE_MERGES > >> Suggested by: danielsh >> >> Modified: >> subversion/trunk/subversion/tests/cmdline/log_tests.py >> >> Modified: subversion/trunk/subversion/tests/cmdline/log_tests.py >> URL: >> http://svn.apache.org/viewvc/subversion/trunk/subversion/tests/cmdline/log_tests.py?rev=1079400&r1=1079399&r2=1079400&view=diff >> == >> --- subversion/trunk/subversion/tests/cmdline/log_tests.py (original) >> +++ subversion/trunk/subversion/tests/cmdline/log_tests.py Tue Mar 8 >> 15:46:09 2011 >> @@ -1148,9 +1148,9 @@ def check_merge_results(log_chain, expec >> >> # Check to see if the number and values of the revisions is correct >> for log in log_chain: >> - if (log['revision'] not in expected_merges >> - and (expected_reverse_merges is not None >> - and log['revision'] not in expected_reverse_merges)): >> + if not ((expected_merges and log['revision'] in expected_merges) >> + or (expected_reverse_merges >> + and log['revision'] in expected_reverse_merges)): > > If EXPECTED_MERGES and EXPECTED_REVERSE_MERGES are both None, > then the if() would trigger --- and I don't think that's the > intention. Hi Daniel, It is the intention. If EXPECTED_MERGES and EXPECTED_REVERSE_MERGES are both None, then the caller believes that no merged revisions (normal or reverse) are present. However, there *is* something in the LOG_CHAIN, so there is an error. Admittedly, none of the present callers pass EXPECTED_MERGES=None and EXPECTED_REVERSE_MERGES=None, but we might have reason to do so in the future. Paul >> raise SVNUnexpectedLogs("Found unexpected revision %d" % >> log['revision'], log_chain) >> >> >
Race condition in libsvn_client code? (Was: svn commit: r1079508 ...)
[Tweaking Subject: for (hopefully) additional visibility.] On 03/15/2011 09:15 AM, Daniel Shahaf wrote: > C. Michael Pilato wrote on Tue, Mar 15, 2011 at 09:05:56 -0400: >> On 03/15/2011 12:34 AM, Daniel Shahaf wrote: >>> cmpil...@apache.org wrote on Tue, Mar 08, 2011 at 20:05:51 -: Author: cmpilato Date: Tue Mar 8 20:05:50 2011 New Revision: 1079508 >> >> [...] >> svn_client_commit4(svn_commit_info_t **commit_info_p, const apr_array_header_t *targets, >> >> [...] >> + /* Ensure that the original notification system is in place. */ + ctx->notify_func2 = notify_baton.orig_notify_func2; + ctx->notify_baton2 = notify_baton.orig_notify_baton2; + >>> >>> This is actually a race condition, isn't it? (for clients that call the >>> deprecated API while using CTX->notify_func2 in another thread (in the >>> same or another API)) >>> >>> (Okay, so maybe we'll just let it live on. Presumably N other >>> deprecated wrappers do this too.) >> >> I hadn't considered that. We do alot of pointer swaps like this up in the >> command-line client code itself, but that's a bit different than doing so >> down in the libsvn_client library as in this case. There is one prior >> instance of us doing this kind of swap inside the libsvn_client library: in >> libsvn_client/copy.c:repos_to_wc_copy_single(). But I'd prefer mere >> precedent not to be our reason for allowing badness to persist in the >> codebase. >> >> Got suggestions? > > Would this work? --- > > svn_client_ctx_t ctx2 = *ctx; > ctx2.notify_func2 = notify_baton.orig_notify_func2; > ctx2.notify_baton2 = notify_baton.orig_notify_baton2; > svn_client_commit5(ctx=&ctx2); -- C. Michael Pilato CollabNet <> www.collab.net <> Distributed Development On Demand signature.asc Description: OpenPGP digital signature
Re: svn commit: r1079508 - in /subversion/trunk/subversion: include/svn_client.h include/svn_wc.h libsvn_client/commit_util.c libsvn_client/deprecated.c svn/notify.c
C. Michael Pilato wrote on Tue, Mar 15, 2011 at 09:05:56 -0400: > On 03/15/2011 12:34 AM, Daniel Shahaf wrote: > > cmpil...@apache.org wrote on Tue, Mar 08, 2011 at 20:05:51 -: > >> Author: cmpilato > >> Date: Tue Mar 8 20:05:50 2011 > >> New Revision: 1079508 > > [...] > > >> svn_client_commit4(svn_commit_info_t **commit_info_p, > >> const apr_array_header_t *targets, > > [...] > > >> + /* Ensure that the original notification system is in place. */ > >> + ctx->notify_func2 = notify_baton.orig_notify_func2; > >> + ctx->notify_baton2 = notify_baton.orig_notify_baton2; > >> + > > > > This is actually a race condition, isn't it? (for clients that call the > > deprecated API while using CTX->notify_func2 in another thread (in the > > same or another API)) > > > > (Okay, so maybe we'll just let it live on. Presumably N other > > deprecated wrappers do this too.) > > I hadn't considered that. We do alot of pointer swaps like this up in the > command-line client code itself, but that's a bit different than doing so > down in the libsvn_client library as in this case. There is one prior > instance of us doing this kind of swap inside the libsvn_client library: in > libsvn_client/copy.c:repos_to_wc_copy_single(). But I'd prefer mere > precedent not to be our reason for allowing badness to persist in the > codebase. > > Got suggestions? Would this work? --- svn_client_ctx_t ctx2 = *ctx; ctx2.notify_func2 = notify_baton.orig_notify_func2; ctx2.notify_baton2 = notify_baton.orig_notify_baton2; svn_client_commit5(ctx=&ctx2);
Re: svn commit: r1079508 - in /subversion/trunk/subversion: include/svn_client.h include/svn_wc.h libsvn_client/commit_util.c libsvn_client/deprecated.c svn/notify.c
On 03/15/2011 12:34 AM, Daniel Shahaf wrote: > cmpil...@apache.org wrote on Tue, Mar 08, 2011 at 20:05:51 -: >> Author: cmpilato >> Date: Tue Mar 8 20:05:50 2011 >> New Revision: 1079508 [...] >> svn_client_commit4(svn_commit_info_t **commit_info_p, >> const apr_array_header_t *targets, [...] >> + /* Ensure that the original notification system is in place. */ >> + ctx->notify_func2 = notify_baton.orig_notify_func2; >> + ctx->notify_baton2 = notify_baton.orig_notify_baton2; >> + > > This is actually a race condition, isn't it? (for clients that call the > deprecated API while using CTX->notify_func2 in another thread (in the > same or another API)) > > (Okay, so maybe we'll just let it live on. Presumably N other > deprecated wrappers do this too.) I hadn't considered that. We do alot of pointer swaps like this up in the command-line client code itself, but that's a bit different than doing so down in the libsvn_client library as in this case. There is one prior instance of us doing this kind of swap inside the libsvn_client library: in libsvn_client/copy.c:repos_to_wc_copy_single(). But I'd prefer mere precedent not to be our reason for allowing badness to persist in the codebase. Got suggestions? -- C. Michael Pilato CollabNet <> www.collab.net <> Distributed Development On Demand signature.asc Description: OpenPGP digital signature
Re: svn commit: r1079592 - /subversion/trunk/subversion/svn/main.c
On 03/15/2011 12:43 AM, Daniel Shahaf wrote: > s...@apache.org wrote on Tue, Mar 08, 2011 at 22:51:52 -: >> Author: stsp >> Date: Tue Mar 8 22:51:51 2011 >> New Revision: 1079592 >> >> URL: http://svn.apache.org/viewvc?rev=1079592&view=rev >> Log: >> * subversion/svn/main.c >> (svn_cl__cmd_table): Document foreign repos merges in merge help text. >> >> Modified: >> subversion/trunk/subversion/svn/main.c >> >> Modified: subversion/trunk/subversion/svn/main.c >> URL: >> http://svn.apache.org/viewvc/subversion/trunk/subversion/svn/main.c?rev=1079592&r1=1079591&r2=1079592&view=diff >> == >> --- subversion/trunk/subversion/svn/main.c (original) >> +++ subversion/trunk/subversion/svn/main.c Tue Mar 8 22:51:51 2011 >> @@ -893,6 +893,14 @@ const svn_opt_subcommand_desc2_t svn_cl_ >> "\n" >> " svn diff ^/trunk@500 ^/feature@HEAD\n" >> "\n" >> + " Note that a 2-URL merge can also merge from foreign >> repositories.\n" >> + " While SOURCE1 and SOURCE2 must both come from the same >> repository,\n" >> + " TARGET_WCPATH may come from a different repository than the >> sources.\n" >> + " However, there are some caveats. Most notably, copies made in >> the\n" >> + " merge source will be transformed into plain additions in the >> merge\n" >> + " target, since the copyfrom information is only valid within >> the\n" >> + " source repository.\n" > > s/within the source repository/within a single repository/ > (or other phrasing that captures the "Doesn't cross repository boundaries" > semantics) > > Also: is "copyfrom information" a user-facing phrase? I would say not. But I'd also say that we don't need to explain the technical reason that copies are downgraded to mere additions. Just state the fact and move on. -- C. Michael Pilato CollabNet <> www.collab.net <> Distributed Development On Demand signature.asc Description: OpenPGP digital signature
Re: wc_db performance
Stefan Sperling writes: > On Tue, Mar 15, 2011 at 10:26:15AM +, Philip Martin wrote: >> Branko Čibej writes: >> >> > The restriction that you may not invoke a callback from within a sqlite >> > transaction remains. That's what the temporary results tables are for -- >> > they live outside transactions on the main wc-db. >> >> I don't understand the temporary table approach. Taking the >> temp__node_props_cache table as an example: it gets populated by a >> transaction and lives beyond that transaction. The table then gets >> queried by a second transaction and the callbacks are made while that >> second transaction is in progress. So callbacks are still being invoked >> from within a transaction. > > It works around the infinite loop problem that happens when the > callback inserts rows that end up being picked up by the proplist query. > Which is silly for this callback to do, but then again we somehow agreed > that the callbacks are free to do virtually anything. (I still think > putting newly documented restrictions on them now is fine and would make > our life easier... *shrug*). I'm still confused. I thought the infinite loop problem affected the old multiple transaction code. If we simply ran a single transaction and made callbacks then that problem would not exist. We would have the same limitations that apply to the temporary table approach: the callback cannot write to the database because a transaction is in progress. >> As far as I can see using a temporary table has made the overall "read >> properties" operation into one that modifies the database, to create the >> temporary table. I guess that by the time the callbacks are made the >> database write-lock has been dropped, but there would have been no >> write-lock if the temporary table were not used. > > I think the temporary table gets put into a separate database file > (or memory, depending on the temp_store pragma or whatsitcalled). > If so, a read lock on the primary db should suffice, shouldn't it? Yes, by the time we invoke the callbacks we have a read-lock. If we simply ran a single transaction and made callbacks within the transaction then it would have the same effect. I don't see what the temporary table achieves. -- Philip
Re: svn commit: r1073366 - in /subversion/trunk: notes/wc-ng/pristine-store subversion/libsvn_wc/wc-queries.sql subversion/libsvn_wc/wc_db_pristine.c
On Tue, Mar 1, 2011 at 21:13, Julian Foad wrote: > Bert Huijben wrote: >> > -Original Message- >> > From: Julian Foad [mailto:julian.f...@wandisco.com] >> > >> > I'm not clear exactly what problem we would avoid by eliminating the >> > "select a unique name" step of this process. Is it what I describe >> > below at the end of note [1] - that a scanner might be more likely to >> > re-scan the content, and therefore more likely to cause a delay? >> >> No, the problem I try to avoid is >> * you create a file >> >> * you delete the file (after the virusscanner releases the file) >> * you rename a file to be at the old location >> >> While we really need something like >> * rename to a unique name. >> > original location> > > >From discussing on IRC we agree that the main concern is that this > involves too many separate filesystem operations, rather than any > specific problem with a particular step. > > I have committed it as is in r1075942, and would like to improve it as a > follow-up. > > Options for improvement: > > (a) Don't open a pristine file with FILE_SHARE_DELETE. Instead, > accept that an attempt to delete it may fail, and if that happens, leave > the there as an orphan. When adding a pristine file, if it already > exists on disk then just keep the copy that already exists. When > cleaning up orphan files, if a delete fails, just leave the file there. > Consider implementing automatic clean-up to prevent orphan files from > accumulating indefinitely. > > (b) Find or write a "rename to a unique name" function that operates > in a single step instead of a creating a new file and then overwriting > it. > > (c) Don't rename the file before deleting it; instead, just delete it. > On Windows, when adding a new file, if a file with that name already > exists, *then* rename the existing file before moving the new file into > place. (We can't just keep the existing file because it is pending > deletion and will disappear when the reader finishes reading it.) > > Pros and cons: > > (a) This would reduce the number of file system operations to a > minimum. It would involve bypassing APR to avoid the FILE_SHARE_DELETE > flag, which is not ideal but possible. > > (b) This would remove one of the file system operations. It would > involve writing a function similar to APR's fall-back implementation of > apr_file_mktemp() that exists for systems that do not provide a > "mkstemp" system API. I'm not sure if there are any concrete problems > with doing this sort of thing in "user space". > > (c) This would remove two of the file system operations. It sounds > straightforward. > > Comments? > (b) Is best option for. It should be big problem to generate random name using current time and then try rename file to this name. Repeat on error. -- Ivan Zhakov
Re: wc_db performance
On Tue, Mar 15, 2011 at 10:26:15AM +, Philip Martin wrote: > Branko Čibej writes: > > > The restriction that you may not invoke a callback from within a sqlite > > transaction remains. That's what the temporary results tables are for -- > > they live outside transactions on the main wc-db. > > I don't understand the temporary table approach. Taking the > temp__node_props_cache table as an example: it gets populated by a > transaction and lives beyond that transaction. The table then gets > queried by a second transaction and the callbacks are made while that > second transaction is in progress. So callbacks are still being invoked > from within a transaction. It works around the infinite loop problem that happens when the callback inserts rows that end up being picked up by the proplist query. Which is silly for this callback to do, but then again we somehow agreed that the callbacks are free to do virtually anything. (I still think putting newly documented restrictions on them now is fine and would make our life easier... *shrug*). > As far as I can see using a temporary table has made the overall "read > properties" operation into one that modifies the database, to create the > temporary table. I guess that by the time the callbacks are made the > database write-lock has been dropped, but there would have been no > write-lock if the temporary table were not used. I think the temporary table gets put into a separate database file (or memory, depending on the temp_store pragma or whatsitcalled). If so, a read lock on the primary db should suffice, shouldn't it?
Re: Another svnadmin checksum mismatch
Ignore this thread; as Philip says it works fine with SVN_DBG_OUTPUT=stderr. Thanks. Philip Martin wrote on Tue, Mar 15, 2011 at 09:35:02 +: > Daniel Shahaf writes: > > > % $svnversion `.svn` > > 1081669M > > % $svn di `.svn` > > Index: subversion/libsvn_fs_fs/fs_fs.c > > === > > --- subversion/libsvn_fs_fs/fs_fs.c (revision 1081669) > > +++ subversion/libsvn_fs_fs/fs_fs.c (working copy) > > @@ -3349,6 +3349,7 @@ > > * non-packed ones: keys for packed rev file content ends with a dot > > * for non-packed rev files they end with a digit. */ > >name = apr_pstrndup(pool, last_part + 1, name_last - last_part); > > + SVN_DBG(("name='%s'\n", name)); > >return svn_fs_fs__combine_number_and_string(offset, name, pool); > > } > > > > % make -C `.svn` -s svnadmin > > % $svnadmin create s > > % $svnmucc put s/README.txt file://$PWD/s/R -mm > > r1 committed by daniel at 2011-03-15T05:27:21.117322Z > > % $svnadmin create s2 > > % $svnadmin dump -q s | $svnadmin load -q s2 > > subversion/libsvn_repos/load.c:603: (apr_err=200014) > > subversion/libsvn_repos/load.c:366: (apr_err=200014) > > subversion/libsvn_subr/checksum.c:418: (apr_err=200014) > > svnadmin: E200014: Checksum mismatch for '/R': > >expected: 520f596e4ab0fb0c28271f554dc8fd10 > > actual: 1021a8e9589b8c3c87827bd6dc7ed5b0 > > > > zsh: done $svnadmin dump -q s | > > zsh: exit 1 $svnadmin load -q s2 > > % md5sum s/README.txt > > 520f596e4ab0fb0c28271f554dc8fd10 /tmp/s/README.txt > > That's a bit cryptic. If you modify the dump output so that the file > contents don't match the checksum then load gives an error. What am I > missing? > > -- > Philip
Re: wc_db performance
Branko Čibej writes: > The restriction that you may not invoke a callback from within a sqlite > transaction remains. That's what the temporary results tables are for -- > they live outside transactions on the main wc-db. I don't understand the temporary table approach. Taking the temp__node_props_cache table as an example: it gets populated by a transaction and lives beyond that transaction. The table then gets queried by a second transaction and the callbacks are made while that second transaction is in progress. So callbacks are still being invoked from within a transaction. As far as I can see using a temporary table has made the overall "read properties" operation into one that modifies the database, to create the temporary table. I guess that by the time the callbacks are made the database write-lock has been dropped, but there would have been no write-lock if the temporary table were not used. -- Philip
Re: wc_db API discussion
Philip Martin writes: > Update with no changes on NFS disk: > > 1.6: 2s > 1.7: 50s With the recent bump-post-update changes that 50s has become 23s. -- Philip
Re: svn commit: r1081390 - /subversion/trunk/subversion/tests/cmdline/merge_tests.py
On 03/14/2011 08:18 PM, Stefan Sperling wrote: On Mon, Mar 14, 2011 at 02:24:58PM -, kame...@apache.org wrote: Author: kameshj Date: Mon Mar 14 14:24:58 2011 New Revision: 1081390 URL: http://svn.apache.org/viewvc?rev=1081390&view=rev Log: Adds an XFail test to catch regression created by r1075802 * subversion/tests/cmdline/merge_tests.py (dry_run_merge_conflicting_binary): New XFail testcase. (test_list): Add dry_run_merge_conflicting_binary. Patch by: Arwin Arni Modified: subversion/trunk/subversion/tests/cmdline/merge_tests.py +@XFail() +def dry_run_merge_conflicting_binary(sbox): + "dry run shouldn't make conflict resoln files" ^ typo? Thanks Stefan. Fixed in r1081703. With regards Kamesh Jayachandran
Re: Another svnadmin checksum mismatch
Daniel Shahaf writes: > % $svnversion `.svn` > 1081669M > % $svn di `.svn` > Index: subversion/libsvn_fs_fs/fs_fs.c > === > --- subversion/libsvn_fs_fs/fs_fs.c (revision 1081669) > +++ subversion/libsvn_fs_fs/fs_fs.c (working copy) > @@ -3349,6 +3349,7 @@ > * non-packed ones: keys for packed rev file content ends with a dot > * for non-packed rev files they end with a digit. */ >name = apr_pstrndup(pool, last_part + 1, name_last - last_part); > + SVN_DBG(("name='%s'\n", name)); >return svn_fs_fs__combine_number_and_string(offset, name, pool); > } > > % make -C `.svn` -s svnadmin > % $svnadmin create s > % $svnmucc put s/README.txt file://$PWD/s/R -mm > r1 committed by daniel at 2011-03-15T05:27:21.117322Z > % $svnadmin create s2 > % $svnadmin dump -q s | $svnadmin load -q s2 > subversion/libsvn_repos/load.c:603: (apr_err=200014) > subversion/libsvn_repos/load.c:366: (apr_err=200014) > subversion/libsvn_subr/checksum.c:418: (apr_err=200014) > svnadmin: E200014: Checksum mismatch for '/R': >expected: 520f596e4ab0fb0c28271f554dc8fd10 > actual: 1021a8e9589b8c3c87827bd6dc7ed5b0 > > zsh: done $svnadmin dump -q s | > zsh: exit 1 $svnadmin load -q s2 > % md5sum s/README.txt > 520f596e4ab0fb0c28271f554dc8fd10 /tmp/s/README.txt That's a bit cryptic. If you modify the dump output so that the file contents don't match the checksum then load gives an error. What am I missing? -- Philip