Re: [PATCH] Add --dry-run flag to "svn update" client command

2011-03-15 Thread Arwin Arni

On Tuesday 15 March 2011 08:34 PM, Julian Foad wrote:

On Wed, 2011-03-02, Arwin Arni wrote:

In my effort to understand the delta editor API, I took it upon myself
to try and implement the --dry-run flag for "svn update".



http://subversion.tigris.org/issues/show_bug.cgi?id=2491


Hi Arwin.  Kamesh asked for technical feedback about this patch so
here's a review.

In my view, a "dry run" mode for update is useful but not very
important.  The complexity of this implementation looks about what I'd
expect, not too bad, but I think right now is maybe not a good time to
add this because it does add significant complexity to code that we are
trying to work on.


Currently, externals are handled inside
subversion/libsvn_client/externals.c by running checkout/switch. For a
dry-run update to mimic a real update, the notifications have to be the
same. Since some of these notifications are generated by the above
mentioned checkout/switch runs, I have to implement dry-run for them
also. I'll take this up as a follow-up exercise. Now, the dry-run will
simply ignore any externals in the working copy.


It's fine to handle externals in a follow-up patch.

I see that '--parents' and '--set-depth' are not allowed in dry-run
mode.  What is the reason behind that?  Is it because they seem to be
difficult to implement, or you think they are not needed, or you intend
to implement them in a follow-up patch, or something else?



Well, the reason is that both --parents and --set-depth make permanent 
changes to the WC which will not be reported in the output at all.. If 
the user is passing these parameters, he has a fair idea of what these 
"invisible" changes are, and so shouldn't be adding them to a dry-run 
command. (Maybe we could just make them FALSE during a dry-run update so 
it doesn't throw those errors that I coded. The user will be oblivious 
to this, as the output isn't going to change in any way)



The reason I'm concerned about orthogonality is because if some parts of
the code don't support one feature, and other parts of the code don't
support another feature, that can make tasks such as refactoring the
code much more difficult.


Not quite sure I understand what you mean..


Index: subversion/include/svn_wc.h
===
--- subversion/include/svn_wc.h (revision 1075841)
+++ subversion/include/svn_wc.h (working copy)
@@ -4990,7 +4990,8 @@
   * @a notify_baton and the path of the restored file. @a notify_func may
   * be @c NULL if this notification is not required.  If @a
   * use_commit_times is TRUE, then set restored files' timestamps to
- * their last-commit-times.
+ * their last-commit-times. If @a dry_run is TRUE, only notifications are
+ * done. Using @a dry_run is meaningless if @a restore_files is FALSE.


The last sentence is not clear.  What does it mean in terms of the
promises and requirements of this API?



This is in svn_wc_crawl_revisions5.

The value of dry_run is never referenced unless restore_files is TRUE.

restore_files   dry_runoutcome

TRUETRUE   dry-run restore
TRUEFALSE  actually restore
FALSE   TRUE/FALSE no restore


Index: subversion/svn/main.c
===
@@ -1733,6 +1733,13 @@
[...]
case opt_dry_run:
+if (opt_state.parents)
+  {
+err = svn_error_create
+(SVN_ERR_CL_ARG_PARSING_ERROR, NULL,
+ _("Can't specify --parents with --dry-run"));
+return svn_cmdline_handle_exit_error(err, pool, "svn: ");
+  }
  opt_state.dry_run = TRUE;
  break;


It looks to me like this check will only work if "--parents" is
specified before "--dry-run" on the command line, and not if they are
specified the other way around.

Also, by putting the check here in "main.c", this restriction applies to
all commands, whereas I think this restriction should be specific to the
"update" command, even if there are currently no other commands that
accept both options.

To fix those two problems, please move the check into the "update"
command itself.



Hmm..

I guess my solution of allowing the user to pass these combination of 
options, and making them a no-op would solve all these problems..



@@ -1764,6 +1771,14 @@
case opt_set_depth:
+if (opt_state.dry_run)
+  {
[...]


Similarly here.


+  }




TEST SUITE

* subversion/tests/cmdline/svntest/actions.py
   (_post_update_checks)  : New function to do the checks after an update.
   (run_and_verify_update): Accept a dry_run parameter. If TRUE, check that
the dry run hasn't affected the wc.


The phrase "the dry run" makes the reader ask, "What dry run?".  I would
suggest: "If TRUE, first do a dry run and check that it hasn't affected
the WC."



Okay..

[Noorul Islam K M] Re: [PATCH] Fix for issue 3826

2011-03-15 Thread Noorul Islam K M

I wonder, why this patch was not considered for fixing issue 3826. I
could see a fix 1081799 which I see very similar to what I have
done. Did I miss anything here?

Thanks and Regards
Noorul

--- Begin Message ---
Julian Foad  writes:

> On Thu, 2011-03-03 at 22:48 +0530, Noorul Islam K M wrote:
>
>> Julian Foad  writes:
>> 
>> > On Wed, 2011-03-02, Noorul Islam K M wrote:
>> >
>> >> Please find attached patch for issue 3826. All tests pass using 'make
>> >> check' 
>> > [...]
>> >> Index: subversion/svn/diff-cmd.c
>> >> ===
>> >> --- subversion/svn/diff-cmd.c (revision 1076214)
>> >> +++ subversion/svn/diff-cmd.c (working copy)
>> >> @@ -324,8 +324,11 @@
>> >>  return svn_error_createf(SVN_ERR_CL_ARG_PARSING_ERROR, NULL,
>> >>   _("Path '%s' not relative to base 
>> >> URLs"),
>> >>   path);
>> >> +  if (! svn_dirent_is_absolute(path))
>> >> +path = svn_relpath_canonicalize(path, iterpool);
>> >> +  else
>> >> +path = svn_dirent_canonicalize(path, iterpool);
>
> Your new code here expects that 'path' could be either an absolute local
> path (know as a 'dirent'), or a 'relpath' ...
>
>> >>  
>> >> -  path = svn_relpath_canonicalize(path, iterpool);
>> >>if (svn_path_is_url(old_target))
>> >>  target1 = svn_path_url_add_component2(old_target, path, 
>> >> iterpool);
>
> ... but here (if old_target is a URL) the code requires that 'path' is a
> relpath.  So what happens if 'path' is a 'dirent'?  Does this
> 'url_add_component' line get executed (and if so it will produce wrong
> results), or does it not get executed (and if not, why not)?
>

>From 'svn help diff'

  2. Display the differences between OLD-TGT as it was seen in OLDREV and
 NEW-TGT as it was seen in NEWREV.  PATHs, if given, are relative to
 OLD-TGT and NEW-TGT and restrict the output to differences for those
 paths.  OLD-TGT and NEW-TGT may be working copy paths or URL[@REV].
 NEW-TGT defaults to OLD-TGT if not specified.  -r N makes OLDREV default
 to N, -r N:M makes OLDREV default to N and NEWREV default to M.

I modified the patch a bit so that conversion to relpath happens only
when user specified new-tgt or old-tgt. In other cases I think it can be
absolute path.

Attached is the modified patch.

Thanks and Regards
Noorul

Index: subversion/tests/cmdline/diff_tests.py
===
--- subversion/tests/cmdline/diff_tests.py  (revision 1079662)
+++ subversion/tests/cmdline/diff_tests.py  (working copy)
@@ -3761,7 +3761,6 @@
  '-c2', '--git')
   os.chdir(was_cwd)
 
-@XFail()
 @Issue(3826)
 def diff_abs_localpath_from_wc_folder(sbox):
   "diff absolute localpath from wc folder"
@@ -3770,9 +3769,10 @@
 
   A_path = os.path.join(wc_dir, 'A')
   B_path = os.path.join(wc_dir, 'A', 'B')
+  B_abspath = os.path.abspath(B_path)
   os.chdir(os.path.abspath(A_path))
   svntest.actions.run_and_verify_svn(None, None, [], 'diff',
- os.path.abspath(B_path))
+ B_abspath)
   
 
 #Run the tests
Index: subversion/svn/diff-cmd.c
===
--- subversion/svn/diff-cmd.c   (revision 1079662)
+++ subversion/svn/diff-cmd.c   (working copy)
@@ -325,7 +325,11 @@
  _("Path '%s' not relative to base URLs"),
  path);
 
-  path = svn_relpath_canonicalize(path, iterpool);
+  if ((strcmp(old_target, "") != 0) || (strcmp(new_target, "") != 0))
+path = svn_relpath_canonicalize(path, iterpool);
+  else
+path = svn_dirent_canonicalize(path, iterpool);
+
   if (svn_path_is_url(old_target))
 target1 = svn_path_url_add_component2(old_target, path, iterpool);
   else
--- End Message ---


Re: wc_db performance (was: wc_db API discussion)

2011-03-15 Thread Paul Burba
On Mon, Mar 14, 2011 at 2:36 PM, Hyrum K Wright  wrote:
> On Sat, Mar 12, 2011 at 6:47 AM, Stefan Sperling  wrote:
>> On Fri, Mar 11, 2011 at 10:43:46PM -0500, Greg Stein wrote:
>>> 2011/3/11 Branko Čibej :
>>> >...
>>> > For the second task, I think the first order of business is to change
>>> > the wc-db tree crawler to do one query instead of zillions, or at least,
>>> > where several queries are required, to do them all in one transaction.
>>>
>>> stsp has been working this recently. Killing the node walker, and
>>> moving to table scans.
>>
>> Yes. So far, I've been working in the revision status code.
>> There are two problems left to fix before I'll move on to the next task:
>>
>>  - There are API layering issues (wc_db.c calls into node.c).
>>   This is related to the API discussions in the other thread
>>   so I'll follow up there.
>
> Before reading this thread, I saw the call into node.c, and have
> subsequently removed it.
>
>>  - The revision status code issues about 5 separate queries,
>>   which aren't combined via a transaction and don't use temporary tables.
>>   This is no worse than the previous code using the node walker,
>>   obviously :)  But I'll look at fixing this so that the results
>>   returned correspond to the state of the DB as of the time the
>>   svn_wc__db_revision_status() call was made.
>
> I wrapped this API in a txn in r1081510.
>
>> For others who want to jump in and help, here is a list of places
>> where the node walker is still being used. I'm not sure if we can
>> eliminate it everywhere before release, but each of these should
>> be looked at to see whether we can use an alternative approach to
>> increase performance:
>>
>>  subversion/libsvn_client/changelist.c
>>  subversion/libsvn_client/commit_util.c
>>  subversion/libsvn_client/info.c
>>  subversion/libsvn_client/merge.c
>>  subversion/libsvn_client/mergeinfo.c
>>  subversion/libsvn_client/prop_commands.c
>>   (This should be propget and propset. Proplist is already using
>>    queries involving temporary tables. Rewriting propget on top
>>    of the proplist code would be easy.

Is anyone working on propget?  I'll take stab at that if not.  It
would be useful for dealing with merge.c:merge_reintegrate_locked's
use of svn_wc__node_walk_children().

Paul

>> Propset needs more work.)
>>  subversion/libsvn_client/ra.c
>>  subversion/libsvn_wc/update_editor.c
>
> I'll take a gander at some of these, too.  (But I'm not entirely sure
> what I' gandering at or for...)
>
> -Hyrum
>


Re: Subversion trunk (r1078338) HTTP(/WC?) performance problems?

2011-03-15 Thread Ivan Zhakov
On Tue, Mar 15, 2011 at 22:38, John Beranek  wrote:
> On 14/03/2011 19:52, Ivan Zhakov wrote:
>> On Mon, Mar 14, 2011 at 22:47, John Beranek  wrote:
>>> On 14/03/2011 19:45, Ivan Zhakov wrote:
 On Mon, Mar 14, 2011 at 20:46, John Beranek  wrote:
> On 14/03/11 17:14, Philip Martin wrote:
>> Ivan Zhakov  writes:
>>
>>> I investigated your problem with high CPU usage and fixed in serf in
>>> r1456. Could you please test it and confirm that problem with high CPU
>>> load is fixed in your case?
>>
>
> Hmm, your timings still don't match mine. I still see ra_serf being
> (significantly) slower than ra_neon. With serf r1456 and SVN r1081484
>
> Export over a LAN from a trunk server:
>
> trunk ra_neon: 1.42
> trunk ra_serf: 2.44
>
> Export over localhost from a trunk server:
>
> trunk ra_neon: 1.69
> trunk ra_serf: 3.00
>
> [I imagine this is a little slower because one host (a dual-core
> workstation) is now both client and server]
>
> Cheers,
 Hi John,

 Does your server requires authentication? What is MaxKeepAliveRequests 
 settings?
>>>
>>> Yes, and:
>>>
>>> Timeout 30
>>> KeepAlive On
>>> MaxKeepAliveRequests 0
>>> KeepAliveTimeout 50
>>>
>>
>> Btw, I forgot to ask you: is gzip compression enabled on your server?
>
> Compression is _off_ on my trunk server.
>
That's also reduce performance for ra_serf, unfortunately mod_deflate
has memory leaks when used with mod_dav_svn [1]

[1] http://svn.haxx.se/dev/archive-2009-08/0274.shtml

-- 
Ivan Zhakov


Re: wc_db performance

2011-03-15 Thread Philip Martin
Greg Stein  writes:

> Why can't we simply put restrictions on those callbacks? I'm unclear
> on this part.

We seem to be unsure if such a restriction is a good idea.

Suppose a client calls proplist, what can it do when it sees a property
of interest?  Can it delete the property?  Modify the value?  Run diff
on the file?  Run proplist on some other path?

-- 
Philip


Re: wc_db performance

2011-03-15 Thread Philip Martin
Branko Čibej  writes:

> On 15.03.2011 15:34, Philip Martin wrote:
>> So with the temporary table approach the callback really has to use a
>> separate database connection to read/write the database from within the
>> callback.
>>
>> However I think that is also the case if we were to avoid the table and
>> simply lock the main database; if just one connection was reused it
>> might be attempt to reuse an SQLite statement.
>
> There's a trick we could use if that's a problem, namely: instead of
> simply creating temporary tables by using the connection's default temp
> database; create another temp *database* per query. It's anonymous, so
> only the code that holds its handle knows about it, and there's no way
> for the callback to access it.
>
> This would require a bit more code, however, the concept is a reusable
> pattern (and the code for the temp database handling would be reusable,
> too.)

Sounds plausible.

Is it valid for an application to call proplist and then in the callback
call proplist again, say on a different part of the working copy?  To
support it I think we need to avoid using a fixed name for the temporary
database, perhaps we could use a sequence number in the database
connection structure?

I thought we would have this problem with the name of the temporary
table, but we get a database locked error first.

-- 
Philip


Re: wc_db performance

2011-03-15 Thread Greg Stein
On Tue, Mar 15, 2011 at 10:34, Philip Martin  wrote:
>...
> So with the temporary table approach the callback really has to use a
> separate database connection to read/write the database from within the
> callback.
>
> However I think that is also the case if we were to avoid the table and
> simply lock the main database; if just one connection was reused it
> might be attempt to reuse an SQLite statement.

And note that we work really hard to use a single SDB connection. We
re-use that as much as we can. I don't even know how we'd start to
bring in more connections. Maybe under the covers, wc_db could open
specialized connections.

But, really?

Why can't we simply put restrictions on those callbacks? I'm unclear
on this part.

Cheers,
-g


Re: Subversion trunk (r1078338) HTTP(/WC?) performance problems?

2011-03-15 Thread John Beranek
On 14/03/2011 19:52, Ivan Zhakov wrote:
> On Mon, Mar 14, 2011 at 22:47, John Beranek  wrote:
>> On 14/03/2011 19:45, Ivan Zhakov wrote:
>>> On Mon, Mar 14, 2011 at 20:46, John Beranek  wrote:
 On 14/03/11 17:14, Philip Martin wrote:
> Ivan Zhakov  writes:
>
>> I investigated your problem with high CPU usage and fixed in serf in
>> r1456. Could you please test it and confirm that problem with high CPU
>> load is fixed in your case?
>

 Hmm, your timings still don't match mine. I still see ra_serf being
 (significantly) slower than ra_neon. With serf r1456 and SVN r1081484

 Export over a LAN from a trunk server:

 trunk ra_neon: 1.42
 trunk ra_serf: 2.44

 Export over localhost from a trunk server:

 trunk ra_neon: 1.69
 trunk ra_serf: 3.00

 [I imagine this is a little slower because one host (a dual-core
 workstation) is now both client and server]

 Cheers,
>>> Hi John,
>>>
>>> Does your server requires authentication? What is MaxKeepAliveRequests 
>>> settings?
>>
>> Yes, and:
>>
>> Timeout 30
>> KeepAlive On
>> MaxKeepAliveRequests 0
>> KeepAliveTimeout 50
>>
> 
> Btw, I forgot to ask you: is gzip compression enabled on your server?

Compression is _off_ on my trunk server.

John.

-- 
John Beranek To generalise is to be an idiot.
http://redux.org.uk/ -- William Blake



smime.p7s
Description: S/MIME Cryptographic Signature


Re: svn commit: r1081892 - in /subversion/trunk/subversion: include/private/svn_io_private.h libsvn_subr/io.c libsvn_wc/adm_ops.c libsvn_wc/questions.c libsvn_wc/wc.h tests/cmdline/revert_tests.py

2011-03-15 Thread Philip Martin
phi...@apache.org writes:

> Author: philip
> Date: Tue Mar 15 18:33:52 2011
> New Revision: 1081892
>
> URL: http://svn.apache.org/viewvc?rev=1081892&view=rev
> Log:
> Make revert change file permissions on the basis of the current magic
> properties and the current permissions, rather than on any change in
> the magic permissions.  This means file permissions follow the same
> rules as file content, and that permissions-only reverts occur.
>
> * subversion/include/private/svn_private_io.h: New.
>
> * subversion/libsvn_subr/io.c
>   (): Include svn_private_io.h.
>   (svn_io__is_finfo_read_only, svn_io__is_executable): New.
>   (svn_io_is_file_executable): Call svn_io__is_executable.
>
> * subversion/libsvn_wc/wc.h
>   (svn_wc__internal_file_modified_p): New.
>
> * subversion/libsvn_wc/questions.c
>   (): Include svn_private_io.h.
>   (svn_wc__internal_file_modified_p): New.
>   (svn_wc__internal_text_modified_p): Call svn_wc__internal_file_modified_p.
>
> * subversion/libsvn_wc/adm_ops.c
>   (new_revert_internal): Use svn_wc__internal_file_modified_p to
>determine whether to reset contents or permissions.
>
> * subversion/tests/cmdline/revert_tests.py
>   (revert_permissions_only): New, marked XFail but passes with new revert.
>   (test_list): Add revert_permissions_only.

The implementation is a little ugly, a private svn_io API.

In the past a file with text and permission changes would have its
permissions reset by revert, and a file with magic property changes
would have its permissions reset by revert.  However a file with just
permission changes would not have its permission reset by revert.

The new code I'm working on will now revert permissions in that last
case.  This could be viewed as fixing a bug, but I suppose some people
might not like it.  Permission-only changes don't show up in status.

Thoughts?

-- 
Philip


Re: wc_db performance

2011-03-15 Thread Branko Čibej
On 15.03.2011 15:34, Philip Martin wrote:
> Philip Martin  writes:
>
>> Branko Čibej  writes:
>>
>>> The temporary table is:
>>>
>>> * in a different database, so the read lock on it does not affect
>>>   the main wc-db;
>>> * per-connection, so even the same process using a different
>>>   database connection will not even see that temporary table.
>> OK, so the process is:
>>
>>   - read lock on main database
>>   - write lock on temporary database
>>   - populate temporary table
>>   - release write lock
>>   - release read lock
>>   - read lock on temporary database
>>   - make callbacks
>>   - release read lock
>>
>> without the temporary table it could be:
>>
>>   - read lock on main database
>>   - make callbacks
>>   - release lock
>>
>> What we gain is that the callback can modify the main database, because
>> there is no read lock.  It still can't modify it using any functions
>> that require write access to the temporary database.
>>
>> What we lose is that the callback cannot call any "read-only" functions
>> that use temporary tables because the step that requires a write lock
>> will fail.
> So with the temporary table approach the callback really has to use a
> separate database connection to read/write the database from within the
> callback.
>
> However I think that is also the case if we were to avoid the table and
> simply lock the main database; if just one connection was reused it
> might be attempt to reuse an SQLite statement.

There's a trick we could use if that's a problem, namely: instead of
simply creating temporary tables by using the connection's default temp
database; create another temp *database* per query. It's anonymous, so
only the code that holds its handle knows about it, and there's no way
for the callback to access it.

This would require a bit more code, however, the concept is a reusable
pattern (and the code for the temp database handling would be reusable,
too.)

-- Brane


Re: [PATCH] Add --dry-run flag to "svn update" client command

2011-03-15 Thread Julian Foad
On Wed, 2011-03-02, Arwin Arni wrote: 
> In my effort to understand the delta editor API, I took it upon myself 
> to try and implement the --dry-run flag for "svn update".

> http://subversion.tigris.org/issues/show_bug.cgi?id=2491

Hi Arwin.  Kamesh asked for technical feedback about this patch so
here's a review.

In my view, a "dry run" mode for update is useful but not very
important.  The complexity of this implementation looks about what I'd
expect, not too bad, but I think right now is maybe not a good time to
add this because it does add significant complexity to code that we are
trying to work on.

> Currently, externals are handled inside 
> subversion/libsvn_client/externals.c by running checkout/switch. For a 
> dry-run update to mimic a real update, the notifications have to be the 
> same. Since some of these notifications are generated by the above 
> mentioned checkout/switch runs, I have to implement dry-run for them 
> also. I'll take this up as a follow-up exercise. Now, the dry-run will 
> simply ignore any externals in the working copy.

It's fine to handle externals in a follow-up patch.

I see that '--parents' and '--set-depth' are not allowed in dry-run
mode.  What is the reason behind that?  Is it because they seem to be
difficult to implement, or you think they are not needed, or you intend
to implement them in a follow-up patch, or something else?

The reason I'm concerned about orthogonality is because if some parts of
the code don't support one feature, and other parts of the code don't
support another feature, that can make tasks such as refactoring the
code much more difficult.


> Index: subversion/include/svn_wc.h
> ===
> --- subversion/include/svn_wc.h (revision 1075841)
> +++ subversion/include/svn_wc.h (working copy)
> @@ -4990,7 +4990,8 @@
>   * @a notify_baton and the path of the restored file. @a notify_func may
>   * be @c NULL if this notification is not required.  If @a
>   * use_commit_times is TRUE, then set restored files' timestamps to
> - * their last-commit-times.
> + * their last-commit-times. If @a dry_run is TRUE, only notifications are
> + * done. Using @a dry_run is meaningless if @a restore_files is FALSE.

The last sentence is not clear.  What does it mean in terms of the
promises and requirements of this API?

> Index: subversion/svn/main.c
> ===
> @@ -1733,6 +1733,13 @@
> [...]
>case opt_dry_run:
> +if (opt_state.parents)
> +  {
> +err = svn_error_create
> +(SVN_ERR_CL_ARG_PARSING_ERROR, NULL,
> + _("Can't specify --parents with --dry-run"));
> +return svn_cmdline_handle_exit_error(err, pool, "svn: ");
> +  }
>  opt_state.dry_run = TRUE;
>  break;

It looks to me like this check will only work if "--parents" is
specified before "--dry-run" on the command line, and not if they are
specified the other way around.

Also, by putting the check here in "main.c", this restriction applies to
all commands, whereas I think this restriction should be specific to the
"update" command, even if there are currently no other commands that
accept both options.

To fix those two problems, please move the check into the "update"
command itself.

> @@ -1764,6 +1771,14 @@
>case opt_set_depth:
> +if (opt_state.dry_run)
> +  {
> [...]

Similarly here.

> +  }


> TEST SUITE
> 
> * subversion/tests/cmdline/svntest/actions.py
>   (_post_update_checks)  : New function to do the checks after an update.
>   (run_and_verify_update): Accept a dry_run parameter. If TRUE, check that 
>the dry run hasn't affected the wc.

The phrase "the dry run" makes the reader ask, "What dry run?".  I would
suggest: "If TRUE, first do a dry run and check that it hasn't affected
the WC."

> * subversion/tests/cmdline/changelist_tests.py
>   subversion/tests/cmdline/externals_tests.py
>   subversion/tests/cmdline/prop_tests.py
>   subversion/tests/cmdline/update_tests.py
>   subversion/tests/cmdline/resolved_tests.py
>   subversion/tests/cmdline/trans_tests.py
>   subversion/tests/cmdline/switch_tests.py
>   subversion/tests/cmdline/stat_tests.py
>   subversion/tests/cmdline/depth_tests.py
> 
>   (Tests) : Fix callers of run_and_verify_update.

Rather than "Fix callers", say what the change is: "Pass dry_run=True to
some callers and dry_run=False to other callers ...".

Why do some callers pass True and some pass False?  I guess it's because
some callers use externals or --parents or --set-depth, and none of
those work properly with dry-run, so those are passing False.

But we could take a different approach.  Rather than the caller deciding
whether a dry-run check is going to work, we could let
run_and_verify_update() decide to perform the additional check
automatically when it can.  If any of the args

Re: svn commit: r1079400 - /subversion/trunk/subversion/tests/cmdline/log_tests.py

2011-03-15 Thread Paul Burba
On Tue, Mar 15, 2011 at 9:56 AM, Daniel Shahaf  wrote:
> Paul Burba wrote on Tue, Mar 15, 2011 at 09:38:31 -0400:
>> On Tue, Mar 15, 2011 at 12:24 AM, Daniel Shahaf  
>> wrote:
>> > pbu...@apache.org wrote on Tue, Mar 08, 2011 at 15:46:10 -:
>> >> +++ subversion/trunk/subversion/tests/cmdline/log_tests.py Tue Mar  8 
>> >> 15:46:09 2011
>> >> @@ -1148,9 +1148,9 @@ def check_merge_results(log_chain, expec
>> >>
>> >>    # Check to see if the number and values of the revisions is correct
>> >>    for log in log_chain:
>> >> -    if (log['revision'] not in expected_merges
>> >> -        and (expected_reverse_merges is not None
>> >> -             and log['revision'] not in expected_reverse_merges)):
>> >> +    if not ((expected_merges and log['revision'] in expected_merges)
>> >> +            or (expected_reverse_merges
>> >> +                and log['revision'] in expected_reverse_merges)):
>> >
>> > If EXPECTED_MERGES and EXPECTED_REVERSE_MERGES are both None,
>> > then the if() would trigger --- and I don't think that's the
>> > intention.
>>
>> Hi Daniel,
>>
>> It is the intention.  If EXPECTED_MERGES and EXPECTED_REVERSE_MERGES
>> are both None, then the caller believes that no merged revisions
>> (normal or reverse) are present.  However, there *is* something in the
>> LOG_CHAIN, so there is an error.  Admittedly, none of the present
>> callers pass  EXPECTED_MERGES=None and EXPECTED_REVERSE_MERGES=None,
>> but we might have reason to do so in the future.
>>
>> Paul

Possibly I'm abusing a Python convention, but...

> I see; I assumed that passing None means "I don't care about this piece
> of information; do not attempt to validate it",

That makes sense for a function like
svntest/actions.py:run_and_verify_svn2(), where if EXPECTED_STDOUT is
None, we don't check stdout.  That function's core purpose is to
invoke main.run_svn().  The caller may or may not care about
confirming the stdout, so EXPECTED_STDOUT is justifiably optional.

On the other hand, log_tests.py:check_merge_results() has only one
purpose, to check that the expected '(Reverse) Merged via' headers
match the actual output.  That's it.  Why would we ever want to ignore
either?  All that would enable us to do is write a test that
spuriously passes.

Paul

> and that callers who
> believe there are no merged revisions would pass an empty dict/list.
>
> Thanks for the clarification,
>
> Daniel
>


Re: wc_db performance

2011-03-15 Thread Philip Martin
Philip Martin  writes:

> Branko Čibej  writes:
>
>> The temporary table is:
>>
>> * in a different database, so the read lock on it does not affect
>>   the main wc-db;
>> * per-connection, so even the same process using a different
>>   database connection will not even see that temporary table.
>
> OK, so the process is:
>
>   - read lock on main database
>   - write lock on temporary database
>   - populate temporary table
>   - release write lock
>   - release read lock
>   - read lock on temporary database
>   - make callbacks
>   - release read lock
>
> without the temporary table it could be:
>
>   - read lock on main database
>   - make callbacks
>   - release lock
>
> What we gain is that the callback can modify the main database, because
> there is no read lock.  It still can't modify it using any functions
> that require write access to the temporary database.
>
> What we lose is that the callback cannot call any "read-only" functions
> that use temporary tables because the step that requires a write lock
> will fail.

So with the temporary table approach the callback really has to use a
separate database connection to read/write the database from within the
callback.

However I think that is also the case if we were to avoid the table and
simply lock the main database; if just one connection was reused it
might be attempt to reuse an SQLite statement.

-- 
Philip


Re: wc_db performance

2011-03-15 Thread Philip Martin
Branko Čibej  writes:

> The temporary table is:
>
> * in a different database, so the read lock on it does not affect
>   the main wc-db;
> * per-connection, so even the same process using a different
>   database connection will not even see that temporary table.

OK, so the process is:

  - read lock on main database
  - write lock on temporary database
  - populate temporary table
  - release write lock
  - release read lock
  - read lock on temporary database
  - make callbacks
  - release read lock

without the temporary table it could be:

  - read lock on main database
  - make callbacks
  - release lock

What we gain is that the callback can modify the main database, because
there is no read lock.  It still can't modify it using any functions
that require write access to the temporary database.

What we lose is that the callback cannot call any "read-only" functions
that use temporary tables because the step that requires a write lock
will fail.

-- 
Philip


Re: wc_db performance

2011-03-15 Thread Stefan Sperling
On Tue, Mar 15, 2011 at 12:52:42PM +, Philip Martin wrote:
> Yes, by the time we invoke the callbacks we have a read-lock.  If we
> simply ran a single transaction and made callbacks within the
> transaction then it would have the same effect.  I don't see what the
> temporary table achieves.

The temp tables live in a separate database.
Without them, lock contention is on a single database.
With them, the contention situation is different -- the callback cannot
read from or write to the temp table.

At least that is how I understand this text:
http://sqlite.org/lang_createtable.html: If the "TEMP" or "TEMPORARY" keyword
occurs between the "CREATE" and "TABLE" then the new table is created in the
temp database. It is an error to specify both a  and the TEMP or
TEMPORARY keyword, unless the  is "temp". If no database name is
specified and the TEMP keyword is not present then the table is created in the
main database. 

So we use the temp table to create a separate database that contains
the data we need to present to the callback, and which is unaffected
by whatever else the callback might be doing.

Does that make sense?


RE: Race condition in libsvn_client code?

2011-03-15 Thread Bert Huijben


> -Original Message-
> From: Philip Martin [mailto:philip.mar...@wandisco.com]
> Sent: dinsdag 15 maart 2011 14:52
> To: C. Michael Pilato
> Cc: Subversion Development
> Subject: Re: Race condition in libsvn_client code?
> 
> "C. Michael Pilato"  writes:
> 
> > [Tweaking Subject: for (hopefully) additional visibility.]
> >
> > On 03/15/2011 09:15 AM, Daniel Shahaf wrote:
> >> C. Michael Pilato wrote on Tue, Mar 15, 2011 at 09:05:56 -0400:
> >>> On 03/15/2011 12:34 AM, Daniel Shahaf wrote:
>  cmpil...@apache.org wrote on Tue, Mar 08, 2011 at 20:05:51 -:
> > Author: cmpilato
> > Date: Tue Mar  8 20:05:50 2011
> > New Revision: 1079508
> >>>
> >>> [...]
> >>>
> >  svn_client_commit4(svn_commit_info_t **commit_info_p,
> > const apr_array_header_t *targets,
> >>>
> >>> [...]
> >>>
> > +  /* Ensure that the original notification system is in place. */
> > +  ctx->notify_func2 = notify_baton.orig_notify_func2;
> > +  ctx->notify_baton2 = notify_baton.orig_notify_baton2;
> > +
> 
>  This is actually a race condition, isn't it? (for clients that call
the
>  deprecated API while using CTX->notify_func2 in another thread (in
> the
>  same or another API))
> 
>  (Okay, so maybe we'll just let it live on.  Presumably N other
>  deprecated wrappers do this too.)
> >>>
> >>> I hadn't considered that.  We do alot of pointer swaps like this up in
the
> >>> command-line client code itself, but that's a bit different than doing
so
> >>> down in the libsvn_client library as in this case.  There is one prior
> >>> instance of us doing this kind of swap inside the libsvn_client
library:  in
> >>> libsvn_client/copy.c:repos_to_wc_copy_single().  But I'd prefer mere
> >>> precedent not to be our reason for allowing badness to persist in the
> codebase.
> >>>
> >>> Got suggestions?
> >>
> >> Would this work? ---
> >>
> >> svn_client_ctx_t ctx2 = *ctx;
> >> ctx2.notify_func2 = notify_baton.orig_notify_func2;
> >> ctx2.notify_baton2 = notify_baton.orig_notify_baton2;
> >> svn_client_commit5(ctx=&ctx2);
> 
> It has a different problem, if the client modifies the context in a
> callback those modifications will not persist.

I don't think any of our callbacks receives the current ctx instance.

But see my other mail: We store a svn_wc_context_t in the client context. So
svn_client_ctx_t can't be shared between functions running on different
threads anyway.

Bert



Re: svn commit: r1073366 - in /subversion/trunk: notes/wc-ng/pristine-store subversion/libsvn_wc/wc-queries.sql subversion/libsvn_wc/wc_db_pristine.c

2011-03-15 Thread Ivan Zhakov
On Tue, Mar 15, 2011 at 16:49, Julian Foad  wrote:
> Ivan Zhakov wrote:
>> On Tue, Mar 1, 2011 at 21:13, Julian Foad  wrote:
>> > Bert Huijben wrote:
>> >> > -Original Message-
>> >> > From: Julian Foad [mailto:julian.f...@wandisco.com]
>> >> >
>> >> > I'm not clear exactly what problem we would avoid by eliminating the
>> >> > "select a unique name" step of this process.  Is it what I describe
>> >> > below at the end of note [1] - that a scanner might be more likely to
>> >> > re-scan the content, and therefore more likely to cause a delay?
>> >>
>> >> No, the problem I try to avoid is
>> >> * you create a file
>> >> 
>> >> * you delete the file (after the virusscanner releases the file)
>> >> * you rename a file to be at the old location
>> >>
>> >> While we really need something like
>> >> * rename to a unique name.
>> >> > >> original location>
>> >
>> > >From discussing on IRC we agree that the main concern is that this
>> > involves too many separate filesystem operations, rather than any
>> > specific problem with a particular step.
>> >
>> > I have committed it as is in r1075942, and would like to improve it as a
>> > follow-up.
>> >
>> > Options for improvement:
>> >
>> >  (a) Don't open a pristine file with FILE_SHARE_DELETE.  Instead,
>> > accept that an attempt to delete it may fail, and if that happens, leave
>> > the there as an orphan.  When adding a pristine file, if it already
>> > exists on disk then just keep the copy that already exists.  When
>> > cleaning up orphan files, if a delete fails, just leave the file there.
>> > Consider implementing automatic clean-up to prevent orphan files from
>> > accumulating indefinitely.
>> >
>> >  (b) Find or write a "rename to a unique name" function that operates
>> > in a single step instead of a creating a new file and then overwriting
>> > it.
>> >
>> >  (c) Don't rename the file before deleting it; instead, just delete it.
>> > On Windows, when adding a new file, if a file with that name already
>> > exists, *then* rename the existing file before moving the new file into
>> > place. (We can't just keep the existing file because it is pending
>> > deletion and will disappear when the reader finishes reading it.)
>> >
>> > Pros and cons:
>> >
>> >  (a) This would reduce the number of file system operations to a
>> > minimum.  It would involve bypassing APR to avoid the FILE_SHARE_DELETE
>> > flag, which is not ideal but possible.
>> >
>> >  (b) This would remove one of the file system operations.  It would
>> > involve writing a function similar to APR's fall-back implementation of
>> > apr_file_mktemp() that exists for systems that do not provide a
>> > "mkstemp" system API.  I'm not sure if there are any concrete problems
>> > with doing this sort of thing in "user space".
>> >
>> >  (c) This would remove two of the file system operations.  It sounds
>> > straightforward.
>> >
>> > Comments?
>> >
>>
>> (b) Is best option for.
>
> For ... what?
for me :)

>>  It should be big problem to generate random
>> name using current time and then try rename file to this name. Repeat
>> on error.
>
> I assume you mean "it should not be a big problem...".
>
Yes, I meant "should not be a big problem". Sorry.


-- 
Ivan Zhakov


Re: svn commit: r1079400 - /subversion/trunk/subversion/tests/cmdline/log_tests.py

2011-03-15 Thread Daniel Shahaf
Paul Burba wrote on Tue, Mar 15, 2011 at 09:38:31 -0400:
> On Tue, Mar 15, 2011 at 12:24 AM, Daniel Shahaf  
> wrote:
> > pbu...@apache.org wrote on Tue, Mar 08, 2011 at 15:46:10 -:
> >> +++ subversion/trunk/subversion/tests/cmdline/log_tests.py Tue Mar  8 
> >> 15:46:09 2011
> >> @@ -1148,9 +1148,9 @@ def check_merge_results(log_chain, expec
> >>
> >>    # Check to see if the number and values of the revisions is correct
> >>    for log in log_chain:
> >> -    if (log['revision'] not in expected_merges
> >> -        and (expected_reverse_merges is not None
> >> -             and log['revision'] not in expected_reverse_merges)):
> >> +    if not ((expected_merges and log['revision'] in expected_merges)
> >> +            or (expected_reverse_merges
> >> +                and log['revision'] in expected_reverse_merges)):
> >
> > If EXPECTED_MERGES and EXPECTED_REVERSE_MERGES are both None,
> > then the if() would trigger --- and I don't think that's the
> > intention.
> 
> Hi Daniel,
> 
> It is the intention.  If EXPECTED_MERGES and EXPECTED_REVERSE_MERGES
> are both None, then the caller believes that no merged revisions
> (normal or reverse) are present.  However, there *is* something in the
> LOG_CHAIN, so there is an error.  Admittedly, none of the present
> callers pass  EXPECTED_MERGES=None and EXPECTED_REVERSE_MERGES=None,
> but we might have reason to do so in the future.
> 
> Paul

I see; I assumed that passing None means "I don't care about this piece
of information; do not attempt to validate it", and that callers who
believe there are no merged revisions would pass an empty dict/list.

Thanks for the clarification,

Daniel


RE: svn commit: r1079508 - in /subversion/trunk/subversion: include/svn_client.h include/svn_wc.h libsvn_client/commit_util.c libsvn_client/deprecated.c svn/notify.c

2011-03-15 Thread Bert Huijben


> -Original Message-
> From: Daniel Shahaf [mailto:d...@daniel.shahaf.name]
> Sent: dinsdag 15 maart 2011 5:34
> To: dev@subversion.apache.org
> Subject: Re: svn commit: r1079508 - in /subversion/trunk/subversion:
> include/svn_client.h include/svn_wc.h libsvn_client/commit_util.c
> libsvn_client/deprecated.c svn/notify.c
> 
> cmpil...@apache.org wrote on Tue, Mar 08, 2011 at 20:05:51 -:
> > Author: cmpilato
> > Date: Tue Mar  8 20:05:50 2011
> > New Revision: 1079508
> >
> > URL: http://svn.apache.org/viewvc?rev=1079508&view=rev
> > Log:
> > For issue #3752 ("Warn about committing copied directory at depth
> > 'empty'"), introduce new notification types for overwriting and
> > non-overwriting copies.  Current, the command-line client treats these
> > just as it does their non-copy-ful equivalents.
> >
> > * subversion/include/svn_wc.h
> >   (svn_wc_notify_action_t): Add new 'svn_wc_notify_commit_copied' and
> > 'svn_wc_notify_commit_copied_replaced' notification types.
> >
> > * subversion/include/svn_client.h
> >   (svn_client_commit5): Update docstring to indicate that this
> > function will transmit the new notification types, too.
> >   (svn_client_commit4): Explicitly note that this interface does *not*
> > use the new notification types.
> >
> > * subversion/libsvn_client/commit_util.c
> >   (do_item_commit): Check for a non-NULL copyfrom_url on the commit
> > item and, if found, use the new notification types rather than
> > mere "added" and "replaced" notifications.
> >
> > * subversion/libsvn_client/deprecated.c
> >   (struct downgrade_commit_copied_notify_baton,
> >downgrade_commit_copied_notify_func): New notification wrapper
> > function and baton.
> >   (svn_client_commit4): Use downgrade_commit_copied_notify_func()
> > wrapper as necessary to avoid sending unrecognized notification
> > types to the caller.
> >
> > * subversion/svn/notify.c
> >   (notify): Handle svn_wc_notify_commit_copied as
> > svn_wc_notify_commit_added;
> svn_wc_notify_commit_copied_replaced
> > as svn_wc_notify_commit_replaced.
> >
> > Modified:
> > subversion/trunk/subversion/include/svn_client.h
> > subversion/trunk/subversion/include/svn_wc.h
> > subversion/trunk/subversion/libsvn_client/commit_util.c
> > subversion/trunk/subversion/libsvn_client/deprecated.c
> > subversion/trunk/subversion/svn/notify.c

> >  svn_error_t *
> >  svn_client_commit4(svn_commit_info_t **commit_info_p,
> > const apr_array_header_t *targets,
> > @@ -437,14 +474,33 @@ svn_client_commit4(svn_commit_info_t **c
> > apr_pool_t *pool)
> >  {
> >struct capture_baton_t cb;
> > +  struct downgrade_commit_copied_notify_baton notify_baton;
> > +  svn_error_t *err;
> > +
> > +  notify_baton.orig_notify_func2 = ctx->notify_func2;
> > +  notify_baton.orig_notify_baton2 = ctx->notify_baton2;
> >
> >*commit_info_p = NULL;
> >cb.info = commit_info_p;
> >cb.pool = pool;
> >
> > -  SVN_ERR(svn_client_commit5(targets, depth, keep_locks,
> keep_changelists,
> > - changelists, revprop_table,
> > - capture_commit_info, &cb, ctx, pool));
> > +  /* Swap out the notification system (if any) with a thin filtering
> > + wrapper. */
> > +  if (ctx->notify_func2)
> > +{
> > +  ctx->notify_func2 = downgrade_commit_copied_notify_func;
> > +  ctx->notify_baton2 = ¬ify_baton;
> > +}
> > +
> > +  err = svn_client_commit5(targets, depth, keep_locks,
keep_changelists,
> > +   changelists, revprop_table,
> > +   capture_commit_info, &cb, ctx, pool);
> > +
> > +  /* Ensure that the original notification system is in place. */
> > +  ctx->notify_func2 = notify_baton.orig_notify_func2;
> > +  ctx->notify_baton2 = notify_baton.orig_notify_baton2;
> > +
> 
> This is actually a race condition, isn't it? (for clients that call the
> deprecated API while using CTX->notify_func2 in another thread (in the
> same or another API))
> 
> (Okay, so maybe we'll just let it live on.  Presumably N other
> deprecated wrappers do this too.)

Sharing svn_client_ctx_t between different threads at the same time is
already broken in 1.7. 
The client context contains a svn_wc_context_t instance, which is not thread
safe.

(I never assumed that it was safe to share the context instance between
threads in the first place)

Bert



Re: Race condition in libsvn_client code?

2011-03-15 Thread Philip Martin
"C. Michael Pilato"  writes:

> [Tweaking Subject: for (hopefully) additional visibility.]
>
> On 03/15/2011 09:15 AM, Daniel Shahaf wrote:
>> C. Michael Pilato wrote on Tue, Mar 15, 2011 at 09:05:56 -0400:
>>> On 03/15/2011 12:34 AM, Daniel Shahaf wrote:
 cmpil...@apache.org wrote on Tue, Mar 08, 2011 at 20:05:51 -:
> Author: cmpilato
> Date: Tue Mar  8 20:05:50 2011
> New Revision: 1079508
>>>
>>> [...]
>>>
>  svn_client_commit4(svn_commit_info_t **commit_info_p,
> const apr_array_header_t *targets,
>>>
>>> [...]
>>>
> +  /* Ensure that the original notification system is in place. */
> +  ctx->notify_func2 = notify_baton.orig_notify_func2;
> +  ctx->notify_baton2 = notify_baton.orig_notify_baton2;
> +

 This is actually a race condition, isn't it? (for clients that call the
 deprecated API while using CTX->notify_func2 in another thread (in the
 same or another API))

 (Okay, so maybe we'll just let it live on.  Presumably N other
 deprecated wrappers do this too.)
>>>
>>> I hadn't considered that.  We do alot of pointer swaps like this up in the
>>> command-line client code itself, but that's a bit different than doing so
>>> down in the libsvn_client library as in this case.  There is one prior
>>> instance of us doing this kind of swap inside the libsvn_client library:  in
>>> libsvn_client/copy.c:repos_to_wc_copy_single().  But I'd prefer mere
>>> precedent not to be our reason for allowing badness to persist in the 
>>> codebase.
>>>
>>> Got suggestions?
>> 
>> Would this work? ---
>> 
>> svn_client_ctx_t ctx2 = *ctx;
>> ctx2.notify_func2 = notify_baton.orig_notify_func2;
>> ctx2.notify_baton2 = notify_baton.orig_notify_baton2;
>> svn_client_commit5(ctx=&ctx2);

It has a different problem, if the client modifies the context in a
callback those modifications will not persist.

-- 
Philip


Re: wc_db performance

2011-03-15 Thread Branko Čibej
On 15.03.2011 11:26, Philip Martin wrote:
> Branko Čibej  writes:
>
>> The restriction that you may not invoke a callback from within a sqlite
>> transaction remains. That's what the temporary results tables are for --
>> they live outside transactions on the main wc-db.
> I don't understand the temporary table approach.  Taking the
> temp__node_props_cache table as an example: it gets populated by a
> transaction and lives beyond that transaction.  The table then gets
> queried by a second transaction and the callbacks are made while that
> second transaction is in progress.  So callbacks are still being invoked
> from within a transaction.
>
> As far as I can see using a temporary table has made the overall "read
> properties" operation into one that modifies the database, to create the
> temporary table. I guess that by the time the callbacks are made the
> database write-lock has been dropped, but there would have been no
> write-lock if the temporary table were not used.

The temporary table is:

* in a different database, so the read lock on it does not affect
  the main wc-db;
* per-connection, so even the same process using a different
  database connection will not even see that temporary table.

-- Brane


Re: svn commit: r1073366 - in /subversion/trunk: notes/wc-ng/pristine-store subversion/libsvn_wc/wc-queries.sql subversion/libsvn_wc/wc_db_pristine.c

2011-03-15 Thread Julian Foad
Ivan Zhakov wrote:
> On Tue, Mar 1, 2011 at 21:13, Julian Foad  wrote:
> > Bert Huijben wrote:
> >> > -Original Message-
> >> > From: Julian Foad [mailto:julian.f...@wandisco.com]
> >> >
> >> > I'm not clear exactly what problem we would avoid by eliminating the
> >> > "select a unique name" step of this process.  Is it what I describe
> >> > below at the end of note [1] - that a scanner might be more likely to
> >> > re-scan the content, and therefore more likely to cause a delay?
> >>
> >> No, the problem I try to avoid is
> >> * you create a file
> >> 
> >> * you delete the file (after the virusscanner releases the file)
> >> * you rename a file to be at the old location
> >>
> >> While we really need something like
> >> * rename to a unique name.
> >>  >> original location>
> >
> > >From discussing on IRC we agree that the main concern is that this
> > involves too many separate filesystem operations, rather than any
> > specific problem with a particular step.
> >
> > I have committed it as is in r1075942, and would like to improve it as a
> > follow-up.
> >
> > Options for improvement:
> >
> >  (a) Don't open a pristine file with FILE_SHARE_DELETE.  Instead,
> > accept that an attempt to delete it may fail, and if that happens, leave
> > the there as an orphan.  When adding a pristine file, if it already
> > exists on disk then just keep the copy that already exists.  When
> > cleaning up orphan files, if a delete fails, just leave the file there.
> > Consider implementing automatic clean-up to prevent orphan files from
> > accumulating indefinitely.
> >
> >  (b) Find or write a "rename to a unique name" function that operates
> > in a single step instead of a creating a new file and then overwriting
> > it.
> >
> >  (c) Don't rename the file before deleting it; instead, just delete it.
> > On Windows, when adding a new file, if a file with that name already
> > exists, *then* rename the existing file before moving the new file into
> > place. (We can't just keep the existing file because it is pending
> > deletion and will disappear when the reader finishes reading it.)
> >
> > Pros and cons:
> >
> >  (a) This would reduce the number of file system operations to a
> > minimum.  It would involve bypassing APR to avoid the FILE_SHARE_DELETE
> > flag, which is not ideal but possible.
> >
> >  (b) This would remove one of the file system operations.  It would
> > involve writing a function similar to APR's fall-back implementation of
> > apr_file_mktemp() that exists for systems that do not provide a
> > "mkstemp" system API.  I'm not sure if there are any concrete problems
> > with doing this sort of thing in "user space".
> >
> >  (c) This would remove two of the file system operations.  It sounds
> > straightforward.
> >
> > Comments?
> >
> 
> (b) Is best option for.

For ... what?

>  It should be big problem to generate random
> name using current time and then try rename file to this name. Repeat
> on error.

I assume you mean "it should not be a big problem...".

- Julian




Re: svn commit: r1079400 - /subversion/trunk/subversion/tests/cmdline/log_tests.py

2011-03-15 Thread Paul Burba
On Tue, Mar 15, 2011 at 12:24 AM, Daniel Shahaf  wrote:
> pbu...@apache.org wrote on Tue, Mar 08, 2011 at 15:46:10 -:
>> Author: pburba
>> Date: Tue Mar  8 15:46:09 2011
>> New Revision: 1079400
>>
>> URL: http://svn.apache.org/viewvc?rev=1079400&view=rev
>> Log:
>> Follow-up to rr1076726, fix a flawed log -g test helper.
>>
>> * subversion/tests/cmdline/log_tests.py
>>   (check_merge_results): Account for the fact that the EXPECTED_MERGES arg
>>    might be None.  Fix the check of expected merges so it doesn't spuriously
>>    pass when EXPECTED_REVSERSE_MERGES is none.
>>
>
> EXPECTED_REVERSE_MERGES
>
>> Suggested by: danielsh
>>
>> Modified:
>>     subversion/trunk/subversion/tests/cmdline/log_tests.py
>>
>> Modified: subversion/trunk/subversion/tests/cmdline/log_tests.py
>> URL: 
>> http://svn.apache.org/viewvc/subversion/trunk/subversion/tests/cmdline/log_tests.py?rev=1079400&r1=1079399&r2=1079400&view=diff
>> ==
>> --- subversion/trunk/subversion/tests/cmdline/log_tests.py (original)
>> +++ subversion/trunk/subversion/tests/cmdline/log_tests.py Tue Mar  8 
>> 15:46:09 2011
>> @@ -1148,9 +1148,9 @@ def check_merge_results(log_chain, expec
>>
>>    # Check to see if the number and values of the revisions is correct
>>    for log in log_chain:
>> -    if (log['revision'] not in expected_merges
>> -        and (expected_reverse_merges is not None
>> -             and log['revision'] not in expected_reverse_merges)):
>> +    if not ((expected_merges and log['revision'] in expected_merges)
>> +            or (expected_reverse_merges
>> +                and log['revision'] in expected_reverse_merges)):
>
> If EXPECTED_MERGES and EXPECTED_REVERSE_MERGES are both None,
> then the if() would trigger --- and I don't think that's the
> intention.

Hi Daniel,

It is the intention.  If EXPECTED_MERGES and EXPECTED_REVERSE_MERGES
are both None, then the caller believes that no merged revisions
(normal or reverse) are present.  However, there *is* something in the
LOG_CHAIN, so there is an error.  Admittedly, none of the present
callers pass  EXPECTED_MERGES=None and EXPECTED_REVERSE_MERGES=None,
but we might have reason to do so in the future.

Paul

>>        raise SVNUnexpectedLogs("Found unexpected revision %d" %
>>                                log['revision'], log_chain)
>>
>>
>


Race condition in libsvn_client code? (Was: svn commit: r1079508 ...)

2011-03-15 Thread C. Michael Pilato
[Tweaking Subject: for (hopefully) additional visibility.]

On 03/15/2011 09:15 AM, Daniel Shahaf wrote:
> C. Michael Pilato wrote on Tue, Mar 15, 2011 at 09:05:56 -0400:
>> On 03/15/2011 12:34 AM, Daniel Shahaf wrote:
>>> cmpil...@apache.org wrote on Tue, Mar 08, 2011 at 20:05:51 -:
 Author: cmpilato
 Date: Tue Mar  8 20:05:50 2011
 New Revision: 1079508
>>
>> [...]
>>
  svn_client_commit4(svn_commit_info_t **commit_info_p,
 const apr_array_header_t *targets,
>>
>> [...]
>>
 +  /* Ensure that the original notification system is in place. */
 +  ctx->notify_func2 = notify_baton.orig_notify_func2;
 +  ctx->notify_baton2 = notify_baton.orig_notify_baton2;
 +
>>>
>>> This is actually a race condition, isn't it? (for clients that call the
>>> deprecated API while using CTX->notify_func2 in another thread (in the
>>> same or another API))
>>>
>>> (Okay, so maybe we'll just let it live on.  Presumably N other
>>> deprecated wrappers do this too.)
>>
>> I hadn't considered that.  We do alot of pointer swaps like this up in the
>> command-line client code itself, but that's a bit different than doing so
>> down in the libsvn_client library as in this case.  There is one prior
>> instance of us doing this kind of swap inside the libsvn_client library:  in
>> libsvn_client/copy.c:repos_to_wc_copy_single().  But I'd prefer mere
>> precedent not to be our reason for allowing badness to persist in the 
>> codebase.
>>
>> Got suggestions?
> 
> Would this work? ---
> 
> svn_client_ctx_t ctx2 = *ctx;
> ctx2.notify_func2 = notify_baton.orig_notify_func2;
> ctx2.notify_baton2 = notify_baton.orig_notify_baton2;
> svn_client_commit5(ctx=&ctx2);


-- 
C. Michael Pilato 
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand



signature.asc
Description: OpenPGP digital signature


Re: svn commit: r1079508 - in /subversion/trunk/subversion: include/svn_client.h include/svn_wc.h libsvn_client/commit_util.c libsvn_client/deprecated.c svn/notify.c

2011-03-15 Thread Daniel Shahaf
C. Michael Pilato wrote on Tue, Mar 15, 2011 at 09:05:56 -0400:
> On 03/15/2011 12:34 AM, Daniel Shahaf wrote:
> > cmpil...@apache.org wrote on Tue, Mar 08, 2011 at 20:05:51 -:
> >> Author: cmpilato
> >> Date: Tue Mar  8 20:05:50 2011
> >> New Revision: 1079508
> 
> [...]
> 
> >>  svn_client_commit4(svn_commit_info_t **commit_info_p,
> >> const apr_array_header_t *targets,
> 
> [...]
> 
> >> +  /* Ensure that the original notification system is in place. */
> >> +  ctx->notify_func2 = notify_baton.orig_notify_func2;
> >> +  ctx->notify_baton2 = notify_baton.orig_notify_baton2;
> >> +
> > 
> > This is actually a race condition, isn't it? (for clients that call the
> > deprecated API while using CTX->notify_func2 in another thread (in the
> > same or another API))
> > 
> > (Okay, so maybe we'll just let it live on.  Presumably N other
> > deprecated wrappers do this too.)
> 
> I hadn't considered that.  We do alot of pointer swaps like this up in the
> command-line client code itself, but that's a bit different than doing so
> down in the libsvn_client library as in this case.  There is one prior
> instance of us doing this kind of swap inside the libsvn_client library:  in
> libsvn_client/copy.c:repos_to_wc_copy_single().  But I'd prefer mere
> precedent not to be our reason for allowing badness to persist in the 
> codebase.
> 
> Got suggestions?

Would this work? ---

svn_client_ctx_t ctx2 = *ctx;
ctx2.notify_func2 = notify_baton.orig_notify_func2;
ctx2.notify_baton2 = notify_baton.orig_notify_baton2;
svn_client_commit5(ctx=&ctx2);


Re: svn commit: r1079508 - in /subversion/trunk/subversion: include/svn_client.h include/svn_wc.h libsvn_client/commit_util.c libsvn_client/deprecated.c svn/notify.c

2011-03-15 Thread C. Michael Pilato
On 03/15/2011 12:34 AM, Daniel Shahaf wrote:
> cmpil...@apache.org wrote on Tue, Mar 08, 2011 at 20:05:51 -:
>> Author: cmpilato
>> Date: Tue Mar  8 20:05:50 2011
>> New Revision: 1079508

[...]

>>  svn_client_commit4(svn_commit_info_t **commit_info_p,
>> const apr_array_header_t *targets,

[...]

>> +  /* Ensure that the original notification system is in place. */
>> +  ctx->notify_func2 = notify_baton.orig_notify_func2;
>> +  ctx->notify_baton2 = notify_baton.orig_notify_baton2;
>> +
> 
> This is actually a race condition, isn't it? (for clients that call the
> deprecated API while using CTX->notify_func2 in another thread (in the
> same or another API))
> 
> (Okay, so maybe we'll just let it live on.  Presumably N other
> deprecated wrappers do this too.)

I hadn't considered that.  We do alot of pointer swaps like this up in the
command-line client code itself, but that's a bit different than doing so
down in the libsvn_client library as in this case.  There is one prior
instance of us doing this kind of swap inside the libsvn_client library:  in
libsvn_client/copy.c:repos_to_wc_copy_single().  But I'd prefer mere
precedent not to be our reason for allowing badness to persist in the codebase.

Got suggestions?

-- 
C. Michael Pilato 
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand



signature.asc
Description: OpenPGP digital signature


Re: svn commit: r1079592 - /subversion/trunk/subversion/svn/main.c

2011-03-15 Thread C. Michael Pilato
On 03/15/2011 12:43 AM, Daniel Shahaf wrote:
> s...@apache.org wrote on Tue, Mar 08, 2011 at 22:51:52 -:
>> Author: stsp
>> Date: Tue Mar  8 22:51:51 2011
>> New Revision: 1079592
>>
>> URL: http://svn.apache.org/viewvc?rev=1079592&view=rev
>> Log:
>> * subversion/svn/main.c
>>   (svn_cl__cmd_table): Document foreign repos merges in merge help text.
>>
>> Modified:
>> subversion/trunk/subversion/svn/main.c
>>
>> Modified: subversion/trunk/subversion/svn/main.c
>> URL: 
>> http://svn.apache.org/viewvc/subversion/trunk/subversion/svn/main.c?rev=1079592&r1=1079591&r2=1079592&view=diff
>> ==
>> --- subversion/trunk/subversion/svn/main.c (original)
>> +++ subversion/trunk/subversion/svn/main.c Tue Mar  8 22:51:51 2011
>> @@ -893,6 +893,14 @@ const svn_opt_subcommand_desc2_t svn_cl_
>>   "\n"
>>   " svn diff ^/trunk@500 ^/feature@HEAD\n"
>>   "\n"
>> + " Note that a 2-URL merge can also merge from foreign 
>> repositories.\n"
>> + " While SOURCE1 and SOURCE2 must both come from the same 
>> repository,\n"
>> + " TARGET_WCPATH may come from a different repository than the 
>> sources.\n"
>> + " However, there are some caveats. Most notably, copies made in 
>> the\n"
>> + " merge source will be transformed into plain additions in the 
>> merge\n"
>> + " target, since the copyfrom information is only valid within 
>> the\n"
>> + " source repository.\n"
> 
> s/within the source repository/within a single repository/
> (or other phrasing that captures the "Doesn't cross repository boundaries" 
> semantics)
> 
> Also: is "copyfrom information" a user-facing phrase?

I would say not.  But I'd also say that we don't need to explain the
technical reason that copies are downgraded to mere additions.  Just state
the fact and move on.

-- 
C. Michael Pilato 
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand



signature.asc
Description: OpenPGP digital signature


Re: wc_db performance

2011-03-15 Thread Philip Martin
Stefan Sperling  writes:

> On Tue, Mar 15, 2011 at 10:26:15AM +, Philip Martin wrote:
>> Branko Čibej  writes:
>> 
>> > The restriction that you may not invoke a callback from within a sqlite
>> > transaction remains. That's what the temporary results tables are for --
>> > they live outside transactions on the main wc-db.
>> 
>> I don't understand the temporary table approach.  Taking the
>> temp__node_props_cache table as an example: it gets populated by a
>> transaction and lives beyond that transaction.  The table then gets
>> queried by a second transaction and the callbacks are made while that
>> second transaction is in progress.  So callbacks are still being invoked
>> from within a transaction.
>
> It works around the infinite loop problem that happens when the
> callback inserts rows that end up being picked up by the proplist query.
> Which is silly for this callback to do, but then again we somehow agreed
> that the callbacks are free to do virtually anything. (I still think
> putting newly documented restrictions on them now is fine and would make
> our life easier... *shrug*).

I'm still confused.  I thought the infinite loop problem affected the
old multiple transaction code.  If we simply ran a single transaction
and made callbacks then that problem would not exist.  We would have the
same limitations that apply to the temporary table approach: the
callback cannot write to the database because a transaction is in
progress.

>> As far as I can see using a temporary table has made the overall "read
>> properties" operation into one that modifies the database, to create the
>> temporary table. I guess that by the time the callbacks are made the
>> database write-lock has been dropped, but there would have been no
>> write-lock if the temporary table were not used.
>
> I think the temporary table gets put into a separate database file
> (or memory, depending on the temp_store pragma or whatsitcalled).
> If so, a read lock on the primary db should suffice, shouldn't it?

Yes, by the time we invoke the callbacks we have a read-lock.  If we
simply ran a single transaction and made callbacks within the
transaction then it would have the same effect.  I don't see what the
temporary table achieves.

-- 
Philip


Re: svn commit: r1073366 - in /subversion/trunk: notes/wc-ng/pristine-store subversion/libsvn_wc/wc-queries.sql subversion/libsvn_wc/wc_db_pristine.c

2011-03-15 Thread Ivan Zhakov
On Tue, Mar 1, 2011 at 21:13, Julian Foad  wrote:
> Bert Huijben wrote:
>> > -Original Message-
>> > From: Julian Foad [mailto:julian.f...@wandisco.com]
>> >
>> > I'm not clear exactly what problem we would avoid by eliminating the
>> > "select a unique name" step of this process.  Is it what I describe
>> > below at the end of note [1] - that a scanner might be more likely to
>> > re-scan the content, and therefore more likely to cause a delay?
>>
>> No, the problem I try to avoid is
>> * you create a file
>> 
>> * you delete the file (after the virusscanner releases the file)
>> * you rename a file to be at the old location
>>
>> While we really need something like
>> * rename to a unique name.
>> > original location>
>
> >From discussing on IRC we agree that the main concern is that this
> involves too many separate filesystem operations, rather than any
> specific problem with a particular step.
>
> I have committed it as is in r1075942, and would like to improve it as a
> follow-up.
>
> Options for improvement:
>
>  (a) Don't open a pristine file with FILE_SHARE_DELETE.  Instead,
> accept that an attempt to delete it may fail, and if that happens, leave
> the there as an orphan.  When adding a pristine file, if it already
> exists on disk then just keep the copy that already exists.  When
> cleaning up orphan files, if a delete fails, just leave the file there.
> Consider implementing automatic clean-up to prevent orphan files from
> accumulating indefinitely.
>
>  (b) Find or write a "rename to a unique name" function that operates
> in a single step instead of a creating a new file and then overwriting
> it.
>
>  (c) Don't rename the file before deleting it; instead, just delete it.
> On Windows, when adding a new file, if a file with that name already
> exists, *then* rename the existing file before moving the new file into
> place. (We can't just keep the existing file because it is pending
> deletion and will disappear when the reader finishes reading it.)
>
> Pros and cons:
>
>  (a) This would reduce the number of file system operations to a
> minimum.  It would involve bypassing APR to avoid the FILE_SHARE_DELETE
> flag, which is not ideal but possible.
>
>  (b) This would remove one of the file system operations.  It would
> involve writing a function similar to APR's fall-back implementation of
> apr_file_mktemp() that exists for systems that do not provide a
> "mkstemp" system API.  I'm not sure if there are any concrete problems
> with doing this sort of thing in "user space".
>
>  (c) This would remove two of the file system operations.  It sounds
> straightforward.
>
> Comments?
>

(b) Is best option for. It should be big problem to generate random
name using current time and then try rename file to this name. Repeat
on error.

-- 
Ivan Zhakov


Re: wc_db performance

2011-03-15 Thread Stefan Sperling
On Tue, Mar 15, 2011 at 10:26:15AM +, Philip Martin wrote:
> Branko Čibej  writes:
> 
> > The restriction that you may not invoke a callback from within a sqlite
> > transaction remains. That's what the temporary results tables are for --
> > they live outside transactions on the main wc-db.
> 
> I don't understand the temporary table approach.  Taking the
> temp__node_props_cache table as an example: it gets populated by a
> transaction and lives beyond that transaction.  The table then gets
> queried by a second transaction and the callbacks are made while that
> second transaction is in progress.  So callbacks are still being invoked
> from within a transaction.

It works around the infinite loop problem that happens when the
callback inserts rows that end up being picked up by the proplist query.
Which is silly for this callback to do, but then again we somehow agreed
that the callbacks are free to do virtually anything. (I still think
putting newly documented restrictions on them now is fine and would make
our life easier... *shrug*).

> As far as I can see using a temporary table has made the overall "read
> properties" operation into one that modifies the database, to create the
> temporary table. I guess that by the time the callbacks are made the
> database write-lock has been dropped, but there would have been no
> write-lock if the temporary table were not used.

I think the temporary table gets put into a separate database file
(or memory, depending on the temp_store pragma or whatsitcalled).
If so, a read lock on the primary db should suffice, shouldn't it?


Re: Another svnadmin checksum mismatch

2011-03-15 Thread Daniel Shahaf
Ignore this thread; as Philip says it works fine with SVN_DBG_OUTPUT=stderr.

Thanks.

Philip Martin wrote on Tue, Mar 15, 2011 at 09:35:02 +:
> Daniel Shahaf  writes:
> 
> > % $svnversion `.svn`
> > 1081669M
> > % $svn di `.svn`
> > Index: subversion/libsvn_fs_fs/fs_fs.c
> > ===
> > --- subversion/libsvn_fs_fs/fs_fs.c (revision 1081669)
> > +++ subversion/libsvn_fs_fs/fs_fs.c (working copy)
> > @@ -3349,6 +3349,7 @@
> > * non-packed ones: keys for packed rev file content ends with a dot
> > * for non-packed rev files they end with a digit. */
> >name = apr_pstrndup(pool, last_part + 1, name_last - last_part);
> > +  SVN_DBG(("name='%s'\n", name));
> >return svn_fs_fs__combine_number_and_string(offset, name, pool);
> >  }
> >  
> > % make -C `.svn` -s svnadmin
> > % $svnadmin create s
> > % $svnmucc put s/README.txt file://$PWD/s/R -mm
> > r1 committed by daniel at 2011-03-15T05:27:21.117322Z
> > % $svnadmin create s2
> > % $svnadmin dump -q s | $svnadmin load -q s2
> > subversion/libsvn_repos/load.c:603: (apr_err=200014)
> > subversion/libsvn_repos/load.c:366: (apr_err=200014)
> > subversion/libsvn_subr/checksum.c:418: (apr_err=200014)
> > svnadmin: E200014: Checksum mismatch for '/R':
> >expected:  520f596e4ab0fb0c28271f554dc8fd10
> >  actual:  1021a8e9589b8c3c87827bd6dc7ed5b0
> >
> > zsh: done   $svnadmin dump -q s | 
> > zsh: exit 1 $svnadmin load -q s2
> > % md5sum s/README.txt 
> > 520f596e4ab0fb0c28271f554dc8fd10  /tmp/s/README.txt
> 
> That's a bit cryptic.  If you modify the dump output so that the file
> contents don't match the checksum then load gives an error.  What am I
> missing?
> 
> -- 
> Philip


Re: wc_db performance

2011-03-15 Thread Philip Martin
Branko Čibej  writes:

> The restriction that you may not invoke a callback from within a sqlite
> transaction remains. That's what the temporary results tables are for --
> they live outside transactions on the main wc-db.

I don't understand the temporary table approach.  Taking the
temp__node_props_cache table as an example: it gets populated by a
transaction and lives beyond that transaction.  The table then gets
queried by a second transaction and the callbacks are made while that
second transaction is in progress.  So callbacks are still being invoked
from within a transaction.

As far as I can see using a temporary table has made the overall "read
properties" operation into one that modifies the database, to create the
temporary table. I guess that by the time the callbacks are made the
database write-lock has been dropped, but there would have been no
write-lock if the temporary table were not used.

-- 
Philip


Re: wc_db API discussion

2011-03-15 Thread Philip Martin
Philip Martin  writes:

> Update with no changes on NFS disk:
>
> 1.6:  2s
> 1.7: 50s

With the recent bump-post-update changes that 50s has become 23s.

-- 
Philip


Re: svn commit: r1081390 - /subversion/trunk/subversion/tests/cmdline/merge_tests.py

2011-03-15 Thread Kamesh Jayachandran

On 03/14/2011 08:18 PM, Stefan Sperling wrote:

On Mon, Mar 14, 2011 at 02:24:58PM -, kame...@apache.org wrote:

Author: kameshj
Date: Mon Mar 14 14:24:58 2011
New Revision: 1081390

URL: http://svn.apache.org/viewvc?rev=1081390&view=rev
Log:
Adds an XFail test to catch regression created by r1075802

* subversion/tests/cmdline/merge_tests.py
   (dry_run_merge_conflicting_binary): New XFail testcase.
   (test_list): Add dry_run_merge_conflicting_binary.

Patch by: Arwin Arni

Modified:
 subversion/trunk/subversion/tests/cmdline/merge_tests.py
+@XFail()
+def dry_run_merge_conflicting_binary(sbox):
+  "dry run shouldn't make conflict resoln files"

^
typo?


Thanks Stefan. Fixed in r1081703.

With regards
Kamesh Jayachandran


Re: Another svnadmin checksum mismatch

2011-03-15 Thread Philip Martin
Daniel Shahaf  writes:

> % $svnversion `.svn`
> 1081669M
> % $svn di `.svn`
> Index: subversion/libsvn_fs_fs/fs_fs.c
> ===
> --- subversion/libsvn_fs_fs/fs_fs.c   (revision 1081669)
> +++ subversion/libsvn_fs_fs/fs_fs.c   (working copy)
> @@ -3349,6 +3349,7 @@
> * non-packed ones: keys for packed rev file content ends with a dot
> * for non-packed rev files they end with a digit. */
>name = apr_pstrndup(pool, last_part + 1, name_last - last_part);
> +  SVN_DBG(("name='%s'\n", name));
>return svn_fs_fs__combine_number_and_string(offset, name, pool);
>  }
>  
> % make -C `.svn` -s svnadmin
> % $svnadmin create s
> % $svnmucc put s/README.txt file://$PWD/s/R -mm
> r1 committed by daniel at 2011-03-15T05:27:21.117322Z
> % $svnadmin create s2
> % $svnadmin dump -q s | $svnadmin load -q s2
> subversion/libsvn_repos/load.c:603: (apr_err=200014)
> subversion/libsvn_repos/load.c:366: (apr_err=200014)
> subversion/libsvn_subr/checksum.c:418: (apr_err=200014)
> svnadmin: E200014: Checksum mismatch for '/R':
>expected:  520f596e4ab0fb0c28271f554dc8fd10
>  actual:  1021a8e9589b8c3c87827bd6dc7ed5b0
>
> zsh: done   $svnadmin dump -q s | 
> zsh: exit 1 $svnadmin load -q s2
> % md5sum s/README.txt 
> 520f596e4ab0fb0c28271f554dc8fd10  /tmp/s/README.txt

That's a bit cryptic.  If you modify the dump output so that the file
contents don't match the checksum then load gives an error.  What am I
missing?

-- 
Philip