Hi Julian, I've examined your patch and did add something to wc_wc_diff_summarize, though it's not as plain as I first thought.
I'll give you the elephant in the room first:
Currently, 'svn diff --summarize <wc-path>' does *nothing else* than what a
plain 'svn status' does. There is no way to get other wc-wc-diffs than the
diff between BASE and WORKING/ACTUAL. So at this point in time, we could
redirect diff_wc_wc_summarize to client_status and basically no-one would
notice anything.
This whole local --summarize only even *begins* to make sense when we can
compare one subpath of a WC to another subpath, or even one checkout with a
different checkout (i.e. locally compare between completely different WCs).
Now back to pretending diff --summarize <wc-path> *is* in fact useful in
itself:
So I thought, during wc-wc diff, it would make sense to also have a
text_deltas flag, so we don't need to do the keywords/nl translations. So I
coded that. While I did, I hit this comment:
/* Note that this might be the _second_ time we translate
the file, as svn_wc__internal_file_modified_p() might have used a
tmp translated copy too. But what the heck, diff is
already expensive, translating twice for the sake of code
modularity is liveable. */
So there's this thing where diff translates pristine&working *anyway* to
determine if they are different. That's just the check whether to skip the file.
Then it translates them *again* to do the diff. And I think the "is it
modified?" test uses streamy translation and the diff-producing part uses
translation to a tempfile.
So there's room for optimization there. Translate them only once, for
comparison, if that's needed, and keep the result for producing the diff.
So all my part of the patch does is skip the second translation if
text_deltas is false. The first translation can't generally be skipped since
it determines the text-modified status. It has its own skip semantics.
Looking at svn_wc__internal_file_modified_p()'s (the first translation's)
heuristics to shortcut the decision whether a file was modified:
- shortcut to "yes, modified" when
- there is no pristine
- the WC status gives away a modification
- shortcut to "no, not modified" when
- filesize and timestamp unchanged
- else translate and compare
So if it shortcuts to "yes, modified", the later diff-translation is the
only translation. (1)
If it shortcuts to "no, not modified", the diff-translation is always
skipped as there is no diff.
Else, basically if there are only text mods (which can only be verified
after translation) there are always two translations. (2)
So this patch makes local-diff translation-free in (1), and removes one of
the two translations, i.e. removes the tempfile-based diff-producing
translation, in (2).
Considering that actually, in most default cases (with the lack of
keywords/nl props), there is *no* translation at all and the callbacks get
the direct pristine and working copy paths, my patch *is* a bit overkill.
But when a user is coming from the other side, say if translating their
files takes a really really long time, well, this might be a noticeable
speedup seen on --summarize. Same of which is obtained by running 'status'.
...If I haven't confused you too much by now, do you think it's worth
touching wc_wc diff like my patch does?
If you said yes, you do probably also think that we shall support diffing
separate wc-paths at some point?
[[[
$ svn diff --summarize --old=wc/foo/ --new=wc/bar/
svn: E200004: Sorry, svn_client_diff5 was called in a way that is not yet
supported
svn: E200004: Only diffs between a path's text-base and its working files
are supported at this time
]]]
(The attached file is your patch and my questionable one combined. And I did
add two printf()s next to DBG()s to show the file_mod paths (whether and how
they're empty), sorry about that)
~Neels
On 08/22/2011 02:21 PM, Julian Foad wrote:
> Hi Neels.
>
> The new diff --summarize code presently still downloads the full text of
> each deleted file to supply to the diff callbacks. I have a patch in
> progress (attached) to eliminate this, but I shan't be working on it for
> a few days. With this patch, 'diff --summarize' still works but I
> haven't checked carefully (either by testing or analysis) that this is
> actually eliminating all the unnecessary downloads, and it could do with
> some documentation updates in the diff callbacks to say that this usage
> is allowed.
>
> Feel free to progress this if you wish, otherwise I'll get back to it
> later.
>
> - Julian
>
In 'diff --summarize', don't download full texts for deleted files, because
that takes time and we don't use them.
Also try to not translate things during wc-wc diff.
### Patch in progress.
* subversion/libsvn_client/repos_diff.c
(edit_baton): Attempt to clarify the documentation of 'target'.
(diff_deleted_dir, delete_entry): Don't fetch the old file if we don't
need it.
(apply_textdelta): If we're not expecting to receive text deltas, use the
'noop' window handler and indicate that there was a text change even
though we don't have the text.
* subversion/libsvn_client/repos_diff_summarize.c
(dbg, DBG): ### Debugging.
(cb_dir_deleted, cb_file_deleted, cb_dir_added, cb_dir_opened,
cb_dir_closed, cb_file_added, cb_file_opened, cb_file_changed,
cb_dir_props_changed): ### Debugging.
neels: sorry, I added plain printf()s
* subversion/include/svn_wc.h,
(svn_wc_diff7): New function, adds TEXT_DELTAS over svn_wc_diff6().
(svn_wc_diff6): Comment. Call svn_wc_diff7() with TEXT_DELTAS as TRUE.
* subversion/libsvn_client/diff.c
(diff_summarize_wc_wc): Call svn_wc_diff7() with TEXT_DELTAS as FALSE.
* subversion/libsvn_wc/diff_local.c
(diff_baton): Add TEXT_DELTAS.
(file_diff):
Skip translating the working file and return neither a pristine path nor a
working path when TEXT_DELTAS is FALSE. Fix a comment.
(svn_wc_diff7): New function (s.a.), place TEXT_DELTAS into EB.
(svn_wc_diff6): Call svn_wc_diff7() with TEXT_DELTAS as TRUE.
--This line, and those below, will be ignored--
+* subversion/include/svn_wc.h
+ (svn_wc_get_diff_editor):
+ (diff_summarize_wc_wc):
conf: # dub:/home/neels/pat/.pat-base/config-default
conf: http://archive.apache.org/dist/apr/apr-1.4.5.tar.bz2
conf: http://archive.apache.org/dist/apr/apr-util-1.3.12.tar.bz2
conf: http://www.sqlite.org/sqlite-autoconf-3070701.tar.gz
conf: http://archive.apache.org/dist/httpd/httpd-2.2.19.tar.bz2
conf: http://www.webdav.org/neon/neon-0.29.6.tar.gz
conf: http://ftp2.de.freebsd.org/pub/FreeBSD/distfiles/bdb/db-5.1.25.tar.gz
conf: ftp://ftp.fu-berlin.de/unix/gnu/libiconv/libiconv-1.14.tar.gz
conf: fsfs
conf: local
Index: subversion/include/svn_wc.h
===================================================================
--- subversion/include/svn_wc.h (revision 1160859)
+++ subversion/include/svn_wc.h (working copy)
@@ -6341,6 +6341,25 @@ svn_wc_get_diff_editor(svn_wc_adm_access
* points during the operation. If it returns an error (typically
* #SVN_ERR_CANCELLED), return that error immediately.
*
+ * @since New in 1.8.
+ */
+svn_error_t *
+svn_wc_diff7(svn_wc_context_t *wc_ctx,
+ const char *target_abspath,
+ const svn_wc_diff_callbacks4_t *callbacks,
+ void *callback_baton,
+ svn_depth_t depth,
+ svn_boolean_t ignore_ancestry,
+ svn_boolean_t show_copies_as_adds,
+ svn_boolean_t use_git_diff_format,
+ svn_boolean_t text_deltas,
+ const apr_array_header_t *changelist_filter,
+ svn_cancel_func_t cancel_func,
+ void *cancel_baton,
+ apr_pool_t *scratch_pool);
+
+/**
+ * Similar to svn_wc_diff6() but with @a text_deltas passed as TRUE.
* @since New in 1.7.
*/
svn_error_t *
Index: subversion/libsvn_client/diff.c
===================================================================
--- subversion/libsvn_client/diff.c (revision 1160859)
+++ subversion/libsvn_client/diff.c (working copy)
@@ -2054,12 +2054,14 @@ diff_summarize_wc_wc(svn_client_diff_sum
&callbacks, &callback_baton, target1,
summarize_func, summarize_baton, pool));
- SVN_ERR(svn_wc_diff6(ctx->wc_ctx,
+ SVN_ERR(svn_wc_diff7(ctx->wc_ctx,
abspath1,
callbacks, callback_baton,
depth,
ignore_ancestry, FALSE /* show_copies_as_adds */,
- FALSE /* use_git_diff_format */, changelists,
+ FALSE /* use_git_diff_format */,
+ FALSE /* text deltas */,
+ changelists,
ctx->cancel_func, ctx->cancel_baton,
pool));
return SVN_NO_ERROR;
Index: subversion/libsvn_client/repos_diff.c
===================================================================
--- subversion/libsvn_client/repos_diff.c (revision 1160859)
+++ subversion/libsvn_client/repos_diff.c (working copy)
@@ -51,8 +51,9 @@
/* Overall crawler editor baton. */
struct edit_baton {
- /* TARGET is a working-copy directory which corresponds to the base
- URL open in RA_SESSION below. */
+ /* TARGET is the 'anchor' directory path, (supposedly in the WC?), which
corresponds
+ to the base URL open in RA_SESSION below. Paths passed into the
+ editor functions are considered to be relative to this path. */
const char *target;
/* A working copy context for TARGET, NULL if this is purely a
@@ -545,11 +546,14 @@ diff_deleted_dir(const char *dir,
/* Compare a file being deleted against an empty file */
b = make_file_baton(path, FALSE, eb, iterpool);
- SVN_ERR(get_file_from_ra(b, FALSE, iterpool));
+ if (eb->text_deltas)
+ {
+ SVN_ERR(get_file_from_ra(b, FALSE, iterpool));
- SVN_ERR(get_empty_file(b->edit_baton, &(b->path_end_revision)));
+ SVN_ERR(get_empty_file(b->edit_baton, &(b->path_end_revision)));
- get_file_mime_types(&mimetype1, &mimetype2, b);
+ get_file_mime_types(&mimetype1, &mimetype2, b);
+ }
SVN_ERR(eb->diff_callbacks->file_deleted(
NULL, NULL, b->wcpath,
@@ -617,10 +621,13 @@ delete_entry(const char *path,
/* Compare a file being deleted against an empty file */
b = make_file_baton(path, FALSE, eb, scratch_pool);
- SVN_ERR(get_file_from_ra(b, FALSE, scratch_pool));
- SVN_ERR(get_empty_file(b->edit_baton, &(b->path_end_revision)));
+ if (eb->text_deltas)
+ {
+ SVN_ERR(get_file_from_ra(b, FALSE, scratch_pool));
+ SVN_ERR(get_empty_file(b->edit_baton, &(b->path_end_revision)));
- get_file_mime_types(&mimetype1, &mimetype2, b);
+ get_file_mime_types(&mimetype1, &mimetype2, b);
+ }
SVN_ERR(eb->diff_callbacks->file_deleted(
&state, &tree_conflicted, b->wcpath,
@@ -908,6 +915,19 @@ apply_textdelta(void *file_baton,
return SVN_NO_ERROR;
}
+ /* If we're not sending file text, then ignore any that we receive. */
+ if (! b->edit_baton->text_deltas)
+ {
+ /* Set temp file paths to non-null to indicate there is a text change. */
+ b->path_start_revision = "";
+ b->path_end_revision = "";
+
+ *handler = svn_delta_noop_window_handler;
+ *handler_baton = NULL;
+
+ return SVN_NO_ERROR;
+ }
+
/* We need the expected pristine file, so go get it */
if (!b->added)
SVN_ERR(get_file_from_ra(b, FALSE, scratch_pool));
Index: subversion/libsvn_client/repos_diff_summarize.c
===================================================================
--- subversion/libsvn_client/repos_diff_summarize.c (revision 1160859)
+++ subversion/libsvn_client/repos_diff_summarize.c (working copy)
@@ -30,6 +30,14 @@
#include "client.h"
+static void
+dbg(const char *func, const char *word, const char *path)
+{
+ SVN_DBG(("%10s '%s'\n", word, path));
+}
+#define DBG(w, path) dbg(__func__, #w, path)
+
+
/* Diff callbacks baton. */
struct summarize_baton_t {
/* The target path of the diff, relative to the anchor; "" if target ==
anchor. */
@@ -96,6 +104,7 @@ cb_dir_deleted(svn_wc_notify_state_t *st
{
struct summarize_baton_t *b = diff_baton;
+ DBG(dir_del, path);
SVN_ERR(send_summary(b, path, svn_client_diff_summarize_kind_deleted,
FALSE, svn_node_dir, scratch_pool));
@@ -120,6 +129,7 @@ cb_file_deleted(svn_wc_notify_state_t *s
{
struct summarize_baton_t *b = diff_baton;
+ DBG(file_del, path);
SVN_ERR(send_summary(b, path, svn_client_diff_summarize_kind_deleted,
FALSE, svn_node_file, scratch_pool));
@@ -142,6 +152,7 @@ cb_dir_added(svn_wc_notify_state_t *stat
void *diff_baton,
apr_pool_t *scratch_pool)
{
+ DBG(dir_add, path);
if (tree_conflicted)
*tree_conflicted = FALSE;
if (skip)
@@ -160,6 +171,7 @@ cb_dir_opened(svn_boolean_t *tree_confli
void *diff_baton,
apr_pool_t *scratch_pool)
{
+ DBG(dir_open, path);
if (tree_conflicted)
*tree_conflicted = FALSE;
if (skip)
@@ -181,6 +193,7 @@ cb_dir_closed(svn_wc_notify_state_t *con
struct summarize_baton_t *b = diff_baton;
svn_boolean_t prop_change;
+ DBG(dir_close, path);
prop_change = apr_hash_get(b->prop_changes, path, APR_HASH_KEY_STRING) !=
NULL;
if (dir_was_added || prop_change)
SVN_ERR(send_summary(b, path,
@@ -217,6 +230,7 @@ cb_file_added(svn_wc_notify_state_t *con
{
struct summarize_baton_t *b = diff_baton;
+ DBG(file_add, path);
SVN_ERR(send_summary(b, path, svn_client_diff_summarize_kind_added,
props_changed(propchanges, scratch_pool),
svn_node_file, scratch_pool));
@@ -238,6 +252,7 @@ cb_file_opened(svn_boolean_t *tree_confl
void *diff_baton,
apr_pool_t *scratch_pool)
{
+ DBG(file_open, path);
if (tree_conflicted)
*tree_conflicted = FALSE;
if (skip)
@@ -264,6 +279,8 @@ cb_file_changed(svn_wc_notify_state_t *c
struct summarize_baton_t *b = diff_baton;
svn_boolean_t text_change = (tmpfile2 != NULL);
+ printf("--- %s %s\n", tmpfile1?tmpfile1:"-", tmpfile2?tmpfile2:"-");
+ DBG(file_mod, path);
SVN_ERR(send_summary(b, path,
text_change ? svn_client_diff_summarize_kind_modified
: svn_client_diff_summarize_kind_normal,
@@ -291,6 +308,7 @@ cb_dir_props_changed(svn_wc_notify_state
{
struct summarize_baton_t *b = diff_baton;
+ DBG(dir_prop, path);
if (props_changed(propchanges, scratch_pool))
apr_hash_set(b->prop_changes, path, APR_HASH_KEY_STRING, path);
Index: subversion/libsvn_wc/diff_local.c
===================================================================
--- subversion/libsvn_wc/diff_local.c (revision 1160859)
+++ subversion/libsvn_wc/diff_local.c (working copy)
@@ -77,6 +77,9 @@ struct diff_baton
/* Hash whose keys are const char * changelist names. */
apr_hash_t *changelist_hash;
+ /* Whether to report text deltas */
+ svn_boolean_t text_deltas;
+
/* Cancel function/baton */
svn_cancel_func_t cancel_func;
void *cancel_baton;
@@ -318,11 +321,16 @@ file_diff(struct diff_baton *eb,
SVN_ERR(svn_prop_diffs(&propchanges, actual_props, pristine_props,
scratch_pool));
- SVN_ERR(svn_wc__internal_translated_file(
- &translated, local_abspath, db, local_abspath,
- SVN_WC_TRANSLATE_TO_NF | SVN_WC_TRANSLATE_USE_GLOBAL_TMP,
- eb->cancel_func, eb->cancel_baton,
- scratch_pool, scratch_pool));
+ if (eb->text_deltas)
+ {
+ SVN_ERR(svn_wc__internal_translated_file(
+ &translated, local_abspath, db, local_abspath,
+ SVN_WC_TRANSLATE_TO_NF | SVN_WC_TRANSLATE_USE_GLOBAL_TMP,
+ eb->cancel_func, eb->cancel_baton,
+ scratch_pool, scratch_pool));
+ }
+ else
+ translated = "";
SVN_ERR(eb->callbacks->file_added(NULL, NULL, NULL, path,
(! eb->show_copies_as_adds &&
@@ -354,15 +362,20 @@ file_diff(struct diff_baton *eb,
if (modified)
{
/* Note that this might be the _second_ time we translate
- the file, as svn_wc__text_modified_internal_p() might have used a
+ the file, as svn_wc__internal_file_modified_p() might have used a
tmp translated copy too. But what the heck, diff is
already expensive, translating twice for the sake of code
modularity is liveable. */
- SVN_ERR(svn_wc__internal_translated_file(
- &translated, local_abspath, db, local_abspath,
- SVN_WC_TRANSLATE_TO_NF | SVN_WC_TRANSLATE_USE_GLOBAL_TMP,
- eb->cancel_func, eb->cancel_baton,
- scratch_pool, scratch_pool));
+ if (eb->text_deltas)
+ {
+ SVN_ERR(svn_wc__internal_translated_file(
+ &translated, local_abspath, db, local_abspath,
+ SVN_WC_TRANSLATE_TO_NF |
+ SVN_WC_TRANSLATE_USE_GLOBAL_TMP, eb->cancel_func,
+ eb->cancel_baton, scratch_pool, scratch_pool));
+ }
+ else
+ translated = "";
}
/* Get the properties, the svn:mime-type values, and compute the
@@ -406,8 +419,11 @@ file_diff(struct diff_baton *eb,
{
SVN_ERR(eb->callbacks->file_changed(NULL, NULL, NULL,
path,
- modified ? pristine_abspath
- : NULL,
+ modified ?
+ (eb->text_deltas ?
+ pristine_abspath
+ : "")
+ : NULL,
translated,
revision,
SVN_INVALID_REVNUM,
@@ -546,7 +562,7 @@ diff_status_callback(void *baton,
/* Public Interface */
svn_error_t *
-svn_wc_diff6(svn_wc_context_t *wc_ctx,
+svn_wc_diff7(svn_wc_context_t *wc_ctx,
const char *local_abspath,
const svn_wc_diff_callbacks4_t *callbacks,
void *callback_baton,
@@ -554,6 +570,7 @@ svn_wc_diff6(svn_wc_context_t *wc_ctx,
svn_boolean_t ignore_ancestry,
svn_boolean_t show_copies_as_adds,
svn_boolean_t use_git_diff_format,
+ svn_boolean_t text_deltas,
const apr_array_header_t *changelist_filter,
svn_cancel_func_t cancel_func,
void *cancel_baton,
@@ -578,6 +595,7 @@ svn_wc_diff6(svn_wc_context_t *wc_ctx,
eb.ignore_ancestry = ignore_ancestry;
eb.show_copies_as_adds = show_copies_as_adds;
eb.use_git_diff_format = use_git_diff_format;
+ eb.text_deltas = text_deltas;
eb.empty_file = NULL;
eb.pool = scratch_pool;
@@ -602,3 +620,23 @@ svn_wc_diff6(svn_wc_context_t *wc_ctx,
return SVN_NO_ERROR;
}
+
+svn_error_t *
+svn_wc_diff6(svn_wc_context_t *wc_ctx,
+ const char *local_abspath,
+ const svn_wc_diff_callbacks4_t *callbacks,
+ void *callback_baton,
+ svn_depth_t depth,
+ svn_boolean_t ignore_ancestry,
+ svn_boolean_t show_copies_as_adds,
+ svn_boolean_t use_git_diff_format,
+ const apr_array_header_t *changelist_filter,
+ svn_cancel_func_t cancel_func,
+ void *cancel_baton,
+ apr_pool_t *scratch_pool)
+{
+ return svn_wc_diff7(wc_ctx, local_abspath, callbacks, callback_baton, depth,
+ ignore_ancestry, show_copies_as_adds,
+ use_git_diff_format, TRUE, changelist_filter,
+ cancel_func, callback_baton, scratch_pool);
+}
signature.asc
Description: OpenPGP digital signature

