Re: [PATCH v4 3/4] cache-tree: fix writing cache-tree when CE_REMOVE is present

2012-12-16 Thread Nguyen Thai Ngoc Duy
On Sun, Dec 16, 2012 at 2:20 PM, Junio C Hamano gits...@pobox.com wrote:
 Nguyễn Thái Ngọc Duy  pclo...@gmail.com writes:

 entry_count is used in update_one() for two purposes:

 1. to skip through the number of processed entries in in-memory index
 2. to record the number of entries this cache-tree covers on disk

 Unfortunately when CE_REMOVE is present these numbers are not the same
 because CE_REMOVE entries are automatically removed before writing to
 disk but entry_count is not adjusted and still counts CE_REMOVE
 entries.

 Nicely explained.  I wonder if we can also add a piece of test to
 the patch 4/4 to demonstrate the issue with CE_REMOVE entries,
 though.

A hand crafted one, maybe. I did not attempt to recreate it with git
commands (and I don't think we update cache-tree after unpack_trees).
So I wrote something like this instead:

int main(int ac, char **av)
{
unsigned char sha1[20];
setup_git_directory();
read_cache();
active_cache[1]-ce_flags |= CE_REMOVE;
write_cache_as_tree(sha1, 0, NULL);
return 0;
}

I can polish it a bit and write new tests based on it and
test-dump-cache-tree if you want.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 3/4] cache-tree: fix writing cache-tree when CE_REMOVE is present

2012-12-16 Thread Junio C Hamano


Nguyen Thai Ngoc Duy pclo...@gmail.com wrote:

On Sun, Dec 16, 2012 at 2:20 PM, Junio C Hamano gits...@pobox.com
wrote:
 Nicely explained.  I wonder if we can also add a piece of test to
 the patch 4/4 to demonstrate the issue with CE_REMOVE entries,
 though.

A hand crafted one, maybe. I did not attempt to recreate it with git
commands (and I don't think we update cache-tree after unpack_trees).

Yeah, that's what I thought. No need to bother creating a bug that won't appear 
in the real life :-) 

Tusks for sanity checking.
-- 
Pardon terseness, typo and HTML from a tablet.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 3/4] cache-tree: fix writing cache-tree when CE_REMOVE is present

2012-12-15 Thread Nguyễn Thái Ngọc Duy
entry_count is used in update_one() for two purposes:

1. to skip through the number of processed entries in in-memory index
2. to record the number of entries this cache-tree covers on disk

Unfortunately when CE_REMOVE is present these numbers are not the same
because CE_REMOVE entries are automatically removed before writing to
disk but entry_count is not adjusted and still counts CE_REMOVE
entries.

Separate the two use cases into two different variables. #1 is taken
care by the new field count in struct cache_tree_sub and entry_count
is prepared for #2.

Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com
---
 cache-tree.c | 30 +++---
 cache-tree.h |  1 +
 2 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/cache-tree.c b/cache-tree.c
index 44eed28..2c10b2e 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -238,6 +238,7 @@ static int update_one(struct cache_tree *it,
  int entries,
  const char *base,
  int baselen,
+ int *skip_count,
  int flags)
 {
struct strbuf buffer;
@@ -245,6 +246,8 @@ static int update_one(struct cache_tree *it,
int dryrun = flags  WRITE_TREE_DRY_RUN;
int i;
 
+   *skip_count = 0;
+
if (0 = it-entry_count  has_sha1_file(it-sha1))
return it-entry_count;
 
@@ -264,7 +267,7 @@ static int update_one(struct cache_tree *it,
struct cache_entry *ce = cache[i];
struct cache_tree_sub *sub;
const char *path, *slash;
-   int pathlen, sublen, subcnt;
+   int pathlen, sublen, subcnt, subskip;
 
path = ce-name;
pathlen = ce_namelen(ce);
@@ -289,10 +292,13 @@ static int update_one(struct cache_tree *it,
cache + i, entries - i,
path,
baselen + sublen + 1,
+   subskip,
flags);
if (subcnt  0)
return subcnt;
i += subcnt;
+   sub-count = subcnt; /* to be used in the next loop */
+   *skip_count += subskip;
sub-used = 1;
}
 
@@ -324,7 +330,7 @@ static int update_one(struct cache_tree *it,
if (!sub)
die(cache-tree.c: '%.*s' in '%s' not found,
entlen, path + baselen, path);
-   i += sub-cache_tree-entry_count;
+   i += sub-count;
sha1 = sub-cache_tree-sha1;
mode = S_IFDIR;
}
@@ -340,8 +346,18 @@ static int update_one(struct cache_tree *it,
mode, sha1_to_hex(sha1), entlen+baselen, path);
}
 
-   if (ce-ce_flags  (CE_REMOVE | CE_INTENT_TO_ADD))
-   continue; /* entry being removed or placeholder */
+   /*
+* CE_REMOVE entries are removed before the index is
+* written to disk. Skip them to remain consistent
+* with the future on-disk index.
+*/
+   if (ce-ce_flags  CE_REMOVE) {
+   *skip_count = *skip_count + 1;
+   continue;
+   }
+
+   if (ce-ce_flags  CE_INTENT_TO_ADD)
+   continue;
 
strbuf_grow(buffer, entlen + 100);
strbuf_addf(buffer, %o %.*s%c, mode, entlen, path + baselen, 
'\0');
@@ -361,7 +377,7 @@ static int update_one(struct cache_tree *it,
}
 
strbuf_release(buffer);
-   it-entry_count = i;
+   it-entry_count = i - *skip_count;
 #if DEBUG
fprintf(stderr, cache-tree update-one (%d ent, %d subtree) %s\n,
it-entry_count, it-subtree_nr,
@@ -375,11 +391,11 @@ int cache_tree_update(struct cache_tree *it,
  int entries,
  int flags)
 {
-   int i;
+   int i, skip;
i = verify_cache(cache, entries, flags);
if (i)
return i;
-   i = update_one(it, cache, entries, , 0, flags);
+   i = update_one(it, cache, entries, , 0, skip, flags);
if (i  0)
return i;
return 0;
diff --git a/cache-tree.h b/cache-tree.h
index d8cb2e9..55d0f59 100644
--- a/cache-tree.h
+++ b/cache-tree.h
@@ -7,6 +7,7 @@
 struct cache_tree;
 struct cache_tree_sub {
struct cache_tree *cache_tree;
+   int count;  /* internally used by update_one() */
int namelen;
int used;
char name[FLEX_ARRAY];
-- 
1.8.0.rc2.23.g1fb49df

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

Re: [PATCH v4 3/4] cache-tree: fix writing cache-tree when CE_REMOVE is present

2012-12-15 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy  pclo...@gmail.com writes:

 entry_count is used in update_one() for two purposes:

 1. to skip through the number of processed entries in in-memory index
 2. to record the number of entries this cache-tree covers on disk

 Unfortunately when CE_REMOVE is present these numbers are not the same
 because CE_REMOVE entries are automatically removed before writing to
 disk but entry_count is not adjusted and still counts CE_REMOVE
 entries.

Nicely explained.  I wonder if we can also add a piece of test to
the patch 4/4 to demonstrate the issue with CE_REMOVE entries,
though.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html