Race condition with git-status and git-rebase

2014-04-08 Thread Yiannis Marangos
Hi,

Since I don't know how to fix it, this is my bug report:

git-status reads the index file then holds the `index.lock', it writes
the updated index in `index.lock' and renames the `index.lock' to
`index', but if it used with git-rebase then there is a (rare) race
condition.

I reproduce this at my work under the following conditions:
1) The repository is very big
2) I can reproduce it only in Windows. I can not reproduce it in Linux
   (my Linux machine has faster CPU and SSD drive.)

I saw this race condition at my coworkers when they use two instances
(in the same repository) of Git-Extensions GUI, and when they use one
instance of Git-Extensions and at the same time they rebase from
terminal. This race condition happens because Git-Extensions monitors
the directory of the repository and when a file is accessed it calls
git-status.

Since I can not share the repository, I decided to trace down the
problem. I came up with the following:

process A calls git-rebase
process A applies the 1st commit
process B calls git-status
process B calls read_cache() (status_init_config() -> gitmodules_config() -> 
read_cache())
process A applies the 2nd commit
process B holds the index.lock
process B writes back the old index (the one that was read from read_cache())
process A applies the 3rd commit (now the 3rd commit contains also a revert of 
the 2nd)

The content of the files of the working directory is correct (i.e. as
it should be if the race condition didn't happen). But git-status
shows that the files of the 2nd commit are modified, the git-diff show
the exact patch of the 2nd commit.

So, in this case process B must avoid calling write_index(). Maybe
something is missing from update_index_if_able()? Maybe git should
check the state before proceeding to write_index() (for example that
we are in rebase)?

Something else that I noticed is that when the race condition happens
the istate->cache_changed is 0 and has_racy_timestamp(istate) is 1, so
the if statement in update_index_if_able() continues to write_index().

I didn't check if there are other similar race conditions.

Regards,
Yiannis.

PS: I tried to move hold_locked_index() above status_init_config() but
this cause other problems.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Race condition with git-status and git-rebase

2014-04-08 Thread Yiannis Marangos
Maybe you will find it useful: this is one way that I use to manually
debug the race condition (and it works in Linux):

diff --git a/builtin/commit.c b/builtin/commit.c
index d9550c5..c0511f6 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1283,6 +1283,8 @@ int cmd_status(int argc, const char **argv, const char 
*prefix)
usage_with_options(builtin_status_usage, 
builtin_status_options);
 
status_init_config(&s, git_status_config);
+   printf("read_cache() finished, press enter to continue.\n");
+   getchar();
argc = parse_options(argc, argv, prefix,
 builtin_status_options,
 builtin_status_usage, 0);
diff --git a/git-rebase--am.sh b/git-rebase--am.sh
index df46f4c..6ffd986 100644
--- a/git-rebase--am.sh
+++ b/git-rebase--am.sh
@@ -61,8 +61,8 @@ else
return $?
fi
 
-   git am $git_am_opt --rebasing --resolvemsg="$resolvemsg" \
-   ${gpg_sign_opt:+"$gpg_sign_opt"} <"$GIT_DIR/rebased-patches"
+   git am $git_am_opt -i --rebasing --resolvemsg="$resolvemsg" \
+   ${gpg_sign_opt:+"$gpg_sign_opt"} "$GIT_DIR/rebased-patches"
ret=$?
 
rm -f "$GIT_DIR/rebased-patches"
diff --git a/read-cache.c b/read-cache.c
index ba13353..a6f73a7 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1778,6 +1778,8 @@ static int has_racy_timestamp(struct index_state *istate)
  */
 void update_index_if_able(struct index_state *istate, struct lock_file 
*lockfile)
 {
+   printf("cache_changed: %d  has_racy_timestamp: %d\n",
+  istate->cache_changed, has_racy_timestamp(istate));
if ((istate->cache_changed || has_racy_timestamp(istate)) &&
!write_index(istate, lockfile->fd))
commit_locked_index(lockfile);
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Verify index file before we opportunistically update it

2014-04-09 Thread Yiannis Marangos
This is a fix for the following bug:
http://thread.gmane.org/gmane.comp.version-control.git/245946/focus=245965

I added 2 functions: verify_index_from and verify_index. They return 1
if the sha1 is correct, otherwise 0. I choose to not die if any errors
are occurred because we just want to not proceed to "opportunistic
update".

Some questions:
1) Is it better to have these functions as static?
2) If the answer of (1) is no, should I define verify_cache*() also?
3) If something goes wrong in verify_hdr(), it will print an error
   message, should I make a "quietly" version of it?

Yiannis Marangos (1):
  Verify index file before we opportunistically update it

 cache.h  |  3 +++
 read-cache.c | 79 +++-
 2 files changed, 71 insertions(+), 11 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Verify index file before we opportunistically update it

2014-04-09 Thread Yiannis Marangos
Before we proceed to "opportunistic update" we must verify that the
current index file is the same as the one that we read before. There
is a possible race if we don't do this:

  1. process A calls git-rebase

  2. process A applies 1st commit

  3. process B calls git-status

  4. process B reads index

  5. process A applies 2nd commit

  6. process B takes the lock, then overwrites process A's changes.

  7. process A applies 3rd commit

As an end result the 3rd commit will have a revert of the 2nd commit.

Signed-off-by: Yiannis Marangos 
---
 cache.h  |  3 +++
 read-cache.c | 79 +++-
 2 files changed, 71 insertions(+), 11 deletions(-)

diff --git a/cache.h b/cache.h
index 107ac61..b0eedad 100644
--- a/cache.h
+++ b/cache.h
@@ -279,6 +279,7 @@ struct index_state {
 initialized : 1;
struct hashmap name_hash;
struct hashmap dir_hash;
+   unsigned char sha1[20];
 };
 
 extern struct index_state the_index;
@@ -456,6 +457,8 @@ extern int daemonize(void);
} while (0)
 
 /* Initialize and use the cache information */
+extern int verify_index_from(const struct index_state *, const char *path);
+extern int verify_index(struct index_state *);
 extern int read_index(struct index_state *);
 extern int read_index_preload(struct index_state *, const struct pathspec 
*pathspec);
 extern int read_index_from(struct index_state *, const char *path);
diff --git a/read-cache.c b/read-cache.c
index ba13353..fac8655 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -14,6 +14,7 @@
 #include "resolve-undo.h"
 #include "strbuf.h"
 #include "varint.h"
+#include "git-compat-util.h"
 
 static struct cache_entry *refresh_cache_entry(struct cache_entry *ce,
   unsigned int options);
@@ -1321,6 +1322,59 @@ static int verify_hdr(struct cache_header *hdr, unsigned 
long size)
return 0;
 }
 
+/* This function verifies if index_state has the correct sha1 of an index file.
+ * Don't die if we have any other failure, just return 0. */
+int verify_index_from(const struct index_state *istate, const char *path)
+{
+   int fd;
+   struct stat st;
+   struct cache_header *hdr;
+   void *mmap_addr;
+   size_t mmap_size;
+
+   if (!istate->initialized)
+   return 0;
+
+   fd = open(path, O_RDONLY);
+   if (fd < 0)
+   return 0;
+
+   if (fstat(fd, &st))
+   return 0;
+
+   /* file is too big */
+   if (st.st_size > (size_t)st.st_size)
+   return 0;
+
+   mmap_size = (size_t)st.st_size;
+   if (mmap_size < sizeof(struct cache_header) + 20)
+   return 0;
+
+   mmap_addr = mmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, 
fd, 0);
+   close(fd);
+   if (mmap_addr == MAP_FAILED)
+   return 0;
+
+   hdr = mmap_addr;
+   if (verify_hdr(hdr, mmap_size) < 0)
+   goto unmap;
+
+   if (hashcmp(istate->sha1, (unsigned char *)hdr + mmap_size - 20))
+   goto unmap;
+
+   munmap(mmap_addr, mmap_size);
+   return 1;
+
+unmap:
+   munmap(mmap_addr, mmap_size);
+   return 0;
+}
+
+int verify_index(struct index_state *istate)
+{
+   return verify_index_from(istate, get_index_file());
+}
+
 static int read_index_extension(struct index_state *istate,
const char *ext, void *data, unsigned long sz)
 {
@@ -1445,7 +1499,7 @@ int read_index_from(struct index_state *istate, const 
char *path)
struct stat st;
unsigned long src_offset;
struct cache_header *hdr;
-   void *mmap;
+   void *mmap_addr;
size_t mmap_size;
struct strbuf previous_name_buf = STRBUF_INIT, *previous_name;
 
@@ -1468,15 +1522,16 @@ int read_index_from(struct index_state *istate, const 
char *path)
if (mmap_size < sizeof(struct cache_header) + 20)
die("index file smaller than expected");
 
-   mmap = xmmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 
0);
-   if (mmap == MAP_FAILED)
+   mmap_addr = xmmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, 
fd, 0);
+   if (mmap_addr == MAP_FAILED)
die_errno("unable to map index file");
close(fd);
 
-   hdr = mmap;
+   hdr = mmap_addr;
if (verify_hdr(hdr, mmap_size) < 0)
goto unmap;
 
+   hashcpy(istate->sha1, (unsigned char *)hdr + mmap_size - 20);
istate->version = ntohl(hdr->hdr_version);
istate->cache_nr = ntohl(hdr->hdr_entries);
istate->cache_alloc = alloc_nr(istate->cache_nr);
@@ -1494,7 +1549,7 @@ int read_index_from(struct index_state *istate, const 
char *path)
struct cache_entry *ce;
unsigned long consumed;
 
-  

[PATCH v2] Verify index file before we opportunistically update it

2014-04-09 Thread Yiannis Marangos
Before we proceed to "opportunistic update" we must verify that the
current index file is the same as the one that we read before. There
is a possible race if we don't do this:

  1. process A calls git-rebase

  2. process A applies 1st commit

  3. process B calls git-status

  4. process B reads index

  5. process A applies 2nd commit

  6. process B takes the lock, then overwrites process A's changes.

  7. process A applies 3rd commit

As an end result the 3rd commit will have a revert of the 2nd commit.

Signed-off-by: Yiannis Marangos 
---
 cache.h  |  3 +++
 read-cache.c | 77 +++-
 2 files changed, 69 insertions(+), 11 deletions(-)

diff --git a/cache.h b/cache.h
index 107ac61..b0eedad 100644
--- a/cache.h
+++ b/cache.h
@@ -279,6 +279,7 @@ struct index_state {
 initialized : 1;
struct hashmap name_hash;
struct hashmap dir_hash;
+   unsigned char sha1[20];
 };
 
 extern struct index_state the_index;
@@ -456,6 +457,8 @@ extern int daemonize(void);
} while (0)
 
 /* Initialize and use the cache information */
+extern int verify_index_from(const struct index_state *, const char *path);
+extern int verify_index(struct index_state *);
 extern int read_index(struct index_state *);
 extern int read_index_preload(struct index_state *, const struct pathspec 
*pathspec);
 extern int read_index_from(struct index_state *, const char *path);
diff --git a/read-cache.c b/read-cache.c
index ba13353..a49a441 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -14,6 +14,7 @@
 #include "resolve-undo.h"
 #include "strbuf.h"
 #include "varint.h"
+#include "git-compat-util.h"
 
 static struct cache_entry *refresh_cache_entry(struct cache_entry *ce,
   unsigned int options);
@@ -1321,6 +1322,59 @@ static int verify_hdr(struct cache_header *hdr, unsigned 
long size)
return 0;
 }
 
+/* This function verifies if index_state has the correct sha1 of an index file.
+ * Don't die if we have any other failure, just return 0. */
+int verify_index_from(const struct index_state *istate, const char *path)
+{
+   int fd;
+   struct stat st;
+   struct cache_header *hdr;
+   void *mmap_addr;
+   size_t mmap_size;
+
+   if (!istate->initialized)
+   return 0;
+
+   fd = open(path, O_RDONLY);
+   if (fd < 0)
+   return 0;
+
+   if (fstat(fd, &st))
+   return 0;
+
+   /* file is too big */
+   if (st.st_size > (size_t)st.st_size)
+   return 0;
+
+   mmap_size = (size_t)st.st_size;
+   if (mmap_size < sizeof(struct cache_header) + 20)
+   return 0;
+
+   mmap_addr = mmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, 
fd, 0);
+   close(fd);
+   if (mmap_addr == MAP_FAILED)
+   return 0;
+
+   hdr = mmap_addr;
+   if (verify_hdr(hdr, mmap_size) < 0)
+   goto unmap;
+
+   if (hashcmp(istate->sha1, (unsigned char *)hdr + mmap_size - 20))
+   goto unmap;
+
+   munmap(mmap_addr, mmap_size);
+   return 1;
+
+unmap:
+   munmap(mmap_addr, mmap_size);
+   return 0;
+}
+
+int verify_index(struct index_state *istate)
+{
+   return verify_index_from(istate, get_index_file());
+}
+
 static int read_index_extension(struct index_state *istate,
const char *ext, void *data, unsigned long sz)
 {
@@ -1445,7 +1499,7 @@ int read_index_from(struct index_state *istate, const 
char *path)
struct stat st;
unsigned long src_offset;
struct cache_header *hdr;
-   void *mmap;
+   void *mmap_addr;
size_t mmap_size;
struct strbuf previous_name_buf = STRBUF_INIT, *previous_name;
 
@@ -1468,15 +1522,16 @@ int read_index_from(struct index_state *istate, const 
char *path)
if (mmap_size < sizeof(struct cache_header) + 20)
die("index file smaller than expected");
 
-   mmap = xmmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 
0);
-   if (mmap == MAP_FAILED)
+   mmap_addr = xmmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, 
fd, 0);
+   if (mmap_addr == MAP_FAILED)
die_errno("unable to map index file");
close(fd);
 
-   hdr = mmap;
+   hdr = mmap_addr;
if (verify_hdr(hdr, mmap_size) < 0)
goto unmap;
 
+   hashcpy(istate->sha1, (unsigned char *)hdr + mmap_size - 20);
istate->version = ntohl(hdr->hdr_version);
istate->cache_nr = ntohl(hdr->hdr_entries);
istate->cache_alloc = alloc_nr(istate->cache_nr);
@@ -1494,7 +1549,7 @@ int read_index_from(struct index_state *istate, const 
char *path)
struct cache_entry *ce;
unsigned long consumed;
 
-  

[PATCH v3] Verify index file before we opportunistically update it

2014-04-09 Thread Yiannis Marangos
Before we proceed to "opportunistic update" we must verify that the
current index file is the same as the one that we read before. There
is a possible race if we don't do this:

  1. process A calls git-rebase

  2. process A applies 1st commit

  3. process B calls git-status

  4. process B reads index

  5. process A applies 2nd commit

  6. process B takes the lock, then overwrites process A's changes.

  7. process A applies 3rd commit

As an end result the 3rd commit will have a revert of the 2nd commit.

Signed-off-by: Yiannis Marangos 
---
 cache.h  |  3 +++
 read-cache.c | 77 +++-
 2 files changed, 69 insertions(+), 11 deletions(-)

diff --git a/cache.h b/cache.h
index 107ac61..76ce3d9 100644
--- a/cache.h
+++ b/cache.h
@@ -279,6 +279,7 @@ struct index_state {
 initialized : 1;
struct hashmap name_hash;
struct hashmap dir_hash;
+   unsigned char sha1[20];
 };
 
 extern struct index_state the_index;
@@ -456,6 +457,8 @@ extern int daemonize(void);
} while (0)
 
 /* Initialize and use the cache information */
+extern int verify_index_from(const struct index_state *, const char *path);
+extern int verify_index(const struct index_state *);
 extern int read_index(struct index_state *);
 extern int read_index_preload(struct index_state *, const struct pathspec 
*pathspec);
 extern int read_index_from(struct index_state *, const char *path);
diff --git a/read-cache.c b/read-cache.c
index ba13353..3604d8c 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -14,6 +14,7 @@
 #include "resolve-undo.h"
 #include "strbuf.h"
 #include "varint.h"
+#include "git-compat-util.h"
 
 static struct cache_entry *refresh_cache_entry(struct cache_entry *ce,
   unsigned int options);
@@ -1321,6 +1322,59 @@ static int verify_hdr(struct cache_header *hdr, unsigned 
long size)
return 0;
 }
 
+/* This function verifies if index_state has the correct sha1 of an index file.
+ * Don't die if we have any other failure, just return 0. */
+int verify_index_from(const struct index_state *istate, const char *path)
+{
+   int fd;
+   struct stat st;
+   struct cache_header *hdr;
+   void *mmap_addr;
+   size_t mmap_size;
+
+   if (!istate->initialized)
+   return 0;
+
+   fd = open(path, O_RDONLY);
+   if (fd < 0)
+   return 0;
+
+   if (fstat(fd, &st))
+   return 0;
+
+   /* file is too big */
+   if (st.st_size > (size_t)st.st_size)
+   return 0;
+
+   mmap_size = (size_t)st.st_size;
+   if (mmap_size < sizeof(struct cache_header) + 20)
+   return 0;
+
+   mmap_addr = mmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, 
fd, 0);
+   close(fd);
+   if (mmap_addr == MAP_FAILED)
+   return 0;
+
+   hdr = mmap_addr;
+   if (verify_hdr(hdr, mmap_size) < 0)
+   goto unmap;
+
+   if (hashcmp(istate->sha1, (unsigned char *)hdr + mmap_size - 20))
+   goto unmap;
+
+   munmap(mmap_addr, mmap_size);
+   return 1;
+
+unmap:
+   munmap(mmap_addr, mmap_size);
+   return 0;
+}
+
+int verify_index(const struct index_state *istate)
+{
+   return verify_index_from(istate, get_index_file());
+}
+
 static int read_index_extension(struct index_state *istate,
const char *ext, void *data, unsigned long sz)
 {
@@ -1445,7 +1499,7 @@ int read_index_from(struct index_state *istate, const 
char *path)
struct stat st;
unsigned long src_offset;
struct cache_header *hdr;
-   void *mmap;
+   void *mmap_addr;
size_t mmap_size;
struct strbuf previous_name_buf = STRBUF_INIT, *previous_name;
 
@@ -1468,15 +1522,16 @@ int read_index_from(struct index_state *istate, const 
char *path)
if (mmap_size < sizeof(struct cache_header) + 20)
die("index file smaller than expected");
 
-   mmap = xmmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 
0);
-   if (mmap == MAP_FAILED)
+   mmap_addr = xmmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, 
fd, 0);
+   if (mmap_addr == MAP_FAILED)
die_errno("unable to map index file");
close(fd);
 
-   hdr = mmap;
+   hdr = mmap_addr;
if (verify_hdr(hdr, mmap_size) < 0)
goto unmap;
 
+   hashcpy(istate->sha1, (unsigned char *)hdr + mmap_size - 20);
istate->version = ntohl(hdr->hdr_version);
istate->cache_nr = ntohl(hdr->hdr_entries);
istate->cache_alloc = alloc_nr(istate->cache_nr);
@@ -1494,7 +1549,7 @@ int read_index_from(struct index_state *istate, const 
char *path)
struct cache_entry *ce;
unsigned l

[PATCH v4] Verify index file before we opportunistically update it

2014-04-09 Thread Yiannis Marangos
Before we proceed to "opportunistic update" we must verify that the
current index file is the same as the one that we read before. There
is a possible race if we don't do this. In the example below git-show
does "opportunistic update" and git-rebase updates the index, but the
race can happen in general.

  1. process A calls git-rebase (or does anything that uses the index)

  2. process A applies 1st commit

  3. process B calls git-status (or does anything that updates the index)

  4. process B reads index

  5. process A applies 2nd commit

  6. process B takes the lock, then overwrites process A's changes.

  7. process A applies 3rd commit

As an end result the 3rd commit will have a revert of the 2nd commit.
When process B takes the lock, it needs to make sure that the index
hasn't changed since step 4.

Signed-off-by: Yiannis Marangos 
---

Version 4 contains fixes based on Junio's comments.

 cache.h  |  1 +
 read-cache.c | 58 +-
 2 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index 107ac61..0460f06 100644
--- a/cache.h
+++ b/cache.h
@@ -279,6 +279,7 @@ struct index_state {
 initialized : 1;
struct hashmap name_hash;
struct hashmap dir_hash;
+   unsigned char sha1[20];
 };
 
 extern struct index_state the_index;
diff --git a/read-cache.c b/read-cache.c
index ba13353..73a15b1 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1477,6 +1477,7 @@ int read_index_from(struct index_state *istate, const 
char *path)
if (verify_hdr(hdr, mmap_size) < 0)
goto unmap;
 
+   hashcpy(istate->sha1, (unsigned char *)hdr + mmap_size - 20);
istate->version = ntohl(hdr->hdr_version);
istate->cache_nr = ntohl(hdr->hdr_entries);
istate->cache_alloc = alloc_nr(istate->cache_nr);
@@ -1760,6 +1761,61 @@ static int ce_write_entry(git_SHA_CTX *c, int fd, struct 
cache_entry *ce,
return result;
 }
 
+/*
+ * This function verifies if index_state has the correct sha1 of an index file.
+ * Don't die if we have any other failure, just return 0.
+ */
+static int verify_index_from(const struct index_state *istate, const char 
*path)
+{
+   int fd;
+   struct stat st;
+   struct cache_header *hdr;
+   void *mmap_addr;
+   size_t mmap_size;
+
+   if (!istate->initialized)
+   return 0;
+
+   fd = open(path, O_RDONLY);
+   if (fd < 0)
+   return 0;
+
+   if (fstat(fd, &st))
+   return 0;
+
+   /* file is too big */
+   if (st.st_size > (size_t)st.st_size)
+   return 0;
+
+   mmap_size = (size_t)st.st_size;
+   if (mmap_size < sizeof(struct cache_header) + 20)
+   return 0;
+
+   mmap_addr = mmap(NULL, mmap_size, PROT_READ, MAP_PRIVATE, fd, 0);
+   close(fd);
+   if (mmap_addr == MAP_FAILED)
+   return 0;
+
+   hdr = mmap_addr;
+   if (verify_hdr(hdr, mmap_size) < 0)
+   goto unmap;
+
+   if (hashcmp(istate->sha1, (unsigned char *)hdr + mmap_size - 20))
+   goto unmap;
+
+   munmap(mmap_addr, mmap_size);
+   return 1;
+
+unmap:
+   munmap(mmap_addr, mmap_size);
+   return 0;
+}
+
+static int verify_index(const struct index_state *istate)
+{
+   return verify_index_from(istate, get_index_file());
+}
+
 static int has_racy_timestamp(struct index_state *istate)
 {
int entries = istate->cache_nr;
@@ -1779,7 +1835,7 @@ static int has_racy_timestamp(struct index_state *istate)
 void update_index_if_able(struct index_state *istate, struct lock_file 
*lockfile)
 {
if ((istate->cache_changed || has_racy_timestamp(istate)) &&
-   !write_index(istate, lockfile->fd))
+   verify_index(istate) && !write_index(istate, lockfile->fd))
commit_locked_index(lockfile);
else
rollback_lock_file(lockfile);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5] Verify index file before we opportunistically update it

2014-04-09 Thread Yiannis Marangos
Before we proceed to "opportunistic update" we must verify that the
current index file is the same as the one that we read before. There
is a possible race if we don't do this. In the example below git-status
does "opportunistic update" and git-rebase updates the index, but the
race can happen in general.

  1. process A calls git-rebase (or does anything that uses the index)

  2. process A applies 1st commit

  3. process B calls git-status (or does anything that updates the index)

  4. process B reads index

  5. process A applies 2nd commit

  6. process B takes the lock, then overwrites process A's changes.

  7. process A applies 3rd commit

As an end result the 3rd commit will have a revert of the 2nd commit.
When process B takes the lock, it needs to make sure that the index
hasn't changed since step 4.

Signed-off-by: Yiannis Marangos 
---

Version 4 contains fixes based on Junio's comments.
Version 5 fixs a typo in commit message (git-show -> git-status)

 cache.h  |  1 +
 read-cache.c | 58 +-
 2 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index 107ac61..0460f06 100644
--- a/cache.h
+++ b/cache.h
@@ -279,6 +279,7 @@ struct index_state {
 initialized : 1;
struct hashmap name_hash;
struct hashmap dir_hash;
+   unsigned char sha1[20];
 };
 
 extern struct index_state the_index;
diff --git a/read-cache.c b/read-cache.c
index ba13353..73a15b1 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1477,6 +1477,7 @@ int read_index_from(struct index_state *istate, const 
char *path)
if (verify_hdr(hdr, mmap_size) < 0)
goto unmap;
 
+   hashcpy(istate->sha1, (unsigned char *)hdr + mmap_size - 20);
istate->version = ntohl(hdr->hdr_version);
istate->cache_nr = ntohl(hdr->hdr_entries);
istate->cache_alloc = alloc_nr(istate->cache_nr);
@@ -1760,6 +1761,61 @@ static int ce_write_entry(git_SHA_CTX *c, int fd, struct 
cache_entry *ce,
return result;
 }
 
+/*
+ * This function verifies if index_state has the correct sha1 of an index file.
+ * Don't die if we have any other failure, just return 0.
+ */
+static int verify_index_from(const struct index_state *istate, const char 
*path)
+{
+   int fd;
+   struct stat st;
+   struct cache_header *hdr;
+   void *mmap_addr;
+   size_t mmap_size;
+
+   if (!istate->initialized)
+   return 0;
+
+   fd = open(path, O_RDONLY);
+   if (fd < 0)
+   return 0;
+
+   if (fstat(fd, &st))
+   return 0;
+
+   /* file is too big */
+   if (st.st_size > (size_t)st.st_size)
+   return 0;
+
+   mmap_size = (size_t)st.st_size;
+   if (mmap_size < sizeof(struct cache_header) + 20)
+   return 0;
+
+   mmap_addr = mmap(NULL, mmap_size, PROT_READ, MAP_PRIVATE, fd, 0);
+   close(fd);
+   if (mmap_addr == MAP_FAILED)
+   return 0;
+
+   hdr = mmap_addr;
+   if (verify_hdr(hdr, mmap_size) < 0)
+   goto unmap;
+
+   if (hashcmp(istate->sha1, (unsigned char *)hdr + mmap_size - 20))
+   goto unmap;
+
+   munmap(mmap_addr, mmap_size);
+   return 1;
+
+unmap:
+   munmap(mmap_addr, mmap_size);
+   return 0;
+}
+
+static int verify_index(const struct index_state *istate)
+{
+   return verify_index_from(istate, get_index_file());
+}
+
 static int has_racy_timestamp(struct index_state *istate)
 {
int entries = istate->cache_nr;
@@ -1779,7 +1835,7 @@ static int has_racy_timestamp(struct index_state *istate)
 void update_index_if_able(struct index_state *istate, struct lock_file 
*lockfile)
 {
if ((istate->cache_changed || has_racy_timestamp(istate)) &&
-   !write_index(istate, lockfile->fd))
+   verify_index(istate) && !write_index(istate, lockfile->fd))
commit_locked_index(lockfile);
else
rollback_lock_file(lockfile);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5] Verify index file before we opportunistically update it

2014-04-10 Thread Yiannis Marangos
On Thu, Apr 10, 2014 at 05:40:55PM +0700, Duy Nguyen wrote:
> verify_hdr() is a bit expensive because you need to digest the whole
> index file (could big as big as 14MB on webkit). Could we get away
> without it? I mean, is it enough that we pick the last 20 bytes and
> compare it with istate->sha1? If we only need 20 bytes, pread() may be
> better than mmap().
> 
> The chance of SHA-1 collision is small enough for us to ignore, I
> think. And if a client updates the index without updating the trailing
> sha-1, the index is broken and we don't have to worry about
> overwriting it.

That's true, I will commit version 6 in a bit.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6] Verify index file before we opportunistically update it

2014-04-10 Thread Yiannis Marangos
Before we proceed to "opportunistic update" we must verify that the
current index file is the same as the one that we read before. There
is a possible race if we don't do this. In the example below git-status
does "opportunistic update" and git-rebase updates the index, but the
race can happen in general.

  1. process A calls git-rebase (or does anything that uses the index)

  2. process A applies 1st commit

  3. process B calls git-status (or does anything that updates the index)

  4. process B reads index

  5. process A applies 2nd commit

  6. process B takes the lock, then overwrites process A's changes.

  7. process A applies 3rd commit

As an end result the 3rd commit will have a revert of the 2nd commit.
When process B takes the lock, it needs to make sure that the index
hasn't changed since step 4.

Signed-off-by: Yiannis Marangos 
---

Version 4 contains fixes based on Junio's comments.
Version 5 fixs a typo in commit message (git-show -> git-status)
Version 6 removes verify_hdr() and use pread() instead of mmap().

 cache.h  |  1 +
 read-cache.c | 47 ++-
 2 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index 107ac61..0460f06 100644
--- a/cache.h
+++ b/cache.h
@@ -279,6 +279,7 @@ struct index_state {
 initialized : 1;
struct hashmap name_hash;
struct hashmap dir_hash;
+   unsigned char sha1[20];
 };
 
 extern struct index_state the_index;
diff --git a/read-cache.c b/read-cache.c
index ba13353..034865d 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1477,6 +1477,7 @@ int read_index_from(struct index_state *istate, const 
char *path)
if (verify_hdr(hdr, mmap_size) < 0)
goto unmap;
 
+   hashcpy(istate->sha1, (unsigned char *)hdr + mmap_size - 20);
istate->version = ntohl(hdr->hdr_version);
istate->cache_nr = ntohl(hdr->hdr_entries);
istate->cache_alloc = alloc_nr(istate->cache_nr);
@@ -1760,6 +1761,50 @@ static int ce_write_entry(git_SHA_CTX *c, int fd, struct 
cache_entry *ce,
return result;
 }
 
+/*
+ * This function verifies if index_state has the correct sha1 of an index file.
+ * Don't die if we have any other failure, just return 0.
+ */
+static int verify_index_from(const struct index_state *istate, const char 
*path)
+{
+   int fd;
+   ssize_t n;
+   struct stat st;
+   unsigned char sha1[20];
+
+   if (!istate->initialized)
+   return 0;
+
+   fd = open(path, O_RDONLY);
+   if (fd < 0)
+   return 0;
+
+   if (fstat(fd, &st))
+   goto out;
+
+   if (st.st_size < sizeof(struct cache_header) + 20)
+   goto out;
+
+   n = pread(fd, sha1, 20, st.st_size - 20);
+   if (n != 20)
+   goto out;
+
+   if (hashcmp(istate->sha1, sha1))
+   goto out;
+
+   close(fd);
+   return 1;
+
+out:
+   close(fd);
+   return 0;
+}
+
+static int verify_index(const struct index_state *istate)
+{
+   return verify_index_from(istate, get_index_file());
+}
+
 static int has_racy_timestamp(struct index_state *istate)
 {
int entries = istate->cache_nr;
@@ -1779,7 +1824,7 @@ static int has_racy_timestamp(struct index_state *istate)
 void update_index_if_able(struct index_state *istate, struct lock_file 
*lockfile)
 {
if ((istate->cache_changed || has_racy_timestamp(istate)) &&
-   !write_index(istate, lockfile->fd))
+   verify_index(istate) && !write_index(istate, lockfile->fd))
commit_locked_index(lockfile);
else
rollback_lock_file(lockfile);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 1/2] Add xpread() and xpwrite()

2014-04-10 Thread Yiannis Marangos
xpread() and xpwrite() pay attention to EAGAIN/EINTR, so they will resume
automatically on interrupted call.

Signed-off-by: Yiannis Marangos 
---
 builtin/index-pack.c |  2 +-
 compat/mmap.c|  4 +---
 git-compat-util.h|  2 ++
 wrapper.c| 36 
 4 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index b9f6e12..1bac0f5 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -542,7 +542,7 @@ static void *unpack_data(struct object_entry *obj,
 
do {
ssize_t n = (len < 64*1024) ? len : 64*1024;
-   n = pread(pack_fd, inbuf, n, from);
+   n = xpread(pack_fd, inbuf, n, from);
if (n < 0)
die_errno(_("cannot pread pack file"));
if (!n)
diff --git a/compat/mmap.c b/compat/mmap.c
index c9d46d1..7f662fe 100644
--- a/compat/mmap.c
+++ b/compat/mmap.c
@@ -14,7 +14,7 @@ void *git_mmap(void *start, size_t length, int prot, int 
flags, int fd, off_t of
}
 
while (n < length) {
-   ssize_t count = pread(fd, (char *)start + n, length - n, offset 
+ n);
+   ssize_t count = xpread(fd, (char *)start + n, length - n, 
offset + n);
 
if (count == 0) {
memset((char *)start+n, 0, length-n);
@@ -22,8 +22,6 @@ void *git_mmap(void *start, size_t length, int prot, int 
flags, int fd, off_t of
}
 
if (count < 0) {
-   if (errno == EAGAIN || errno == EINTR)
-   continue;
free(start);
errno = EACCES;
return MAP_FAILED;
diff --git a/git-compat-util.h b/git-compat-util.h
index f6d3a46..4da04c6 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -531,6 +531,8 @@ extern void *xcalloc(size_t nmemb, size_t size);
 extern void *xmmap(void *start, size_t length, int prot, int flags, int fd, 
off_t offset);
 extern ssize_t xread(int fd, void *buf, size_t len);
 extern ssize_t xwrite(int fd, const void *buf, size_t len);
+extern ssize_t xpread(int fd, void *buf, size_t len, off_t offset);
+extern ssize_t xpwrite(int fd, const void *buf, size_t len, off_t offset);
 extern int xdup(int fd);
 extern FILE *xfdopen(int fd, const char *mode);
 extern int xmkstemp(char *template);
diff --git a/wrapper.c b/wrapper.c
index 0cc5636..25b7419 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -174,6 +174,42 @@ ssize_t xwrite(int fd, const void *buf, size_t len)
}
 }
 
+/*
+ * xpread() is the same as pread(), but it automatically restarts pread()
+ * operations with a recoverable error (EAGAIN and EINTR). xpread() DOES
+ * NOT GUARANTEE that "len" bytes is read even if the data is available.
+ */
+ssize_t xpread(int fd, void *buf, size_t len, off_t offset)
+{
+   ssize_t nr;
+   if (len > MAX_IO_SIZE)
+   len = MAX_IO_SIZE;
+   while (1) {
+   nr = pread(fd, buf, len, offset);
+   if ((nr < 0) && (errno == EAGAIN || errno == EINTR))
+   continue;
+   return nr;
+   }
+}
+
+/*
+ * xpwrite() is the same as pwrite(), but it automatically restarts pwrite()
+ * operations with a recoverable error (EAGAIN and EINTR). xpwrite() DOES NOT
+ * GUARANTEE that "len" bytes is written even if the operation is successful.
+ */
+ssize_t xpwrite(int fd, const void *buf, size_t len, off_t offset)
+{
+   ssize_t nr;
+   if (len > MAX_IO_SIZE)
+   len = MAX_IO_SIZE;
+   while (1) {
+   nr = pwrite(fd, buf, len, offset);
+   if ((nr < 0) && (errno == EAGAIN || errno == EINTR))
+   continue;
+   return nr;
+   }
+}
+
 ssize_t read_in_full(int fd, void *buf, size_t count)
 {
char *p = buf;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 2/2] Verify index file before we opportunistically update it

2014-04-10 Thread Yiannis Marangos
Before we proceed to "opportunistic update" we must verify that the
current index file is the same as the one that we read before. There
is a possible race if we don't do this. In the example below git-status
does "opportunistic update" and git-rebase updates the index, but the
race can happen in general.

  1. process A calls git-rebase (or does anything that uses the index)

  2. process A applies 1st commit

  3. process B calls git-status (or does anything that updates the index)

  4. process B reads index

  5. process A applies 2nd commit

  6. process B takes the lock, then overwrites process A's changes.

  7. process A applies 3rd commit

As an end result the 3rd commit will have a revert of the 2nd commit.
When process B takes the lock, it needs to make sure that the index
hasn't changed since step 4.

Signed-off-by: Yiannis Marangos 
---

Version 4 contains fixes based on Junio's comments.
Version 5 fixs a typo in commit message (git-show -> git-status).
Version 6 removes verify_hdr() and use pread() instead of mmap().
Version 7 use xpread() (added with [PATCH v7 1/2]) instead of pread().

 cache.h  |  1 +
 read-cache.c | 47 ++-
 2 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index 107ac61..0460f06 100644
--- a/cache.h
+++ b/cache.h
@@ -279,6 +279,7 @@ struct index_state {
 initialized : 1;
struct hashmap name_hash;
struct hashmap dir_hash;
+   unsigned char sha1[20];
 };
 
 extern struct index_state the_index;
diff --git a/read-cache.c b/read-cache.c
index ba13353..28de1a6 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1477,6 +1477,7 @@ int read_index_from(struct index_state *istate, const 
char *path)
if (verify_hdr(hdr, mmap_size) < 0)
goto unmap;
 
+   hashcpy(istate->sha1, (unsigned char *)hdr + mmap_size - 20);
istate->version = ntohl(hdr->hdr_version);
istate->cache_nr = ntohl(hdr->hdr_entries);
istate->cache_alloc = alloc_nr(istate->cache_nr);
@@ -1760,6 +1761,50 @@ static int ce_write_entry(git_SHA_CTX *c, int fd, struct 
cache_entry *ce,
return result;
 }
 
+/*
+ * This function verifies if index_state has the correct sha1 of an index file.
+ * Don't die if we have any other failure, just return 0.
+ */
+static int verify_index_from(const struct index_state *istate, const char 
*path)
+{
+   int fd;
+   ssize_t n;
+   struct stat st;
+   unsigned char sha1[20];
+
+   if (!istate->initialized)
+   return 0;
+
+   fd = open(path, O_RDONLY);
+   if (fd < 0)
+   return 0;
+
+   if (fstat(fd, &st))
+   goto out;
+
+   if (st.st_size < sizeof(struct cache_header) + 20)
+   goto out;
+
+   n = xpread(fd, sha1, 20, st.st_size - 20);
+   if (n != 20)
+   goto out;
+
+   if (hashcmp(istate->sha1, sha1))
+   goto out;
+
+   close(fd);
+   return 1;
+
+out:
+   close(fd);
+   return 0;
+}
+
+static int verify_index(const struct index_state *istate)
+{
+   return verify_index_from(istate, get_index_file());
+}
+
 static int has_racy_timestamp(struct index_state *istate)
 {
int entries = istate->cache_nr;
@@ -1779,7 +1824,7 @@ static int has_racy_timestamp(struct index_state *istate)
 void update_index_if_able(struct index_state *istate, struct lock_file 
*lockfile)
 {
if ((istate->cache_changed || has_racy_timestamp(istate)) &&
-   !write_index(istate, lockfile->fd))
+   verify_index(istate) && !write_index(istate, lockfile->fd))
commit_locked_index(lockfile);
else
rollback_lock_file(lockfile);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 1/2] Add xpread() and xpwrite()

2014-04-10 Thread Yiannis Marangos
On Thu, Apr 10, 2014 at 11:35:42AM -0700, Junio C Hamano wrote:
> We do not even use pwrite(); please don't add anything unnecessary
> and unexercised, like xpwrite(), as potential bugs in it will go
> unnoticed long after its introduction until it first gets used.

Correct, my mistake.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v8 1/2] Add xpread()

2014-04-10 Thread Yiannis Marangos
xpread() pay attention to EAGAIN/EINTR, so it will resume
automatically on interrupted call.

Signed-off-by: Yiannis Marangos 
---

This is actually the 2nd version of this commit but before I emailed
it as version 7 because it's a dependency of [PATCH v7 2/2]. Is this
the correct way to introduce a new descendant commit? Sorry, It's my
first time that I use mailing lists for contribution.

Version 8 removes xpwrite()

 builtin/index-pack.c |  2 +-
 compat/mmap.c|  4 +---
 git-compat-util.h|  1 +
 wrapper.c| 18 ++
 4 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index b9f6e12..1bac0f5 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -542,7 +542,7 @@ static void *unpack_data(struct object_entry *obj,
 
do {
ssize_t n = (len < 64*1024) ? len : 64*1024;
-   n = pread(pack_fd, inbuf, n, from);
+   n = xpread(pack_fd, inbuf, n, from);
if (n < 0)
die_errno(_("cannot pread pack file"));
if (!n)
diff --git a/compat/mmap.c b/compat/mmap.c
index c9d46d1..7f662fe 100644
--- a/compat/mmap.c
+++ b/compat/mmap.c
@@ -14,7 +14,7 @@ void *git_mmap(void *start, size_t length, int prot, int 
flags, int fd, off_t of
}
 
while (n < length) {
-   ssize_t count = pread(fd, (char *)start + n, length - n, offset 
+ n);
+   ssize_t count = xpread(fd, (char *)start + n, length - n, 
offset + n);
 
if (count == 0) {
memset((char *)start+n, 0, length-n);
@@ -22,8 +22,6 @@ void *git_mmap(void *start, size_t length, int prot, int 
flags, int fd, off_t of
}
 
if (count < 0) {
-   if (errno == EAGAIN || errno == EINTR)
-   continue;
free(start);
errno = EACCES;
return MAP_FAILED;
diff --git a/git-compat-util.h b/git-compat-util.h
index f6d3a46..e6a4159 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -531,6 +531,7 @@ extern void *xcalloc(size_t nmemb, size_t size);
 extern void *xmmap(void *start, size_t length, int prot, int flags, int fd, 
off_t offset);
 extern ssize_t xread(int fd, void *buf, size_t len);
 extern ssize_t xwrite(int fd, const void *buf, size_t len);
+extern ssize_t xpread(int fd, void *buf, size_t len, off_t offset);
 extern int xdup(int fd);
 extern FILE *xfdopen(int fd, const char *mode);
 extern int xmkstemp(char *template);
diff --git a/wrapper.c b/wrapper.c
index 0cc5636..cf71817 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -174,6 +174,24 @@ ssize_t xwrite(int fd, const void *buf, size_t len)
}
 }
 
+/*
+ * xpread() is the same as pread(), but it automatically restarts pread()
+ * operations with a recoverable error (EAGAIN and EINTR). xpread() DOES
+ * NOT GUARANTEE that "len" bytes is read even if the data is available.
+ */
+ssize_t xpread(int fd, void *buf, size_t len, off_t offset)
+{
+   ssize_t nr;
+   if (len > MAX_IO_SIZE)
+   len = MAX_IO_SIZE;
+   while (1) {
+   nr = pread(fd, buf, len, offset);
+   if ((nr < 0) && (errno == EAGAIN || errno == EINTR))
+   continue;
+   return nr;
+   }
+}
+
 ssize_t read_in_full(int fd, void *buf, size_t count)
 {
char *p = buf;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 2/2] Verify index file before we opportunistically update it

2014-04-11 Thread Yiannis Marangos
On Fri, Apr 11, 2014 at 09:47:09AM +0200, Torsten Bögershausen wrote:
>   6. process A applies a commit:
>   - read the index into memory
>   - take the lock
>   - update the index file on disc
>   - release the lock

So here we can have race condition. Maybe we should implement Duy's
idea of verifying the sha1 in write_index()?


> The new code works like this:
>
>   8. process B applies a commit:
>   - take the lock
>   - verifies tha the index file on disc has the same sha as the one read 
> before
>   # And if not: What do we do? die() or retry() ?
>   - update the index file on disc
>   - release the lock

IMO, we must never die() if we want to do opportunistic update and we
should never retry because it's a bit expensive if we do opportunistic update.


--
Yiannis
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 2/2] Verify index file before we opportunistically update it

2014-04-11 Thread Yiannis Marangos
On Fri, Apr 11, 2014 at 01:43:47PM -0700, Junio C Hamano wrote:
> Having said that, nobody sane would be running two simultaneous
> operations that are clearly write-oriented competing with each other
> against the same index file.  So in that sense that can be done as a
> less urgent follow-up for this topic.

I'm willing to take this task, I decide to spend some of my free time
for git development. But since I'm a newbie with it's architecture I
don't know when it will be ready because I want to explore the code a
bit.

--
Yiannis
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html