Re: [PATCH v2 2/2] git_path(): handle `.lock` files correctly

2019-10-22 Thread Junio C Hamano
Johannes Schindelin  writes:

> However, I think this is _really_ ugly and intrusive. The opposite of
> what my goals were.
>
> So I think I'll just bite the bullet and use a temporary copy if the
> argument ends in `.lock`.

Sounds like a quite sensible design decision to me.


Re: [PATCH v2 2/2] git_path(): handle `.lock` files correctly

2019-10-21 Thread Johannes Schindelin
Hi Junio,

On Fri, 18 Oct 2019, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" 
> writes:
>
> > From: Johannes Schindelin 
> >
> > Ever since worktrees were introduced, the `git_path()` function _really_
> > needed to be called e.g. to get at the path to `logs/HEAD` (`HEAD` is
> > specific to the worktree). However, the wrong path is returned for
> > `logs/HEAD.lock`.
> >
> > This does not matter as long as the Git executable is doing the asking,
> > as the path for that `index.lock` file is constructed from
> > `git_path("index")` by appending the `.lock` suffix.
>
> Is this still git_path("index") or is it now HEAD?

Darn. It is "logs/HEAD", of course.

> > Side note: Git GUI _does_ ask for `index.lock`, but that is already
> > resolved correctly.
>
> Is that s/but/and/?

It sounds better with an "and", you're right.

> > diff --git a/path.c b/path.c
> > index e3da1f3c4e..ff85692b45 100644
> > --- a/path.c
> > +++ b/path.c
> > @@ -268,7 +268,7 @@ static int trie_find(struct trie *root, const char 
> > *key, match_fn fn,
> > int result;
> > struct trie *child;
> >
> > -   if (!*key) {
> > +   if (!*key || !strcmp(key, ".lock")) {
>
> We only do strcmp for the tail part at the end of the path, so this
> should probably OK from performance point of view but semantically
> it is not very satisfying to see a special case for a single .suffix
> this deep in the callchain.  I wonder if it is nicer to have the
> higher level callers notice ".lock" or whatever other suffixes they
> care about and ask the lower layer for a key with the suffix
> stripped?

Hmm. The parameter is provided as a `const char *`, so I cannot just
edit it. My first attempt at fixing this resulted in this commit:

-- snip --
path: switch `trie_find()` to a counted string

Before this patch, the `trie_find()` function would be fed a regular,
NUL-terminated string.

However, in the next commit, we want to strip any `.lock` suffix from
the argument, and the argument is provided as a `const char *` (i.e. we
do not want to modify it).

Let's use a string + length pair of parameters instead of a single
NUL-terminated string to open the door for this kind of manipulation.

Signed-off-by: Johannes Schindelin 

diff --git a/path.c b/path.c
index e3da1f3c4e2..b18d552c890 100644
--- a/path.c
+++ b/path.c
@@ -236,7 +236,8 @@ static void *add_to_trie(struct trie *root, const char 
*key, void *value)
return old;
 }

-typedef int (*match_fn)(const char *unmatched, void *data, void *baton);
+typedef int (*match_fn)(const char *unmatched, size_t unmatched_len,
+   void *data, void *baton);

 /*
  * Search a trie for some key.  Find the longest /-or-\0-terminated
@@ -261,22 +262,22 @@ typedef int (*match_fn)(const char *unmatched, void 
*data, void *baton);
  * |-||---|--|
  *
  */
-static int trie_find(struct trie *root, const char *key, match_fn fn,
-void *baton)
+static int trie_find(struct trie *root, const char *key, size_t key_len,
+match_fn fn, void *baton)
 {
int i;
int result;
struct trie *child;

-   if (!*key) {
+   if (!key_len) {
/* we have reached the end of the key */
if (root->value && !root->len)
-   return fn(key, root->value, baton);
+   return fn(key, key_len, root->value, baton);
else
return -1;
}

-   for (i = 0; i < root->len; i++) {
+   for (i = 0; i < key_len && i < root->len; i++) {
/* Partial path normalization: skip consecutive slashes. */
if (key[i] == '/' && key[i+1] == '/') {
key++;
@@ -288,24 +289,26 @@ static int trie_find(struct trie *root, const char *key, 
match_fn fn,

/* Matched the entire compressed section */
key += i;
-   if (!*key)
+   key_len -= i;
+
+   if (!key_len)
/* End of key */
-   return fn(key, root->value, baton);
+   return fn(key, key_len, root->value, baton);

/* Partial path normalization: skip consecutive slashes */
-   while (key[0] == '/' && key[1] == '/')
-   key++;
+   while (key_len > 0 && key[0] == '/' && key[1] == '/')
+   key++, key_len--;

-   child = root->children[(unsigned char)*key];
+   child = root->children[!key_len ? '0' : (unsigned char)*key];
if (child)
-   result = trie_find(child, key + 1, fn, baton);
+   result = trie_find(child, key + 1, key_len - 1, fn, baton);
else
result = -1;

-   if (result >= 0 || (*key != '/' && *key != 0))
+   if (result >= 0 || (key_len && *key != '/'))
return result;
if (root->value)
-   return fn(key, root->value, baton);
+   return fn(k

Re: [PATCH v2 2/2] git_path(): handle `.lock` files correctly

2019-10-18 Thread SZEDER Gábor
On Fri, Oct 18, 2019 at 10:23:00AM +0900, Junio C Hamano wrote:
> > diff --git a/path.c b/path.c
> > index e3da1f3c4e..ff85692b45 100644
> > --- a/path.c
> > +++ b/path.c
> > @@ -268,7 +268,7 @@ static int trie_find(struct trie *root, const char 
> > *key, match_fn fn,
> > int result;
> > struct trie *child;
> >  
> > -   if (!*key) {
> > +   if (!*key || !strcmp(key, ".lock")) {
> 
> We only do strcmp for the tail part at the end of the path, so this
> should probably OK from performance point of view but semantically
> it is not very satisfying to see a special case for a single .suffix
> this deep in the callchain.  I wonder if it is nicer to have the
> higher level callers notice ".lock" or whatever other suffixes they
> care about and ask the lower layer for a key with the suffix
> stripped?

I was wondering the same.  Despite all of its functions being file
static, this is a fairly general trie implementation for paths, which
could potentially be useful in other parts of Git as well.  Special
casing any suffix within its implementation makes it less (re)usable
in other cases.



Re: [PATCH v2 2/2] git_path(): handle `.lock` files correctly

2019-10-17 Thread Junio C Hamano
"Johannes Schindelin via GitGitGadget" 
writes:

> From: Johannes Schindelin 
>
> Ever since worktrees were introduced, the `git_path()` function _really_
> needed to be called e.g. to get at the path to `logs/HEAD` (`HEAD` is
> specific to the worktree). However, the wrong path is returned for
> `logs/HEAD.lock`.
>
> This does not matter as long as the Git executable is doing the asking,
> as the path for that `index.lock` file is constructed from
> `git_path("index")` by appending the `.lock` suffix.

Is this still git_path("index") or is it now HEAD?

> Side note: Git GUI _does_ ask for `index.lock`, but that is already
> resolved correctly.

Is that s/but/and/?

> diff --git a/path.c b/path.c
> index e3da1f3c4e..ff85692b45 100644
> --- a/path.c
> +++ b/path.c
> @@ -268,7 +268,7 @@ static int trie_find(struct trie *root, const char *key, 
> match_fn fn,
>   int result;
>   struct trie *child;
>  
> - if (!*key) {
> + if (!*key || !strcmp(key, ".lock")) {

We only do strcmp for the tail part at the end of the path, so this
should probably OK from performance point of view but semantically
it is not very satisfying to see a special case for a single .suffix
this deep in the callchain.  I wonder if it is nicer to have the
higher level callers notice ".lock" or whatever other suffixes they
care about and ask the lower layer for a key with the suffix
stripped?

Will queue.

Thanks.