> On Fri, 2016-02-19 at 09:54 +0700, Duy Nguyen wrote:
>> On Fri, Feb 19, 2016 at 3:23 AM, David Turner <
>> dtur...@twopensource.com> wrote:
>> > > > +static int read_per_worktree_ref(const char *submodule, const
>> > > > char
>> > > > *refname,
>> > > > +                            struct MDB_val *val, int
>> > > > *needs_free)
>> > >
>> > > From what I read, I suspect these _per_worktree functions will be
>> > > identical for the next backend. Should we just hand over the job
>> > > for
>> > > files backend? For all entry points that may deal with per
>> > > -worktree
>> > > refs, e.g. lmdb_resolve_ref_unsafe, can we check ref_type() first
>> > > thing, if it's per-worktree we call
>> > > refs_be_files.resolve_ref_unsafe()
>> > > instead?  It could even be done at frontend level,
>> > > e.g. refs.c:resolve_ref_unsafe().
>> > >
>> > > Though I may be talking rubbish here because I don't know how
>> > > whether
>> > > it has anything to do with transactions.
>> >
>> > The reason I did it this way is that some ref chains cross backend
>> > boundaries (e.g. HEAD -> refs/heads/master).  But if we have other
>> > backends later, we could generalize.
>>
>> Crossing backends should go through frontend again, imo. But I don't
>> really know if it's efficient.
>
> It's pretty tricky to maintain state (e.g. count of symref redirects)
> across that barrier.  So I'm not sure how to do it cleanly.

I notice files backend does pretty much the same thing. "files"
backend looks more like two backends combined in one, one is files,
the other is packed-refs. And it looks like we could solve it by
providing a lower level api, read_raw_ref() or something, that
retrieves the ref without any validation or link following. More on
this later.

>> > > I'm not sure I get this comment. D/F conflicts are no longer a
>> > > thing
>> > > for lmdb backend, right?
>> >
>> > I'm trying to avoid the lmdb backend creating a set of refs that
>> > the
>> > files backend can't handle.  This would make collaboration with
>> > other
>> > versions of git more difficult.
>>
>> It already is. If you create refs "foo" and "FOO" on case sensitive
>> file system and clone it on a case-insensitive one, you face the same
>> problem. We may have an optional configuration knob to prevent
>> incompatibilities with files backend, but I think that should be done
>> (and enforced if necessary) outside backends.
>
> Sure, the current state isn't perfect, but why make it worse?

I see it from a different angle. The current state isn't perfect, but
we will be moving to a better future where "files" backend may
eventually be deprecated. Why hold back?

But this line of reasoning only works if we have a new backend capable
of replacing "files" without regressions or introducing new
dependency. Which is why I suggest a new backend [1] (or implement
Shawn's RefTree if it's proven as good with small repos)

I have no problem if you want to stay strictly compatible with "files"
though.

[1] http://thread.gmane.org/gmane.comp.version-control.git/285893/focus=286654

On Fri, Feb 19, 2016 at 05:49:46PM -0500, David Turner wrote:
> > 
> > This code looks a lot like near the end of resolve_ref_1(). Maybe we
> > could share the code in refs/backend-common.c or something and call
> > here instead?
> 
> Something like the following?
> 
> commit aad6b84fd1869f6e1cf6ed15bcece0c2f6429e9d
> Author: David Turner <dtur...@twopensource.com>
> Date:   Thu Feb 18 17:09:29 2016 -0500
> 
>     refs: break out some functions from resolve_ref_1
>     
>     A bunch of resolve_ref_1 is not backend-specific, so we can
>     break it out into separate internal functions that other
>     backends can use.
...
> 
> I'm not sure I like it, because it breaks out these weird tiny
> functions that take a lot of arguments.  But maybe it's worth it?  What
> do you think?

OK how about we keep resolve_ref_1() whole and split real backend code
out? Something like these three patches (only built, did not test). A
bit ugly with continue_symlink, but it's just demonstration.

commit ef46fcdc62ef89fd5260ca054cd1d98f9f2d7c2b
Author: Nguyễn Thái Ngọc Duy <pclo...@gmail.com>
Date:   Sat Feb 20 09:18:45 2016 +0700

    refs/files: move ref I/O code out of resolve_refs_1()

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 4bddfb3..f54f2ae 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1407,6 +1407,95 @@ static int resolve_missing_loose_ref(const char *refname,
        }
 }
 
+static const char *continue_normal_ref = "read_ref returns a normal ref";
+static const char *continue_symlink = "read_ref returns a symlink";
+
+/*
+ * Read a ref from backend. Returning any values except
+ * continue_normal_ref or continue_symlink ends resolve_ref_1()
+ * execution. If the return value is not NULL, sha1 and flags must be
+ * updated correctly. except REF_ISBROKEN which is set by
+ * resolve_ref_1().
+ *
+ * If continue_* is returned, sb_contents must contain the ref data.
+ */
+static const char *parse_ref(const char *refname,
+                            int resolve_flags,
+                            unsigned char *sha1,
+                            int *flags,
+                            struct strbuf *sb_path,
+                            struct strbuf *sb_contents)
+{
+       const char *path;
+       struct stat st;
+       int fd;
+
+       strbuf_reset(sb_path);
+       strbuf_git_path(sb_path, "%s", refname);
+       path = sb_path->buf;
+
+       /*
+        * We might have to loop back here to avoid a race
+        * condition: first we lstat() the file, then we try
+        * to read it as a link or as a file.  But if somebody
+        * changes the type of the file (file <-> directory
+        * <-> symlink) between the lstat() and reading, then
+        * we don't want to report that as an error but rather
+        * try again starting with the lstat().
+        */
+stat_ref:
+       if (lstat(path, &st) < 0) {
+               if (errno != ENOENT)
+                       return NULL;
+               if (resolve_missing_loose_ref(refname, resolve_flags,
+                                             sha1, flags))
+                       return NULL;
+               return refname;
+       }
+
+       /* Follow "normalized" - ie "refs/.." symlinks by hand */
+       if (S_ISLNK(st.st_mode)) {
+               strbuf_reset(sb_contents);
+               if (strbuf_readlink(sb_contents, path, 0) < 0) {
+                       if (errno == ENOENT || errno == EINVAL)
+                               /* inconsistent with lstat; retry */
+                               goto stat_ref;
+                       else
+                               return NULL;
+               }
+               return continue_symlink;
+       }
+
+       /* Is it a directory? */
+       if (S_ISDIR(st.st_mode)) {
+               errno = EISDIR;
+               return NULL;
+       }
+
+       /*
+        * Anything else, just open it and try to use it as
+        * a ref
+        */
+       fd = open(path, O_RDONLY);
+       if (fd < 0) {
+               if (errno == ENOENT)
+                       /* inconsistent with lstat; retry */
+                       goto stat_ref;
+               else
+                       return NULL;
+       }
+       strbuf_reset(sb_contents);
+       if (strbuf_read(sb_contents, fd, 256) < 0) {
+               int save_errno = errno;
+               close(fd);
+               errno = save_errno;
+               return NULL;
+       }
+       close(fd);
+
+       return continue_normal_ref;
+}
+
 /* This function needs to return a meaningful errno on failure */
 static const char *resolve_ref_1(const char *refname,
                                 int resolve_flags,
@@ -1442,54 +1531,18 @@ static const char *resolve_ref_1(const char *refname,
                bad_name = 1;
        }
        for (;;) {
-               const char *path;
-               struct stat st;
+               const char *ret;
                char *buf;
-               int fd;
 
                if (--depth < 0) {
                        errno = ELOOP;
                        return NULL;
                }
 
-               strbuf_reset(sb_path);
-               strbuf_git_path(sb_path, "%s", refname);
-               path = sb_path->buf;
+               ret = parse_ref(refname, resolve_flags, sha1,
+                               flags, sb_path, sb_contents);
 
-               /*
-                * We might have to loop back here to avoid a race
-                * condition: first we lstat() the file, then we try
-                * to read it as a link or as a file.  But if somebody
-                * changes the type of the file (file <-> directory
-                * <-> symlink) between the lstat() and reading, then
-                * we don't want to report that as an error but rather
-                * try again starting with the lstat().
-                */
-       stat_ref:
-               if (lstat(path, &st) < 0) {
-                       if (errno != ENOENT)
-                               return NULL;
-                       if (resolve_missing_loose_ref(refname, resolve_flags,
-                                                     sha1, flags))
-                               return NULL;
-                       if (bad_name) {
-                               hashclr(sha1);
-                               if (flags)
-                                       *flags |= REF_ISBROKEN;
-                       }
-                       return refname;
-               }
-
-               /* Follow "normalized" - ie "refs/.." symlinks by hand */
-               if (S_ISLNK(st.st_mode)) {
-                       strbuf_reset(sb_contents);
-                       if (strbuf_readlink(sb_contents, path, 0) < 0) {
-                               if (errno == ENOENT || errno == EINVAL)
-                                       /* inconsistent with lstat; retry */
-                                       goto stat_ref;
-                               else
-                                       return NULL;
-                       }
+               if (ret == continue_symlink) {
                        if (starts_with(sb_contents->buf, "refs/") &&
                            !check_refname_format(sb_contents->buf, 0)) {
                                strbuf_swap(sb_refname, sb_contents);
@@ -1502,35 +1555,10 @@ static const char *resolve_ref_1(const char *refname,
                                }
                                continue;
                        }
-               }
-
-               /* Is it a directory? */
-               if (S_ISDIR(st.st_mode)) {
-                       errno = EISDIR;
-                       return NULL;
-               }
-
-               /*
-                * Anything else, just open it and try to use it as
-                * a ref
-                */
-               fd = open(path, O_RDONLY);
-               if (fd < 0) {
-                       if (errno == ENOENT)
-                               /* inconsistent with lstat; retry */
-                               goto stat_ref;
-                       else
-                               return NULL;
-               }
-               strbuf_reset(sb_contents);
-               if (strbuf_read(sb_contents, fd, 256) < 0) {
-                       int save_errno = errno;
-                       close(fd);
-                       errno = save_errno;
-                       return NULL;
-               }
-               close(fd);
-               strbuf_rtrim(sb_contents);
+               } else if (ret == refname)
+                       break;
+               else if (ret != continue_normal_ref)
+                       return ret;
 
                /*
                 * Is it a symbolic ref?
@@ -1547,12 +1575,7 @@ static const char *resolve_ref_1(const char *refname,
                                errno = EINVAL;
                                return NULL;
                        }
-                       if (bad_name) {
-                               hashclr(sha1);
-                               if (flags)
-                                       *flags |= REF_ISBROKEN;
-                       }
-                       return refname;
+                       break;
                }
                if (flags)
                        *flags |= REF_ISSYMREF;
@@ -1578,6 +1601,13 @@ static const char *resolve_ref_1(const char *refname,
                        bad_name = 1;
                }
        }
+
+       if (bad_name) {
+               hashclr(sha1);
+               if (flags)
+                       *flags |= REF_ISBROKEN;
+       }
+       return refname;
 }
 
 static const char *files_resolve_ref_unsafe(const char *refname,

After this resolve_ref_1() is backend independent. So we can make it
take parse_ref() as a function pointer instead.

commit 50d96b6f79b30b5ba17fa00ec3ee42845546a123
Author: Nguyễn Thái Ngọc Duy <pclo...@gmail.com>
Date:   Sat Feb 20 09:22:03 2016 +0700

    refs/files-backend.c: let resolve_refs_1() accept parse_ref as callback

diff --git a/refs/files-backend.c b/refs/files-backend.c
index f54f2ae..9b4de9f 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1424,7 +1424,8 @@ static const char *parse_ref(const char *refname,
                             unsigned char *sha1,
                             int *flags,
                             struct strbuf *sb_path,
-                            struct strbuf *sb_contents)
+                            struct strbuf *sb_contents,
+                            void *cb_data)
 {
        const char *path;
        struct stat st;
@@ -1497,13 +1498,21 @@ stat_ref:
 }
 
 /* This function needs to return a meaningful errno on failure */
-static const char *resolve_ref_1(const char *refname,
-                                int resolve_flags,
-                                unsigned char *sha1,
-                                int *flags,
-                                struct strbuf *sb_refname,
-                                struct strbuf *sb_path,
-                                struct strbuf *sb_contents)
+const char *resolve_ref_1(const char *refname,
+                         int resolve_flags,
+                         unsigned char *sha1,
+                         int *flags,
+                         struct strbuf *sb_refname,
+                         struct strbuf *sb_path,
+                         struct strbuf *sb_contents,
+                         const char *(*parse_ref)(const char *refname,
+                                                  int resolve_flags,
+                                                  unsigned char *sha1,
+                                                  int *flags,
+                                                  struct strbuf *sb_path,
+                                                  struct strbuf *sb_contents,
+                                                  void *cb_data),
+                         void *cb_data)
 {
        int depth = MAXDEPTH;
        int bad_name = 0;
@@ -1540,7 +1549,8 @@ static const char *resolve_ref_1(const char *refname,
                }
 
                ret = parse_ref(refname, resolve_flags, sha1,
-                               flags, sb_path, sb_contents);
+                               flags, sb_path, sb_contents,
+                               cb_data);
 
                if (ret == continue_symlink) {
                        if (starts_with(sb_contents->buf, "refs/") &&
@@ -1621,7 +1631,8 @@ static const char *files_resolve_ref_unsafe(const char 
*refname,
        const char *ret;
 
        ret = resolve_ref_1(refname, resolve_flags, sha1, flags,
-                           &sb_refname, &sb_path, &sb_contents);
+                           &sb_refname, &sb_path, &sb_contents,
+                           parse_ref, NULL);
        strbuf_release(&sb_path);
        strbuf_release(&sb_contents);
        return ret;

And finally we can make lmdb use resolve_ref_1(). lmdb-specific code
is in the new retrieve_ref() function.

commit 62a5df3117c0f825bc26fd09dda29e713f94d743 (HEAD -> lmdb)
Author: Nguyễn Thái Ngọc Duy <pclo...@gmail.com>
Date:   Sat Feb 20 09:33:01 2016 +0700

    refs/lmdb-backend: use resolve_ref_1()

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 9b4de9f..44b7136 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1407,7 +1407,7 @@ static int resolve_missing_loose_ref(const char *refname,
        }
 }
 
-static const char *continue_normal_ref = "read_ref returns a normal ref";
+const char *continue_normal_ref = "read_ref returns a normal ref";
 static const char *continue_symlink = "read_ref returns a symlink";
 
 /*
diff --git a/refs/lmdb-backend.c b/refs/lmdb-backend.c
index 6c0d7fb..7f169bd 100644
--- a/refs/lmdb-backend.c
+++ b/refs/lmdb-backend.c
@@ -544,74 +544,64 @@ done:
        return ret;
 }
 
+extern const char *continue_normal_ref;
+
+static const char *retrieve_ref(const char *refname,
+                               int resolve_flags,
+                               unsigned char *sha1,
+                               int *flags,
+                               struct strbuf *sb_path,
+                               struct strbuf *sb_contents,
+                               void *cb_data)
+{
+       struct lmdb_transaction *transaction = cb_data;
+       MDB_val key, val;
+       int needs_free;         /* dont care, leak */
+
+       key.mv_data = (void *)refname;
+       key.mv_size = strlen(refname) + 1;
+
+       val.mv_data = NULL;
+       val.mv_size = 0;
+
+       if (mdb_get_or_die(transaction, &key, &val, &needs_free)) {
+               struct strbuf err = STRBUF_INIT;
+
+               if (resolve_flags & RESOLVE_REF_READING)
+                       return NULL;
+
+               if (verify_refname_available_txn(transaction,
+                                                refname, NULL, NULL, &err)) {
+                       error("%s", err.buf);
+                       strbuf_release(&err);
+                       return NULL;
+               }
+
+               hashclr(sha1);
+               return refname;
+       }
+
+       strbuf_reset(sb_contents);
+       strbuf_add(sb_contents, val.mv_data, val.mv_size);
+       return continue_normal_ref;
+}
+
 static const char *resolve_ref_unsafe_txn(struct lmdb_transaction *transaction,
                                          const char *refname,
                                          int resolve_flags,
                                          unsigned char *sha1,
                                          int *flags)
 {
-       int bad_name = 0;
-       char *ref_data;
-       struct MDB_val key, val;
-       struct strbuf err = STRBUF_INIT;
-       int needs_free = 0;
+       static struct strbuf sb_refname = STRBUF_INIT;
+       struct strbuf sb_contents = STRBUF_INIT;
+       struct strbuf sb_path = STRBUF_INIT;
        const char *ret;
 
-       val.mv_size = 0;
-       val.mv_data = NULL;
-
-       if (flags)
-               *flags = 0;
-
-       if (check_refname_format(refname, REFNAME_ALLOW_ONELEVEL)) {
-               if (flags)
-                       *flags |= REF_BAD_NAME;
-
-               if (!(resolve_flags & RESOLVE_REF_ALLOW_BAD_NAME) ||
-                   !refname_is_safe(refname)) {
-                       errno = EINVAL;
-                       return NULL;
-               }
-               /*
-                * dwim_ref() uses REF_ISBROKEN to distinguish between
-                * missing refs and refs that were present but invalid,
-                * to complain about the latter to stderr.
-                *
-                * We don't know whether the ref exists, so don't set
-                * REF_ISBROKEN yet.
-                */
-               bad_name = 1;
-       }
-
-       key.mv_data = (void *)refname;
-       key.mv_size = strlen(refname) + 1;
-       if (mdb_get_or_die(transaction, &key, &val, &needs_free)) {
-               if (bad_name) {
-                       hashclr(sha1);
-                       if (flags)
-                               *flags |= REF_ISBROKEN;
-               }
-
-               if (resolve_flags & RESOLVE_REF_READING)
-                       return NULL;
-
-               if (verify_refname_available_txn(transaction, refname, NULL, 
NULL, &err)) {
-                       error("%s", err.buf);
-                       strbuf_release(&err);
-                       return NULL;
-               }
-
-               hashclr(sha1);
-               return refname;
-       }
-
-       ref_data = val.mv_data;
-       assert(ref_data[val.mv_size - 1] == 0);
-
-       ret = parse_ref_data(transaction, refname, ref_data, sha1,
-                            resolve_flags, flags, bad_name);
-       if (needs_free)
-               free(ref_data);
+       ret = resolve_ref_1(refname, resolve_flags, sha1, flags,
+                           &sb_refname, &sb_path, &sb_contents,
+                           retrieve_ref, transaction);
+       strbuf_release(&sb_path);
+       strbuf_release(&sb_contents);
        return ret;
 } 

lmdb-backend.c:retrieve_ref(), files-backend.c:parse_ref() can be made
part of ref api that, given a ref name, returns the ref raw data and
type. The frontend can decide what backend callback to use based on
refname, so retrieve_ref() in the end does not have to call
read_per_worktree_ref() internally anymore.

Hmm?
--
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to