Re: Creating own hierarchies under $GITDIR/refs ?

2014-02-03 Thread Andreas Schwab
David Kastrup d...@gnu.org writes:

 Are there some measures one can take/configure in the parent repository
 such that (named or all) additional directories inside of $GITDIR/refs
 would get cloned along with the rest?

$ git config --add remote.orgin.fetch '+refs/notes/*:refs/notes/*'

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Creating own hierarchies under $GITDIR/refs ?

2014-02-02 Thread Duy Nguyen
On Sun, Feb 2, 2014 at 5:37 PM, David Kastrup d...@gnu.org wrote:
 in the context of an ongoing discussion on the Emacs developer list of
 converting the Bzr repository of Emacs, one question (with different
 approaches) is where to put the information regarding preexisting Bazaar
 revision numbers and bug tracker ids: those are not present in the
 current Git mirror.

 Putting them in the commit messages would require a full history
 rewrite, and if some are missed in the process, this cannot be fixed
 afterwards.

What do you need them for? Perhaps putting everything in a file, maybe
sorted by SHA-1, would suffice? It should not be too hard to write a
script to map bug tracker id to a commit id. The file is for past
commits only. New commits can contain these info in their messages.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Creating own hierarchies under $GITDIR/refs ?

2014-02-02 Thread John Keeping
On Sun, Feb 02, 2014 at 12:19:43PM +0100, David Kastrup wrote:
 Duy Nguyen pclo...@gmail.com writes:
 
  The file is for past commits only.
 
  New commits can contain these info in their messages.
 
 If it's not forgotten.  Experience shows that things like issue numbers
 have a tendency to be omitted, and then they stay missing.
 
 At any rate, this is exactly the kind of stuff that tags are useful for,
 except that using them for all that would render the tag space
 overcrowded.

Actually, I would say this is exactly the sort of thing notes are for.

git.git uses them to map commits back to mailing list discussions:

git fetch git://github.com/gitster/git +refs/notes/amlog:refs/notes/amlog 
git log --notes=amlog

See also notes.displayRef in git-config(1).

Notes aren't fetch by default, but it's not hard for those interested to
add a remote.*.fetch line to their config.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Creating own hierarchies under $GITDIR/refs ?

2014-02-02 Thread David Kastrup
John Keeping j...@keeping.me.uk writes:

 On Sun, Feb 02, 2014 at 12:19:43PM +0100, David Kastrup wrote:
 Duy Nguyen pclo...@gmail.com writes:
 
  The file is for past commits only.
 
  New commits can contain these info in their messages.
 
 If it's not forgotten.  Experience shows that things like issue numbers
 have a tendency to be omitted, and then they stay missing.
 
 At any rate, this is exactly the kind of stuff that tags are useful for,
 except that using them for all that would render the tag space
 overcrowded.

 Actually, I would say this is exactly the sort of thing notes are for.

 git.git uses them to map commits back to mailing list discussions:

But that's the wrong direction.  What is needed in the Emacs case is
mapping the Bazaar reference numbers (and bug numbers) to commits.

While it is true that the history rewriting approach would not deliver
this either (short of git log --grep with suitable patterns), I was
looking for something less of a crutch here.

 Notes aren't fetch by default, but it's not hard for those interested
 to add a remote.*.fetch line to their config.

If we are talking about measures everybody has to actively take before
getting access to functionality, this does not cross the convenience
threshold making it a solution preferred over others.  But it's probably
feasible to configure a fetch line doing this that will get cloned when
first cloning a repository.  That's not too hot for people with existing
repositories, but since we are talking about a migration from Bazaar
anyway, Git users currently are so by choice and so might be more
willing to update their configuration if it helps with avoiding a fully
new clone.

-- 
David Kastrup
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Creating own hierarchies under $GITDIR/refs ?

2014-02-02 Thread Duy Nguyen
On Sun, Feb 2, 2014 at 6:19 PM, David Kastrup d...@gnu.org wrote:
 Since Git has a working facility for references that is catered to do
 exactly this kind of mapping and already _does_, it seems like a
 convenient path to explore.

It will not scale. If you make those refs available for
cloning/fetching, all of them will be advertised first thing when git
starts negotiate. Imagine thousands of refs (and keep increasing) sent
to the receiver at the beginning of every connection. Something like
reverse git-notes may transfer more efficiently. Or we need to
improve git protocol to handle massive refs better, something that's
been discussed for a while without any outcome.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Creating own hierarchies under $GITDIR/refs ?

2014-02-02 Thread David Kastrup
Duy Nguyen pclo...@gmail.com writes:

 On Sun, Feb 2, 2014 at 6:19 PM, David Kastrup d...@gnu.org wrote:
 Since Git has a working facility for references that is catered to do
 exactly this kind of mapping and already _does_, it seems like a
 convenient path to explore.

 It will not scale. If you make those refs available for
 cloning/fetching, all of them will be advertised first thing when git
 starts negotiate. Imagine thousands of refs (and keep increasing) sent
 to the receiver at the beginning of every connection.

In current LilyPond repository:
git tag|wc
969 969   15161

In current Emacs mirror:
git tag|wc
   12021202   15729

In current Git repository:
git tag|wc
498 4984820

 Something like reverse git-notes may transfer more efficiently. Or
 we need to improve git protocol to handle massive refs better,
 something that's been discussed for a while without any outcome.

I think that even disregarding special use of references, _existing_
practice would already appear to warrant being able to deal with
thousands of refs in a reasonable manner.

It's a reasonable expectation to have a tag per (potentially
intermediate) release or release candidate.  For any project publishing
reproducible daily snapshots, the threshold of 1000 will get reached
within few years.

Of course, it is relevant information to know that right _now_
references will not scale.  But that does not seem like a defensible
long-term perspective.

-- 
David Kastrup
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Creating own hierarchies under $GITDIR/refs ?

2014-02-02 Thread John Keeping
On Sun, Feb 02, 2014 at 12:42:52PM +0100, David Kastrup wrote:
 John Keeping j...@keeping.me.uk writes:
 
  On Sun, Feb 02, 2014 at 12:19:43PM +0100, David Kastrup wrote:
  Duy Nguyen pclo...@gmail.com writes:
  
   The file is for past commits only.
  
   New commits can contain these info in their messages.
  
  If it's not forgotten.  Experience shows that things like issue numbers
  have a tendency to be omitted, and then they stay missing.
  
  At any rate, this is exactly the kind of stuff that tags are useful for,
  except that using them for all that would render the tag space
  overcrowded.
 
  Actually, I would say this is exactly the sort of thing notes are for.
 
  git.git uses them to map commits back to mailing list discussions:
 
 But that's the wrong direction.  What is needed in the Emacs case is
 mapping the Bazaar reference numbers (and bug numbers) to commits.

Ah, OK.  I hadn't quite read carefully enough.

I actually wonder if you could do this with notes and git-grep; for
example:

git grep -l keeping.me.uk refs/notes/amlog |
sed -e 's/.*://' -e 's!/!!g'

That should be relatively efficient since you're only looking at the
current notes tree.

 While it is true that the history rewriting approach would not deliver
 this either (short of git log --grep with suitable patterns), I was
 looking for something less of a crutch here.
 
  Notes aren't fetch by default, but it's not hard for those interested
  to add a remote.*.fetch line to their config.
 
 If we are talking about measures everybody has to actively take before
 getting access to functionality, this does not cross the convenience
 threshold making it a solution preferred over others.  But it's probably
 feasible to configure a fetch line doing this that will get cloned when
 first cloning a repository.

I'm assuming you'll need some form of tool (at least a script) to
manipulate this feature; it wouldn't be too hard for that to set this up
the first time it's run.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Creating own hierarchies under $GITDIR/refs ?

2014-02-02 Thread Jeff King
On Sun, Feb 02, 2014 at 11:37:39AM +0100, David Kastrup wrote:

 So I mused: refs/heads contains branches, refs/tags contains tags.  The
 respective information would likely easily enough be stored in refs/bzr
 and refs/bugs and in that manner would not pollute the ordinary tag
 and branch spaces, rendering git tag and/or git branch output mostly
 unusable.  I tested creating such a directory and entries and indeed
 references like bzr/39005 then worked.

Yes. The names refs/tags and refs/heads are special by convention,
and there is no reason you cannot have other hierarchies (and indeed, we
already have refs/notes and refs/remotes as common hierarchies).

 However, cloning from the repository did not copy those directories and
 references, so without modification, this scheme would not work for
 cloned repositories.

Correct. Anyone who wants them will have to ask for them manually, like:

  git config --add remote.origin.fetch '+refs/bzr/*:refs/bzr/*'

after which any git fetch will retrieve them.

 Are there some measures one can take/configure in the parent repository
 such that (named or all) additional directories inside of $GITDIR/refs
 would get cloned along with the rest?

No. It is up to the client to decide which parts of the ref namespace
they want to fetch. The server only advertises what it has, and the
client selects from that.


Others mentioned that refs were never really intended to scale to
one-per-commit. We serve some repositories with tens of thousands of
refs from GitHub, and it does work. On the backend, we even have some
repos in the hundreds of thousands (but these are not client facing).
Most of the pain points (like O(n^2) loops) have been ironed out, but
the two big ones are still:

  - server ref advertisement lists _all_ refs at the start of the
conversation. So, e.g.,

git fetch git://github.com/Homebrew/homebrew.git

sends 2MB of advertisement just so a client can find out nope,
nothing to fetch.

  - the packed-refs storage is rather monolithic. Reading a value from
it currently requires parsing the whole file. Likewise, deleting a
ref requires rewriting the whole file.

So what you are proposing will work, but do note that there is a cost.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Creating own hierarchies under $GITDIR/refs ?

2014-02-02 Thread Jed Brown
John Keeping j...@keeping.me.uk writes:
 I actually wonder if you could do this with notes and git-grep; for
 example:

 git grep -l keeping.me.uk refs/notes/amlog |
 sed -e 's/.*://' -e 's!/!!g'

 That should be relatively efficient since you're only looking at the
 current notes tree.

I added notes handling to gitifyhg and would search it similar to this.
Since gitifyhg is two-way, I could not modify the commits.  Later, when
we converted several repositories (up to 50k commits/80 MB), I appended

  Hg-commit: $Hg_commit_hash

to all the commit messages.  This way it shows up on the web interface,
users don't have to obtain the notes specially, and git log --grep
works naturally.  I think it's worth considering this simple solution;
existing Git users won't mind recloning once.


pgp3uBjty3sk1.pgp
Description: PGP signature