Re: git-note -C changes commit type?

2014-02-12 Thread Joachim Breitner
Dear Johan,

Am Mittwoch, den 12.02.2014, 00:52 +0100 schrieb Johan Herland:
 On Tue, Feb 11, 2014 at 6:23 PM, Joachim Breitner
 m...@joachim-breitner.de wrote:
  judging from the documentation I got the impression that I can pass any
  git object has to git note -C hash and it would stored as-is. But it
  seems the objects gets mangled somehow...
 
 ...well... the documentation does not say any object, it actually
 explicitly says blob object... ;)

ok, my bad; guess I’m not fully versed with gits terminology.

 You would have a notes ref refs/notes/history whose tree would
 contain an entry named e1bfac434ebd3135a3784f6fc802f235098eebd0
 pointing to a _commit_ (3d7de37...). Obviously, it would not make
 sense to use refs/notes/history while displaying the commit log (git
 log --notes=history), as the raw commit object would be shown in the
 log. However, more fundamentally: a tree referring to a _commit_ is
 usually how Git stores _submodule_ links (i.e. which revision of the
 named submodule is to be used by this super-repo tree), and I'm (off
 the top of my head) not at all sure that such a submodule link in a
 notes tree is handled sanely by Git - or even that it makes sense at
 all. For one, I'm not sure that Git requires (or even expects) the
 commit object referenced by a tree to be present in the same object
 DB. So if you share your notes, I don't know whether or not the
 fetch/push machinery will include the commit object in the shared
 notes... These are questions that should be answered before we decide
 whether using commits directly as notes makes sense.

If that is the case, then my approach is indeed flawed. The main point
of the exercise is to have a tree that follows another commit (or, as a
next-best approximation, a note attached to that commit) around.

 In that case, you might be better off using an explicit
 ref to keep that history alive; e.g. you could create
 refs/history/e1bfac4 to point to 3d7de37 (git update-ref
 refs/history/e1bfac4 3d7de37), and keep everything
 alive/reachable/shareable that way...

That’s an interesting idea; instead of relying on the notes feature
putting the hash in the ref name. But I wonder how that scales – imagine
every second feature merged into Linux¹ also having such a history ref? 

I guess having a way for a tree to reference commits in a way that is
followed by git gc, i.e. separate from submodules, would allow a less
noisy implementation, and possibly create the opportunity for many other
strange uses of git :-)

Greetings,
Joachim

¹ I’m not proposing for anyone else but me to use this, at the moment,
don’t worry :-). But I am considering to use it in the context of GHC,
which isn’t a small project either.

-- 
Joachim “nomeata” Breitner
  m...@joachim-breitner.de • http://www.joachim-breitner.de/
  Jabber: nome...@joachim-breitner.de  • GPG-Key: 0x4743206C
  Debian Developer: nome...@debian.org


signature.asc
Description: This is a digitally signed message part


Re: [PATCH 00/11] More preparatory work for multiparent tree-walker

2014-02-12 Thread Kirill Smelkov
On Tue, Feb 11, 2014 at 11:59:02AM -0800, Junio C Hamano wrote:
 Kirill Smelkov k...@mns.spb.ru writes:
 
  Sorry for the confusion. Could you please do the following:
 
  Patches should be applied over to ks/tree-diff-walk
  (74aa4a18). Before applying the patches, please cherry-pick
 
  c90483d9(tree-diff: no need to manually verify that there is no
   mode change for a path)
 
  ef4f0928(tree-diff: no need to pass match to
   skip_uninteresting())
 
  into that branch, and drop them from ks/combine-diff - we'll have one
  branch reworking the diff tree-walker, and the other taking advantage of
  it for combine-diff.
 
 As long as that does not lose changes to tests and clean-ups, I'm
 fine with that direction.  For example, I do not know if you want to
 lose e3f62d12 (diffcore-order: export generic ordering interface,
 2014-01-20), which is part of the combine-diff topic.

Sorry for the confusion again, and please don't worry: we are not going
to lose anything - my only plea here was to transfer two of the patches
to more appropriate topic.

That couple touches tree-diff.c - they were some initial cleanups I've
noticed while working on separate combine-diff tree-walker, which we
decided to drop instead of generalizing diff tree-walker to handle all
cases. Only the cleanups are still relevant and needed as a base for
what I've sent you here.

And as to e3f62d12 (diffcore-order: export generic ordering interface,
2014-01-20) and other patches on ks/diff-c-with-diff-order topic - they
stay as they are - we do not need to rework them as ks/combine-diff
builds on top of the topic and generalizing diff tree-walker is
orthogonal work.


So in short: could you please transform the two tree-diff patches from
ks/combine-diff to ks/tree-diff-walk, then apply sent-here patches to
ks/tree-diff-walk; thats all.


Thanks,
Kirill
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git-note -C changes commit type?

2014-02-12 Thread Johan Herland
On Wed, Feb 12, 2014 at 1:06 AM, Junio C Hamano gits...@pobox.com wrote:
 Johan Herland jo...@herland.net writes:
 There is currently no way the git notes commands will allow you to
 store the 3d7de37 commit object directly as a note. There is also
 (AFAICS) no easy workaround (git fast-import could've been a
 workaround if it did not already require the first N/notemodify
 argument to be a blob object). The best alternative, off the top of my
 head, would be to write your own program using the notes.h API to
 manipulate the notes tree directly (or - suboptimally - use other
 low-level Git operations to do the same).

 Even worse. I do not think such a non-blob object in the notes tree
 does not participate in the reachability at all, so you won't be
 able to fetch refs/notes/whatever and expect to get a useful
 result.

s/non-blob/non-(blob-or-tree)/

Any object type that is deemed reachable by reference from a regular
git tree object will also be usable (as far as reachability goes) in a
notes tree.

 I do not think storing the raw bits of commit object as a
 blob in the notes tree is useful behaviour, either.  The command
 probably should refuse to get anything non-blob via that option.

Patch coming up...

 Perhaps the notes entry should just note the object name of whatever
 commit it wants to refer to in a *blob*?

Agreed.

...Johan

-- 
Johan Herland, jo...@herland.net
www.herland.net
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] notes: Disallow reusing non-blob as a note object

2014-02-12 Thread Johan Herland
Currently git notes add -C $object will read the raw bytes from $object,
and then copy those bytes into the note object, which is hardcoded to be
of type blob. This means that if the given $object is a non-blob (e.g.
tree or commit), the raw bytes from that object is copied into a blob
object. This is probably not useful, and certainly not what any sane
user would expect. So disallow it, by erroring out if the $object passed
to the -C option is not a blob.

The fix also applies to the -c option (in which the user is prompted to
edit/verify the note contents in a text editor), and also when -c/-C is
passed to git notes append (which appends the $object contents to an
existing note object). In both cases, passing a non-blob $object does not
make sense.

Also add a couple of tests demonstrating expected behavior.

Suggested-by: Junio C Hamano gits...@pobox.com
Signed-off-by: Johan Herland jo...@herland.net
---
 builtin/notes.c  |  6 +-
 t/t3301-notes.sh | 27 +++
 2 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/builtin/notes.c b/builtin/notes.c
index 2b24d05..bb89930 100644
--- a/builtin/notes.c
+++ b/builtin/notes.c
@@ -269,7 +269,11 @@ static int parse_reuse_arg(const struct option *opt, const 
char *arg, int unset)
die(_(Failed to resolve '%s' as a valid ref.), arg);
if (!(buf = read_sha1_file(object, type, len)) || !len) {
free(buf);
-   die(_(Failed to read object '%s'.), arg);;
+   die(_(Failed to read object '%s'.), arg);
+   }
+   if (type != OBJ_BLOB) {
+   free(buf);
+   die(_(Cannot read note data from non-blob object '%s'.), arg);
}
strbuf_add((msg-buf), buf, len);
free(buf);
diff --git a/t/t3301-notes.sh b/t/t3301-notes.sh
index 16de05a..3bb79a4 100755
--- a/t/t3301-notes.sh
+++ b/t/t3301-notes.sh
@@ -812,6 +812,33 @@ test_expect_success 'create note from non-existing note 
with git notes add -C
test_must_fail git notes list HEAD
 '
 
+test_expect_success 'create note from non-blob with git notes add -C fails' '
+   commit=$(git rev-parse --verify HEAD) 
+   tree=$(git rev-parse --verify HEAD:) 
+   test_must_fail git notes add -C $commit 
+   test_must_fail git notes add -C $tree 
+   test_must_fail git notes list HEAD
+'
+
+cat  expect  EOF
+commit 80d796defacd5db327b7a4e50099663902fbdc5c
+Author: A U Thor aut...@example.com
+Date:   Thu Apr 7 15:20:13 2005 -0700
+
+8th
+
+Notes (other):
+This is a blob object
+EOF
+
+test_expect_success 'create note from blob with git notes add -C reuses blob 
id' '
+   blob=$(echo This is a blob object | git hash-object -w --stdin) 
+   git notes add -C $blob 
+   git log -1  actual 
+   test_cmp expect actual 
+   test $(git notes list HEAD) = $blob
+'
+
 cat  expect  EOF
 commit 016e982bad97eacdbda0fcbd7ce5b0ba87c81f1b
 Author: A U Thor aut...@example.com
-- 
1.8.4.653.g2df02b3

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git-note -C changes commit type?

2014-02-12 Thread Johan Herland
On Wed, Feb 12, 2014 at 9:53 AM, Joachim Breitner
m...@joachim-breitner.de wrote:
 Am Mittwoch, den 12.02.2014, 00:52 +0100 schrieb Johan Herland:
 You would have a notes ref refs/notes/history whose tree would
 contain an entry named e1bfac434ebd3135a3784f6fc802f235098eebd0
 pointing to a _commit_ (3d7de37...). Obviously, it would not make
 sense to use refs/notes/history while displaying the commit log (git
 log --notes=history), as the raw commit object would be shown in the
 log. However, more fundamentally: a tree referring to a _commit_ is
 usually how Git stores _submodule_ links (i.e. which revision of the
 named submodule is to be used by this super-repo tree), and I'm (off
 the top of my head) not at all sure that such a submodule link in a
 notes tree is handled sanely by Git - or even that it makes sense at
 all. For one, I'm not sure that Git requires (or even expects) the
 commit object referenced by a tree to be present in the same object
 DB. So if you share your notes, I don't know whether or not the
 fetch/push machinery will include the commit object in the shared
 notes... These are questions that should be answered before we decide
 whether using commits directly as notes makes sense.

 If that is the case, then my approach is indeed flawed. The main point
 of the exercise is to have a tree that follows another commit (or, as a
 next-best approximation, a note attached to that commit) around.

 In that case, you might be better off using an explicit
 ref to keep that history alive; e.g. you could create
 refs/history/e1bfac4 to point to 3d7de37 (git update-ref
 refs/history/e1bfac4 3d7de37), and keep everything
 alive/reachable/shareable that way...

 That’s an interesting idea; instead of relying on the notes feature
 putting the hash in the ref name. But I wonder how that scales – imagine
 every second feature merged into Linux¹ also having such a history ref?

Ah, that will probably not scale very well.

 I guess having a way for a tree to reference commits in a way that is
 followed by git gc, i.e. separate from submodules, would allow a less
 noisy implementation, and possibly create the opportunity for many other
 strange uses of git :-)

Here's another way to solve your problem, which should be fairly
transparent and performant:

Whenever you want to reference history of a commit (I'm using quotes
here, because we're not talking about the regular git sense of
history, i.e. the commit graph), you perform the following two steps:

1. Append the historical commit SHA-1 (3d7de37 in your example) to a
note on the current commit (e1bfac4). E.g.:

git notes --ref history append -m 3d7de37... e1bfac4...

2. Perform some automated merge into a history-tracking ref (e.g.
refs/history), to keep the historical commits reachable.

(You can easily wrap both steps into a script to automate things.)

Step #1 encodes the history of a commit in a note, but does not keep
the history reachable.

Step #2 keeps all historical commits reachable by making them part
of the history (in the git sense - without quotes) of a proper ref
(refs/history). The actual result/outcome of the merge is not
interesting. It only exists to insert the historical commit
(3d7de37) into the ancestry for refs/history. Since the actual merge
itself is uninteresting, you should probably use a merge strategy that
never yields conflicts, e.g. -s ours

You can now share the history by pushing/fetching the two refs
refs/notes/history and refs/history.

(In theory, you might even be able to combine the two refs, by
performing the merge directly into refs/notes/history, always taking
care to retain the notes tree contents as the result of the merge. In
other words, after you do step #1 (append the note), you manually
rewrite the just-created tip of refs/notes/history to include 3d7de37
as a second parent. This keeps 3d7de37 reachable (and it will be
shared when you share refs/notes/history), and it should not interfere
with the notes infrastructure, as they only look at the current state
of the notes tree.)


Hope this helps,

...Johan

-- 
Johan Herland, jo...@herland.net
www.herland.net
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git-note -C changes commit type?

2014-02-12 Thread Joachim Breitner
Dear Johan,

thanks for the patch!

Am Mittwoch, den 12.02.2014, 11:26 +0100 schrieb Johan Herland:
 Here's another way to solve your problem, which should be fairly
 transparent and performant:
 
 Whenever you want to reference history of a commit (I'm using quotes
 here, because we're not talking about the regular git sense of
 history, i.e. the commit graph), you perform the following two steps:
 
 1. Append the historical commit SHA-1 (3d7de37 in your example) to a
 note on the current commit (e1bfac4). E.g.:
 
 git notes --ref history append -m 3d7de37... e1bfac4...
 
 2. Perform some automated merge into a history-tracking ref (e.g.
 refs/history), to keep the historical commits reachable.
 
 (You can easily wrap both steps into a script to automate things.)
 
 Step #1 encodes the history of a commit in a note, but does not keep
 the history reachable.
 
 Step #2 keeps all historical commits reachable by making them part
 of the history (in the git sense - without quotes) of a proper ref
 (refs/history). The actual result/outcome of the merge is not
 interesting. It only exists to insert the historical commit
 (3d7de37) into the ancestry for refs/history. Since the actual merge
 itself is uninteresting, you should probably use a merge strategy that
 never yields conflicts, e.g. -s ours
 
 You can now share the history by pushing/fetching the two refs
 refs/notes/history and refs/history.

 (In theory, you might even be able to combine the two refs, by
 performing the merge directly into refs/notes/history, always taking
 care to retain the notes tree contents as the result of the merge. In
 other words, after you do step #1 (append the note), you manually
 rewrite the just-created tip of refs/notes/history to include 3d7de37
 as a second parent. This keeps 3d7de37 reachable (and it will be
 shared when you share refs/notes/history), and it should not interfere
 with the notes infrastructure, as they only look at the current state
 of the notes tree.)

That is quite a good approximation. What it doesn’t do is dropping
history (in the refs/history sense) of commits that disappear, but the
same problem exists with notes. Thanks!


I guess there are no plans to make the commit object format itself
extensible, are they? Extensible in the sense that I can add a custom
field to it (e.g. history:). Git would not have to know anything about
the field besides its type, i.e. that it contains refs that it has to
follow. Very much like parent:, just without the semantics of it wrt.
git log and the like.


Greetings,
Joachim
-- 
Joachim “nomeata” Breitner
  m...@joachim-breitner.de • http://www.joachim-breitner.de/
  Jabber: nome...@joachim-breitner.de  • GPG-Key: 0x4743206C
  Debian Developer: nome...@debian.org


signature.asc
Description: This is a digitally signed message part


Re: Make the git codebase thread-safe

2014-02-12 Thread Karsten Blees
Am 12.02.2014 04:43, schrieb Duy Nguyen:
 On Wed, Feb 12, 2014 at 9:02 AM, Robin H. Johnson robb...@gentoo.org wrote:
 On Tue, Feb 11, 2014 at 05:54:51PM -0800,  Stefan Zager wrote:
 We in the chromium project have a keen interest in adding threading to
 git in the pursuit of performance for lengthy operations (checkout,
 status, blame, ...).  Our motivation comes from hitting some
 performance walls when working with repositories the size of chromium
 and blink:
 +1 from Gentoo on performance improvements for large repos.

 The main repository in the ongoing Git migration project looks to be in
 the 1.5GB range (and for those that want to propose splitting it up, we
 have explored that option and found it lacking), with very deep history
 (but no branches of note, and very few tags).
 
 From v1.9 shallow clone should work for all push/pull/clone... so
 history depth does not matter (on the client side). As for
 gentoo-x86's large worktree, using index v4 and avoid full-tree
 operations (e.g. status ., not status..) should make all
 operations reasonably fast. I plan to make status fast even without
 path limiting with the help of inotify, but that's not going to be
 finished soon. Did I miss anything else?
 

Regarding git-status on msysgit, enable core.preloadindex and core.fscache (as 
of 1.8.5.2).

There's no inotify on Windows, and I gave up using ReadDirectoryChangesW to 
keep fscache up to date, as it _may_ report DOS file names (e.g. C:\PROGRA~1 
instead of C:\Program Files).
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pack bitmap woes on Windows

2014-02-12 Thread Erik Faye-Lund
On Wed, Feb 12, 2014 at 8:27 AM, Johannes Sixt j.s...@viscovery.net wrote:
 Running test suite of 'next' on Windows fails in t5310-pack-bitmaps with
 the following symptoms. I haven't followed the topic. Have there been
 patches floating that addressed the problem in one way or another?

 (gdb) run
 Starting program: D:\Src\mingw-git\t\trash 
 directory.t5310-pack-bitmaps/..\..\git.exe rev-list --test-bitmap HEAD
 [New thread 3528.0x8d4]
 Bitmap v1 test (20 entries loaded)
 Found bitmap for 537ea4d3eb79c95f602873b1167c480006d2ac2d. 64 bits / 15873b36 
 checksum

 Breakpoint 1, die (err=0x5939e9 Out of memory, realloc failed) at usage.c:97
 97  if (die_is_recursing()) {
 (gdb) bt
 #0  die (err=0x5939e9 Out of memory, realloc failed) at usage.c:97
 #1  0x00487c4c in xrealloc (ptr=0x12b10008, size=837107040) at wrapper.c:109

~800 megs is a pretty large allocation for 32-bit systems. What gives?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Failed to find remote helpers after install from source

2014-02-12 Thread Antonio Pérez Barrero
Hi,

I installed git 1.8.5.4 from source under $HOME/bin. My system is SUSE
Linux Enterprise Server 11 SP1 (x86_64). I followed instructions and
just run make  make install.

After installation I cannot clone repos from https urls, getting the
following error:

$ git clone https://github.com/apbarrero/pyrad.git
Cloning into 'pyrad'...
fatal: Unable to find remote helper for 'https'

I checked that git is properly compiled against libcurl-devel and
libexpat-devel packages. Then I tryed using git-remote-https installed
under $HOME/bin/libexec/git-core/ and it worked fine to connect to the
same remote:

$ $HOME/libexec/git-core/git-remote-https https://github.com/apbarrero/pyrad.git
capabilities
fetch
option
push
check-connectivity

list
@refs/heads/master HEAD
9599cf833354793b81d2a47504826332473bcb12 refs/heads/master
1f8f2b995bb5ab55e6c6f1051ccb44875ab1e60d refs/tags/0.6
68552227901d377b513884c70d9582da0329a270 refs/tags/0.6^{}
e0cd958edc5b3aad7e31435990674e2cff4e3b7e refs/tags/0.7
c50213b2d4213f3574c1a6b454e6887a529de340 refs/tags/0.7^{}
6085deb4ee37862d65f4a26f472fa2d1894a4331 refs/tags/0.8
33902c5b3da1272a4f5930815f561b8068315ba3 refs/tags/0.8^{}
5a45639faaf1cbf7622fe47e2795d6f5a0ee6658 refs/tags/0.9
edd69b9014d7e5bbf9da203d7db9a26587756aa4 refs/tags/0.9^{}
6e3b16ed19b329be944bd1b10aa17d02eb473009 refs/tags/1.0
30beedc5c4e56a15f4025d25331515aa2a917234 refs/tags/1.0^{}
dacf4bd37aaddd3872faeb8a77c801fe3c8550cb refs/tags/1.1
4c3e2d6700947ca6ea7b3319ff52abb7029bf3be refs/tags/1.1^{}
9a7f5a4e9fe19ef9f45db4e28a7d4648a011cc9b refs/tags/1.2
ee7ec8f2b37da5e84bf0fbb83e214a8bd3cfdf70 refs/tags/1.2^{}
e16af24d814e8d8c83b172ca6103fd3ab93b08db refs/tags/2.0
ce4a625caf5c6d892e020ec150373043a203366e refs/tags/2.0^{}

I looked at this post in stackoverflow and use strace to find out git
clone is trying to look for remote helpers under
$HOME/bin/libexec/git-core, but make install left the libexec/git-core
directory under $HOME, instead of$HOME/bin.

...
[pid 675896] execve($HOME/bin/libexec/git-core/git-remote-https,
[git-remote-https, origin, https://github.com/apbarrero/pyr;...],
[/* 88 vars */]) = -1 ENOENT (No such file or directory)
[pid 675896] execve($HOME/bin/git-remote-https, [git-remote-https,
origin, https://github.com/apbarrero/pyr;...], [/* 88 vars */]) =
-1 ENOENT (No such file or directory)
[pid 675896] execve($HOME/bin/git-remote-https, [git-remote-https,
origin, https://github.com/apbarrero/pyr;...], [/* 88 vars */]) =
-1 ENOENT (No such file or directory)
[pid 675896] execve($HOME/bin/git-remote-https, [git-remote-https,
origin, https://github.com/apbarrero/pyr;...], [/* 88 vars */]) =
-1 ENOENT (No such file or directory)
[pid 675896] execve($HOME/bin/git-remote-https, [git-remote-https,
origin, https://github.com/apbarrero/pyr;...], [/* 88 vars */]) =
-1 ENOENT (No such file or directory)
[pid 675896] execve(/usr/local/bin/git-remote-https,
[git-remote-https, origin, https://github.com/apbarrero/pyr;...],
[/* 88 vars */]) = -1 ENOENT (No such file or directory)
[pid 675896] execve(/usr/bin/git-remote-https, [git-remote-https,
origin, https://github.com/apbarrero/pyr;...], [/* 88 vars */]) =
-1 ENOENT (No such file or directory)
[pid 675896] execve(/bin/git-remote-https, [git-remote-https,
origin, https://github.com/apbarrero/pyr;...], [/* 88 vars */]) =
-1 ENOENT (No such file or directory)
[pid 675896] execve(/usr/bin/X11/git-remote-https,
[git-remote-https, origin, https://github.com/apbarrero/pyr;...],
[/* 88 vars */]) = -1 ENOENT (No such file or directory)
[pid 675896] execve(/usr/X11R6/bin/git-remote-https,
[git-remote-https, origin, https://github.com/apbarrero/pyr;...],
[/* 88 vars */]) = -1 ENOENT (No such file or directory)
[pid 675896] execve(/usr/games/git-remote-https,
[git-remote-https, origin, https://github.com/apbarrero/pyr;...],
[/* 88 vars */]) = -1 ENOENT (No such file or directory)
[pid 675896] execve(/usr/lib64/jvm/jre/bin/git-remote-https,
[git-remote-https, origin, https://github.com/apbarrero/pyr;...],
[/* 88 vars */]) = -1 ENOENT (No such file or directory)
[pid 675896] execve(/usr/lib/mit/bin/git-remote-https,
[git-remote-https, origin, https://github.com/apbarrero/pyr;...],
[/* 88 vars */]) = -1 ENOENT (No such file or directory)
[pid 675896] execve(/usr/lib/mit/sbin/git-remote-https,
[git-remote-https, origin, https://github.com/apbarrero/pyr;...],
[/* 88 vars */]) = -1 ENOENT (No such file or directory)
...

Creating a soft link under $HOME/bin/ to ../libexec got it working.

So it look like it is rather an installation issue or a bug in the
directory list where git looks for binaries under libexec or I missed
some parameter to pass to make.

Regards,

-- 
   Antonio Pérez Barrero
   apbarr...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-12 Thread Erik Faye-Lund
On Wed, Feb 12, 2014 at 2:54 AM, Stefan Zager sza...@chromium.org wrote:
 We in the chromium project have a keen interest in adding threading to
 git in the pursuit of performance for lengthy operations (checkout,
 status, blame, ...).  Our motivation comes from hitting some
 performance walls when working with repositories the size of chromium
 and blink:

 https://chromium.googlesource.com/chromium/src
 https://chromium.googlesource.com/chromium/blink

 We are particularly concerned with the performance of msysgit, and we
 have already chalked up a significant performance gain by turning on
 the threading code in pack-objects (which was already enabled for
 posix platforms, but not on msysgit, owing to the lack of a correct
 pread implementation).

How did you manage to do this? I'm not aware of any way to implement
pread on Windows (without going down the insanity-path of wrapping and
potentially locking inside every IO operation)...
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pack bitmap woes on Windows

2014-02-12 Thread Johannes Sixt
Am 2/12/2014 12:56, schrieb Erik Faye-Lund:
 On Wed, Feb 12, 2014 at 8:27 AM, Johannes Sixt j.s...@viscovery.net wrote:
 Running test suite of 'next' on Windows fails in t5310-pack-bitmaps with
 the following symptoms. I haven't followed the topic. Have there been
 patches floating that addressed the problem in one way or another?

 (gdb) run
 Starting program: D:\Src\mingw-git\t\trash 
 directory.t5310-pack-bitmaps/..\..\git.exe rev-list --test-bitmap HEAD
 [New thread 3528.0x8d4]
 Bitmap v1 test (20 entries loaded)
 Found bitmap for 537ea4d3eb79c95f602873b1167c480006d2ac2d. 64 bits / 
 15873b36 checksum

 Breakpoint 1, die (err=0x5939e9 Out of memory, realloc failed) at 
 usage.c:97
 97  if (die_is_recursing()) {
 (gdb) bt
 #0  die (err=0x5939e9 Out of memory, realloc failed) at usage.c:97
 #1  0x00487c4c in xrealloc (ptr=0x12b10008, size=837107040) at wrapper.c:109
 
 ~800 megs is a pretty large allocation for 32-bit systems. What gives?

That's exactly the problem: why would a tiny repository from the test
suite require such a large allocation? (Not to mention that the allocation
ultimately fails in my case.)
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pack bitmap woes on Windows

2014-02-12 Thread David Kastrup
Johannes Sixt j.s...@viscovery.net writes:

 Running test suite of 'next' on Windows fails in t5310-pack-bitmaps with
 the following symptoms. I haven't followed the topic. Have there been
 patches floating that addressed the problem in one way or another?

 (gdb) run
 Starting program: D:\Src\mingw-git\t\trash 
 directory.t5310-pack-bitmaps/..\..\git.exe rev-list --test-bitmap HEAD
 [New thread 3528.0x8d4]
 Bitmap v1 test (20 entries loaded)
 Found bitmap for 537ea4d3eb79c95f602873b1167c480006d2ac2d. 64 bits / 15873b36 
 checksum

Does reverting a201c20b41a2f0725977bcb89a2a66135d776ba2 help?

-- 
David Kastrup
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pack bitmap woes on Windows

2014-02-12 Thread Johannes Sixt
Am 2/12/2014 13:55, schrieb David Kastrup:
 Johannes Sixt j.s...@viscovery.net writes:
 
 Running test suite of 'next' on Windows fails in t5310-pack-bitmaps with
 the following symptoms. I haven't followed the topic. Have there been
 patches floating that addressed the problem in one way or another?

 (gdb) run
 Starting program: D:\Src\mingw-git\t\trash 
 directory.t5310-pack-bitmaps/..\..\git.exe rev-list --test-bitmap HEAD
 [New thread 3528.0x8d4]
 Bitmap v1 test (20 entries loaded)
 Found bitmap for 537ea4d3eb79c95f602873b1167c480006d2ac2d. 64 bits / 
 15873b36 checksum
 
 Does reverting a201c20b41a2f0725977bcb89a2a66135d776ba2 help?

YES! t5310 passes after reverting this commit.

-- Hannes
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pack bitmap woes on Windows

2014-02-12 Thread David Kastrup
Johannes Sixt j.s...@viscovery.net writes:

 Am 2/12/2014 13:55, schrieb David Kastrup:
 Johannes Sixt j.s...@viscovery.net writes:
 
 Running test suite of 'next' on Windows fails in t5310-pack-bitmaps with
 the following symptoms. I haven't followed the topic. Have there been
 patches floating that addressed the problem in one way or another?

 (gdb) run
 Starting program: D:\Src\mingw-git\t\trash
 directory.t5310-pack-bitmaps/..\..\git.exe rev-list --test-bitmap
 HEAD
 [New thread 3528.0x8d4]
 Bitmap v1 test (20 entries loaded)
 Found bitmap for 537ea4d3eb79c95f602873b1167c480006d2ac2d. 64 bits
 / 15873b36 checksum
 
 Does reverting a201c20b41a2f0725977bcb89a2a66135d776ba2 help?

 YES! t5310 passes after reverting this commit.

Oh.  I just looked through the backtrace until finding a routine
reasonably related with the regtest and checked for the last commit
changing it, then posted my question.

Then I looked through the diff of the patch and considered it
unconspicuous.  So I commenced reading through earlier commits.

I actually don't have a good idea what might be wrong here.  The code is
somewhat distasteful as it basically uses eword_t and uint64_t
interchangeably, but then this does match its current definition.

-- 
David Kastrup
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pack bitmap woes on Windows

2014-02-12 Thread Duy Nguyen
On Wed, Feb 12, 2014 at 8:23 PM, David Kastrup d...@gnu.org wrote:
 Johannes Sixt j.s...@viscovery.net writes:

 Am 2/12/2014 13:55, schrieb David Kastrup:
 Johannes Sixt j.s...@viscovery.net writes:

 Running test suite of 'next' on Windows fails in t5310-pack-bitmaps with
 the following symptoms. I haven't followed the topic. Have there been
 patches floating that addressed the problem in one way or another?

 (gdb) run
 Starting program: D:\Src\mingw-git\t\trash
 directory.t5310-pack-bitmaps/..\..\git.exe rev-list --test-bitmap
 HEAD
 [New thread 3528.0x8d4]
 Bitmap v1 test (20 entries loaded)
 Found bitmap for 537ea4d3eb79c95f602873b1167c480006d2ac2d. 64 bits
 / 15873b36 checksum

 Does reverting a201c20b41a2f0725977bcb89a2a66135d776ba2 help?

 YES! t5310 passes after reverting this commit.

 Oh.  I just looked through the backtrace until finding a routine
 reasonably related with the regtest and checked for the last commit
 changing it, then posted my question.

 Then I looked through the diff of the patch and considered it
 unconspicuous.  So I commenced reading through earlier commits.

 I actually don't have a good idea what might be wrong here.  The code is
 somewhat distasteful as it basically uses eword_t and uint64_t
 interchangeably, but then this does match its current definition.

Perhaps __BYTE_ORDER or __BIG_ENDIAN is misdefined and the ntohll() is skipped?
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pack bitmap woes on Windows

2014-02-12 Thread Erik Faye-Lund
On Wed, Feb 12, 2014 at 3:09 PM, Duy Nguyen pclo...@gmail.com wrote:
 On Wed, Feb 12, 2014 at 8:23 PM, David Kastrup d...@gnu.org wrote:
 Johannes Sixt j.s...@viscovery.net writes:

 Am 2/12/2014 13:55, schrieb David Kastrup:
 Johannes Sixt j.s...@viscovery.net writes:

 Running test suite of 'next' on Windows fails in t5310-pack-bitmaps with
 the following symptoms. I haven't followed the topic. Have there been
 patches floating that addressed the problem in one way or another?

 (gdb) run
 Starting program: D:\Src\mingw-git\t\trash
 directory.t5310-pack-bitmaps/..\..\git.exe rev-list --test-bitmap
 HEAD
 [New thread 3528.0x8d4]
 Bitmap v1 test (20 entries loaded)
 Found bitmap for 537ea4d3eb79c95f602873b1167c480006d2ac2d. 64 bits
 / 15873b36 checksum

 Does reverting a201c20b41a2f0725977bcb89a2a66135d776ba2 help?

 YES! t5310 passes after reverting this commit.

 Oh.  I just looked through the backtrace until finding a routine
 reasonably related with the regtest and checked for the last commit
 changing it, then posted my question.

 Then I looked through the diff of the patch and considered it
 unconspicuous.  So I commenced reading through earlier commits.

 I actually don't have a good idea what might be wrong here.  The code is
 somewhat distasteful as it basically uses eword_t and uint64_t
 interchangeably, but then this does match its current definition.

 Perhaps __BYTE_ORDER or __BIG_ENDIAN is misdefined and the ntohll() is 
 skipped?

That is indeed the case.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] blame.c: prepare_lines should not call xrealloc for every line

2014-02-12 Thread David Kastrup
Making a single preparation run for counting the lines will avoid memory
fragmentation.  Also, fix the allocated memory size which was wrong
when sizeof(int *) != sizeof(int), and would have been too small
for sizeof(int *)  sizeof(int), admittedly unlikely.

Signed-off-by: David Kastrup d...@gnu.org
---

Since there was no feedback after the last defense/explanation of the
coding choices, the code rewritten by this patch was much more awful,
and the kind of style requests (fixed already in the last iteration)
are not actually heeded by the core developers themselves, I have no
idea whether this patch will be dropped just like the last one.

As opposed to the last try, this incorporates a suggestion from Jeff
to change sizeof(type) to sizeof(expression) which is not helping much
since the types of lineno and sb-lineno still need to be changed in
sync.

It also fiddles cosmetically with the code layout of the loops.

builtin/blame.c | 46 +++---
 1 file changed, 31 insertions(+), 15 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index e44a6bb..1aefedf 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -1772,25 +1772,41 @@ static int prepare_lines(struct scoreboard *sb)
 {
const char *buf = sb-final_buf;
unsigned long len = sb-final_buf_size;
-   int num = 0, incomplete = 0, bol = 1;
+   const char *end = buf + len;
+   const char *p;
+   int *lineno;
+   int num = 0, incomplete = 0;
 
-   if (len  buf[len-1] != '\n')
-   incomplete++; /* incomplete line at the end */
-   while (len--) {
-   if (bol) {
-   sb-lineno = xrealloc(sb-lineno,
- sizeof(int *) * (num + 1));
-   sb-lineno[num] = buf - sb-final_buf;
-   bol = 0;
-   }
-   if (*buf++ == '\n') {
+   for (p = buf;;) {
+   p = memchr(p, '\n', end - p);
+   if (p) {
+   p++;
num++;
-   bol = 1;
+   continue;
}
+   break;
}
-   sb-lineno = xrealloc(sb-lineno,
- sizeof(int *) * (num + incomplete + 1));
-   sb-lineno[num + incomplete] = buf - sb-final_buf;
+
+   if (len  end[-1] != '\n')
+   incomplete++; /* incomplete line at the end */
+
+   sb-lineno = xmalloc(sizeof(*sb-lineno) * (num + incomplete + 1));
+   lineno = sb-lineno;
+
+   *lineno++ = 0;
+   for (p = buf;;) {
+   p = memchr(p, '\n', end - p);
+   if (p) {
+   p++;
+   *lineno++ = p - buf;
+   continue;
+   }
+   break;
+   }
+
+   if (incomplete)
+   *lineno++ = len;
+
sb-num_lines = num + incomplete;
return sb-num_lines;
 }
-- 
1.8.3.2

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pack bitmap woes on Windows

2014-02-12 Thread Erik Faye-Lund
On Wed, Feb 12, 2014 at 3:22 PM, Erik Faye-Lund kusmab...@gmail.com wrote:
 On Wed, Feb 12, 2014 at 3:09 PM, Duy Nguyen pclo...@gmail.com wrote:
 On Wed, Feb 12, 2014 at 8:23 PM, David Kastrup d...@gnu.org wrote:
 Johannes Sixt j.s...@viscovery.net writes:

 Am 2/12/2014 13:55, schrieb David Kastrup:
 Johannes Sixt j.s...@viscovery.net writes:

 Running test suite of 'next' on Windows fails in t5310-pack-bitmaps with
 the following symptoms. I haven't followed the topic. Have there been
 patches floating that addressed the problem in one way or another?

 (gdb) run
 Starting program: D:\Src\mingw-git\t\trash
 directory.t5310-pack-bitmaps/..\..\git.exe rev-list --test-bitmap
 HEAD
 [New thread 3528.0x8d4]
 Bitmap v1 test (20 entries loaded)
 Found bitmap for 537ea4d3eb79c95f602873b1167c480006d2ac2d. 64 bits
 / 15873b36 checksum

 Does reverting a201c20b41a2f0725977bcb89a2a66135d776ba2 help?

 YES! t5310 passes after reverting this commit.

 Oh.  I just looked through the backtrace until finding a routine
 reasonably related with the regtest and checked for the last commit
 changing it, then posted my question.

 Then I looked through the diff of the patch and considered it
 unconspicuous.  So I commenced reading through earlier commits.

 I actually don't have a good idea what might be wrong here.  The code is
 somewhat distasteful as it basically uses eword_t and uint64_t
 interchangeably, but then this does match its current definition.

 Perhaps __BYTE_ORDER or __BIG_ENDIAN is misdefined and the ntohll() is 
 skipped?

 That is indeed the case.

Looking a bit at it, the whole byte-order detection mess seems insane to me.

MinGW falls inside the defined(__GNUC__)  (defined(__i386__) ||
defined(__x86_64__))-block, but does not define __BYTE_ORDER.

It seems the author of a201c20b41a2f0725977bcb89a2a66135d776ba2
assumes __BYTE_ORDER was guaranteed to always be defined, probably
fooled by the following check:

#if !defined(__BYTE_ORDER)
# error Cannot determine endianness
#endif

However, that check is only performed if we're NOT building with GCC
on x86/x64 nor MSVC (which I don't think defined __BYTE_ORDER either)

The following would make __BYTE_ORDER a guarantee, and show that MinGW
does not define it:

---8---

diff --git a/compat/bswap.h b/compat/bswap.h
index 120c6c1..c73bf0e 100644
--- a/compat/bswap.h
+++ b/compat/bswap.h
@@ -109,10 +109,6 @@ static inline uint64_t git_bswap64(uint64_t x)
 # endif
 #endif

-#if !defined(__BYTE_ORDER)
-# error Cannot determine endianness
-#endif
-
 #if __BYTE_ORDER == __BIG_ENDIAN
 # define ntohll(n) (n)
 # define htonll(n) (n)
@@ -123,6 +119,10 @@ static inline uint64_t git_bswap64(uint64_t x)

 #endif

+#if !defined(__BYTE_ORDER)
+# error Cannot determine endianness
+#endif
+
 /*
  * Performance might be improved if the CPU architecture is OK with
  * unaligned 32-bit loads and a fast ntohl() is available.

---8---

But another level of insanity (IMO) is defining double-underscore
macros. These symbols are reserved for the implementation. Sure,
knowing we're on a given implementation and defining something *else*
dependent on them is fine. But defining them is just yuckiness, and
not very standard-kosher.

IMO, we should rather do the definition the other way around:

#if !defined(BYTE_ORDER)
# if defined(__BYTE_ORDER)  defined(__LITTLE_ENDIAN)  defined(__BIG_ENDIAN)
#  define BYTE_ORDER __BYTE_ORDER
#  define LITTLE_ENDIAN __LITTLE_ENDIAN
#  define BIG_ENDIAN __BIG_ENDIAN
# endif
#endif

...and only referred to BYTE_ORDER, LITTLE_ENDIAN and BIG_ENDIAN.


That way we'd follow closer to where opengroup are heading:

http://www.opengroup.org/austin/docs/austin_514.txt
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: feature request: text=input option in .gitattributes

2014-02-12 Thread Torsten Bögershausen
On 2014-02-11 15.57, Cameron Taggart wrote:
 After requesting this as
 https://github.com/msysgit/msysgit/issues/164, I was told to take it
 upstream, so here I am.
 
 I would like a text=input feature added that has the same behavior as
 text=auto, except that it defaults to core.autocrlf=input behavior
 instead of core.autocrlf=true. 
If you want to normailze your repo, and keep it normalized,
I can  recommend to use .gitattributes.

Please see the excellent page here:
https://www.kernel.org/pub/software/scm/git/docs/gitattributes.html
And especially this part:

$ echo * text=auto .gitattributes
$ rm .git/index # Remove the index to force git to
$ git reset # re-scan the working directory
$ git status# Show files that will be normalized
$ git add -u
$ git add .gitattributes
$ git commit -m Introduce end-of-line normalization



/Torsten

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Turning a test script into something usable in t/perf

2014-02-12 Thread David Kastrup

Hi,

I find that I have little clue about how to convert the following brief
test script into some test to place in t/perf:

#!/bin/sh
rm -rf /tmp/git-test
mkdir /tmp/git-test
cd /tmp/git-test
git init
LIMIT=20
yes a|head -$LIMIT data
yes b|head -$LIMIT data2
git add data data2
git commit -m split
git rm data2
yes 'a
b' | head -$(($LIMIT*2)) data
git add data
git commit -m combined
time git blame data /dev/null

The variable LIMIT is the deciding factor for determining performance
which, with the code in current master, is rather measurably O(LIMIT^2).
I think that the current test takes about 15 minutes to complete on my
computer.

Obviously, that's sort of excessive: there is little point in choosing
sizes that show off more than two orders of magnitude in improvement.

Now the pathological cases are lots of small but attributable fragments
in the blamed source files.  One real-world project that is hit rather
hard is an alphabetically sorted large list of words that tends to get
insertions/deletions of few scattered lines at a time.

Should one aim for an actually pathological case like in this script?
Should one try benchmarking with one of the stock repositories instead
that don't really demonstrate well just how bad the behavior might
become and which code passages are dominant regarding the quadratic
behavior?

-- 
David Kastrup


Re: pack bitmap woes on Windows

2014-02-12 Thread Jeff King
On Wed, Feb 12, 2014 at 03:49:24PM +0100, Erik Faye-Lund wrote:

 It seems the author of a201c20b41a2f0725977bcb89a2a66135d776ba2
 assumes __BYTE_ORDER was guaranteed to always be defined, probably
 fooled by the following check:
 
 #if !defined(__BYTE_ORDER)
 # error Cannot determine endianness
 #endif
 
 However, that check is only performed if we're NOT building with GCC
 on x86/x64 nor MSVC (which I don't think defined __BYTE_ORDER either)
 
 The following would make __BYTE_ORDER a guarantee, and show that MinGW
 does not define it:

Yes, that is the problem. Sorry, this got brought up earlier[1], but the
discussion went on and I did not notice that Brian's patch did not get
applied.

After re-reading the discussion, I think the patch below is the best
solution. Can you confirm that it fixes the problem for you?

-Peff

[1] http://thread.gmane.org/gmane.comp.version-control.git/241278

-- 8 --
Subject: ewah: unconditionally ntohll ewah data

Commit a201c20 tried to optimize out a loop like:

  for (i = 0; i  len; i++)
  data[i] = ntohll(data[i]);

in the big-endian case, because we know that ntohll is a
noop, and we do not need to pay the cost of the loop at all.
However, it mistakenly assumed that __BYTE_ORDER was always
defined, whereas it may not be on systems which do not
define it by default, and where we did not need to define it
to set up the ntohll macro. This includes OS X and Windows.

We could muck with the ordering in compat/bswap.h to make
sure it is defined unconditionally, but it is simpler to
still to just execute the loop unconditionally. That avoids
the application code knowing anything about these magic
macros, and lets it depend only on having ntohll defined.

And since the resulting loop looks like (on a big-endian
system):

  for (i = 0; i  len; i++)
  data[i] = data[i];

any decent compiler can probably optimize it out.

Original report and analysis by Brian Gernhardt.

Signed-off-by: Jeff King p...@peff.net
---
 ewah/ewah_io.c | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/ewah/ewah_io.c b/ewah/ewah_io.c
index 4a7fae6..f7f700e 100644
--- a/ewah/ewah_io.c
+++ b/ewah/ewah_io.c
@@ -113,6 +113,7 @@ int ewah_serialize(struct ewah_bitmap *self, int fd)
 int ewah_read_mmap(struct ewah_bitmap *self, void *map, size_t len)
 {
uint8_t *ptr = map;
+   size_t i;
 
self-bit_size = get_be32(ptr);
ptr += sizeof(uint32_t);
@@ -135,13 +136,8 @@ int ewah_read_mmap(struct ewah_bitmap *self, void *map, 
size_t len)
memcpy(self-buffer, ptr, self-buffer_size * sizeof(uint64_t));
ptr += self-buffer_size * sizeof(uint64_t);
 
-#if __BYTE_ORDER != __BIG_ENDIAN
-   {
-   size_t i;
-   for (i = 0; i  self-buffer_size; ++i)
-   self-buffer[i] = ntohll(self-buffer[i]);
-   }
-#endif
+   for (i = 0; i  self-buffer_size; ++i)
+   self-buffer[i] = ntohll(self-buffer[i]);
 
self-rlw = self-buffer + get_be32(ptr);
 
-- 
1.8.5.2.500.g8060133

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/11] More preparatory work for multiparent tree-walker

2014-02-12 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

 Kirill Smelkov k...@mns.spb.ru writes:

 Sorry for the confusion. Could you please do the following:

 Patches should be applied over to ks/tree-diff-walk
 (74aa4a18). Before applying the patches, please cherry-pick

 c90483d9(tree-diff: no need to manually verify that there is no
  mode change for a path)

 ef4f0928(tree-diff: no need to pass match to
  skip_uninteresting())

 into that branch, and drop them from ks/combine-diff - we'll have one
 branch reworking the diff tree-walker, and the other taking advantage of
 it for combine-diff.

 As long as that does not lose changes to tests and clean-ups, I'm
 fine with that direction.  For example, I do not know if you want to
 lose e3f62d12 (diffcore-order: export generic ordering interface,
 2014-01-20), which is part of the combine-diff topic.

Ahh, sorry, I misread the drop as salvage these two and drop the
rest.  The new series does apply cleanly on a commit in master..pu
that has both ks/tree-diff-walk and ks/combine-diff, and the latter
is not yet in 'next' so we are free to reorganize.

Let me flip the latter topic around, also queue these updates and
push the result out on 'pu'.

Thanks.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] gc: config option for running --auto in background

2014-02-12 Thread Junio C Hamano
Duy Nguyen pclo...@gmail.com writes:

 On Tue, Feb 11, 2014 at 2:11 AM, Junio C Hamano gits...@pobox.com wrote:
 On Mon, Feb 10, 2014 at 10:43 AM, Junio C Hamano gits...@pobox.com wrote:
 If --quiet is set, we should not be printing anyway. If not, I thinkg
 we could only print auto packing in background.. when we actually
 can do that, else just print the old message. It means an #ifdef
 NO_POSIX_GOODIES here again though..

 Didn't you change it not to die but return nosys or something?

 Ah, the problem is that it is too late to take back ... will do so in
 the background when you noticed that daemonize() did not succeed, so
 you would need a way to see if we can daemonize() before actually
 doing so if you want to give different messages.

 int can_daemonize(void) could be an answer that is nicer than
 NO_POSIX_GOODIES, but I am not sure if it is worth it.

 Or we could pass the quiet flag to daemonize() and let it print
 something in the #ifdef NO_POSIX_GOODIES part.

Hmph...  What would that something say?  I was asked to gc in the
background but I can't here is not suitable for daemonize() that is
not specific to gc.

The flow I had in mind was something along the lines of this

if (!quiet) {
if (detach_auto  can_daemonize())
say auto packing in the background;
else
say auto packing
}
if (detach_auto  can_daemonize())
daemonize();

If we had daemonize(noisy=1) and coded it this way:

if (!quiet)
say auto packing;
if (detach_auto)
daemonize(!quiet);

we could do something like:

daemonize(int noisy) {
if (noisy  !defined(NO_POSIX_GOODIES))
say , and doing so in the background;
... do the actual daemonizing ...
}

but that feels ugly.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-12 Thread Stefan Zager
On Tue, Feb 11, 2014 at 6:11 PM, Duy Nguyen pclo...@gmail.com wrote:

 I have no comments about thread safety improvements (well, not yet).
 If you have investigated about git performance on chromium
 repositories, could you please sum it up? Threading may be an option
 to improve performance, but it's probably not the only option.

Well, the painful operations that we use frequently are pack-objects,
checkout, status, and blame.  Anything on Windows that touches a lot
of files is miserable due to the usual file system slowness on
Windows, and luafv.sys (the UAC file virtualization driver) seems to
make it much worse.

With threading turned on, pack-objects on Windows now takes about
twice as long as on Linux, which is still more than a 2x improvement
over the non-threaded operation.

Checkout is really bad on Windows.  The blink repository is ~200K
files, and a full clean checkout from the index takes about 10 seconds
on Linux, and about 3:30 on Windows.  I used the Very Sleepy profiler
to see where all the time was spent on Windows: 55% of the time was
spent in OpenFile, and 25% in CloseFile (both in win32).  My immediate
goal is to add threading to checkout, so those file system calls can
be done in parallel.

Enabling the fscache speeds up status quite a bit.  I'm optimistic
that parallelizing the stat calls will yield a further improvement.
Beyond that, it may not be possible to do much more without using a
file system watcher daemon, like facebook does with mercurial.
(https://code.facebook.com/posts/218678814984400/scaling-mercurial-at-facebook/)

Blame is something that chromium and blink developers use heavily, and
it is not unusual for a blame invocation on the blink repository to
run for 30 seconds.  It seems like it should be possible to
parallelize blame, but it requires pack file operations to be
thread-safe.

Stefan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-12 Thread Stefan Zager
On Tue, Feb 11, 2014 at 7:43 PM, Duy Nguyen pclo...@gmail.com wrote:

 From v1.9 shallow clone should work for all push/pull/clone... so
 history depth does not matter (on the client side). As for
 gentoo-x86's large worktree, using index v4 and avoid full-tree
 operations (e.g. status ., not status..) should make all
 operations reasonably fast. I plan to make status fast even without
 path limiting with the help of inotify, but that's not going to be
 finished soon. Did I miss anything else?

Chromium developers frequently want to run status over their entire
checkout, and a lot of them run 'git commit -a'.  We want to do
everything possible to speed this up.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-12 Thread Stefan Zager
On Wed, Feb 12, 2014 at 3:59 AM, Erik Faye-Lund kusmab...@gmail.com wrote:
 On Wed, Feb 12, 2014 at 2:54 AM, Stefan Zager sza...@chromium.org wrote:

 We are particularly concerned with the performance of msysgit, and we
 have already chalked up a significant performance gain by turning on
 the threading code in pack-objects (which was already enabled for
 posix platforms, but not on msysgit, owing to the lack of a correct
 pread implementation).

 How did you manage to do this? I'm not aware of any way to implement
 pread on Windows (without going down the insanity-path of wrapping and
 potentially locking inside every IO operation)...

I don't want to steal the thunder of my coworker, who wrote the
implementation.  He plans to submit it upstream soon-ish.  It relies
on using the lpOverlapped argument to ReadFile(), with some additional
tomfoolery to make sure that the implicit position pointer for the
file descriptor doesn't get modified.

Stefan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Make the global packed_git variable static to sha1_file.c.

2014-02-12 Thread Stefan Zager
On Tue, Feb 11, 2014 at 11:29 PM, Chris Packham judge.pack...@gmail.com wrote:
 Hi,

 On 12/02/14 14:57, Stefan Zager wrote:
 From b4796d9d99c03b0b7cddd50808a41413e45f1129 Mon Sep 17 00:00:00 2001
 From: Stefan Zager sza...@chromium.org
 Date: Mon, 10 Feb 2014 16:55:12 -0800
 Subject: [PATCH] Make the global packed_git variable static to sha1_file.c.

 I'm not really qualified to comment on substance but there are some
 basic style issues w.r.t. whitespace namely using 4 spaces for indent
 and mixing tabs/spaces. This might seem pedantic for the first round of
 a patch but it does put off reviewers.

 From Documentation/CodingGuidelines:

  - We use tabs to indent, and interpret tabs as taking up to
8 spaces.

My bad, I will upload a fixed patch.  In my defense: I edited the code
in emacs and then ran M-x tabify over the entire file.  But that had
the unfortunate side effect of adding a bunch of whitespace-only
changes to the diff, illuminating the fact that there is already mixed
whitespace in the existing code.  So I had to go back and selectively
tabify my changes, and I clearly missed a bunch.

If anyone has a recommendation for a less labor-intensive way to do
this in emacs, I'd be very grateful.

Thanks,

Stefan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-12 Thread Erik Faye-Lund
On Wed, Feb 12, 2014 at 7:20 PM, Stefan Zager sza...@google.com wrote:
 On Wed, Feb 12, 2014 at 3:59 AM, Erik Faye-Lund kusmab...@gmail.com wrote:
 On Wed, Feb 12, 2014 at 2:54 AM, Stefan Zager sza...@chromium.org wrote:

 We are particularly concerned with the performance of msysgit, and we
 have already chalked up a significant performance gain by turning on
 the threading code in pack-objects (which was already enabled for
 posix platforms, but not on msysgit, owing to the lack of a correct
 pread implementation).

 How did you manage to do this? I'm not aware of any way to implement
 pread on Windows (without going down the insanity-path of wrapping and
 potentially locking inside every IO operation)...

 I don't want to steal the thunder of my coworker, who wrote the
 implementation.  He plans to submit it upstream soon-ish.  It relies
 on using the lpOverlapped argument to ReadFile(), with some additional
 tomfoolery to make sure that the implicit position pointer for the
 file descriptor doesn't get modified.

Is the code available somewhere? I'm especially interested in the
additional tomfoolery to make sure that the implicit position pointer
for the file descriptor doesn't get modified-part, as this was what I
ended up butting my head into when trying to do this myself.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-12 Thread Matthieu Moy
Stefan Zager sza...@chromium.org writes:

 I'm optimistic that parallelizing the stat calls will yield a further
 improvement.

It has already been mentionned in the thread, but in case you overlooked
it: did you look at core.preloadindex? It seems at least very close to
what you want.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-12 Thread Stefan Zager
On Wed, Feb 12, 2014 at 10:27 AM, Erik Faye-Lund kusmab...@gmail.com wrote:
 On Wed, Feb 12, 2014 at 7:20 PM, Stefan Zager sza...@google.com wrote:

 I don't want to steal the thunder of my coworker, who wrote the
 implementation.  He plans to submit it upstream soon-ish.  It relies
 on using the lpOverlapped argument to ReadFile(), with some additional
 tomfoolery to make sure that the implicit position pointer for the
 file descriptor doesn't get modified.

 Is the code available somewhere? I'm especially interested in the
 additional tomfoolery to make sure that the implicit position pointer
 for the file descriptor doesn't get modified-part, as this was what I
 ended up butting my head into when trying to do this myself.

https://chromium-review.googlesource.com/#/c/186104/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-12 Thread Stefan Zager
On Wed, Feb 12, 2014 at 10:33 AM, Matthieu Moy
matthieu@grenoble-inp.fr wrote:
 Stefan Zager sza...@chromium.org writes:

 I'm optimistic that parallelizing the stat calls will yield a further
 improvement.

 It has already been mentionned in the thread, but in case you overlooked
 it: did you look at core.preloadindex? It seems at least very close to
 what you want.

Ah yes, sorry, I overlooked that.  We have indeed turned on
core.preloadindex, and it does indeed speed up status.  That speedup
is reflected in my previous comments about our observations working
with chromium and blink.

Stefan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-12 Thread Erik Faye-Lund
On Wed, Feb 12, 2014 at 7:34 PM, Stefan Zager sza...@google.com wrote:
 On Wed, Feb 12, 2014 at 10:27 AM, Erik Faye-Lund kusmab...@gmail.com wrote:
 On Wed, Feb 12, 2014 at 7:20 PM, Stefan Zager sza...@google.com wrote:

 I don't want to steal the thunder of my coworker, who wrote the
 implementation.  He plans to submit it upstream soon-ish.  It relies
 on using the lpOverlapped argument to ReadFile(), with some additional
 tomfoolery to make sure that the implicit position pointer for the
 file descriptor doesn't get modified.

 Is the code available somewhere? I'm especially interested in the
 additional tomfoolery to make sure that the implicit position pointer
 for the file descriptor doesn't get modified-part, as this was what I
 ended up butting my head into when trying to do this myself.

 https://chromium-review.googlesource.com/#/c/186104/

ReOpenFile, that's fantastic. Thanks a lot!
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-12 Thread David Kastrup
Stefan Zager sza...@chromium.org writes:

 On Tue, Feb 11, 2014 at 6:11 PM, Duy Nguyen pclo...@gmail.com wrote:

 I have no comments about thread safety improvements (well, not yet).
 If you have investigated about git performance on chromium
 repositories, could you please sum it up? Threading may be an option
 to improve performance, but it's probably not the only option.

 Well, the painful operations that we use frequently are pack-objects,
 checkout, status, and blame.

Have you checked the patch in
URL:http://thread.gmane.org/gmane.comp.version-control.git/241448 and
followups,
Message-ID: 1391454849-26558-1-git-send-email-...@gnu.org?

While this does not yet support -M and -C options, it's conceivable that
you don't use them in your server/scripts.

 Anything on Windows that touches a lot of files is miserable due to
 the usual file system slowness on Windows, and luafv.sys (the UAC file
 virtualization driver) seems to make it much worse.

There is an obvious solution here...  Dedicated hardware is not that
expensive.  Virtualization will always have a price.

 Blame is something that chromium and blink developers use heavily, and
 it is not unusual for a blame invocation on the blink repository to
 run for 30 seconds.  It seems like it should be possible to
 parallelize blame, but it requires pack file operations to be
 thread-safe.

Really, give the above patch a try.  I am taking longer to finish it
than anticipated (with a lot due to procrastination but that is,
unfortunately, a large part of my workflow), and it's cutting into my
paychecks (voluntary donations which to a good degree depend on timely
and nontrivial progress reports for my freely available work on GNU
LilyPond).

Note that it looks like the majority of the remaining time on GNU/Linux
tends to be spent in system time: I/O time, memory management.  And I
have an SSD drive.  When using packed repositories of considerable size,
decompression comes into play as well.  I don't think that you can hope
to get noticeably higher I/O throughput by multithreading, so really,
really, really consider dedicated hardware running on a native Linux
file system.

-- 
David Kastrup
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Make the global packed_git variable static to sha1_file.c.

2014-02-12 Thread David Kastrup
Stefan Zager sza...@google.com writes:

 On Tue, Feb 11, 2014 at 11:29 PM, Chris Packham judge.pack...@gmail.com 
 wrote:
 Hi,

 On 12/02/14 14:57, Stefan Zager wrote:
 From b4796d9d99c03b0b7cddd50808a41413e45f1129 Mon Sep 17 00:00:00 2001
 From: Stefan Zager sza...@chromium.org
 Date: Mon, 10 Feb 2014 16:55:12 -0800
 Subject: [PATCH] Make the global packed_git variable static to sha1_file.c.

 I'm not really qualified to comment on substance but there are some
 basic style issues w.r.t. whitespace namely using 4 spaces for indent
 and mixing tabs/spaces. This might seem pedantic for the first round of
 a patch but it does put off reviewers.

 From Documentation/CodingGuidelines:

  - We use tabs to indent, and interpret tabs as taking up to
8 spaces.

 My bad, I will upload a fixed patch.  In my defense: I edited the code
 in emacs and then ran M-x tabify over the entire file.  But that had
 the unfortunate side effect of adding a bunch of whitespace-only
 changes to the diff, illuminating the fact that there is already mixed
 whitespace in the existing code.  So I had to go back and selectively
 tabify my changes, and I clearly missed a bunch.

 If anyone has a recommendation for a less labor-intensive way to do
 this in emacs, I'd be very grateful.

C-c . RET linux RET before entering any changes.

-- 
David Kastrup
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] Make the global packed_git variable static to sha1_file.c.

2014-02-12 Thread szager
From 0a59547f3e95ddecf7606c5f259ae6177c5a104f Mon Sep 17 00:00:00 2001
From: Stefan Zager sza...@chromium.org
Date: Mon, 10 Feb 2014 16:55:12 -0800
Subject: [PATCH] Make the global packed_git variable static to sha1_file.c.

This is a first step in making the codebase thread-safe.  By and
large, the operations which might benefit from threading are those
that work with pack files (e.g., checkout, blame), so the focus of
this patch is stop leaking the global list of pack files outside of
sha1_file.c.

The next step will be to control access to the list of pack files
with a mutex.  However, that alone is not enough to make pack file
access thread safe.  Even in a read-only operation, the window list
associated with each pack file will need to be controlled.
Additionally, the global counters in sha1_file.c will need to be
controlled.

This patch is a pure refactor with no functional changes, so it
shouldn't require any additional tests.  Adding the actual locks
will be a functional change, and will require additional tests.

Signed-off-by: Stefan Zager sza...@chromium.org
---
 builtin/count-objects.c  |  44 ++-
 builtin/fsck.c   |  46 +++-
 builtin/gc.c |  26 +++
 builtin/pack-objects.c   | 188 ---
 builtin/pack-redundant.c |  37 +++---
 cache.h  |  16 +++-
 fast-import.c|   4 +-
 http-backend.c   |  28 ---
 http-push.c  |   4 +-
 http-walker.c|   2 +-
 pack-revindex.c  |  20 ++---
 server-info.c|  35 +
 sha1_file.c  |  35 -
 sha1_name.c  |  18 -
 14 files changed, 315 insertions(+), 188 deletions(-)

diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index a7f70cb..6554dfe 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -83,14 +83,32 @@ static char const * const count_objects_usage[] = {
NULL
 };
 
+struct pack_data {
+   unsigned long packed;
+   off_t size_pack;
+   unsigned long num_pack;
+};
+
+int pack_data_fn(struct packed_git *p, void *data)
+{
+   struct pack_data *pd = (struct pack_data *) data;
+   if (p-pack_local  !open_pack_index(p)) {
+   pd-packed += p-num_objects;
+   pd-size_pack += p-pack_size + p-index_size;
+   pd-num_pack++;
+   }
+   return 0;
+}
+
 int cmd_count_objects(int argc, const char **argv, const char *prefix)
 {
int i, verbose = 0, human_readable = 0;
const char *objdir = get_object_directory();
int len = strlen(objdir);
char *path = xmalloc(len + 50);
-   unsigned long loose = 0, packed = 0, packed_loose = 0;
+   unsigned long loose = 0, packed_loose = 0;
off_t loose_size = 0;
+   struct pack_data pd = {0,0,0};
struct option opts[] = {
OPT__VERBOSE(verbose, N_(be verbose)),
OPT_BOOL('H', human-readable, human_readable,
@@ -118,41 +136,29 @@ int cmd_count_objects(int argc, const char **argv, const 
char *prefix)
closedir(d);
}
if (verbose) {
-   struct packed_git *p;
-   unsigned long num_pack = 0;
-   off_t size_pack = 0;
struct strbuf loose_buf = STRBUF_INIT;
struct strbuf pack_buf = STRBUF_INIT;
struct strbuf garbage_buf = STRBUF_INIT;
-   if (!packed_git)
-   prepare_packed_git();
-   for (p = packed_git; p; p = p-next) {
-   if (!p-pack_local)
-   continue;
-   if (open_pack_index(p))
-   continue;
-   packed += p-num_objects;
-   size_pack += p-pack_size + p-index_size;
-   num_pack++;
-   }
+   prepare_packed_git();
+   foreach_packed_git(pack_data_fn, NULL, pd);
 
if (human_readable) {
strbuf_humanise_bytes(loose_buf, loose_size);
-   strbuf_humanise_bytes(pack_buf, size_pack);
+   strbuf_humanise_bytes(pack_buf, pd.size_pack);
strbuf_humanise_bytes(garbage_buf, size_garbage);
} else {
strbuf_addf(loose_buf, %lu,
(unsigned long)(loose_size / 1024));
strbuf_addf(pack_buf, %lu,
-   (unsigned long)(size_pack / 1024));
+   (unsigned long)(pd.size_pack / 1024));
strbuf_addf(garbage_buf, %lu,
(unsigned long)(size_garbage / 1024));
}
 
printf(count: %lu\n, loose);
printf(size: %s\n, loose_buf.buf);
-   printf(in-pack: %lu\n, packed);
- 

Re: Make the git codebase thread-safe

2014-02-12 Thread Stefan Zager
On Wed, Feb 12, 2014 at 10:50 AM, David Kastrup d...@gnu.org wrote:
 Stefan Zager sza...@chromium.org writes:

 Anything on Windows that touches a lot of files is miserable due to
 the usual file system slowness on Windows, and luafv.sys (the UAC file
 virtualization driver) seems to make it much worse.

 There is an obvious solution here...  Dedicated hardware is not that
 expensive.  Virtualization will always have a price.

Not sure I follow you.  We need to support people developing,
building, and testing on natively Windows machines.  And we need to
support users with reasonable hardware, including spinning disks.  If
we were only interested in optimizing for Google employees, each of
whom has one or more small nuclear reactors under their desk, this
would be easy.

 Blame is something that chromium and blink developers use heavily, and
 it is not unusual for a blame invocation on the blink repository to
 run for 30 seconds.  It seems like it should be possible to
 parallelize blame, but it requires pack file operations to be
 thread-safe.

 Really, give the above patch a try.  I am taking longer to finish it
 than anticipated (with a lot due to procrastination but that is,
 unfortunately, a large part of my workflow), and it's cutting into my
 paychecks (voluntary donations which to a good degree depend on timely
 and nontrivial progress reports for my freely available work on GNU
 LilyPond).

I will give that a try.  How much of a performance improvement have you clocked?

 Note that it looks like the majority of the remaining time on GNU/Linux
 tends to be spent in system time: I/O time, memory management.  And I
 have an SSD drive.  When using packed repositories of considerable size,
 decompression comes into play as well.  I don't think that you can hope
 to get noticeably higher I/O throughput by multithreading, so really,
 really, really consider dedicated hardware running on a native Linux
 file system.

I have a background in hardware, and I have much more faith in modern
disk schedulers :)

Stefan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] tests: turn on network daemon tests by default

2014-02-12 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 Agreed, but I think the only way to know the size of those fallouts is
 to try it and see who complains.  I would not normally be so cavalier
 with git itself, but I think for the test infrastructure, we have a
 small, tech-savvy audience that can help us iterate on it without too
 much pain.

There is another.

$ GIT_TEST_HTTPD=false sh t5537-fetch-shallow.sh 
ok 1 - setup
ok 2 - setup shallow clone
ok 3 - clone from shallow clone
ok 4 - fetch from shallow clone
ok 5 - fetch --depth from shallow clone
ok 6 - fetch --unshallow from shallow clone
ok 7 - fetch something upstream has but hidden by clients shallow boundaries
ok 8 - fetch that requires changes in .git/shallow is filtered
ok 9 - fetch --update-shallow
error: Can't use skip_all after running some tests

Under 'prove' test target, the way it exits causes:

*** prove ***
t5537-fetch-shallow.sh .. Dubious, test returned 1 (wstat 256, 0x100)
All 9 subtests passed

which leads to:

Test Summary Report
---
t5537-fetch-shallow.sh (Wstat: 256 Tests: 9 Failed: 0)
  Non-zero exit status: 1
  Parse errors: No plan found in TAP output


On the 'master' branch without these auto opt-in patches,

$ GIT_TEST_HTTPD= sh t5537-fetch-shallow.sh 
ok 1 - setup
ok 2 - setup shallow clone
ok 3 - clone from shallow clone
ok 4 - fetch from shallow clone
ok 5 - fetch --depth from shallow clone
ok 6 - fetch --unshallow from shallow clone
ok 7 - fetch something upstream has but hidden by clients shallow boundaries
ok 8 - fetch that requires changes in .git/shallow is filtered
ok 9 - fetch --update-shallow
skipping remaining tests, git built without http support
# passed all 9 test(s)
1..9

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-12 Thread David Kastrup
Stefan Zager sza...@chromium.org writes:

 On Wed, Feb 12, 2014 at 10:50 AM, David Kastrup d...@gnu.org wrote:

 Really, give the above patch a try.  I am taking longer to finish it
 than anticipated (with a lot due to procrastination but that is,
 unfortunately, a large part of my workflow), and it's cutting into my
 paychecks (voluntary donations which to a good degree depend on timely
 and nontrivial progress reports for my freely available work on GNU
 LilyPond).

 I will give that a try.  How much of a performance improvement have
 you clocked?

Depends on file type and size.  With large files with lots of small
changes, performance improvements get more impressive.

Some ugly real-world examples are the Emacs repository, src/xdisp.c
(performance improvement about a factor of 3), a large file in the style
of /usr/share/dict/words clocking in at a factor of about 5.

Again, that's with an SSD and ext4 filesystem on GNU/Linux, and there
are no improvements in system time (I/O) except for patch 4 of the
series which helps perhaps 20% or so.

So the benefits of the patch will come into play mostly for big, bad
files on Windows: other than that, the I/O time is likely to be the
dominant player anyway.

If you have benchmarked the stuff, for annoying cases expect I/O time to
go down maybe 10-20%, and user time to drop by a factor of 4.  Under
GNU/Linux, that makes for a significant overall improvement.  On
Windows, the payback is likely quite less because of the worse I/O
performance.  Pity.

-- 
David Kastrup
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Make the global packed_git variable static to sha1_file.c.

2014-02-12 Thread Junio C Hamano
Stefan Zager sza...@google.com writes:

 If anyone has a recommendation for a less labor-intensive way to do
 this in emacs, I'd be very grateful.

This is not do this in emacs, but here is a possible approach.

You can ask git diff about what you changed, and actually apply
the change while fixing whitespace errors.  I.e.

git diff sha1_file.c | git apply --cached --whitespace=fix
git diff
git checkout sha1_file.c

The first step will add a cleaned-up version to your index.

The second diff (optional) is to see what whitespace errors are
introduced when going from that cleaned-up version to what you have
in the working tree.

With the last step you would update the working tree version to the
cleaned-up version from the index.

[alias]
wsadd = !sh -c 'git diff -- \$@\ | git apply --cached 
--whitespace=fix;\
git co -- ${1-.} \$@\' -
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-12 Thread Karsten Blees
Am 12.02.2014 19:37, schrieb Erik Faye-Lund:
 On Wed, Feb 12, 2014 at 7:34 PM, Stefan Zager sza...@google.com wrote:
 On Wed, Feb 12, 2014 at 10:27 AM, Erik Faye-Lund kusmab...@gmail.com wrote:
 On Wed, Feb 12, 2014 at 7:20 PM, Stefan Zager sza...@google.com wrote:

 I don't want to steal the thunder of my coworker, who wrote the
 implementation.  He plans to submit it upstream soon-ish.  It relies
 on using the lpOverlapped argument to ReadFile(), with some additional
 tomfoolery to make sure that the implicit position pointer for the
 file descriptor doesn't get modified.

 Is the code available somewhere? I'm especially interested in the
 additional tomfoolery to make sure that the implicit position pointer
 for the file descriptor doesn't get modified-part, as this was what I
 ended up butting my head into when trying to do this myself.

 https://chromium-review.googlesource.com/#/c/186104/
 
 ReOpenFile, that's fantastic. Thanks a lot!

...but should be loaded dynamically via GetProcAddress, or are we ready to drop 
XP support?

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-12 Thread Stefan Zager
On Wed, Feb 12, 2014 at 11:22 AM, Karsten Blees karsten.bl...@gmail.com wrote:
 Am 12.02.2014 19:37, schrieb Erik Faye-Lund:
 On Wed, Feb 12, 2014 at 7:34 PM, Stefan Zager sza...@google.com wrote:
 On Wed, Feb 12, 2014 at 10:27 AM, Erik Faye-Lund kusmab...@gmail.com 
 wrote:
 On Wed, Feb 12, 2014 at 7:20 PM, Stefan Zager sza...@google.com wrote:

 I don't want to steal the thunder of my coworker, who wrote the
 implementation.  He plans to submit it upstream soon-ish.  It relies
 on using the lpOverlapped argument to ReadFile(), with some additional
 tomfoolery to make sure that the implicit position pointer for the
 file descriptor doesn't get modified.

 Is the code available somewhere? I'm especially interested in the
 additional tomfoolery to make sure that the implicit position pointer
 for the file descriptor doesn't get modified-part, as this was what I
 ended up butting my head into when trying to do this myself.

 https://chromium-review.googlesource.com/#/c/186104/

 ReOpenFile, that's fantastic. Thanks a lot!

 ...but should be loaded dynamically via GetProcAddress, or are we ready to 
 drop XP support?

Right, that is an issue.  From our perspective, it's well past time to
drop XP support.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] blame.c: prepare_lines should not call xrealloc for every line

2014-02-12 Thread Junio C Hamano
David Kastrup d...@gnu.org writes:

 Making a single preparation run for counting the lines will avoid memory
 fragmentation.  Also, fix the allocated memory size which was wrong
 when sizeof(int *) != sizeof(int), and would have been too small
 for sizeof(int *)  sizeof(int), admittedly unlikely.

 Signed-off-by: David Kastrup d...@gnu.org
 ---

I think I took sizeof(int*)-sizeof(int) patch to the 'next' branch
already, which might have to conflict with this clean-up, but it
should be trivial to resolve.

Thanks for resending.  I was busy elsewhere (i.e. no feedback does
not mean silent rejection nor silent agreement at least from
me), and such a resend does help prevent patches fall thru cracks.

 diff --git a/builtin/blame.c b/builtin/blame.c
 index e44a6bb..1aefedf 100644
 --- a/builtin/blame.c
 +++ b/builtin/blame.c
 @@ -1772,25 +1772,41 @@ static int prepare_lines(struct scoreboard *sb)
  {
   const char *buf = sb-final_buf;
   unsigned long len = sb-final_buf_size;
 + const char *end = buf + len;
 + const char *p;
 + int *lineno;
 + int num = 0, incomplete = 0;
  
 + for (p = buf;;) {
 + p = memchr(p, '\n', end - p);
 + if (p) {
 + p++;
   num++;
 + continue;
   }
 + break;
   }
 +
 + if (len  end[-1] != '\n')
 + incomplete++; /* incomplete line at the end */
 +
 + sb-lineno = xmalloc(sizeof(*sb-lineno) * (num + incomplete + 1));
 + lineno = sb-lineno;
 +
 + *lineno++ = 0;
 + for (p = buf;;) {
 + p = memchr(p, '\n', end - p);
 + if (p) {
 + p++;
 + *lineno++ = p - buf;
 + continue;
 + }
 + break;
 + }
 +
 + if (incomplete)
 + *lineno++ = len;
 +
   sb-num_lines = num + incomplete;
   return sb-num_lines;
  }
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] tree-walk: be more specific about corrupt tree errors

2014-02-12 Thread Jeff King
When the tree-walker runs into an error, it just calls
die(), and the message is always corrupt tree file.
However, we are actually covering several cases here; let's
give the user a hint about what happened.

Let's also avoid using the word corrupt, which makes it
seem like the data bit-rotted on disk. Our sha1 check would
already have found that. These errors are ones of data that
is malformed in the first place.

Signed-off-by: Jeff King p...@peff.net
---
Michael and I have been looking off-list at some bogus trees (created by
a non-git.git implementation). git-fsck unhelpfully dies during the
tree-walk:

  $ git fsck
  Checking object directories: 100% (256/256), done.
  fatal: corrupt tree file

I think in the long run we want to either teach fsck to avoid the
regular tree-walker or to set a special continue as much as you can
flag. That will let us keep going to find more errors, do our usual fsck
error checks (which are more strict), and especially report on _which_
object was broken (we can't do that here because we are deep in the call
stack and may not even have a real object yet).

This patch at least gives us slightly more specific error messages (both
for fsck and for other commands). And it may provide a first step in
clarity if we follow the continue as much as you can path.

 tree-walk.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/tree-walk.c b/tree-walk.c
index 79dba1d..d53b084 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -28,11 +28,13 @@ static void decode_tree_entry(struct tree_desc *desc, const 
char *buf, unsigned
unsigned int mode, len;
 
if (size  24 || buf[size - 21])
-   die(corrupt tree file);
+   die(truncated tree file);
 
path = get_mode(buf, mode);
-   if (!path || !*path)
-   die(corrupt tree file);
+   if (!path)
+   die(malformed mode in tree entry);
+   if (!*path)
+   die(empty filename in tree entry);
len = strlen(path) + 1;
 
/* Initialize the descriptor entry */
@@ -81,7 +83,7 @@ void update_tree_entry(struct tree_desc *desc)
unsigned long len = end - (const unsigned char *)buf;
 
if (size  len)
-   die(corrupt tree file);
+   die(truncated tree file);
buf = end;
size -= len;
desc-buffer = buf;
-- 
1.8.5.2.500.g8060133
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-12 Thread Junio C Hamano
Stefan Zager sza...@chromium.org writes:

 ...  I used the Very Sleepy profiler
 to see where all the time was spent on Windows: 55% of the time was
 spent in OpenFile, and 25% in CloseFile (both in win32).

This is somewhat interesting.

When we check things out, checkout_paths() has a list of paths to be
checked out, and iterates over them and call checkout_entry().

I wonder if you can:

 - introduce a version of checkout_entry() that takes file
   descriptors to write to;

 - have an asynchronous helper threads that pre-open the paths to be
   written out and feed ce, file descriptor to be written to a
   queue;

 - restructure that loop so that it reads the ce, file descriptor
   to be written from the queue, performs the actual writing out,
   and then feeds file descriptor to be closed to another queue; and

 - have another asynchronous helper threads that reads file
   descriptor to be closed from the queue and close them.

Calls to write (and preparation of data to be written) will then
remain single-threaded, but it sounds like that codepath is not the
bottleneck in your measurement, so

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Make the global packed_git variable static to sha1_file.c.

2014-02-12 Thread Junio C Hamano
I'll locally fix up these style issues before commenting on the patch.

Thanks.

ERROR: space required after that ',' (ctx:VxV)
#78: FILE: builtin/count-objects.c:111:
+   struct pack_data pd = {0,0,0};
^

ERROR: space required after that ',' (ctx:VxV)
#78: FILE: builtin/count-objects.c:111:
+   struct pack_data pd = {0,0,0};
  ^

ERROR: space required after that ',' (ctx:VxV)
#171: FILE: builtin/fsck.c:683:
+   struct verify_packs_data vpd = {0,0,0};
 ^

ERROR: space required after that ',' (ctx:VxV)
#171: FILE: builtin/fsck.c:683:
+   struct verify_packs_data vpd = {0,0,0};
   ^

ERROR: space required after that ',' (ctx:VxV)
#541: FILE: builtin/pack-redundant.c:591:
+   struct add_pack_data apd = {filename,0,NULL};
^

ERROR: space required after that ',' (ctx:VxV)
#541: FILE: builtin/pack-redundant.c:591:
+   struct add_pack_data apd = {filename,0,NULL};
  ^

ERROR: space required after that ',' (ctx:VxV)
#565: FILE: builtin/pack-redundant.c:604:
+   struct add_pack_data apd = {NULL,0,NULL};
^

ERROR: space required after that ',' (ctx:VxV)
#565: FILE: builtin/pack-redundant.c:604:
+   struct add_pack_data apd = {NULL,0,NULL};
  ^

ERROR: space prohibited after that '-' (ctx:WxW)
#726: FILE: pack-revindex.c:47:
+   num = - 1 - num;
  ^

ERROR: space required after that ',' (ctx:VxV)
#901: FILE: sha1_name.c:195:
+   struct disambiguate_data d = {len,bin_pfx,ds};
 ^

ERROR: space required after that ',' (ctx:VxV)
#901: FILE: sha1_name.c:195:
+   struct disambiguate_data d = {len,bin_pfx,ds};
 ^

total: 11 errors, 0 warnings, 781 lines checked
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-12 Thread Stefan Zager
On Wed, Feb 12, 2014 at 12:06 PM, Junio C Hamano gits...@pobox.com wrote:
 Stefan Zager sza...@chromium.org writes:

 ...  I used the Very Sleepy profiler
 to see where all the time was spent on Windows: 55% of the time was
 spent in OpenFile, and 25% in CloseFile (both in win32).

 This is somewhat interesting.

 When we check things out, checkout_paths() has a list of paths to be
 checked out, and iterates over them and call checkout_entry().

 I wonder if you can:

  - introduce a version of checkout_entry() that takes file
descriptors to write to;

  - have an asynchronous helper threads that pre-open the paths to be
written out and feed ce, file descriptor to be written to a
queue;

  - restructure that loop so that it reads the ce, file descriptor
to be written from the queue, performs the actual writing out,
and then feeds file descriptor to be closed to another queue; and

  - have another asynchronous helper threads that reads file
descriptor to be closed from the queue and close them.

 Calls to write (and preparation of data to be written) will then
 remain single-threaded, but it sounds like that codepath is not the
 bottleneck in your measurement, so

Yes, I considered that as well.  At a minimum, that would still
require attr.c to implement thread locking, since attribute files must
be parsed to look for stream filters.  I have already done that work.

But I'm not sure it's the best long-term approach to add convoluted
custom threading solutions to each git operation as it appears on the
performance radar.  I'm hoping to make the entire code base more
thread-friendly, so that threading can be added in a more natural and
idiomatic (and less painful) way.

For example, the most natural way to add threading to checkout would
be in the loops over the index in check_updates() in unpack-trees.c.
If attr.c and sha1_file.c were thread-safe, then it would be possible
to thread checkout entirely in check_updates(), with a pretty compact
code change.  I have already done the work in attr.c; sha1_file.c is
hairier, but do-able.

Stefan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Introduce experimental remote object access mode

2014-02-12 Thread Shawn Pearce
On Tue, Feb 11, 2014 at 11:29 AM, Junio C Hamano gits...@pobox.com wrote:
 Shawn Pearce spea...@spearce.org writes:

 Why would you do this? Perhaps you need more time in your day
 to consume tea or coffee. Set GIT_RTT and enjoy a beverage.

 So the conclusion is that it is not practical to do a lazy fetch if
 it is done extremely naively at we want this object --- wait a bit
 and we'll give you level?

Yes, this is what I thought when someone proposed this hack in
sha1_file.c to me on Monday. So I ran a quick experiment to see if my
instinct was right.

 I am wondering if we can do a bit better, like we want this object
 --- wait a bit, ah that's a commit, so it is likely that you may
 want the trees and blobs associated with it, too, if not right now
 but in a near future, let me push a pack that holds them to you?

Ah, smart observation. That might work. However I doubt it.

I implemented a version of Git on top of Google Bigtable (and Apache
HBase and Apache Cassandra) multiple times using JGit. tl;dr: this
approach doesn't work in practice.


The naive implementation for these distributed NoSQL systems is to
store each object in its own row keyed by SHA-1, and lookup the object
when you want it. This is very slow and is more or less what this
stupid patch shows. Worse, none of them were able to even get close to
the 1ms latency I used in this example.

In another implementation (which I published into JGit as the DHT
backend) I stored a group of related commits together in a row. Row
target sizes were in the 1-2 MiB range when using pack style
compression for commits, so the average row held hundreds of commits.
Reading one commit would actually slurp back a number of related
commits. The idea was if we need commit A now we will need B, C, D, E
(its parents and ancestors) soon as the application walks the revision
history, like rev-list or pack-objects.

For a process like pack-objects this almost seems to work. If commits
are together we can slurp a group at a time to amortize the round trip
latency. Unfortunately the application can still go through hundreds
of commits faster than the real world RTT is. So I tried to fix this
by storing an extra metadata pointer in each row to identify the next
row, so next block of commits could start loading right away. Its
still slow, as the application can scan through data faster than the
RTT.

At least for pack generation the traversal code does commits and
builds up a list of all root trees. The root trees can be async loaded
in batches, but the depth first traversal is still a killer. There are
stalls while the application waits for the next subtree, even if you
cluster the trees also into groups using depth first traversal the way
the packer produces pack files today.


Junio's idea to cluster data by commit and its related trees and blobs
is just a different data organization. It may be necessary to make two
copies of the data, one clustered by commits and another by
commit+tree+blob to satisfy different access patterns. And in the
commit+tree+blob case you may need multiple redundant copies of blobs
near commits that use them if those commits are frequently accessed.
Its a lot of redundant disk space.

We always say disk is cheap, but disk is slow and not getting faster.
SSDs are helping, but SSDs are expensive and have size limitations
compared to spinning disk. Just making many copies of data isn't
necessarily a solution.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Make the global packed_git variable static to sha1_file.c.

2014-02-12 Thread Junio C Hamano
sza...@chromium.org writes:

 From 0a59547f3e95ddecf7606c5f259ae6177c5a104f Mon Sep 17 00:00:00 2001

Please drop this line.

 From: Stefan Zager sza...@chromium.org

Please drop this line and instead have it in your e-mail header.

 Date: Mon, 10 Feb 2014 16:55:12 -0800

The date in your e-mail header, which is the first time general
public saw this particular version of the patch, is used by default
as the author time.  Unless there is a compelling reason to override
that with an in-body header, please drop this line.

 Subject: [PATCH] Make the global packed_git variable static to sha1_file.c.

Please drop this line and instead have it in your e-mail header.

 This patch is a pure refactor with no functional changes,...

 diff --git a/builtin/count-objects.c b/builtin/count-objects.c
 index a7f70cb..6554dfe 100644
 --- a/builtin/count-objects.c
 +++ b/builtin/count-objects.c
 @@ -83,14 +83,32 @@ static char const * const count_objects_usage[] = {
   NULL
  };
  
 +struct pack_data {
 + unsigned long packed;
 + off_t size_pack;
 + unsigned long num_pack;
 +};
 +
 +int pack_data_fn(struct packed_git *p, void *data)

Can't/shouldn't this be static?

Also I'd suggest s/pack_data_fn/collect_pack_data/ or something.
_fn may be a good suffix for typedef'ed typename used in a
callback function, but for a concrete function, it only adds noise
without giving us anything new.

Yes, I know there are a few existing violators, but we
shouldn't make the codebase worse, using their existence due
to past carelessness as an excuse.

 diff --git a/cache.h b/cache.h
 index dc040fb..542a9d9 100644
 --- a/cache.h
 +++ b/cache.h
 ...
 +/* The 'hint' argument is for the commonly-used 'last found pack' 
 optimization.
 + * It can be NULL.
 + */

/*
 * Please try to have opening slash-asterisk and closing
 * asterisk-slash in a multi-line comment block on their
 * own lines by themselves.
 */

 +extern void foreach_packed_git(packed_git_foreach_fn fn, struct packed_git 
 *hint, void *data);
 +
 +extern size_t packed_git_count();
 +extern size_t packed_git_local_count();

extern size_t packed_git_count(void);
extern size_t packed_git_local_count(void);

 diff --git a/sha1_file.c b/sha1_file.c
 index 6e8c05d..aeeb7e6 100644
 --- a/sha1_file.c
 +++ b/sha1_file.c
 @@ -60,6 +60,7 @@ static struct cached_object empty_tree = {
   0
  };
  
 +static struct packed_git *packed_git;
  static struct packed_git *last_found_pack;
  
  static struct cached_object *find_cached_object(const unsigned char *sha1)
 @@ -468,7 +469,6 @@ static unsigned int pack_open_fds;
  static unsigned int pack_max_fds;
  static size_t peak_pack_mapped;
  static size_t pack_mapped;
 -struct packed_git *packed_git;

Hmm, any particular reason why only this variable and not others are
moved up?

 @@ -1091,6 +1091,37 @@ struct packed_git *add_packed_git(const char *path, 
 int path_len, int local)
   return p;
  }
  
 +void foreach_packed_git(packed_git_foreach_fn fn, struct packed_git *hint, 
 void *data)
 +{
 + struct packed_git *p;
 + if (hint  ((*fn)(hint, data)))
 + return;
 + for (p = packed_git; p; p = p-next)
 + if (p != hint  (*fn)(p, data))
 + return;

(mental note) In the new API, a non-zero return signals an early
return/break from the loop.

 +}
 +
 +size_t packed_git_count()

size_t packed_git_count(void)

 +{
 + size_t res = 0;
 + struct packed_git *p;
 +
 + for (p = packed_git; p; p = p-next)
 + ++res;

When pre- or post- increment does not make any difference (i.e. you
do not use its value), please stick to post-increment.  pre-increment
looks unusual in our codebase and becomes distracting while reading
it through.

Same comments for packed-git-local-count.

 diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
 index 541667f..bc3074b 100644
 --- a/builtin/pack-objects.c
 +++ b/builtin/pack-objects.c
 @@ -900,14 +900,45 @@ static int no_try_delta(const char *path)
   return 0;
  }
  
 +struct find_pack_data {
 + const unsigned char *sha1;
 + off_t offset;
 + struct packed_git *found_pack;
 + int exclude;
 + int found_non_local_pack;
 + int found_pack_keep;
 +};
 +
 +static int find_pack_fn(struct packed_git *p, void *data)
 +{
 + struct find_pack_data *fpd = (struct find_pack_data *) data;
 + off_t offset = find_pack_entry_one(fpd-sha1, p);
 + if (offset) {
 + if (!fpd-found_pack) {
 + if (!is_pack_valid(p)) {
 + warning(packfile %s cannot be accessed, 
 p-pack_name);
 + return 0;
 + }
 + fpd-offset = offset;
 + fpd-found_pack = p;
 + }
 + if (fpd-exclude)
 + return 1;
 + if (!p-pack_local)
 + fpd-found_non_local_pack = 1;
 + else if 

Re: [PATCH] tests: turn on network daemon tests by default

2014-02-12 Thread Jeff King
On Tue, Feb 11, 2014 at 03:58:27PM -0800, Junio C Hamano wrote:

 Sure. One immediate complaint is that I would probably need to do
 something like this in the build automation:
 
   if testing a branch without this patch
 then
   : do nothing
   else
   GIT_TEST_GIT_DAEMON=false
   fi
 
 Arguably, it is the fault of the current/original code that treated
 *any* non-empty value that is set in the environment variable as
 true---if it paid attention to GIT_TEST_GIT_DAEMON=NoThanks, we
 wouldn't have to have a workaround like this.

Yes, I didn't really think about build config that works reliably
between both versions (though personally, I think you should be building
with GIT_TEST_GIT_DAEMON=true :) ).

 I wonder if GIT_TEST_X=$(test_tristate $GIT_TEST_X) pattern can be
 made a bit more friendly, though.  For example, can we behave
 differently depending on the reason why $GIT_TEST_X is empty?
 
  - People who have *not* been opting in to the expensive tests have
not done anything special; GIT_TEST_X environment variable did
not exist for them (i.e. unset), and we used to skip when
$GIT_TEST_X is an empty string.
 
  - We want to encourage people who do not care to run these tests.
If people do not do anything, their $GIT_TEST_X will continue to
be an empty string without GIT_TEST_X variable in the
environment.
 
 If we let people who *do* want to opt out of the expensive tests by
 explicitly setting $GIT_TEST_X to an empty string in the new scheme,
 wouldn't the transition go a lot smoother?

Hmm. So you are suggesting that the old code treated undefined and
empty the same (as false). But that in the new code, we would treat
them _differently_, taking undefined to mean auto and empty to mean
false. I suppose that works, but it is rather unfortunate that the end
state we are left with (for all time) makes such a confusing and subtle
distinction.

I think this should work OK with the existing Makefile conventions. That
is, we do not ever set GIT_TEST_HTTPD in the Makefile ourselves, but
rely on it being either unset or set to whatever the user likes (this is
opposed to something like CFLAGS, where the distinction is long gone).

So I'm not excited about it, but I do not think there is any other
loophole through which we can maintain compatibility. If that's
important, I think we have to do it.

 The caller may have to pass the name of the variable:
 
   GIT_TEST_DAEMON=$(test_tristate GIT_TEST_DAEMON)

I don't think that's a big deal. I actually was tempted to just make
this:

  test_normalize_tristate GIT_TEST_DAEMON

in the first place, since you would always want to look at the
normalized value from there on out.

 and then the callee may become
 
   test_tristate () {
   variable=$1
 if eval test x\\${$variable+isset}\ = xisset

Hmm, today I learned about {foo:+bar} versus ${foo+bar}. I'm not
sure how that bit of shell trivia escaped me for so many years.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


What's cooking in git.git (Feb 2014, #04; Wed, 12)

2014-02-12 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.

As a workaround to make life easier for third-party tools, the
upcoming major release will be called Git 1.9.0 (not Git 1.9).
The first maintenance release for it will be Git 1.9.1, and the
major release after Git 1.9.0 will either be Git 2.0.0 or Git
1.10.0.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[New Topics]

* al/docs (2014-02-11) 4 commits
 - docs/git-blame: explain more clearly the example pickaxe use
 - docs/git-clone: clarify use of --no-hardlinks option
 - docs/git-remote: capitalize first word of initial blurb
 - docs/merge-strategies: remove hyphen from mis-merges

 A handful of documentation updates, all trivially harmless.

 Will merge to 'next'.


* jk/test-ports (2014-02-10) 2 commits
 - tests: auto-set git-daemon port
 - tests: auto-set LIB_HTTPD_PORT from test name
 (this branch is tangled with nd/http-fetch-shallow-fix.)

 Avoid having to assign port number to be used in tests manually.

 Will merge to 'next'.


* nd/daemonize-gc (2014-02-10) 2 commits
 - gc: config option for running --auto in background
 - daemon: move daemonize() to libgit.a

 Allow running gc --auto in the background.

 Will merge to 'next'.


* nd/gitignore-trailing-whitespace (2014-02-10) 2 commits
 - dir: ignore trailing spaces in exclude patterns
 - dir: warn about trailing spaces in exclude patterns

 Warn and then ignore trailing whitespaces in .gitignore files,
 unless they are quoted for fnmatch(3), e.g. path\ .


* nd/log-show-linear-break (2014-02-10) 1 commit
 - log: add --show-linear-break to help see non-linear history


* ss/completion-rec-sub-fetch-push (2014-02-11) 1 commit
 - completion: teach --recurse-submodules to fetch, pull and push


* ks/tree-diff-more (2014-02-12) 16 commits
 - tree-diff: reuse base str(buf) memory on sub-tree recursion
 - tree-diff: no need to call full diff_tree_sha1 from show_path()
 - tree-diff: rework diff_tree interface to be sha1 based
 - tree-diff: remove special-case diff-emitting code for empty-tree cases
 - tree-diff: simplify tree_entry_pathcmp
 - tree-diff: show_path prototype is not needed anymore
 - tree-diff: rename compare_tree_entry - tree_entry_pathcmp
 - tree-diff: move all action-taking code out of compare_tree_entry()
 - tree-diff: don't assume compare_tree_entry() returns -1,0,1
 - FIXUP!
 - tree-diff: consolidate code for emitting diffs and recursion in one place
 - tree-diff: show_tree() is not needed
 - tree-diff: no need to pass match to skip_uninteresting()
 - tree-diff: no need to manually verify that there is no mode change for a path
 - combine-diff: move changed-paths scanning logic into its own function
 - combine-diff: move show_log_first logic/action out of paths scanning
 (this branch uses ks/combine-diff and ks/tree-diff-walk.)


* jh/note-trees-record-blobs (2014-02-12) 1 commit
 - notes: Disallow reusing non-blob as a note object


* jk/run-network-tests-by-default (2014-02-12) 1 commit
 - tests: turn on network daemon tests by default

 Teach make test to run networking tests when possible by default.

 Needs a bit more work.
 e.g. $gmane/242013

--
[Stalled]

* po/everyday-doc (2014-01-27) 1 commit
 - Make 'git help everyday' work

 This may make the said command to emit something, but the source is
 not meant to be formatted into a manual pages to begin with, and
 also its contents are a bit stale.  It may be a good first step in
 the right direction, but needs more work to at least get the
 mark-up right before public consumption.

 Will hold.


* jk/branch-at-publish-rebased (2014-01-17) 5 commits
 - t1507 (rev-parse-upstream): fix typo in test title
 - implement @{publish} shorthand
 - branch_get: provide per-branch pushremote pointers
 - branch_get: return early on error
 - sha1_name: refactor upstream_mark

 Give an easier access to the tracking branches from other side in
 a triangular workflow by introducing B@{publish} that works in a
 similar way to how B@{upstream} does.

 Meant to be used as a basis for whatever Ram wants to build on.

 Will hold.


* rb/merge-prepare-commit-msg-hook (2014-01-10) 4 commits
 - merge: drop unused arg from abort_commit method signature
 - merge: make prepare_to_commit responsible for write_merge_state
 - t7505: ensure cleanup after hook blocks merge
 - t7505: add missing 

 Expose more merge states (e.g. $GIT_DIR/MERGE_MODE) to hooks that
 run during git merge.  The log message stresses too much on one
 hook, prepare-commit-msg, but it would equally apply to other hooks
 like post-merge, I think.

 Waiting for a reroll.


* jl/submodule-recursive-checkout (2013-12-26) 5 commits
 - Teach checkout to recursively checkout 

Re: [PATCH] tests: turn on network daemon tests by default

2014-02-12 Thread Jeff King
On Wed, Feb 12, 2014 at 11:06:43AM -0800, Junio C Hamano wrote:

 Jeff King p...@peff.net writes:
 
  Agreed, but I think the only way to know the size of those fallouts is
  to try it and see who complains.  I would not normally be so cavalier
  with git itself, but I think for the test infrastructure, we have a
  small, tech-savvy audience that can help us iterate on it without too
  much pain.
 
 There is another.

I somehow pictured this while reading your email:
http://youtu.be/X4JVcvR7IM0

 $ GIT_TEST_HTTPD=false sh t5537-fetch-shallow.sh 
 ok 1 - setup
 ok 2 - setup shallow clone
 ok 3 - clone from shallow clone
 ok 4 - fetch from shallow clone
 ok 5 - fetch --depth from shallow clone
 ok 6 - fetch --unshallow from shallow clone
 ok 7 - fetch something upstream has but hidden by clients shallow boundaries
 ok 8 - fetch that requires changes in .git/shallow is filtered
 ok 9 - fetch --update-shallow
 error: Can't use skip_all after running some tests

Ah, yeah, I did not notice that t5537 does its own special handling of
GIT_TEST_HTTPD. I think it is wrong to do so, and it is already buggy in
the case that GIT_TEST_HTTPD is set, but apache cannot be started.

E.g., with the current master:

  $ sudo chmod -x `which apache2`
  $ GIT_TEST_HTTPD=1 ./t5537-fetch-shallow.sh
  ok 1 - setup
  ok 2 - setup shallow clone
  ok 3 - clone from shallow clone
  ok 4 - fetch from shallow clone
  ok 5 - fetch --depth from shallow clone
  ok 6 - fetch --unshallow from shallow clone
  ok 7 - fetch something upstream has but hidden by clients shallow boundaries
  ok 8 - fetch that requires changes in .git/shallow is filtered
  ok 9 - fetch --update-shallow
  error: Can't use skip_all after running some tests

lib-httpd was never designed to be included from anywhere except the
beginning of the file. But that wouldn't be right for t5537, because it
wants to run some of the tests, even if apache setup fails. The right
way to do it is probably to have lib-httpd do all of its work in a lazy
prereq. I don't know how clunky that will end up, though; it might be
simpler to just move the shallow http test into one of the http-fetch
scripts.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] tests: turn on network daemon tests by default

2014-02-12 Thread Junio C Hamano
Jeff King p...@peff.net writes:

   test_normalize_tristate GIT_TEST_DAEMON

Heh, great minds think alike.  This is what I am playing with,
without committing (because I do like your ask config if this is a
kind of various boolean 'false' representations, which I haven't
managed to add to it).


 t/lib-git-daemon.sh |  2 +-
 t/lib-httpd.sh  |  2 +-
 t/test-lib-functions.sh | 45 +++--
 3 files changed, 29 insertions(+), 20 deletions(-)

diff --git a/t/lib-git-daemon.sh b/t/lib-git-daemon.sh
index 36106de..615bf5d 100644
--- a/t/lib-git-daemon.sh
+++ b/t/lib-git-daemon.sh
@@ -16,7 +16,7 @@
 #  stop_git_daemon
 #  test_done
 
-GIT_TEST_GIT_DAEMON=$(test_tristate $GIT_TEST_GIT_DAEMON)
+test_tristate GIT_TEST_GIT_DAEMON
 if test $GIT_TEST_GIT_DAEMON = false
 then
skip_all=git-daemon testing disabled (unset GIT_TEST_GIT_DAEMON to 
enable)
diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index 583fb14..f9c2e22 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -30,7 +30,7 @@
 # Copyright (c) 2008 Clemens Buchacher dri...@aon.at
 #
 
-GIT_TEST_HTTPD=$(test_tristate $GIT_TEST_HTTPD)
+test_tristate GIT_TEST_HTTPD
 if test $GIT_TEST_HTTPD = false
 then
skip_all=Network testing disabled (unset GIT_TEST_HTTPD to enable)
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 3cc09c0..21c5214 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -716,27 +716,36 @@ perl () {
command $PERL_PATH $@
 }
 
-# Normalize the value given in $1 to one of true, false, or auto. The
-# result is written to stdout. E.g.:
+# Given a variable $1, normalize the value of it to one of true,
+# false, or auto and store the result to it.
 #
-# GIT_TEST_HTTPD=$(test_tristate $GIT_TEST_HTTPD)
+# test_tristate GIT_TEST_HTTPD
 #
+# A variable set to an empty string is set to 'false'.
+# A variable set to 'false' or 'auto' keeps its value.
+# Anything else is set to 'true'.
+# An unset variable defaults to 'auto'.
+#
+# The last rule is to allow people to set the variable to an empty
+# string and export it to decline testing the particular feature
+# for versions both before and after this change.  We used to treat
+# both unset and empty variable as a signal for do not test and
+# took any non-empty string as please test.
+
 test_tristate () {
-   case $1 in
-   |auto)
-   echo auto
-   ;;
-   *)
-   # Rely on git-config to handle all the variants of
-   # true/1/on/etc that we allow.
-   if ! git -c magic.hack=$1 config --bool magic.hack 2/dev/null
-   then
-   # If git-config failed, it was some non-bool value like
-   # YesPlease. Default this to true for historical
-   # compatibility.
-   echo true
-   fi
-   esac
+   if eval test x\\${$1+isset}\ = xisset
+   then
+   # explicitly set
+   eval 
+   case \\$$1\ in
+   '') $1=false ;;
+   false | auto) ;;
+   *)  $1=true ;;
+   esac
+   
+   else
+   eval $1=auto
+   fi
 }
 
 # Exit the test suite, either by skipping all remaining tests or by
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-12 Thread Junio C Hamano
On Wed, Feb 12, 2014 at 12:27 PM, Stefan Zager sza...@chromium.org wrote:
 On Wed, Feb 12, 2014 at 12:06 PM, Junio C Hamano gits...@pobox.com wrote:
 Stefan Zager sza...@chromium.org writes:

 Calls to write (and preparation of data to be written) will then
 remain single-threaded, but it sounds like that codepath is not the
 bottleneck in your measurement, so

 Yes, I considered that as well.  At a minimum, that would still
 require attr.c to implement thread locking, since attribute files must
 be parsed to look for stream filters.  I have already done that work.

I would have imagined that use of the attribute system belongs to write and
preparation of data to be written category, i.e. the single threaded
part of the
kludge I outlined.

 But I'm not sure it's the best long-term approach to add convoluted
 custom threading solutions to each git operation as it appears on the
 performance radar.

Yeah, it depends on how clean and non-intrusive an abstraction we can make.
The kludge I outlined is certainly not very pretty.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-12 Thread Mike Hommey
On Wed, Feb 12, 2014 at 08:15:24PM +0100, David Kastrup wrote:
 Stefan Zager sza...@chromium.org writes:
 
  On Wed, Feb 12, 2014 at 10:50 AM, David Kastrup d...@gnu.org wrote:
 
  Really, give the above patch a try.  I am taking longer to finish it
  than anticipated (with a lot due to procrastination but that is,
  unfortunately, a large part of my workflow), and it's cutting into my
  paychecks (voluntary donations which to a good degree depend on timely
  and nontrivial progress reports for my freely available work on GNU
  LilyPond).
 
  I will give that a try.  How much of a performance improvement have
  you clocked?
 
 Depends on file type and size.  With large files with lots of small
 changes, performance improvements get more impressive.
 
 Some ugly real-world examples are the Emacs repository, src/xdisp.c
 (performance improvement about a factor of 3), a large file in the style
 of /usr/share/dict/words clocking in at a factor of about 5.
 
 Again, that's with an SSD and ext4 filesystem on GNU/Linux, and there
 are no improvements in system time (I/O) except for patch 4 of the
 series which helps perhaps 20% or so.
 
 So the benefits of the patch will come into play mostly for big, bad
 files on Windows: other than that, the I/O time is likely to be the
 dominant player anyway.

How much fragmentation does that add to the files, though?

Mike
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] contrib/diff-highlight: multibyte characters diff

2014-02-12 Thread Thomas Adam
On 12 February 2014 20:59, Jeff King p...@peff.net wrote:
 +sub decode {
 +   my $orig = shift;
 +   my $decoded = eval { decode_utf8($orig, Encode::FB_CROAK) };
 +   return defined $decoded ?

I'd still advocate checking $@ here, rather than the defined $decoded check.

 +  ($decoded, sub { encode_utf8(shift) }) :
 +  ($orig, sub { shift });
 +}
 +

-- Thomas Adam
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] contrib/diff-highlight: multibyte characters diff

2014-02-12 Thread Jeff King
On Wed, Feb 12, 2014 at 11:10:49PM +, Thomas Adam wrote:

 On 12 February 2014 20:59, Jeff King p...@peff.net wrote:
  +sub decode {
  +   my $orig = shift;
  +   my $decoded = eval { decode_utf8($orig, Encode::FB_CROAK) };
  +   return defined $decoded ?
 
 I'd still advocate checking $@ here, rather than the defined $decoded check.

I don't mind changing it, but for my edification, what is the advantage?

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-12 Thread Mike Hommey
On Wed, Feb 12, 2014 at 12:00:19PM +0100, Karsten Blees wrote:
 Am 12.02.2014 04:43, schrieb Duy Nguyen:
  On Wed, Feb 12, 2014 at 9:02 AM, Robin H. Johnson robb...@gentoo.org 
  wrote:
  On Tue, Feb 11, 2014 at 05:54:51PM -0800,  Stefan Zager wrote:
  We in the chromium project have a keen interest in adding threading to
  git in the pursuit of performance for lengthy operations (checkout,
  status, blame, ...).  Our motivation comes from hitting some
  performance walls when working with repositories the size of chromium
  and blink:
  +1 from Gentoo on performance improvements for large repos.
 
  The main repository in the ongoing Git migration project looks to be in
  the 1.5GB range (and for those that want to propose splitting it up, we
  have explored that option and found it lacking), with very deep history
  (but no branches of note, and very few tags).
  
  From v1.9 shallow clone should work for all push/pull/clone... so
  history depth does not matter (on the client side). As for
  gentoo-x86's large worktree, using index v4 and avoid full-tree
  operations (e.g. status ., not status..) should make all
  operations reasonably fast. I plan to make status fast even without
  path limiting with the help of inotify, but that's not going to be
  finished soon. Did I miss anything else?
  
 
 Regarding git-status on msysgit, enable core.preloadindex and core.fscache 
 (as of 1.8.5.2).
 
 There's no inotify on Windows, and I gave up using ReadDirectoryChangesW to
 keep fscache up to date, as it _may_ report DOS file names (e.g. C:\PROGRA~1
 instead of C:\Program Files).

You can use GetLongPathNameW to get the latter from the former.

Mike
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-12 Thread Karsten Blees
Am 13.02.2014 00:03, schrieb Mike Hommey:
 On Wed, Feb 12, 2014 at 12:00:19PM +0100, Karsten Blees wrote:
 Am 12.02.2014 04:43, schrieb Duy Nguyen:
 On Wed, Feb 12, 2014 at 9:02 AM, Robin H. Johnson robb...@gentoo.org 
 wrote:
 On Tue, Feb 11, 2014 at 05:54:51PM -0800,  Stefan Zager wrote:
 We in the chromium project have a keen interest in adding threading to
 git in the pursuit of performance for lengthy operations (checkout,
 status, blame, ...).  Our motivation comes from hitting some
 performance walls when working with repositories the size of chromium
 and blink:
 +1 from Gentoo on performance improvements for large repos.

 The main repository in the ongoing Git migration project looks to be in
 the 1.5GB range (and for those that want to propose splitting it up, we
 have explored that option and found it lacking), with very deep history
 (but no branches of note, and very few tags).

 From v1.9 shallow clone should work for all push/pull/clone... so
 history depth does not matter (on the client side). As for
 gentoo-x86's large worktree, using index v4 and avoid full-tree
 operations (e.g. status ., not status..) should make all
 operations reasonably fast. I plan to make status fast even without
 path limiting with the help of inotify, but that's not going to be
 finished soon. Did I miss anything else?


 Regarding git-status on msysgit, enable core.preloadindex and core.fscache 
 (as of 1.8.5.2).

 There's no inotify on Windows, and I gave up using ReadDirectoryChangesW to
 keep fscache up to date, as it _may_ report DOS file names (e.g. C:\PROGRA~1
 instead of C:\Program Files).
 
 You can use GetLongPathNameW to get the latter from the former.
 
 Mike
 

Except if its a delete or rename notification...my final ReadDirectoryChangesW 
version cached the files by their long _and_ short names, but was so complex 
that it slowed most commands down rather than speeding them up :-)
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] contrib/diff-highlight: multibyte characters diff

2014-02-12 Thread brian m. carlson
On Wed, Feb 12, 2014 at 06:27:40PM -0500, Jeff King wrote:
 On Wed, Feb 12, 2014 at 11:10:49PM +, Thomas Adam wrote:
 
  On 12 February 2014 20:59, Jeff King p...@peff.net wrote:
   +sub decode {
   +   my $orig = shift;
   +   my $decoded = eval { decode_utf8($orig, Encode::FB_CROAK) };
   +   return defined $decoded ?
  
  I'd still advocate checking $@ here, rather than the defined $decoded check.
 
 I don't mind changing it, but for my edification, what is the advantage?

The documentation for decode_utf8 isn't clear, but I don't know if it
can ever return undef.  What, for example, does it return if $orig is
not defined?  That's the benefit: it's immediately clear to the user
that you're interested in whether it threw an exception, rather than
whether it produced a given value.

That said, $DAYJOB is a Perl shop, and I would certainly not reject this
code in review, and depending on the situation, I might even write
something like this.  I personally think it's fine.

-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187


signature.asc
Description: Digital signature


Re: [PATCH] tests: turn on network daemon tests by default

2014-02-12 Thread Duy Nguyen
On Thu, Feb 13, 2014 at 5:12 AM, Jeff King p...@peff.net wrote:
 lib-httpd was never designed to be included from anywhere except the
 beginning of the file. But that wouldn't be right for t5537, because it
 wants to run some of the tests, even if apache setup fails. The right
 way to do it is probably to have lib-httpd do all of its work in a lazy
 prereq. I don't know how clunky that will end up, though; it might be
 simpler to just move the shallow http test into one of the http-fetch
 scripts.

I'll move it out later.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] contrib/diff-highlight: multibyte characters diff

2014-02-12 Thread Jeff King
On Thu, Feb 13, 2014 at 01:17:54AM +, brian m. carlson wrote:

 On Wed, Feb 12, 2014 at 06:27:40PM -0500, Jeff King wrote:
  On Wed, Feb 12, 2014 at 11:10:49PM +, Thomas Adam wrote:
  
   On 12 February 2014 20:59, Jeff King p...@peff.net wrote:
+sub decode {
+   my $orig = shift;
+   my $decoded = eval { decode_utf8($orig, Encode::FB_CROAK) };
+   return defined $decoded ?
   
   I'd still advocate checking $@ here, rather than the defined $decoded 
   check.
  
  I don't mind changing it, but for my edification, what is the advantage?
 
 The documentation for decode_utf8 isn't clear, but I don't know if it
 can ever return undef.  What, for example, does it return if $orig is
 not defined?  That's the benefit: it's immediately clear to the user
 that you're interested in whether it threw an exception, rather than
 whether it produced a given value.

I'd argue that I am more interested in whether it returned a value. Let
us imagine for a moment that decode_utf8 could return undef without
throwing an exception. What should the function return in such a case?

I think the only sensible thing is the original (and to indicate that
the result was not converted).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-12 Thread brian m. carlson
On Tue, Feb 11, 2014 at 05:54:51PM -0800, Stefan Zager wrote:
 To this end, I'd like to start submitting patches that make the code
 base generally more thread-safe and thread-friendly.  Right after this
 email, I'm going to send the first such patch, which makes the global
 list of pack files (packed_git) internal to sha1_file.c.

I'm definitely interested in this if it also works on POSIX systems.  At
work, we have a 7.6 GiB repo (packed)[0], so while performance is not
bad, I certainly wouldn't object if it were better.

[0] Using du -sh.  For comparison, the Linux kernel repo is 1.4 GiB.

-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187


signature.asc
Description: Digital signature


Re: feature request: text=input option in .gitattributes

2014-02-12 Thread Cameron Taggart
Thank you Torsten. Could some one help me clarify what this means?

https://www.kernel.org/pub/software/scm/git/docs/gitattributes.html
This ensures that all files that git considers to be text will have
normalized (LF) line endings in the repository. The core.eol
configuration variable controls which line endings git will use for
normalized files in your working directory; the default is to use the
native line ending for your platform, or CRLF if core.autocrlf is
set.

Does this mean?
1) the files in your working directory should only be set to CRLF if
your native line endings are core.eol=crlf and core.autocrlf=true?
2) the files in your working directory should be set to CRLF by
default if your native line endings are core.eol=crlf and not
core.autocrlf=input

I was hoping it would be #1, but it does not appear to be working that
way. The F# Compiler Service that I want to source index has
text=auto. It can be used to test.

git clone https://github.com/fsharp/FSharp.Compiler.Service fcs

git config --global core.eol crlf
git clone https://github.com/fsharp/FSharp.Compiler.Service fcsCoreEolCrlf

Please help clarify how git should be working, so that I can either
log a bug or a feature request.

cheers,
Cameron

On Wed, Feb 12, 2014 at 9:56 AM, Torsten Bögershausen tbo...@web.de wrote:
 On 2014-02-11 15.57, Cameron Taggart wrote:
 After requesting this as
 https://github.com/msysgit/msysgit/issues/164, I was told to take it
 upstream, so here I am.

 I would like a text=input feature added that has the same behavior as
 text=auto, except that it defaults to core.autocrlf=input behavior
 instead of core.autocrlf=true.
 If you want to normailze your repo, and keep it normalized,
 I can  recommend to use .gitattributes.

 Please see the excellent page here:
 https://www.kernel.org/pub/software/scm/git/docs/gitattributes.html
 And especially this part:

 $ echo * text=auto .gitattributes
 $ rm .git/index # Remove the index to force git to
 $ git reset # re-scan the working directory
 $ git status# Show files that will be normalized
 $ git add -u
 $ git add .gitattributes
 $ git commit -m Introduce end-of-line normalization



 /Torsten

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html